Data Dependent Random Projections

Kang, Keegan

Data Dependent Random Projections

Files

Kang_cornellgrad_0058F_10216.pdf (8.01 MB)

Permanent Link(s)

https://doi.org/10.7298/X4XG9P8R

https://hdl.handle.net/1813/51575

Collections

Cornell Theses and Dissertations

Full item page

Author(s)

Kang, Keegan

Abstract

Random projections is a technique used primarily in dimension reduction, in order to estimate distances in data. They can be thought of a linear transformation mapping a data matrix X to a lower dimensional space, where distances are preserved in expectation. However, the preservation of distances can be thought of a stepping stone to some eventual goal, such as classification, hypothesis testing, information retrieval, or even reconstructing principal components of data. In this thesis, I will give a background of the basic random projection algorithm. Next, I then look at the structure of random projection matrices and propose modifications to result in a more accurate estimation of distances, which would help in information retrieval and reconstruction of principal components. Finally, I show that it is possible to juxtapose the use of Monte Carlo variance reduction methods with random projections to improve the accuracy of distance estimates, which can then be used in an algorithm or procedure of the users' choice. Theoretical justifications are given, and empirical results are shown with synthetic data, and experiments from publicly available datasets.

Date Issued

2017-05-30

Keywords

information retrieval; random projections; Computer science; Statistics; control variates

Committee Chair

Hooker, Giles J

Committee Member

Sridharan, Karthik
Mimno, David

Degree Discipline

Statistics

Degree Name

Ph. D., Statistics

Degree Level

Doctor of Philosophy

Rights

Attribution-ShareAlike 4.0 International

Rights URI

https://creativecommons.org/licenses/by-sa/4.0/

Types

dissertation or thesis

Data Dependent Random Projections

Files

No Access Until

Permanent Link(s)

Collections

Other Titles

Author(s)

Abstract

Journal / Series

Volume & Issue

Description

Sponsorship

Date Issued

Publisher

Keywords

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Committee Co-Chair

Committee Member

Degree Discipline

Degree Name

Degree Level

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Rights URI

Types

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record