Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Data Dependent Random Projections

Data Dependent Random Projections

File(s)
Kang_cornellgrad_0058F_10216.pdf (8.01 MB)
Permanent Link(s)
https://doi.org/10.7298/X4XG9P8R
https://hdl.handle.net/1813/51575
Collections
Cornell Theses and Dissertations
Author
Kang, Keegan
Abstract

Random projections is a technique used primarily in dimension reduction, in order to estimate distances in data. They can be thought of a linear transformation mapping a data matrix X to a lower dimensional space, where distances are preserved in expectation. However, the preservation of distances can be thought of a stepping stone to some eventual goal, such as classification, hypothesis testing, information retrieval, or even reconstructing principal components of data. In this thesis, I will give a background of the basic random projection algorithm. Next, I then look at the structure of random projection matrices and propose modifications to result in a more accurate estimation of distances, which would help in information retrieval and reconstruction of principal components. Finally, I show that it is possible to juxtapose the use of Monte Carlo variance reduction methods with random projections to improve the accuracy of distance estimates, which can then be used in an algorithm or procedure of the users' choice. Theoretical justifications are given, and empirical results are shown with synthetic data, and experiments from publicly available datasets.

Date Issued
2017-05-30
Keywords
information retrieval
•
random projections
•
Computer science
•
Statistics
•
control variates
Committee Chair
Hooker, Giles J
Committee Member
Sridharan, Karthik
Mimno, David
Degree Discipline
Statistics
Degree Name
Ph. D., Statistics
Degree Level
Doctor of Philosophy
Rights
Attribution-ShareAlike 4.0 International
Rights URI
https://creativecommons.org/licenses/by-sa/4.0/
Type
dissertation or thesis

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance