Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. PROVABLE AND PRACTICAL ALGORITHMS FOR NON-CONVEX PROBLEMS IN MACHINE LEARNING

PROVABLE AND PRACTICAL ALGORITHMS FOR NON-CONVEX PROBLEMS IN MACHINE LEARNING

File(s)
Yuan_cornellgrad_0058F_10809.pdf (2.43 MB)
Permanent Link(s)
https://doi.org/10.7298/X40R9MMS
https://hdl.handle.net/1813/59273
Collections
Cornell Theses and Dissertations
Author
Yuan, Yang
Abstract

Machine learning has become one of the most exciting research areas in the world, with various applications. However, there exists a noticeable gap between theory and practice. On one hand, a simple algorithm like stochastic gradient descent (SGD) works very well in practice, without satisfactory theoretical explanations. On the other hand, the algorithms analyzed in the theoretical machine learning literature, although with solid guarantees, tend to be less efficient compared with the techniques widely used in practice, which are usually hand tuned or ad hoc based on intuition. This dissertation is about bridging the gap between theory and practice from two directions. The first direction is "practice to theory", i.e., to explain and analyze the existing algorithms and empirical observations in machine learning. Along this direction, we provide sufficient conditions for SGD to escape saddle points and local minima, as well as SGD dynamics analysis for the two-layer neural network with ReLU activation. The other direction is "theory to practice", i.e., using theoretical tools to obtain new, better and practical algorithms. Along this direction, we introduce a new algorithm Harmonica that uses Fourier analysis and compressed sensing for tuning hyperparameters. Harmonica supports parallel sampling and works well for tuning neural networks with more than 30 hyperparameters.

Date Issued
2018-05-30
Keywords
Artificial intelligence
•
Computer science
•
Hyperparameter tuning
•
Local minima
•
Non-convex optimization
•
Saddle points
•
Stochastic Gradient Descent
•
machine learning
Committee Chair
Kleinberg, Robert David
Committee Member
Kozen, Dexter Campbell
Joachims, Thorsten
Degree Discipline
Computer Science
Degree Name
Ph. D., Computer Science
Degree Level
Doctor of Philosophy
Rights
Attribution 4.0 International
Rights URI
https://creativecommons.org/licenses/by/4.0/
Type
dissertation or thesis

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance