eCommons

 

Learning from Less: Improving and Understanding Model Selection in Penalized Machine Learning Problems

dc.contributor.authorSeto, Skyler
dc.contributor.chairWells, Martin Timothy
dc.contributor.committeeMemberWilson, Andrew Gordon
dc.contributor.committeeMemberJoachims, Thorsten
dc.date.accessioned2021-03-12T17:40:39Z
dc.date.available2022-08-27T06:00:26Z
dc.date.issued2020-08
dc.description142 pages
dc.description.abstractModel selection is the task of selecting a "good" model from a set of candidate models given data. In machine learning, it is important that models fit the training data well, however it is more important for a model to generalize to unseen data. Additionally, it is desirable for a model to be as small as possible allowing for deployment in low-resource settings. In this thesis, we focus on three problems where the structure of the problem benefits from reducing the size of the models. In the first problem setting, we explore word embedding models and propose a framework in which we connect popular word embedding methods to low-rank matrix models and suggest new models for computing word vectors. In the second problem setting, we explore sparsity-inducing penalties for deep neural networks in order to obtain highly sparse networks which perform competitively with their over-parametrized counterparts. Finally, we explore the task of robot planar pushing and propose a novel penalty which adapts the parameters of a neural network to unseen examples allowing for robots to better interact in unseen environments. Our results on simulations and real-world data applications indicate that penalization is effective for learning models which perform well in settings with less computational budget, storage, or labeled data.
dc.identifier.doihttps://doi.org/10.7298/fz7r-bz15
dc.identifier.otherSeto_cornellgrad_0058F_12148
dc.identifier.otherhttp://dissertations.umi.com/cornellgrad:12148
dc.identifier.urihttps://hdl.handle.net/1813/103002
dc.language.isoen
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectDeep Learning
dc.subjectMachine Learning
dc.subjectMatrix Factorization
dc.subjectModel Selection
dc.subjectPenalization
dc.subjectRegularization
dc.titleLearning from Less: Improving and Understanding Model Selection in Penalized Machine Learning Problems
dc.typedissertation or thesis
dcterms.licensehttps://hdl.handle.net/1813/59810
thesis.degree.disciplineStatistics
thesis.degree.grantorCornell University
thesis.degree.levelDoctor of Philosophy
thesis.degree.namePh. D., Statistics

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Seto_cornellgrad_0058F_12148.pdf
Size:
8.91 MB
Format:
Adobe Portable Document Format