Learning from Less: Improving and Understanding Model Selection in Penalized Machine Learning Problems

Seto, Skyler

Learning from Less: Improving and Understanding Model Selection in Penalized Machine Learning Problems

dc.contributor.author	Seto, Skyler
dc.contributor.chair	Wells, Martin Timothy
dc.contributor.committeeMember	Wilson, Andrew Gordon
dc.contributor.committeeMember	Joachims, Thorsten
dc.date.accessioned	2021-03-12T17:40:39Z
dc.date.available	2022-08-27T06:00:26Z
dc.date.issued	2020-08
dc.description	142 pages
dc.description.abstract	Model selection is the task of selecting a "good" model from a set of candidate models given data. In machine learning, it is important that models fit the training data well, however it is more important for a model to generalize to unseen data. Additionally, it is desirable for a model to be as small as possible allowing for deployment in low-resource settings. In this thesis, we focus on three problems where the structure of the problem benefits from reducing the size of the models. In the first problem setting, we explore word embedding models and propose a framework in which we connect popular word embedding methods to low-rank matrix models and suggest new models for computing word vectors. In the second problem setting, we explore sparsity-inducing penalties for deep neural networks in order to obtain highly sparse networks which perform competitively with their over-parametrized counterparts. Finally, we explore the task of robot planar pushing and propose a novel penalty which adapts the parameters of a neural network to unseen examples allowing for robots to better interact in unseen environments. Our results on simulations and real-world data applications indicate that penalization is effective for learning models which perform well in settings with less computational budget, storage, or labeled data.
dc.identifier.doi	https://doi.org/10.7298/fz7r-bz15
dc.identifier.other	Seto_cornellgrad_0058F_12148
dc.identifier.other	http://dissertations.umi.com/cornellgrad:12148
dc.identifier.uri	https://hdl.handle.net/1813/103002
dc.language.iso	en
dc.rights	Attribution 4.0 International
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Deep Learning
dc.subject	Machine Learning
dc.subject	Matrix Factorization
dc.subject	Model Selection
dc.subject	Penalization
dc.subject	Regularization
dc.title	Learning from Less: Improving and Understanding Model Selection in Penalized Machine Learning Problems
dc.type	dissertation or thesis
dcterms.license	https://hdl.handle.net/1813/59810
thesis.degree.discipline	Statistics
thesis.degree.grantor	Cornell University
thesis.degree.level	Doctor of Philosophy
thesis.degree.name	Ph. D., Statistics

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Seto_cornellgrad_0058F_12148.pdf
Size:: 8.91 MB
Format:: Adobe Portable Document Format

Download

Collections

Cornell Theses and Dissertations