Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Matrix factorization and Deep Learning in Scientific Domains: Understanding When and Why It Works

Matrix factorization and Deep Learning in Scientific Domains: Understanding When and Why It Works

File(s)
Bjorck_cornellgrad_0058F_12891.pdf (18.23 MB)
Permanent Link(s)
https://doi.org/10.7298/8q68-0894
https://hdl.handle.net/1813/110845
Collections
Cornell Theses and Dissertations
Author
Bjorck, Nils Johan Bertil
Abstract

Propelled by large datasets and parallel compute accelerators, deep neural networks have recently demonstrated human-like performance in many domains previously beyond the reach of machines. Computers can now recognize objects in images, transcribe speech and exhibit reading comprehension at the level of an average human. However, there are many domains that require expert knowledge beyond that of the average person, e.g. medical diagnosis and scientific analysis. This raises the question -- beyond average humans, can modern statistical models perform as well as domain experts? Attempting to answer this question, this thesis considers modern machine learning as applied to several expert domains. In particular, we consider problems related to sustainability, placing our work in the domain of computational sustainability -- a nascent field at the intersection of computer science and sustainability. Firstly, we consider using neural networks to identify invasive species habitats from remote sensing images, showing that unsupervised learning can make use of sparse expert labels and cheap satellite images. Secondly, we consider passive acoustic monitoring of endangered animals and introduce a novel data-driven compression scheme for this setting. Thirdly, we consider the use of non-negative matrix factorization (NMF) to spectroscopic datasets in materials science, and show how combining discrete and continuous optimization can yield solutions that accelerate scientific discovery. In expert domains such as these, empirical performance is not always enough. Instead, one needs to know when and why machine learning methods work to utilize model predictions. Paradoxically, most machine learning models are NP-hard to optimize, yet work remarkably well in practice. Inspired by our practical problems, we make empirical and theoretical contributions towards a principled understanding of when and why machine learning works. On the theoretical side, we introduce a randomized average case model for NMF and prove that certain convexity properties arise naturally in this model. On the empirical side, we consider a popular method in deep learning -- batch normalization -- which cannot improve model expressivity yet improves performance in practice. We demonstrate that the improved conditioning this normalization confers enables larger learning rates which has a regularizing effect.

Description
145 pages
Date Issued
2021-12
Committee Chair
Gomes, Carla P.
Selman, Bart
Committee Member
Agarwal, Rachit
van Dover, R. B.
Weinberger, Kilian Quirin
Degree Discipline
Computer Science
Degree Name
Ph. D., Computer Science
Degree Level
Doctor of Philosophy
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International
Rights URI
https://creativecommons.org/licenses/by-nc-nd/4.0/
Type
dissertation or thesis
Link(s) to Catalog Record
https://newcatalog.library.cornell.edu/catalog/15312701

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance