Matrix factorization and Deep Learning in Scientific Domains: Understanding When and Why It Works

Other Titles


Propelled by large datasets and parallel compute accelerators, deep neural networks have recently demonstrated human-like performance in many domains previously beyond the reach of machines. Computers can now recognize objects in images, transcribe speech and exhibit reading comprehension at the level of an average human. However, there are many domains that require expert knowledge beyond that of the average person, e.g. medical diagnosis and scientific analysis. This raises the question -- beyond average humans, can modern statistical models perform as well as domain experts? Attempting to answer this question, this thesis considers modern machine learning as applied to several expert domains. In particular, we consider problems related to sustainability, placing our work in the domain of computational sustainability -- a nascent field at the intersection of computer science and sustainability. Firstly, we consider using neural networks to identify invasive species habitats from remote sensing images, showing that unsupervised learning can make use of sparse expert labels and cheap satellite images. Secondly, we consider passive acoustic monitoring of endangered animals and introduce a novel data-driven compression scheme for this setting. Thirdly, we consider the use of non-negative matrix factorization (NMF) to spectroscopic datasets in materials science, and show how combining discrete and continuous optimization can yield solutions that accelerate scientific discovery. In expert domains such as these, empirical performance is not always enough. Instead, one needs to know when and why machine learning methods work to utilize model predictions. Paradoxically, most machine learning models are NP-hard to optimize, yet work remarkably well in practice. Inspired by our practical problems, we make empirical and theoretical contributions towards a principled understanding of when and why machine learning works. On the theoretical side, we introduce a randomized average case model for NMF and prove that certain convexity properties arise naturally in this model. On the empirical side, we consider a popular method in deep learning -- batch normalization -- which cannot improve model expressivity yet improves performance in practice. We demonstrate that the improved conditioning this normalization confers enables larger learning rates which has a regularizing effect.

Journal / Series

Volume & Issue


145 pages


Date Issued





Effective Date

Expiration Date




Union Local


Number of Workers

Committee Chair

Gomes, Carla P.
Selman, Bart

Committee Co-Chair

Committee Member

Agarwal, Rachit
van Dover, R. B.
Weinberger, Kilian Quirin

Degree Discipline

Computer Science

Degree Name

Ph. D., Computer Science

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)


Link(s) to Reference(s)

Previously Published As

Government Document




Other Identifiers


Attribution-NonCommercial-NoDerivatives 4.0 International


dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record