Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. A Scalable and Flexible Framework for Gaussian Processes via Matrix-Vector Multiplication

A Scalable and Flexible Framework for Gaussian Processes via Matrix-Vector Multiplication

File(s)
Pleiss_cornellgrad_0058F_12063.pdf (4.15 MB)
Permanent Link(s)
https://doi.org/10.7298/9m1e-te84
https://hdl.handle.net/1813/102953
Collections
Cornell Theses and Dissertations
Author
Pleiss, Geoff
Abstract

Gaussian processes (GPs) exhibit a classic tension of many machine learning methods: they possess desirable modelling capabilities yet suffer from important practical limitations. In many instances, GPs are able to offer well-calibrated uncertainty estimates, interpretable predictions, and the ability to encode prior knowledge. These properties have made them an indispensable tool for black-box optimization, time series forecasting, and high-risk applications like health care. Despite these benefits, GPs are typically not applied to datasets with more than a few thousand data points. This is in part due to an inference procedure that requires matrix inverses, determinants, and other expensive operations. Moreover, specialty models often require significant implementation efforts. This thesis aims to alleviate these practical concerns through a single simple design decision. Taking inspiration from neural network libraries, we construct GP inference algorithms using only matrix-vector multiplications (MVMs) and other linear operations. This MVM-based approach simultaneously address several of these practical concerns: it reduces asymptotic complexity, effectively utilizes GPU hardware, and provides straight-forward implementations for many specialty GP models. The chapters of this thesis each address a different aspect of Gaussian process inference. Chapter 3 introduces a MVM method for training Gaussian process regression models (i.e. optimizing kernel/likelihood hyperparameters). This approach unifies several existing methods into a highly-parallel and stable algorithm. Chapter 4 focuses on making predictions with Gaussian processes. A memory-efficient cache, which can be computed through MVMs, significantly reduces the computation of predictive distributions. Chapter 5 introduces a multi-purpose MVM algorithm that can be used to draw samples from GP posteriors and perform approximate Gaussian process inference. All three of these methods offer speedups ranging from 4x to 40x. Importantly, applying any of these algorithms to specialty models (e.g. multitask GPs and scalable approximations) simply requires a matrix-vector multiplication routine that exploits covariance structure afforded by the model. The MVM methods from this thesis form the building blocks of the GPyTorch library, an open-sourced GP implementation designed for scalability and simple implementations. In the final chapter, we evaluate GPyTorch models on several large-scale regression datasets. Using the proposed MVM methods, we can apply exact Gaussian processes to datasets that are 2 orders of magnitude larger than what has previously been reported - up to 1 million data points.

Description
212 pages
Date Issued
2020-08
Keywords
Bayesian nonparametrics
•
Conjugate gradients
•
Gaussian processes
•
GPU acceleration
•
Krylov subspaces
•
Regression
Committee Chair
Weinberger, Kilian Quirin
Committee Member
Sridharan, Karthik
Wilson, Andrew Gordon
Degree Discipline
Computer Science
Degree Name
Ph. D., Computer Science
Degree Level
Doctor of Philosophy
Rights
Attribution 4.0 International
Rights URI
https://creativecommons.org/licenses/by/4.0/
Type
dissertation or thesis
Link(s) to Catalog Record
https://catalog.library.cornell.edu/catalog/13277779

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance