Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Robust and Geometry-Aware Machine Learning Using Optimal Transport

Robust and Geometry-Aware Machine Learning Using Optimal Transport

File(s)
Nietert_cornellgrad_0058F_15236.pdf (3.52 MB)
No Access Until
2026-03-09
Permanent Link(s)
https://doi.org/10.7298/ympf-7t60
https://hdl.handle.net/1813/120919
Collections
Cornell Theses and Dissertations
Author
Nietert, Sloan
Abstract

Modern machine learning relies on large, high-dimensional datasets with rich geometric structure. As the reach of these methods expands, so does the risk posed by data poisoning attacks and incorrect modeling assumptions. To address these concerns, my research provides statistical and computational guarantees for robust and geometry-aware learning algorithms. An overarching formalism for this thesis is that of optimal transport (OT), a powerful framework for metric-based comparisons and transformations between high-dimensional data distributions. My work employs tools from OT, high-dimensional statistics, and online algorithms, with applications including generative modeling, distributionally robust optimization, and dynamic pricing. In such data-driven settings, underlying population distributions are unknown and must be estimated from finite samples. This has driven the development of a wide and growing literature on statistical OT. In Part I, I discuss my contributions to this field, focusing on fundamental statistical questions like the estimation of transport maps and the characterization of limit distributions. In Part II, I explore the interplay between OT and robust statistics. First, I examine how the OT landscape shifts when a constant fraction of data are contaminated, characterizing the precise impact of such contamination and developing robust estimators based on partial transport and partial alignment. Then, I introduce a more flexible framework for data corruption, based on OT, which additionally supports local perturbations of all data points. Here, we extend recent tools from algorithmic robust statistics and distributionally robust optimization to develop efficient, robust algorithms for distribution estimation and stochastic optimization. Finally, in Part III, I present a distinct line of work on robust geometric search in online environments. In both settings considered, our algorithm's feedback is determined by the responses of strategic agents. We develop geometric tools for reasoning about these agents' states and robust learning algorithms to estimate and respond to them appropriately.

Description
473 pages
Date Issued
2025-08
Committee Chair
Goldfeld, Ziv
Committee Member
Kleinberg, Robert
Acharya, Jayadev
Degree Discipline
Computer Science
Degree Name
Ph. D., Computer Science
Degree Level
Doctor of Philosophy
Type
dissertation or thesis

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance