Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Threats and Countermeasures in Machine Learning Applications

Threats and Countermeasures in Machine Learning Applications

File(s)
Guo_cornellgrad_0058F_11933.pdf (8.25 MB)
Permanent Link(s)
https://doi.org/10.7298/7c50-kr44
https://hdl.handle.net/1813/70425
Collections
Cornell Theses and Dissertations
Author
Guo, Chuan
Abstract

Machine learning as a technique of automatically constructing programs from past data for making future predictions has led to its adoption in many diverse areas of applications. Similar to traditional programs written by expert software engineers, these machine learned programs are also subject to safety requirements when applied to sensitive usage scenarios. In hypothetical situations, if a certain degree of knowledge about the internal details of the machine learned models or their training method is exposed to an adversary, the behavior of the model may not be guaranteed; even worse, the adversary may arbitrarily alter the model's behavior to his or her desire. Throughout the training and deployment pipeline, an adversary may leverage vulnerabilities in the training procedure or model architecture in several ways. First, an adversary may exploit the assumption of identical distribution between training and test data. During test time, inputs to the model can be carefully constructed to fall outside of the natural data distribution and deliberately cause malfunctioning of the system. Second, during training, an adversary may inject secret functionalities into the model and release it publicly or sell it to another party. At test time, the injected functionality can serve as a backdoor to allow the adversary manipulation of the deployed system. Third, training data for the model is often memorized in full or in part due to outlier samples or excess capacity. These memorized samples may be extracted from the model parameters or its predictions to allow an adversary unauthorized access to a private training database. Our goal in this thesis is to expose these potential hazards in real world applications, understand the root cause for the existence of these loopholes, and devise or provide insight into potential solutions. We will focus on the following three main topics: (i) black-box attacks against the prediction of machine learning models; (ii) embedding Trojan horse models into a benign transport model; and (iii) provable removal of training data from a model for protecting data privacy. We hope that by public discussion of these possible safety concerns of machine learning, we can raise awareness and advocate ongoing research to improve the security and privacy of learning systems.

Description
134 pages
Date Issued
2020-05
Keywords
machine learning
•
privacy
•
security
Committee Chair
Weinberger, Kilian
Committee Member
Sridharan, Karthik
Joachims, Thorsten
Degree Discipline
Computer Science
Degree Name
Ph. D., Computer Science
Degree Level
Doctor of Philosophy
Type
dissertation or thesis
Link(s) to Catalog Record
https://catalog.library.cornell.edu/catalog/13254330

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance