Threats and Countermeasures in Machine Learning Applications
Machine learning as a technique of automatically constructing programs from past data for making future predictions has led to its adoption in many diverse areas of applications. Similar to traditional programs written by expert software engineers, these machine learned programs are also subject to safety requirements when applied to sensitive usage scenarios. In hypothetical situations, if a certain degree of knowledge about the internal details of the machine learned models or their training method is exposed to an adversary, the behavior of the model may not be guaranteed; even worse, the adversary may arbitrarily alter the model's behavior to his or her desire. Throughout the training and deployment pipeline, an adversary may leverage vulnerabilities in the training procedure or model architecture in several ways. First, an adversary may exploit the assumption of identical distribution between training and test data. During test time, inputs to the model can be carefully constructed to fall outside of the natural data distribution and deliberately cause malfunctioning of the system. Second, during training, an adversary may inject secret functionalities into the model and release it publicly or sell it to another party. At test time, the injected functionality can serve as a backdoor to allow the adversary manipulation of the deployed system. Third, training data for the model is often memorized in full or in part due to outlier samples or excess capacity. These memorized samples may be extracted from the model parameters or its predictions to allow an adversary unauthorized access to a private training database. Our goal in this thesis is to expose these potential hazards in real world applications, understand the root cause for the existence of these loopholes, and devise or provide insight into potential solutions. We will focus on the following three main topics: (i) black-box attacks against the prediction of machine learning models; (ii) embedding Trojan horse models into a benign transport model; and (iii) provable removal of training data from a model for protecting data privacy. We hope that by public discussion of these possible safety concerns of machine learning, we can raise awareness and advocate ongoing research to improve the security and privacy of learning systems.