Bayesian Optimization with Parallel Function Evaluations and Multiple Information Sources: Methodology with Applications in Biochemistry, Aerospace Engineering, and Machine Learning
Bayesian optimization, a framework for global optimization of expensive-to-evaluate functions, has recently gained popularity in machine learning and global optimization because it can find good feasible points with few function evaluations. In this dissertation, we present novel Bayesian optimization algorithms for problems with parallel function evaluations and multiple information sources, for use in machine learning, biochemistry, and aerospace engineering applications. First, we present a novel algorithm that extends expected improvement, a widely-used Bayesian optimization algorithm that evaluates one point at a time, to settings with parallel function evaluations. This algorithm is based on a new efficient solution method for finding the Bayes-optimal set of points to evaluate next in the context of parallel Bayesian optimization. The author implemented this algorithm in an open source software package co-developed with engineers at Yelp, which was used by Yelp and Netflix for automatic tuning of hyperparameters in machine learning algorithms, and for choosing parameters in online content delivery systems based on evaluations in A/B tests on live traffic. Second, we present a novel parallel Bayesian optimization algorithm with a worst-case approximation guarantee applied to peptide optimization in biochemistry, where we face a large collection of peptides with unknown fitness prior to experimentation, and our goal is to identify peptides with a high score using a small number of experiments. High scoring peptides can be used for biolabeling, targeted drug delivery, and self-assembly of metamaterials. This problem has two novelties: first, unlike traditional Bayesian optimization, where the objective function has a continuous domain and real-valued output well-modeled by a Gaussian Process, this problem has a discrete domain, and involves binary output not well-modeled by a Gaussian process; second, it uses hundreds of parallel function evaluations, which is a level of parallelism too large to be approached with other previously-proposed parallel Bayesian optimization methods. Third, we present a novel Bayesian optimization algorithm for problems in which there are multiple methods or "information sources" for evaluating the objective function, each with its own bias, noise and cost of evaluation. For example, in aerospace engineering, to evaluate an aircraft wing design, different computational models may simulate performance. Our algorithm explores the correlation and model discrepancy of each information source, and optimally chooses the information source to evaluate next and the point at which to evaluate it. We describe how this algorithm can be used in general multi information source optimization problems, and also how a related algorithm can be used in "warm start" problems, where we have results from previous optimizations of closely related objective functions, and we wish to leverage these results to more quickly optimize a new objective function.
Operations research; Bayesian optimization; machine learning
Joachims, Thorsten; Clancy, Paulette
PHD of Operations Research
Doctor of Philosophy
Attribution-ShareAlike 4.0 International
dissertation or thesis