PRACTICAL AND THEORETICAL ADVANCES IN CONSTRAINED BAYESIAN OPTIMIZATION AND BAYESIAN OPTIMIZATION FOR MACHINE LEARNING SYSTEMS
No Access Until
Permanent Link(s)
Collections
Other Titles
Author(s)
Abstract
Recent advances in computationally efficient non-myopic Bayesian optimization (BO) improve query efficiency over traditional myopic methods like expected improvement while only modestly increasing computational cost. These advances have been largely limited, however, to unconstrained optimization. For constrained optimization, the few existing non-myopic BO methods require heavy computation. For instance, one existing non-myopic constrained BO method [1] relies on computationally expensive unreliable brute-force derivative-free optimization of a Monte Carlo rollout acquisition function. Methods that use the reparameterization trick for more efficient derivative-based optimization of non-myopic acquisition functions in the unconstrained setting, like sample average approximation and infinitesimal perturbation analysis, do not extend: constraints introduce discontinuities in the sampled acquisition function surface that hinder its optimization. Moreover, we argue here that being non-myopic is even more important in constrained problems because fear of violating constraints pushes myopic methods away from sampling the boundary between feasible and infeasible regions, slowing the discovery of optimal solutions with tight constraints. In this work, we propose a computationally efficient two-step lookahead constrained Bayesian optimization acquisition function (2-OPT-C) supporting both sequential and batch settings. To enable fast acquisition function optimization, we develop a novel likelihood-ratio-based unbiased estimator of the gradient of the two-step optimal acquisition function that does not use the reparameterization trick. In numerical experiments, 2-OPT-C typically improves query efficiency by 2x or more over previous methods, and in some cases by 10x or more. Recent advances in Bayesian optimization with constraints (CBO) have significantly improved the sample query efficiency compared to the standard CBO algorithm, constrained expected improvement (EIC). Although the EIC is first proposed by [2] and rediscovered by [3], which is more than twenty years ago, there is no work focusing on the theoretical aspect of the EIC algorithm, especially regarding its consistency property. In this work, we show that EIC is inconsistent. In detail, we construct a counterexample where both the objective and constraint functions are piece-wise linear, and the Gaussian process priors are Wiener processes. Moreover, to overcome the inconsistency of EIC, we propose a new algorithm named Constrained Expected Improvement with Perturbation (EIC-P). We prove that EIC-P is consistent in the setting of reproducing kernel Hilbert space. Machine learning systems, consisting of various models, have shown superiority over single-model approaches in both academia and industry. However, tuning the hyperparameters of these systems is challenging. First, machine learning systems usually have numerous hyperparameters. Moreover, due to the interaction between models in the system, the hyperparameters of upstream models might also affect the downstream models. In this paper, we first provide a formal mathematical definition of a machine learning (ML) system. Also, we formulate the evaluation of an ML system and its components as a network of evaluation functions. Moreover, we provide extensive guidance on building effective Bayesian optimization function networks (BOFN) for the evaluation function network including evaluation metric design. Then, we investigate the efficacy of standard and grey-box [4] Bayesian optimization (BO) algorithms for tuning hyperparameters for machine learning systems. The experiment results demonstrate that even if we utilize BOFN with simple structures to leverage partial information within ML systems regarding models' qualities, BOFN can still improve sample efficiency or compare favorably to standard BO methods.
Journal / Series
Volume & Issue
Description
Sponsorship
Date Issued
Publisher
Keywords
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Committee Co-Chair
Committee Member
Bindel, David