eCommons

 

PRACTICAL AND THEORETICAL ADVANCES IN CONSTRAINED BAYESIAN OPTIMIZATION AND BAYESIAN OPTIMIZATION FOR MACHINE LEARNING SYSTEMS

Other Titles

Author(s)

Abstract

Recent advances in computationally efficient non-myopic Bayesian optimization (BO) improve query efficiency over traditional myopic methods like expected improvement while only modestly increasing computational cost. These advances have been largely limited, however, to unconstrained optimization. For constrained optimization, the few existing non-myopic BO methods require heavy computation. For instance, one existing non-myopic constrained BO method [1] relies on computationally expensive unreliable brute-force derivative-free optimization of a Monte Carlo rollout acquisition function. Methods that use the reparameterization trick for more efficient derivative-based optimization of non-myopic acquisition functions in the unconstrained setting, like sample average approximation and infinitesimal perturbation analysis, do not extend: constraints introduce discontinuities in the sampled acquisition function surface that hinder its optimization. Moreover, we argue here that being non-myopic is even more important in constrained problems because fear of violating constraints pushes myopic methods away from sampling the boundary between feasible and infeasible regions, slowing the discovery of optimal solutions with tight constraints. In this work, we propose a computationally efficient two-step lookahead constrained Bayesian optimization acquisition function (2-OPT-C) supporting both sequential and batch settings. To enable fast acquisition function optimization, we develop a novel likelihood-ratio-based unbiased estimator of the gradient of the two-step optimal acquisition function that does not use the reparameterization trick. In numerical experiments, 2-OPT-C typically improves query efficiency by 2x or more over previous methods, and in some cases by 10x or more. Recent advances in Bayesian optimization with constraints (CBO) have significantly improved the sample query efficiency compared to the standard CBO algorithm, constrained expected improvement (EIC). Although the EIC is first proposed by [2] and rediscovered by [3], which is more than twenty years ago, there is no work focusing on the theoretical aspect of the EIC algorithm, especially regarding its consistency property. In this work, we show that EIC is inconsistent. In detail, we construct a counterexample where both the objective and constraint functions are piece-wise linear, and the Gaussian process priors are Wiener processes. Moreover, to overcome the inconsistency of EIC, we propose a new algorithm named Constrained Expected Improvement with Perturbation (EIC-P). We prove that EIC-P is consistent in the setting of reproducing kernel Hilbert space. Machine learning systems, consisting of various models, have shown superiority over single-model approaches in both academia and industry. However, tuning the hyperparameters of these systems is challenging. First, machine learning systems usually have numerous hyperparameters. Moreover, due to the interaction between models in the system, the hyperparameters of upstream models might also affect the downstream models. In this paper, we first provide a formal mathematical definition of a machine learning (ML) system. Also, we formulate the evaluation of an ML system and its components as a network of evaluation functions. Moreover, we provide extensive guidance on building effective Bayesian optimization function networks (BOFN) for the evaluation function network including evaluation metric design. Then, we investigate the efficacy of standard and grey-box [4] Bayesian optimization (BO) algorithms for tuning hyperparameters for machine learning systems. The experiment results demonstrate that even if we utilize BOFN with simple structures to leverage partial information within ML systems regarding models' qualities, BOFN can still improve sample efficiency or compare favorably to standard BO methods.

Journal / Series

Volume & Issue

Description

145 pages

Sponsorship

Date Issued

2024-05

Publisher

Keywords

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Frazier, Peter

Committee Co-Chair

Committee Member

Scheinberg, Katya
Bindel, David

Degree Discipline

Operations Research and Information Engineering

Degree Name

Ph. D., Operations Research and Information Engineering

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Rights URI

Types

dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record