Machine Learning Methods for Data-driven Decision Making: Contextual Optimization, Causal Inference, and Algorithmic Fairness
MetadataShow full item record
Recent advances in machine learning (ML) hold much promise for using data to drive more effective decisions. However, many challenges remain to our realizing this decision-making potential, given the limitations of predictive algorithms and imperfections in the data available. This thesis investigates these critical challenges in the areas of data-driven optimization, causal inference, and algorithmic fairness, and develops fundamental theory and new ML methods. In Part I, we focus on data-driven optimization that involves both uncertain quantities of interest (e.g., demand) and predictive contextual features (e.g., product characteristics). Chapters 2 and 3 focus on stochastic optimization problems and investigate two popular paradigms: an "estimate-then-optimize" paradigm that first uses standard ML methods to predict the distribution of uncertain quantities given contextual features and then plugs the distributional predictions into a stochastic optimization problem to solve for decisions, and an "end-to-end" paradigm that integrates the prediction and the decision-making aspects by directly training predictive models to target good decisions. In Chapter 2, we develop an "end-to-end" stochastic optimization forest algorithm that constructs decision trees to directly optimize decision quality, which we show provides significant benefits for decision making over building trees that target prediction. In Chapter 3, we show a more nuanced landscape for the integration of estimation and optimization, by identifying many common settings where "end-to-end" approaches can actually have much slower regret-convergence rates than the far simpler "estimate-then-optimize" approach. Chapter 4 considers the online setting where we collect more data as we make decisions and studies nonparametric contextual bandits with smooth expected reward functions. We develop a novel algorithm that leverages this smoothness structure and show that its regret rate is minimax optimal. Our regret analysis reveals the full spectrum of relationship between regret in contextual bandits and the smoothness of reward functions, recovering existing results for Lipschitz and parametric reward functions at the extremes. In Part II, we study causal inference with observational data in complex settings. In Chapter 5, we consider the estimation and inference of complex causal parameters such as quantile treatment effects whose efficient estimation requires learning nuisance functions that depend on the parameter itself. We propose a localized debiased machine learning approach that avoids this complex dependence and need only rely on simple nuisance-function estimation that can be easily outsourced to standard ML algorithms. The resulting estimators are not only practically feasible but also theoretically-grounded with asymptotically optimal distributions under weak conditions. In Chapter 6, we tackle the common challenge wherein some confounders cannot be measured exactly and only noisy proxy observations thereof are available. We propose to use matrix factorization to infer confounders from noisy proxies and then estimate causal effects based on the inferred confounders. This provides a flexible and principled framework that adapts to missing values, accommodates many data types, and can enhance a wide variety of causal inference methods. In Part III, we tackle a prevalent challenge in assessing the fairness of decision-making algorithms with respect to a protected class (e.g., race and ethnicity): the protected class is often unobserved in practice. In Chapter 7, we analyze the bias of proxy methods that impute class labels by ML algorithms and that have been extensively applied in consumer-financial and healthcare contexts. This is the first rigorous analysis of how such proxy methods can lead to biased disparity assessments. In Chapter 8, we prove the fundamental impossibility of exactly measuring decision disparity without class labels, and propose algorithms that estimate and visualize the tightest possible set of all values of true disparities that are consistent with the observed data. Our proposal thus provides a robust and reliable fairness auditing tool that fully takes into account the inherent ambiguity in disparity assessment due to missing protected classes.
Algorithmic Fairness; Causal Inference; Data-driven Decision Making; Data-driven Optimization; Machine Learning
Udell, Madeleine Richards; Frazier, Peter; Joachims, Thorsten
Ph. D., Statistics
Doctor of Philosophy
dissertation or thesis