Off-policy Evaluation and Learning for Interactive Systems

dc.contributor.authorSu, Yi
dc.contributor.chairJoachims, Thorsten
dc.contributor.committeeMemberSridharan, Karthik
dc.contributor.committeeMemberKallus, Nathan
dc.description206 pages
dc.description.abstractRecent advances in reinforcement learning (RL) provide exciting potential for making agents learn, plan and act effectively in uncertain environments. Most existing algorithms in RL rely on known environments or the existence of a good simulator, where it is cheap to explore and collect the training data. However, this is not the case for human-centered interactive systems, in which online sampling or experimentation is costly, dangerous, or even illegal. This dissertation advocates an alternative data-driven approach that aims to evaluate and improve the performance of intelligent systems by only using the logged data from prior versions of the system (a.k.a. off-policy evaluation and learning). While such data is collected in large quantity as a byproduct of system operation, reasoning them is difficult since the data is biased and partial in nature. We present our key contributions in off-policy evaluation and learning for the contextual bandit setting, which is a state-less form of RL that is highly relevant to many real-world applications. This includes the discovery of a general family of counterfactual estimators for off-policy evaluation, which subsumes most estimators proposed to date; a principled optimization-based framework for automatically designing estimators, instead of manually constructing them; a data-driven model selection technique in off-policy policy evaluation settings; as well as various approaches for handling support-deficient data in the off-policy learning setting.
dc.rightsAttribution 4.0 International
dc.titleOff-policy Evaluation and Learning for Interactive Systems
dc.typedissertation or thesis
dcterms.license University of Philosophy D., Statistics


Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
4.48 MB
Adobe Portable Document Format