Show simple item record

dc.contributor.authorChen, Bangrui
dc.identifier.otherbibid: 10474153
dc.description.abstractIn this thesis, we study adaptive preference learning, in which a machine learning system learns users' preferences from feedback while simultaneously using these learned preferences to help them find preferred items. We study three different types of user feedback in three application setting: cardinal feedback with application in information filtering systems, ordinal feedback with application in personalized content recommender systems, and attribute feedback with application in review aggregators. We connect these settings respectively to existing work on classical multi-armed bandits, dueling bandits, and incentivizing exploration. For each type of feedback and application setting, we provide an algorithm and a theoretical analysis bounding its regret. We demonstrate through numerical experiments that our algorithms outperform existing benchmarks.
dc.subjectOperations research
dc.subjectComputer science
dc.subjectadaptive preference learning
dc.subjectbandit feedback
dc.subjectdueling bandits
dc.subjectincentivizing exploration
dc.subjectinformation filtering
dc.subjectmulti-armed bandits
dc.titleAdaptive Preference Learning With Bandit Feedback: Information Filtering, Dueling Bandits and Incentivizing Exploration
dc.typedissertation or thesis Research University of Philosophy D., Operations Research
dc.contributor.chairFrazier, Peter
dc.contributor.committeeMemberTopaloglu, Huseyin
dc.contributor.committeeMemberJoachims, Thorsten

Files in this item


This item appears in the following Collection(s)

Show simple item record