Sequential Decision Making With Resource Constraints
Badanidiyuru Varadaraja, Ashwinkumar
In sequential decision making, an algorithm interacts with an environment, where it can learn from the feedback of its past actions. A model for sequential decision making with partial feedback is the multi-armed bandit problem. This model has also found applications to a very diverse set of problems such as sequential design of experiments including medical decision-making, learning click-through rates in search engines, economic theory, network routing, etc. We study a fundamental feature in many of these applications, which is the presence of one or more limited-supply resources that are consumed during the decision process. Existing literature lacked general models for this feature and offered very limited treatment of such problems. We propose models which capture many of these applications and give tight performance guarantees.
Kleinberg, Robert David
Gehrke, Johannes E.; Tardos, Eva
Ph.D. of Computer Science
Doctor of Philosophy
dissertation or thesis