Sequential Resource Allocation Under Uncertainty: An Index Policy Approach
We consider a class of stochastic sequential allocation problems - restless multi-armed bandits (RMAB) with a finite horizon and multiple pulls per period. Leveraging the Lagrangian relaxation of the problem, we propose an index-based policy that uses the optimal Lagrange multipliers to index individual arms, and prove that the policy is asymptotically optimal as the number of arms tends to infinity. We also demonstrate numerically that this index-based policy outperforms state-of-the-art heuristics in several instances of RMAB. In addition, we study two other applications of sequential resource allocation problems which are extensions of the RMAB problem, and demonstrate how our index policy can be adapted to these settings.
Index-based Policy; Restless Bandit; Sequential Resource Allocation; Stochastic Dynamic Program; Operations research
Topaloglu, Huseyin; Joachims, Thorsten
Ph. D., Operations Research
Doctor of Philosophy
dissertation or thesis