Dynamic Resource Management For Systems With Controllable Service Capacity
The rise in the Internet traffic volumes has led to a growing interest in reducing energy costs of IT infrastructure. Resource management policies for such systems, known as power aware policies, are becoming increasingly important. We propose a dynamic capacity management framework for such systems to save energy costs without sacrificing quality of service. The system incurs two types of costs; a holding cost which reflects the Quality of Service (QoS) experienced by the users, and a cost of effort (Utilization/Energy cost) based on the amount of resources being used. The key challenge for the service provider is balancing these two objectives, as using more resources improves the system performance but also leads to higher utilization costs. This tension between delivering good performance and reducing energy costs is the central feature of the models considered in this work. In order to make good capacity allocation decisions for such scenarios, we formulate two queueing control problems using the Markov Decision Process framework. We first consider the problem of service rate control of a single-server queueing system with a finite-state Markov-modulated Poisson arrival process. We show that the optimal service rate is non-decreasing in the number of jobs in the system; higher congestion levels warrant higher service rates. On the contrary, however, we show that the optimal service rate is not necessarily monotone in the current arrival rate. If the modulating process satisfies a stochastic mono- tonicity property, the monotonicity is recovered. We examine several heuristics and show where heuristics are reasonable substitutes for the optimal control. In the next model we consider the problem of dynamic resource allocation in the presence of different job classes. In this case, the service provider has to make two decisions; how many resources to utilize and how to allocate these resources among various job classes. These decisions, made dynamically based on the congestion levels of job classes, minimize a weighted sum of utilization and QoS related costs. We show that a priority policy is optimal for scheduling the jobs of different classes. We further show that the optimal resource utilization levels are non-decreasing in the number of jobs of each type. We extend this model to the case when server reallocations incur a switching penalty. In this case, we show that the optimal policy is hard to characterize and computing the optimal solution using the Dynamic Programming framework becomes intractable. Instead, we develop several heuristics and perform numerical studies to compare their performance.
Markov Decision Process; Service Rate Control; Dynamic Power Management
Henderson,Shane G.; Topaloglu,Huseyin
Ph. D., Operations Research
Doctor of Philosophy
dissertation or thesis