Prediction Strategies For Power-Aware Computing On Multicore Processors
Diminishing performance returns and increasing power consumption of single-threaded processors have made chip multiprocessors (CMPs) an industry imperative. Unfortunately, low power efficiency and bottlenecks in shared hardware structures can prevent optimal use when running multiple sequential programs. Furthermore, for multithreaded programs, adding a core may harm performance and increase power consumption. To better use otherwise limitedly beneficial cores, software components such as hypervisors and operating systems can be provided with estimates of application performance and power consumption. They can use this information to improve system-wide performance and reliability. Estimating power consumption can also be useful for hardware and software developers. However, obtaining processor and system power consumption information can be nontrivial. First, we present a predictive approach for real-time, per-core power estimation on a CMP. We analytically derive functions for real-time estimation of processor and system power consumption using performance counter and temperature data on real hardware. Our model uses data gathered from microbenchmarks that capture potential application behavior. The model is independent of our test benchmarks, and thus we expect it to be well suited for future applications. For chip multiprocessors, we achieve median error of 3.8% on an AMD quad-core CMP, 2.0% on an Intel quad-core CMP, and 2.8% on an Intel eight-core CMP. We implement the same approach inside an Intel XScale simulator and achieve median error of 1.3%. Next, we introduce and evaluate an approach to throttling concurrency in parallel programs dynamically. We throttle concurrency to levels with higher predicted efficiency using artificial neural networks (ANNs). One advantage of using ANNs over similar techniques previously explored is that the training phase is greatly simplified, thereby reducing the burden on the end user. We effectively identify energy efficient concurrency levels in multithreaded scientific applications on an Intel quad-core CMP. We improve the energy efficiency for many of our applications by predicting more favorable number and placement of threads at runtime, and improve the average ED 2 by 17.2% and 22.6% on an Intel quad-core and an Intel eight-core CMP, respectively. Last, we propose a framework that combines both approaches. With the impending shift to many-core architectures, systems need information on power and energy for more energy-efficient use of all cores. Any approach utilizing this framework also needs to be scalable to many cores. We implement an infrastructure that can schedule for power efficiency for a given power envelope, and/or a given thermal envelope. We expect the framework to scale well with number of cores. We perform experiments on quad-core and eight-core platforms. We schedule for better power efficiency by suspending or slowing down (via DVFS) single-threaded programs, or throttling concurrency for multithreaded programs. We utilize the per-core power predictor to schedule applications to remain under a given power envelope. We modify the scheduler policies to take advantage of all power saving options to enforce the power envelope, while minimizing performance loss.
dissertation or thesis