Adaptive Thread Management For Power, Temperature, And Reliability In Future Microprocessors
MetadataShow full item record
With continued scaling of CMOS technology, power, thermal, and reliability issues threaten to significantly limit future performance improvements. The advent of microprocessors with multiple processing units creates a new opportunity to address these concerns through low-cost adaptive thread management techniques. In this dissertation we devise two types of dynamic management schemes, thread migration and power management, which leverage the inherent architectural characteristics of future microprocessors to dramatically mitigate thermal hotspots, variations, and hard faults. These techniques are applied both within the core, in clustered simultaneous multithreaded (SMT) architectures, and among the cores of unpredictably heterogeneous chip multiprocessors (CMPs). First, we investigate dynamic thermal management (DTM) in clustered SMT architectures. We propose novel thread migration algorithms that leverage the steering mechanism inherent in clustered architectures to cool hotspots more effectively than dynamic voltage and frequency scaling (DVFS) when executing thermally nonuniform workloads. In addition, we create a DTM mechanism that combines intelligent steering with DVFS power management to achieve efficient thermal control across all workloads. In future large-scale multi-core microprocessors, hard faults and process variations will create dynamic heterogeneity, causing performance and power characteristics to differ among the cores in an unanticipated manner. Contemporary CMP thread managers are oblivious to this heterogeneity, resulting in significant performance losses and excess power dissipation. We develop operation system scheduling and global power management policies, which significantly reduce the loss in power-performance efficiency. We further explore the scalability of these algorithms to many-core architectures with four to two-hundred fifty-six cores and devise novel, scalable runtime management techniques which achieve high performance with low overhead.
dissertation or thesis