Using General-Purpose Processor Cores as Prefetching Engines in Chip Multiprocessor Architectures

Other Titles
Scaling the performance of applications with little thread-level parallelism is one of the most serious impediments to the success of multi-core architectures. At the same time, the long latency of memory accesses represents one of the largest performance bottlenecks for individual program threads. As a result, a typical microprocessor spends a significant amount of time waiting for data to be delivered from memory instead of performing useful computation. Fortunately, it is often possible to guess which memory data will be needed by a program thread in the near future. Various hardware and software prefetching techniques have been developed to fetch critical data before they are requested by the processor. This way prefetching can eliminate processor stalls otherwise induced by the slow response from the memory system. The main contribution of this dissertation is the development of two techniques that utilize extra cores of a chip multiprocessor (CMP) as prefetching engines to increase the performance of single program threads. The proposed approaches effectively leverage the execution capabilities of chip multiprocessors to compute data addresses that are likely to miss in the cache and prefetch them ahead of program thread load requests. I demonstrate the effectiveness of the proposed approaches by performing cycle-accurate simulations of a chip multiprocessor consisting of two four-way superscalar cores running the single-threaded SPEC CPU2000 benchmark suite. The proposed mechanisms provide significant performance improvements over a baseline that already includes an aggressive hardware stream prefetcher. A comparison with other multi-core prefetching mechanisms from the literature shows that the techniques proposed in this dissertation provide competitive performance, incur less energy overhead, and require considerably simpler hardware support.
Journal / Series
Volume & Issue
Date Issued
computer engineering; computer architecture; chip multiprocessor; prefetching
Effective Date
Expiration Date
Union Local
Number of Workers
Committee Chair
Committee Co-Chair
Committee Member
Degree Discipline
Degree Name
Degree Level
Related Version
Related DOI
Related To
Related Part
Based on Related Item
Has Other Format(s)
Part of Related Item
Related To
Related Publication(s)
Link(s) to Related Publication(s)
Link(s) to Reference(s)
Previously Published As
Government Document
Other Identifiers
Rights URI
Accessibility Feature
Accessibility Hazard
Accessibility Summary
Link(s) to Catalog Record