Using Dynamic Binary Instrumentation To Create Faster, Validated, Multi-Core Simulations
The Memory Wall continues to be a problem with modern systems design. While the steady increase in processor speeds has abated somewhat, Moore's Law continues to provide more transistors to chip designers. This leads to an increase in the number of processors and threads located per chip, which increases the demands on memory systems. Current simulation technology is not able to keep up, leading to sacrifices in methodology and accuracy in order to get results in reasonable time. Because cycle-accurate simulators are so slow, various methods for reducing execution time can be used. Unfortunately these methods can introduce variations in results of between 10-50% when compared to full reference input sets. Limitations of academic simulators also constrain the architectures under study, with results generated for obsolete or uninteresting systems. We analyze the performance and accuracy of various limited-execution methodologies. We investigate how deterministic execution affects the measurement of error. We then evaluate using Dynamic Binary Instrumentation (DBI) as an alternative to cycle-accurate simulation. We compare our results to actual systems using hardware performance counters. We look first at a simple 32-bit RISC system, and then look at more complex 64-bit x86 based systems. Finally we investigate the feasibility of using the same methodology for modern multi-processors simulations.
dissertation or thesis