Fine-Grain Compilation for Pipelined Machines
Computer architecture design requires careful attention to the balance between the complexity of code scheduling problems and the cost and feasibility of building a machine. In this paper, we show that recently developed software pipelining algorithms produce optimal or near-optimal code for a large class of loops when the target architecture is a clean pipelined parallel machine. The important feature of these machines is the absence of structural hazards. We argue that the robustness of the scheduling algorithms and relatively simple hardware make these machines realistic and cost-effective. To illustrate the delicate balance between architecture and scheduling complexity, we show that scheduling with structural hazards is NP-hard, and that there are machines with simple structural hazards for which vectorization and the software pipelining techniques generate poor code.
computer science; technical report
Previously Published As