Perfect Pipelining: A New Loop Parallelization Technique
No Access Until
Parallelizing compilers do not handle loops in a satisfactory manner. Fine-grain transformations capture irregular parallelism inside a loop body not amenable to coarser approaches but have limited ability to exploit parallelism across iterations. Coarser methods sacrifice irregular forms of parallelism in favor of pipelining (overlapping) iterations. In this paper we present a new transformation, Perfect Pipelining, that bridges the gap between these fine-and coarse-grain transformations while retaining the desirable features of both. This is accomplished even in the presence of conditional branches and resource constraints. For loops typically encountered in practice, Perfect Pipelining achieves the effect of full loop unrolling coupled with fine-grain parallelization. To make our claims rigorous, we develop a formalism for parallelization. The formalism can also be used to compare transformations across computational models. As an illustration, we show that Doacross, a transformation intended for synchronous and asynchronous multiprocessors, can be expressed as a restriction of Perfect Pipelining.