Static Scheduling for Dynamic Dataflow Machines
Beck, Micah; Pingali, Keshav; Nicolau, Alexandru
Dataflow machines can "unravel" loops automatically so that many iterations of a loop can execute in parallel. Unbounded loop unraveling can strain the resources available on the machine and, in extreme cases, deadlock can occur due to overcommitment of resources. Previous efforts to address this problem have focused mainly on runtime mechanisms of debatable utility. Loop bounding, a compile-time technique, controls parallelism by introducing dependencies between loop iterations. The loop is given enough resources for the concurrent execution of some number of iterations, say $k$. The $K$ + 1st iteration uses the same resources as the first iteration and starts only after the first iteration is complete, and so on. Thus, the granularity of resource allocation is based on the rather arbitrary syntactic notion of a loop iteration. In this paper, we argue that loop bounding can lead to inefficient use of resources and propose an alternative way of compiling loops for pipelined execution. We introduce the notion of a stage decomposition of a loop, which defines a partition of a loop iteration into stages. We show how the problem of choosing a stage decomposition for a particular loop can be tackled by applying compile-time analyses and static scheduling techniques. Such techniques have been developed for scheduling loops on very long instruction word (VLIW) machines which, like dataflow machines, can exploit fine-grained parallelism in programs. These analyses permit the compiler to allocate resources according to expected patterns of usage, thus reducing overall resource requirements. Finally, we show how our schema can be implemented on the Monsoon dataflow machine being built at M.I.T.
computer science; technical report
Previously Published As