Optimization and Parallelization of a Commodity Trade Model for the SP1, Using Parallel Programming Tools
Bergmark, Donna; Pottle, Marcia
We compare two different approaches to parallelization of Fortran programs. The first approach is to optimize the serial code so that it runs as fast as possible on a single processor, and then optimize the parallel version. In this paper a variety of parallel programming tools is used to obtain an optimal, parallel version of an economic policy modelling application for the IBM SP1. We apply a new technique called Data Access Normalization; we use an extended ParaScope as our parallel programming environment; we use FORGE 90 as our parallelizer; and we use KAP as our optimizer. We make a number of observations about the effectiveness of these tools. Both strategies obtain a working, parallel program, but use different tools to get there. On this occasion, both KAP and Data Access Normalization lead to the same critical transformation of inverting four of the twelve loop nests in the original program. The next most important optimization is parallel I/O, one of the few transformations that had to be done by hand. Speedups are obtained on the SP1 (using MPLp communication over the High Speed Switch).
theory center; multiprocessors; program transformations; parallel programming tools; data access normalization; ParaScope; Lambda Toolkit; Fortran; HPF; FORGE; SP1; SPMD; KAP; parallel I/O; PED LAMBDA; data parallel; loop distribution; loop fusion; trace analyzers
Previously Published As
Showing items related by title, author, creator and subject.
Bruck, Jehoshua; Dolev, Danny; Ho, Ching-Tien; Rosu, Marcel-Catalin; Strong, Ray (Cornell University, 1995-02)Parallel computing on clusters of workstations and personal computers has very high potential, since it leverages existing hardware and software. Parallel programming environments offer the user a convenient way to express ...
Nicolau, Alexandru (Cornell University, 1985-05)Percolation Scheduling (PS) is a new technique for compiling programs into parallel code. It attempts to overcome problems that limit the effectiveness and applicability of currently available techniques. PS globally ...
Gilbert, John R.; Hafsteinsson, Hjalmtyr (Cornell University, 1987-12)We describe a parallel algorithm for finding the Cholesky factorization of a sparse symmetric positive definite matrix A. The algorithm runs in $O(h \log n)$ time with $m\*$ processors, where $h$ is the height of A's ...