Relational Algebraic Techniques for the Synthesis of Sparse MatrixPrograms
No Access Until
Permanent Link(s)
Collections
Other Titles
Author(s)
Abstract
Sparse matrix computations are ubiquitous in computational science. However, the development of high-performance software for sparse matrix computations is a tedious and error-prone task, for two reasons. First, there is no standard way of storing sparse matrices, since a variety of formats are used to avoid storing zeros, and the best choice for the format is dependent on the problem and the architecture. Second, for most algorithms, it takes a lot of code reorganization to produce an efficient sparse program that is tuned to a particular format. We view the problem of supporting effective development of high-performance sparse matrix codes as one of {\em generic programming}. Generic programming is a discipline of designing and implementing software components which can be used when there is a set of {\em related data structures} supporting a common semantics described by an API or protocol, and a set of {\em common algorithms} that can be formulated in terms of this API. When designing a generic programming system one must address the following fundamental questions: -- How do we represent efficient algorithms independently of any particular data-representation scheme? -- How do we provide an interface to a diverse set of data-structures? -- How do we ``knit'' together the representation of the algorithms and the representation for the data to obtain an efficient implementation? This dissertation presents a {\em relational algebraic model} for automatically generating efficient sparse codes starting with dense matrix codes and specification of sparse matrix formats. Our techniques are based on viewing arrays as relations and the execution of DOALL loop nests and loops with reductions as evaluation of queries over these relations. Storage formats are specified to the compiler through search and enumeration access methods and their costs. Code restructuring is then formulated as the search for the most efficient plan for the query. The main step in this process is the identification of simultaneous enumeration of data structures (relational joins) and the determination of the best implementations of this enumeration. This software architecture not only provides for a clean design of the compiler, but it also exposes additional opportunities for code optimization and has led us to more general transformation algorithms than previously reported in the literature. We present experimental data that demonstrates that the code generated by our compiler achieves performance competitive with that of hand-written codes for important computational kernels.