Relational Algebraic Techniques for the Synthesis of Sparse MatrixPrograms

Other Titles
Abstract
Sparse matrix computations are ubiquitous in computational science. However, the development of high-performance software for sparse matrix computations is a tedious and error-prone task, for two reasons. First, there is no standard way of storing sparse matrices, since a variety of formats are used to avoid storing zeros, and the best choice for the format is dependent on the problem and the architecture. Second, for most algorithms, it takes a lot of code reorganization to produce an efficient sparse program that is tuned to a particular format. We view the problem of supporting effective development of high-performance sparse matrix codes as one of {\em generic programming}. Generic programming is a discipline of designing and implementing software components which can be used when there is a set of {\em related data structures} supporting a common semantics described by an API or protocol, and a set of {\em common algorithms} that can be formulated in terms of this API. When designing a generic programming system one must address the following fundamental questions: -- How do we represent efficient algorithms independently of any particular data-representation scheme? -- How do we provide an interface to a diverse set of data-structures? -- How do we ``knit'' together the representation of the algorithms and the representation for the data to obtain an efficient implementation? This dissertation presents a {\em relational algebraic model} for automatically generating efficient sparse codes starting with dense matrix codes and specification of sparse matrix formats. Our techniques are based on viewing arrays as relations and the execution of DOALL loop nests and loops with reductions as evaluation of queries over these relations. Storage formats are specified to the compiler through search and enumeration access methods and their costs. Code restructuring is then formulated as the search for the most efficient plan for the query. The main step in this process is the identification of simultaneous enumeration of data structures (relational joins) and the determination of the best implementations of this enumeration. This software architecture not only provides for a clean design of the compiler, but it also exposes additional opportunities for code optimization and has led us to more general transformation algorithms than previously reported in the literature. We present experimental data that demonstrates that the code generated by our compiler achieves performance competitive with that of hand-written codes for important computational kernels.
Journal / Series
Volume & Issue
Description
Sponsorship
Date Issued
1999-02
Publisher
Cornell University
Keywords
computer science; technical report
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Committee Co-Chair
Committee Member
Degree Discipline
Degree Name
Degree Level
Related Version
Related DOI
Related To
Related Part
Based on Related Item
Has Other Format(s)
Part of Related Item
Related To
Related Publication(s)
Link(s) to Related Publication(s)
References
Link(s) to Reference(s)
Previously Published As
http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cs/TR99-1732
Government Document
ISBN
ISMN
ISSN
Other Identifiers
Rights
Rights URI
Types
technical report
Accessibility Feature
Accessibility Hazard
Accessibility Summary
Link(s) to Catalog Record