Block Factorizations on a Cluster of RS/6000s
Henry, Greg; Hoisie, Adolfy
This paper discusses optimizing computational linear algebra algorithms on a ring cluster of IBM RS/6000s. We offer the results of a block Cholesky factorization and the underlying BLAS to demonstrate the advantage of using blocking algorithms on such architectures. A thorough analysis of the complexities of the problem is provided. Different communication protocols, serial versus parallel execution, and optimization of data traffic is explored. We provide insight into some of the techniques we have observed in exploiting this particular design. The implementations demonstrate that this important architecture can be utilized effectively for sufficiently large dense matrix computations.
Previously Published As