Parallel Finite Element Analysis of Biomechanical Structures on the Ncube 6400
This paper presents parallel 3-D finite element analysis for distributed memory multiprocessors. Traditionally, finite element analysis has been performed on sequential computers. Current research in high performance finite element analysis shows considerable promise for fast, efficient implementation on MIMD and SIMD computers. This paper demonstrates the use of a standard, banded Cholesky method for solving the finite element system of equations. The uniformity of the underlying data distribution ensures high performance due to load balance. Moreover, since a distributed banded Cholesky algorithm is likely to a part of a standard parallel numerical library, it reduces the burden on the applications programmer, making this method simpler to implement than the substructuring method. Since a parallel solver requires the rows of the coefficient matrix to be distributed in a wrap fashion, it might appear that the assembly of the element stiffness matrices would not be efficient. However, as shown in this paper, the calculation of element stiffness matrices, assembly and the calculation of Gauss-point stresses can be done efficiently in parallel without any inter-processor communication. In fact, once nodal coordinates and element connectivity is made available to all processors, message passing is required only during the factorization and solution stages. The next few sections describe how parallelism was exploited during the assembly, solution and stress recovery strages of the finite element analysis. The parallel program developed was tested on large 3-D finite element problems arising from biomechanical structural systems, on an Ncube 6400. High performance Basic Linear Algebra Subprograms (BLAS) were used to improve the execution speed.