This Article 
 Bibliographic References 
 Add to: 
Multicoloring of Grid-Structured PDE Solvers on Shared-Memory Multiprocessors
November 1995 (vol. 6 no. 11)
pp. 1195-1205

Abstract—In order to execute a parallel PDE (partial differential equation) solver on a shared-memory multiprocessor, we have to avoid memory conflicts in accessing multidimensional data grids. A new multicoloring technique is proposed for speeding sparse matrix operations. The new technique enables parallel access of grid-structured data elements in the shared memory without causing conflicts. The coloring scheme is formulated as an algebraic mapping which can be easily implemented with low overhead on commercial multiprocessors. The proposed multicoloring scheme has been tested on an Alliant FX/80 multiprocessor for solving 2D and 3D problems using the CGNR method. Compared to the results reported by Saad (1989) on an identical Alliant system, our results show a factor of 30 times higher performance in Mflops. Multicoloring transforms sparse matrices into ones with a diagonal diagonal block (DDB) structure, enabling parallel LU decomposition in solving PDE problems. The multicoloring technique can also be extended to solve other scientific problems characterized by sparse matrices.

[1] L. Adams,“m-step preconditioned conjugate gradient methods,” SIAM J. Sci. Stat. Comp., vol. 6, pp. 452-463, 1985.
[2] Alliant Computer Systems Corp., FX/Series Product Summary, 1987.
[3] A.J. Bernstein,“Analysis of programs for parallel processing,” IEEE Trans. Elec. Computers, pp. 746-757, Oct. 1966.
[4] J.H. Bramble,J.E. Pasciak,, and A.H. Schatz,“The construction of preconditioners for elliptic problems by substructuring: I,” Math. Comp., vol. 47, no. 175, pp. 103-134, 1986.
[5] T.F. Chan and Y. Saad,“Multigrid algorithms on the hypercube multiprocessor,” IEEE Trans. Computers, pp. 969-977, Nov. 1986.
[6] P. Concus,G.H. Golub,, and G. Meurant,“Block preconditioning for the conjugate gradient method,” SIAM J. Sci. Stat. Comp., vol. 6, pp. 309-332, 1985, Also Report LBL-14856, Lawrence Berkeley Laboratory, 1982.
[7] C.C. Douglas and W.L. Miranker,“Constructive interference in parallel algorithms,” SIAM J. Numer. Anal., vol. 25, pp. 376-398, 1988.
[8] I.S. Duff, R. Grimes, and J. Lewis, “Sparse Matrix Test Problems,” ACM Trans. Mathematical Software, vol. 15, pp. 1–14, Mar. 1989.
[9] T. Dupont,R.P. Kendall,, and H.H. Rachford, Jr.,“An approximate factorization procedure for solving self-adjoint difference equations,” SIAM J. Numer. Anal., vol. 5, pp. 559-573, 1968.
[10] H.C. Elman and E. Agron,“Ordering techniques for the preconditioning of conjugate gradient methods on parallel computers,” Technical Report UMIACS-TR-88-53, UMIACS, Univ. of Maryland, 1988.
[11] I. Garcia,J.J. Merelo,J.D. Bruguera,, and E.L. Zapata,“Parallel quadrant interlocking factorization on hypercube computers,” Parallel Computing, vol. 15, pp. 87-100, 1990.
[12] K. Hwang, Advanced Computer Architecture: Parallelism, Scalability, Programmability. McGraw-Hill, 1993.
[13] K. Hwang and H.C. Wang,“A multigrid Schwarz alternating method for parallel solution of elliptic PDE problems,” Proc. Int’l Conf. on Advances in Parallel Computing, D.J. Evans et al., Ed., pp. 105-120, 1989.
[14] R.B. Lee,“Empirical results on the speedup, efficiency, redundancy, and quality of parallel computations,” Proc. Int’l Conf. on Parallel Processing, Aug. 1980, pp. 91-96.
[15] O.A. McBryan,P.O. Frederickson,J. Linden,A. Schuller,K. Solchenbach,K. Stuben,C.A. Thole,, and U. Trottenberg,“Multigrid methods on parallel computers: A survey of recent developments,” Impact Comput. Sci. Eng., vol. 3, pp. 1-75, 1991.
[16] J.A. Meijerink and H.A. van der Vorst,“An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix,” Math. Comp, vol. 31, pp. 148-162, 1977.
[17] R. Melhem and K. Ramarao,“Multicolor reordering of sparse matrices resulting from irregular grids,” ACM Trans. Math. Software, vol. 14, no. 2, pp. 117-138, 1988.
[18] N.M. Nachtigal,S.C. Reddy,, and L.N. Trefethen,“How fast are nonsymmetric matrix iterations?” SIAM J. Matrix Anal. Appl., vol. 13, no. 3, pp. 778-795, 1992.
[19] J.M. Ortega and R.G. Voigt,“Solution of partial differential equations on vector and parallel computers,” SIAM Rev., vol. 27, no. 2, pp. 149-240, 1985.
[20] E.L. Poole and J.M. Ortega,“Multicolor ICCG methods for vector computers,” SIAM J. Numer. Anal., vol. 24, pp. 1,394-1,418, 1987.
[21] Y. Saad,“Krylov subspace methods on supercomputers,” SIAM J. Sci. Stat. Comp., vol. 10, no. 6, pp. 1,200-1,232, 1989.
[22] R. Schreiber and W. Tang,“Vectorizing the conjugate gradient method,” Proc. Symp. CYBER 205 Applications,Ft. Collins, Colo., 1982.
[23] H.C. Wang,“Parallelization of iterative PDE solvers on shared-memory multiprocessors,” PhD thesis, Department of Electrical Engineering-Systems, Univ. of Southern California, 1992.
[24] H.C. Wang and K. Hwang,“Multicoloring for fast sparse matrix-vector multiplication in solving PDE problems,” Proc. Int’l Conf. Parallel Processing,St. Charles, Ill., Aug. 1993, vol. 3: Algorithms and Applications, pp. 215-222.
[25] M.J. Wolfe,“Automatic vectorization, data dependence, and optimizations for parallel computers,” Parallel Processing for Supercomputing and Artificial Intelligence, K. Hwang and DeGroot, Eds., chap. 11., McGraw-Hill, New York, 1989.

Index Terms:
Parallel processing, conjugate gradient methods, multicoloring, sparse matrix, PDE solvers, memory access conflicts, cache saturation, multiprocessor performance.
Hwang-Cheng Wang, Kai Hwang, "Multicoloring of Grid-Structured PDE Solvers on Shared-Memory Multiprocessors," IEEE Transactions on Parallel and Distributed Systems, vol. 6, no. 11, pp. 1195-1205, Nov. 1995, doi:10.1109/71.476191
Usage of this product signifies your acceptance of the Terms of Use.