|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Umit V. Catalyurek, Cevdet Aykanat, "Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication," IEEE Transactions on Parallel and Distributed Systems, vol. 10, no. 7, pp. 673-693, July, 1999. | |||
| BibTex | x | ||
| @article{ 10.1109/71.780863, author = {Umit V. Catalyurek and Cevdet Aykanat}, title = {Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {10}, number = {7}, issn = {1045-9219}, year = {1999}, pages = {673-693}, doi = {http://doi.ieeecomputersociety.org/10.1109/71.780863}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication IS - 7 SN - 1045-9219 SP673 EP693 EPD - 673-693 A1 - Umit V. Catalyurek, A1 - Cevdet Aykanat, PY - 1999 KW - Sparse matrices KW - matrix multiplication KW - parallel processing KW - matrix decomposition KW - computational graph model KW - graph partitioning KW - computational hypergraph model KW - hypergraph partitioning. VL - 10 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
Abstract—In this work, we show that the standard graph-partitioning-based decomposition of sparse matrices does not reflect the actual communication volume requirement for parallel matrix-vector multiplication. We propose two computational hypergraph models which avoid this crucial deficiency of the graph model. The proposed models reduce the decomposition problem to the well-known hypergraph partitioning problem. The recently proposed successful multilevel framework is exploited to develop a multilevel hypergraph partitioning tool PaToH for the experimental verification of our proposed hypergraph models. Experimental results on a wide range of realistic sparse test matrices confirm the validity of the proposed hypergraph models. In the decomposition of the test matrices, the hypergraph models using PaToH and hMeTiS result in up to 63 percent less communication volume (30 to 38 percent less on the average) than the graph model using MeTiS, while PaToH is only 1.3–2.3 times slower than MeTiS on the average.
[1] C.J. Alpert and A.B. Kahng, “Recent Directions in Netlist Partitioning: A Survey,” VLSI J., vol. 19, nos. 1-2, pp. 1–81, 1995.
[2] C.J. Alpert, L.W. Hagen, and A.B. Kahng, “A Hybrid Multilevel/Genetic Approach for Circuit Partitioning,” technical report, UCLA Computer Science Dept., 1996.
[3] C. Aykanat, F. Ozguner, F. Ercal, and P. Sadayappan, “Iterative Algorithms for Solution of Large Sparse Systems of Linear Equations on Hypercubes,” IEEE Trans. Computers, vol. 37, no. 12, pp. 1,554–1,567, Dec. 1988.
[4] T. Bui and C. Jones, “A Heuristic for Reducing Fill in Sparse Matrix Factorization,” Proc. Sixth SIAM Conf. Parallel Processing for Scientific Computing, pp. 445–452, 1993.
[5] T. Bultan and C. Aykanat, “A New Mapping Heuristic Based on Mean Field Annealing,” J. Parallel and Distributed Computing, vol. 16, pp. 292–305, 1992.
[6] W. Camp, S.J. Plimpton, B. Hendrickson, and R.W. Leland, “Massively Parallel Methods for Engineering and Science Problems,” Comm. ACM, vol. 37, pp. 31–41, Apr. 1994.
[7] W.J. Carolan, J.E. Hill, J.L. Kennington, S. Niemi, and S.J. Wichmann, “An Empirical Evaluation of the Korbx Algorithms for Military Airlift Applications,” Operations Research, vol. 38, no. 2, pp. 240–248, 1990.
[8] Ü.V. Çatalyürek and C. Aykanat, “Decomposing Irregularly Sparse Matrices for Parallel Matrix-Vector Multiplications,” Proc. Third Int'l Workshop Parallel Algorithms for Irregularly Structured Problems (IRREGULAR '96), pp. 175–181, 1996.
[9] I.S. Duff, R. Grimes, and J. Lewis, “Sparse Matrix Test Problems,” ACM Trans. Mathematical Software, vol. 15, pp. 1–14, Mar. 1989.
[10] C.M. Fiduccia and R.M. Mattheyses, "A Linear Time Heuristic for Improving Network Partitions," Proc. 19th Design Automation Conf., pp. 175-181, 1982.
[11] M. Garey, D. Johnson, and L. Stockmeyer, “Some Simplified NP-Complete Graph Problems,” Theoretical Computer Science, vol. 1, pp. 237–267, 1976.
[12] M.K. Goldberg and M. Burstein, “Heuristic Improvement Techniques for Bisection of VLSI Networks,” Proc. IEEE Int'l Conf. Computer Design, pp. 122–125, 1983.
[13] B. Hendrickson and R. Leland, “A Multilevel Algorithm for Partitioning Graphs,” technical report, Sandia Nat'l Laboratories, 1993.
[14] B. Hendrickson and R. Leland, “The Chaco User's Guide, version 2.0,” Technical Report SAND95-2344, Sandia Nat'l Laboratories, 1995.
[15] B. Hendrickson, R. Leland, and S. Plimpton, “An Efficient Parallel Algorithm for Matrix-Vector Multiplication,” Int'l J. High Speed Computing, vol. 7, no. 1, pp. 73–88, 1995.
[16] B. Hendrickson, “Graph Partitioning and Parallel Solvers: Has the Emperor No Clothes?,” Lecture Notes in Computer Science, vol. 1,457, pp. 218–225, 1998.
[17] B. Hendrickson and T.G. Kolda, “Partitioning Rectangular and Structurally Nonsymmetric Sparse Matrices for Parallel Processing,” submitted to SIAM J. Scientific Computing.
[18] E. Ihler, D. Wagner, and F. Wagner, “Modeling Hypergraphs by Graphs with the Same Mincut Properties,” Information Processing Letters, vol. 45, pp. 171-175, Mar. 1993.
[19] IOWA Optimization Center, “Linear Programming problems,” http://www.cs.ucsd.edu/~cfetzer/HWC.ftp:/ /col. biz. uiowa. edu:pub/testprob/lpgondzio .
[20] M. Kaddoura, C.W. Qu, and S. Ranka, “Partitioning Unstructured Computational Graphs for Nonuniform and Adaptive Environments,” IEEE Parallel and Distributed Technology, pp. 63–69, 1995.
[21] G. Karypis and V. Kumar, “A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs,” SIAM J. Scientific Computing, to appear.
[22] G. Karypis and V. Kumar, MeTiS: A Software Package for Partitioning Unstructured Graphs, Partitioning Meshes, and Computing Fill-Reducing Orderings of Sparse Matrices, Version 3.0, Univ. of Minnesota, Dept. of Computer Science and Engineering, Army HPC Research Center, Minneapolis, Minn., 1998.
[23] G. Karypis, V. Kumar, R. Aggarwal, and S. Shekhar, “Hypergraph Partitioning Using Multilevel Approach: Applications in VLSI Domain,” IEEE Trans. VLSI Systems, to appear.
[24] G. Karypis, V. Kumar, R. Aggarwal, and S. Shekhar, MeTiS: A Hypergraph Partitioning Package, Version 1.0.1, Univ. of Minnesota, Dept. of Computer Science and Engineering, Army HPC Research Center, Minneapolis, Minn., 1998.
[25] B.W. Kernighan and S. Lin, “An Efficient Heuristic Procedure for Partitioning Graphs,” The Bell System Technical J., vol. 49, pp. 291–307, Feb. 1970.
[26] T.G. Kolda, “Partitioning Sparse Rectangular Matrices for Parallel Processing,” Lecture Notes in Computer Science, vol. 1,457, pp. 68–79, 1998.
[27] V. Kumar, A. Grama, A. Gupta, and G. Karypis, Introduction to Parallel Computing: Design and Analysis of Algorithms. Benjamin Cummings, 1994.
[28] V. Lakamsani, L.N. Bhuyan, and D.S. Linthicum, “Mapping Molecular Dynamics Computations on to Hypercubes,” Parallel Computing, vol. 21, pp. 993–1,013, 1995.
[29] T. Lengauer, Combinatorial Algorithms for Integrated Circuit Layout. Chichester, U.K.: Wiley, 1990.
[30] J.G. Lewis and R.A. van de Geijn, “Distributed Memory Matrix-Vector Multiplication and Conjugate Gradient Algorithms,” Proc. Supercomputing '93, pp. 15–19, 1993.
[31] O.C. Martin and S.W. Otto, “Partitioning of Unstructured Meshes for Load Balancing,” Concurrency: Practice and Experience, vol. 7, no. 4, pp. 303–314, 1995.
[32] S.G. Nastea, O. Frieder, and T. El-Ghazawi, “Load-Balanced Sparse Matrix-Vector Multiplication on Parallel Computers,” J. Parallel and Distributed Computing, vol. 46, pp. 439–458, 1997.
[33] A.T. Ogielski and W. Aielo, “Sparse Matrix Computations on Parallel Processor Arrays,” SIAM J. Scientific Computing, 1993.
[34] A. Pinar, Ü.V. Çatalyürek, C. Aykanat, and M. Pinar, “Decomposing Linear Programs for Parallel Solution,” Lecture Notes in Computer Science, vol. 1,041, pp. 473–482, 1996.
[35] C. Pommerell, M. Annaratone, and W. Fichtner, “A Set of New Mapping and Coloring Heuristics for Distributed-Memory Parallel Processors,” SIAM J. Scientific and Statistical Computing, vol. 13, pp. 194–226, Jan. 1992.
[36] C.-W. Qu and S. Ranka, “Parallel Incremental Graph Partitioning,” IEEE Trans. Parallel and Distributed Systems, vol. 8, no. 8, pp. 884–896, 1997.
[37] Y. Saad, K. Wu, and S. Petiton, “Sparse Matrix Computations on the CM-5,” Proc. Sixth SIAM Conf. Parallel Processing for Scientific Computing, 1993.
[38] D.G. Schweikert and B.W. Kernighan, “A Proper Model for the Partitioning of Electrical Circuits,” Proc. Ninth ACM/IEEE Design Automation Conf., pp. 57–62, 1972.
[39] T. Davis, Univ. of Florida Sparse Matrix Collection,http://www. cise. ufl. edu/ davissparse/, NA Digest, vols. 92/96/97, nos. 42/28/23, 1994/1996/1997.

