This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Wavefront Diffusion and LMSR: Algorithms for Dynamic Repartitioning of Adaptive Meshes
May 2001 (vol. 12 no. 5)
pp. 451-466

Abstract—Current multilevel repartitioning schemes tend to perform well on certain types of problems while obtaining worse results for other types of problems. We present two new multilevel algorithms for repartitioning adaptive meshes that improve the performance of multilevel schemes for the types of problems that current schemes perform poorly while maintaining similar or better results for those problems that current schemes perform well. Specifically, we present a new scratch-remap scheme called Locally-matched Multilevel Scratch-remap (or simply LMSR) for repartitioning of adaptive meshes. LMSR tries to compute a high-quality partitioning that has a large amount of overlap with the original partitioning. We show that LMSR generally decreases the data redistribution costs required to balance the load compared to current scratch-remap schemes. We present a new diffusion-based scheme that we refer to as Wavefront Diffusion. In Wavefront Diffusion, the flow of vertices moves in a wavefront from overweight to underweight subdomains. We show that Wavefront Diffusion obtains significantly lower data redistribution costs while maintaining similar or better edge-cut results compared to existing diffusion algorithms. We also compare Wavefront Diffusion with LMSR and show that these provide a trade-off between edge-cut and data redistribution costs for a wide range of problems. Our experimental results on a Cray T3E, an IBM SP2, and a cluster of Pentium Pro workstations show that both schemes are fast and scalable. For example, both are capable of repartitioning a seven million vertex graph in under three seconds on 128 processors of a Cray T3E. Our schemes obtained relative speedups of between nine and 12 when the number of processors was increased by a factor of 16 on a Cray T3E.

[1] R. Biswas and R.C. Strawn, “A New Procedure for Dynamic Adaption of Three-Dimensional Unstructured Grids,” Applied Numerical Math., vol. 13, pp. 437–452, 1994.
[2] J. Boillat, “Load Balancing and Poisson Equation in a Graph,” Concurrency: Practice and Experience, vol. 2, pp. 289-313, 1990.
[3] T. Bui and C. Jones, “A Heuristic for Reducing Fill in Sparse Matrix Factorization,” Proc. Sixth SIAM Conf. Parallel Processing for Scientific Computing, pp. 445-452, 1993.
[4] J. Castanos and J. Savage, “Repartitioning Unstructured Adaptive Meshes,” Proc. Int'l. Parallel and Distributed Processing Symp., 2000.
[5] J. Cong and M. Smith, “A Parallel Bottom-Up Clustering Algorithm with Applications to Circuit Partitioning in VSLI Design,” Proc. ACM/IEEE Design Automation Conf., pp. 755-760, 1993.
[6] G. Cybenko, "Dynamic Load Balancing for Distributed Memory Multiprocessors," J. Parallel and Distributed Computing, vol. 7, pp. 279-301, 1989.
[7] K. Devine, B. Hendrickson, E. Boman, M. St. John, and C. Vaughan, “Design of Dynamic Load-Balancing Tools for Parallel Applications,” Proc. Int'l. Conf. Supercomputing, 2000.
[8] R. Diekmann, A. Frommer, and B. Monien, “Efficient Schemes for Nearest Neighbor Load Balancing,” Parallel Computing, vol. 25, pp. 789-812, 1999.
[9] P. Diniz, S. Plimpton, B. Hendrickson, and R. Leland, “Parallel Algorithms for Dynamically Partitioning Unstructured Grids,” Proc. Seventh SIAM Conf. Parallel Procedures, 1995.
[10] C.M. Fiduccia and R.M. Mattheyses, "A Linear Time Heuristic for Improving Network Partitions," Proc. 19th Design Automation Conf., pp. 175-181, 1982.
[11] J. Flaherty, R. Loy, C. Ozturan, M. Shephard, B. Szymanski, J. Teresco, and L. Ziantz, “Parallel Structures and Dynamic Load Balancing for Adaptive Finite Element Computation,” Applied Numerical Math, vol. 26, pp. 241-263, 1998.
[12] H. Gabow, “Data Structures for Weighted Matching and Nearest Common Ancestors with Linking,” Proc. First Ann. ACM-SIAM Symp. Discrete Algorithms, pp. 434-443, 1990.
[13] A. Gupta, “Fast and Effective Algorithms for Graph Partitioning and Sparse Matrix Reordering,” IBM J. Research and Development, vol. 41, nos. 1/2, pp. 171-183, 1996.
[14] K. Hall, “An$r-Dimensional$Quadratic Placement Algorithm,” Management Science, vol. 17, no. 3, pp. 219-229, 1970.
[15] S. Hauck and G. Borriello, “An Evaluation of Bipartitioning Technique,” Proc. Conf. Advanced Research in VLSI, 1995.
[16] B. Hendrickson and R. Leland, “The Chaco User's Guide, Version 2.0.,” Technical Report SAND94-2692, Sandia Nat'l Laboratories, 1994.
[17] B. Hendrickson and R. Leland, "An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations," SIAM J. Scientific Computation, vol. 16, no. 2, Mar. 1995, pp. 452-469.
[18] B. Hendrickson and R. Leland, “An Multilevel Algorithm for Partitioning Graphs,” Proc. Supercomputing, Dec. 1995
[19] G. Horton, “A Multi-Level Diffusion Method for Dynamic Load Balancing,” Parallel Computing, vol. 9, pp. 209-218, 1993.
[20] Y. Hu and R. Blake, “An Improved Diffusion Algorithm for Dynamic Load Balancing,” Parallel Computing, vol. 25, pp. 417-444, 1999.
[21] Y. Hu, R. Blake, and D. Emerson, “An Optimal Migration Algorithm for Dynamic Load Balancing,” Concurrency: Practice and Experience, vol. 10, pp. 467-483, 1998.
[22] G. Karypis and V. Kumar, “A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs,” SIAM J. Scientific Computing, to appear.
[23] G. Karypis and V. Kumar, “MetIS: A Software Package for Partitioning Unstructured Graphs, Partitioning Meshes, and Computing Fill-Reducing Orderings of Sparse Matrices, Version 4. 0.,” Technical Report, Dept. of Computer Science and Eng., Univ. of Minnesota, 1998.
[24] G. Karypis and V. Kumar, "Multilevel K-Way Partitioning Scheme for Irregular Graphs," J. Parallel and Distributed Computing, vol. 48, 1998, pp. 96-129.
[25] G. Karypis, K. Schloegel, and V. Kumar, “PARmeTIS: Parallel Graph Partitioning and Sparse Matrix Ordering Library,” technical report, Dept. of Computer Science and Eng., Univ. of Minnesota, 1997.
[26] B. Kernighan and S. Lin, “An Efficient Heuristic Partitioning Graphs,” The Bell System Technical J., vol. 49, no. 2, pp. 291-307, 1970.
[27] B. Monien, R. Preis, and R. Diekmann, “Quality Matching and Local Improvement for Multilevel Graph-Partitioning,” technical report, Univ. of Paderborn, 1999.
[28] B. Nour-Omid, A. Raefsky, and G. Lyzenga, “Solving Finite Element Equations on Concurrent Computers,” Am. Soc. Mechancial Eng. A.K. Noor, ed. pp. 291-307, 1986.
[29] L. Oliker and R. Biswas, “PLUM: Parallel Load Balancing for Adaptive Unstructured Meshes,” J. Parallel and Distributed Computing, vol. 52, pp. 150–177, 1998.
[30] R. Sandhu, "Engineering Authority and Trust in Cyberspace: The OM-AM and RBAC Way," Proc. 5th ACM Workshop on RBAC, ACM Press, New York, 2000, pp. 111-119.
[31] C. Ou, S. Ranka, and G. Fox, “Fast and Parallel Mapping Algorithms for Irregular and Adaptive Problems,” J. Supercomputing, vol. 10, pp. 119-140, 1996.
[32] A. Patra and D. Kim, “Efficient Mesh Partitioning for Adaptive$hp$Finite Element Meshes,” technical report, Dept. of Mechanical Eng., State University of New York, Buffalo, 1999.
[33] J. Pilkington and S. Baden, “Dynamic Partitioning of Non-Uniform Structured Workloads with Spacefilling Curves,” technical report, Dept. of Computer Science and Eng., Univ. of California 1995.
[34] A. Pothen, H. Simon, and K. Liou, "Partitioning Sparse Matrices with Eigenvectors of Graphs," SIAM J. Matrix Analysis and Application, vol. 11, pp. 430-352, July 1990.
[35] A. Pothen, H.D. Simon, L. Wang, and S.T. Bernard, "Towards a Fast Implementation of Spectral Nested Dissection," Proc. Supercomputing '92, pp.42-51, 1992.
[36] K. Schloegel, G. Karypis, and V. Kumar, “Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes,” J. Parallel and Distributed Computing, vol. 47, pp. 109-124, 1997.
[37] K. Schloegel, G. Karypis, and V. Kumar, “Graph Partitioning for High Performance Scientific Simulations,” CRPC Parallel Computing Handbook, Morgan Kaufmann, 2000.
[38] K. Schloegel, G. Karypis, V. Kumar, R. Biswas, and L. Oliker, “A Performance Study of Diffusive vs. Remapped Load-Balancing Schemes” ISCA 11th Int'l Conf. Parallel and Distributed Computing Systems, pp. 59-66, 1998.
[39] H. Simon, A. Sohn, and R. Biswas, “HARP: A Fast Spectral Partitioner,” Proc. Ninth ACM Symp. Parallel Algorithms and Architectures, pp. 43-52, 1997.
[40] A. Sohn, “S-HARP: A Parallel Dynamic Spectral Partitioner,” technical report, Dept. of Computer and Information Science, New Jersey Institute of Technology, 1997.
[41] A. Sohn, R. Biswas, and H. Simon, “Impact of Load Balancing on Unstructured Adaptive Grid Computations for Distributed-Memory Multiprocessors,” Proc. Eighth IEEE Symp. Parallel and Distributed Processing, pp. 26-33, 1996.
[42] A. Sohn and H. Simon, “JOVE: A Dynamic Load Balancing Framework for Adaptive Computations on an SP-2 Distributed-Memory Multiprocessor,” Technical Report 94-60, Dept. of Computer and Information Science, New Jersey Institute of Technology, 1994.
[43] N. Touheed, P. Selwood, P. Jimack, and M. Berzins, “A Comparison of Some Dynamic Load-Balancing Algorithms for a Parallel Adaptive Flow Solver,” Parallel Computing, vol. 26, no. 1, pp. 535-554, 2000.
[44] A. Vidwans, Y. Kallinderis, and V. Venkatakrishnan, “Parallel Dynamic Load-Balancing Algorithm for Three-Dimensional Adaptive Unnstructured Grids,” AIAA J., vol. 32, pp. 497-505, 1994.
[45] C. Walshaw and M. Cross, “Mesh Partitioning: A Multilevel Balancing and Refinement Algorithm,” SIAM J. Scientific Computing, vol. 22, no. 1, pp.63-80, 2000.
[46] C. Walshaw, M. Cross, and M. Everett, “Dynamic Mesh Partitioning: A Unified Optimisation and Load-Balancing Algorithm,” Technical Report 95/IM/06, Centre for Numerical Modelling and Process Analysis, Univ. of Greenwich, 1995.
[47] C. Walshaw, M. Cross, and M. Everett, “Mesh Partitioning and Load-Balancing for Distributed Memory Parallel Systems,” Proc. Parallel and Distbuted Computing for Computer Mechanics, 1997.
[48] C. Walshaw, M. Cross, and M.G. Everett, “Parallel Dynamic Graph Partitioning for Adaptive Unstructured Meshes,” J. Parallel and Distributed Computing, vol. 47, pp. 102-108, 1997.
[49] J. Watts and S. Taylor, “A Practical Approach to Dynamic Load Balancing,” IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 3, pp. 235–248, Mar. 1998.
[50] C.-Z. Xu and F.C.M. Lau, "The Generalized Dimension Exchange Method for Load Balancing in k-ary n-cubes and Variants," J. Parallel and Distributed Computing, vol. 24, no. 1, pp. 72-85, Jan. 1995.

Index Terms:
Dynamic graph partitioning, multilevel diffusion, scratch-remap, wavefront diffusion, LMSR, adaptive mesh computations.
Citation:
Kirk Schloegel, George Karypis, Vipin Kumar, "Wavefront Diffusion and LMSR: Algorithms for Dynamic Repartitioning of Adaptive Meshes," IEEE Transactions on Parallel and Distributed Systems, vol. 12, no. 5, pp. 451-466, May 2001, doi:10.1109/71.926167
Usage of this product signifies your acceptance of the Terms of Use.