This Article 
 Bibliographic References 
 Add to: 
A Parallelization Domain Oriented Multilevel Graph Partitioner
December 2002 (vol. 51 no. 12)
pp. 1435-1441

Abstract—In this paper we present a novel multilevel graph partitioning algorithm, KACE, which uses knowledge about the domain and employs several graph transformation techniques. Both functional and structural parallelism in the sequential code are explored to improve the quality of parallel tasks. Statistical information about communication times between nodes as a function of message size and/or other factors are used to have a better estimate of balancing factors, code replication, and synchronization penalties. This enables us to use a task cohesion algorithm to obtain a coarse version of the partitioned graph. Many of KACE's pIarameters are shown to have definite impact on the parallelized program code.

[1] D. Bailey, T. Harris, W. Saphir, R. van def Wijngaart, A. Woo, and M. Yarrow, “The NAS Parallel Benchmarks 2.0,” Technical Report NAS-95-020, NASA Ames Research Center Systems Division, NAS, Dec. 1995.
[2] U. Banerjee, Dependence Analysis. Norwell, Mass.: Kluwer Academic, Oct. 1996.
[3] M.J. Berger and S.H. Bokhari, "A partitioning strategy for nonuniform problems on multiprocessors," IEEE Trans. Computers, vol. 36, no. 5, pp. 570-580, May 1987.
[4] T.M. Conte, Encyclopedia of Electrical and Electronics Engineering, chapter “Superscalar and VLIW Processors (Multifunction Parallelism).” New York: John Wiley&Sons, 1998.
[5] C. Brownhill et al., “The PROMIS Compiler Prototype,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), 1997.
[6] J. Lyle et al., “Unravel: A CASE Tool to Assist Evaluation of High Integrity Software Solume 2: User Manual,” Technical Report 5691, US Dept. of Commerce, NISTIR, 1995.
[7] J. Lyle et al., “Unravel: A CASE Tool to Assist Evaluation of High Integrity Software Volume 1: Requirements and Design,” Technical Report 5691, US Dept. of Commerce, NISTIR, 1995.
[8] P. Banerjee et al., "The Paradigm Compiler for Distributed-Memory Multicomputers," Computer, Vol. 28, No. 10, Oct. 1995, pp. 37-47.
[9] J. Ferrante,K.J. Ottenstein,, and J.D. Warren,“The program dependence graph and its use in optimization,” ACM Trans. Programming Languages and Systems, vol. 9, no. 3, pp. 319-349, June 1987.
[10] C.M. Fiduccia and R.M. Mattheyses, "A Linear Time Heuristic for Improving Network Partitions," Proc. 19th Design Automation Conf., pp. 175-181, 1982.
[11] M. Girkar and C.D. Polychronopoulos, "Extracting Task-Level Parallelism," ACM Trans. Program Languages and Systems, Vol. 17, No. 4, July 1995, pp. 600-634.
[12] B. Hendrickson, “Graph Partitioning and Parallel Solvers: Has the Emperor No Clothes?,” Lecture Notes in Computer Science, vol. 1,457, pp. 218–225, 1998.
[13] B. Hendrickson and R. Leland, "An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations," SIAM J. Scientific Computation, vol. 16, no. 2, Mar. 1995, pp. 452-469.
[14] G. Karypis and V. Kumar, "Multilevel K-Way Partitioning Scheme for Irregular Graphs," J. Parallel and Distributed Computing, vol. 48, 1998, pp. 96-129.
[15] G. Karypis and V. Kumar, “A Parallel algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering,” J. Parallel and Distributed Computing, vol. 48, no. 1, pp. 71-95, 1998.
[16] B.W. Kernighan and S. Lin, “An Efficient Heuristic Procedure for Partitioning Graphs,” Bell Systems Technology J., vol. 49, no. 2, pp. 291-307, 1970.
[17] S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi, “Optimization by Simulated Annealing,” Science, vol. 220, no. 4598, pp. 671-680, 1983.
[18] A. Lakhotia, "Rule-Based Approach to Computing Module Cohesion," Proc. 15th Int'l Conf. Software Eng., pp. 35-44, 1993.
[19] A.W. Lim, G.I. Cheong, and M.S. Lam, “An Affine Partitioning Algorithm to Maximize Parallelism and Minimize Communication,” Proc. 13th ACM SIGARCH Int'l Conf. Supercomputing, June 1999.
[20] G.J. Myers, Reliable Software through Computer Design. Petrocelli/Charter, 1975.
[21] J. Nandigam, “A Measure for Module Cohesion,” PhD thesis, Univ. of Southwestern Louisiana, 1995.
[22] W. Saphir, R. van der Wijngaart, A. Woo, and M. Yarrow, “New Implementation and Results for the NAS Parallel Benchmarks 2,” technical report, NASA Ames Research Center, NAS Systems Division, 1995.
[23] E.A. Schweitz, “Extracting Task-Level Parallelism from a Sequential Program and Coarse-Grain Parallelism from Irregular Loops,” PhD thesis, North Carolina State Univ., 1999.
[24] H. Simon., “Partitioning of Unstructured Problems for Parallel Processing,” Computing Systems in Eng., 1991.
[25] W.P. Stevens, G.J. Myers, and L.L. Constantine, “Structured Design,” IBM Systems J., vol. 13, no. 2, 1974.
[26] F. Tip, “A Survey of Program Slicing Techniques,” J. Programming Languages, vol. 3, no. 3, pp. 121-189, 1995.
[27] K.A. Tomko, “Domain Decomposition, Irregular Application, and Parallel Computers,” PhD thesis, Univ. of Michigan, 1995.
[28] R.P. Wilson, “Efficient, Context-Sensitive Pointer Analysis for C Programs,” PhD thesis, Stanford Univ., 1998.
[29] M.J. Wolfe, High Performance Compilers for Parallel Computing. Reading, Mass.: Addison-Wesley, 1996.

Index Terms:
Code replication, data-flow, dependence analysis, domain, graph transformations, multi-level graph partitioning, superimposition graph, task cohesion.
Eric A. Schweitz, Dharma P. Agrawal, "A Parallelization Domain Oriented Multilevel Graph Partitioner," IEEE Transactions on Computers, vol. 51, no. 12, pp. 1435-1441, Dec. 2002, doi:10.1109/TC.2002.1146709
Usage of this product signifies your acceptance of the Terms of Use.