This Article 
 Bibliographic References 
 Add to: 
Compile-Time Techniques for Data Distribution in Distributed Memory Machines
October 1991 (vol. 2 no. 4)
pp. 472-482

A solution to the problem of partitioning data for distributed memory machines is discussed. The solution uses a matrix notation to describe array accesses in fully parallel loops, which allows the derivation of sufficient conditions for communication-free partitioning (decomposition) of arrays. A series of examples that illustrate the effectiveness of the technique for linear references, the use of loop transformations in deriving the necessary data decompositions, and a formulation that aids in deriving heuristics for minimizing a communication when communication-free partitions are not feasible are presented.

[1] W. Abu-Sufah, D. Kuck, and D. Lawrie, "On the performance enhancement of paging systems through program analysis and transformations,"IEEE Trans. Comput., vol. C-30, pp. 341-356, May 1981.
[2] R. Allen and K. Kennedy, "Automatic translation of FORTRAN to vector form,"ACM Trans. Programming Languages Syst., vol. 9, no. 4, pp. 491-524, 1987.
[3] V. Balasundaram, G. Fox, K. Kennedy, and U. Kremer, "An interactive environment for data partitioning and distribution," inProc. 5th Distributed Memory Comput. Conf., Charleston, SC, Apr. 1990.
[4] D. Callahan and K. Kennedy, "Compiling programs for distributed-memory multiprocessors,"J. Supercomput., vol. 2, pp. 151-169, Oct. 1988.
[5] M. Chen, Y. Choo, and J. Li, "Compiling parallel programs by optimizing performance,"J. Supercomput., vol. 2, pp. 171-207, Oct. 1988.
[6] M. Foxet al., Solving Problems on Concurrent Processors, vol. 1. Englewood Cliffs, NJ: Prentice-Hall, 1988.
[7] K. Gallivan, W. Jalby, and D. Gannon, "On the problem of optimizing data transfers for complex memory systems," inProc. 1988 ACM Int. Conf. Supercomput., St. Malo France, July 1988, pp. 238,253.
[8] D. Gannon, W. Jalby, and K. Gallivan, "Strategies for Cache and Local Memory Management by Global Program Transformation,"J. Parallel and Distributed Computing, Vol. 5, No. 5, Oct. 1988, pp. 587-616.
[9] M. Gerndt, "Array distribution in SUPERB," inProc. 1989 ACM Int. Conf. Supercomput., Athens, Greece, June 1989, pp. 164-174.
[10] M. Gupta and P. Banerjee, "Automatic data partitioning on distributed memory multiprocessors," Tech. Rep. CRHC-90-14, Center for Reliable and High-Performance Computing, Univ. of Illinois, Oct. 1990.
[11] D. Hudak and S. Abraham, "Compiler techniques for data partitioning of sequentially iterated parallel loops," inProc. ACM Int. Conf. Supercomput., June 1990, pp. 187-200.
[12] A.H. Karp, "Programming for Parallelism,"Computer, Vol. 20, No. 5, May 1987, pp. 43- 57.
[13] J.P. Lazzaro, "A Silicon Model of an Auditory Neural Representation of Spectral Shape,"IEEE J. Solid State Circuits, Vol. 26, 1991, pp. 772-777.
[14] C. Koelbel, P. Mehrotra, and J. Von Rosendale, "Semi-automatic process partitioning for parallel computation,"Int. J. Parallel Programming, vol. 16, no. 5, pp. 365-382, 1987.
[15] C. Koelbel, P. Mehrotra, and J. Van Rosendale, "Supporting shared data structures on distributed memory architectures," inProc. 2nd ACM SIGPLAN Symp. Principles Practice of Parallel Programming, Mar. 1990, Rep. 90-7, ICASE, Jan. 1990.
[16] C. Koelbel, "Compiling programs for nonshared memory machines," Ph.D. dissertation, Purdue Univ., West Lafayette, IN, Aug. 1990.
[17] J. Li and M. Chen, "Index domain alignment: Minimizing cost of cross-referencing between distributed arrays," Tech. Rep. YALEU/DCS/TR- 275, Dep. Comput. Sci., Yale Univ., Nov. 1989.
[18] M. E. Mace,Memory Storage Patterns in Parallel Processing.New York: Kluwer Academic, 1987.
[19] J. Ramanujam, "Compile-time techniques for parallel execution of loops on distributed memory multiprocessors," Ph.D. dissertation, Dep. Comput. Inform. Sci., The Ohio State Univ., Columbus, OH, Sept. 1990.
[20] A. Rogers and K. Pingali, "Process decomposition through locality of reference," inProc. SIGPLAN'89 Conf. Programming Language Design and Implementation, 1989, pp. 69-80.
[21] A. Rogers, "Compiling for locality of reference," Ph.D. dissertation, Cornell Univ., Aug. 1990.
[22] M. Rosing and R. Weaver, "Mapping data to processors in distributed memory computations," inProc. 5th Distributed Memory Comput. Conf. (DMCC5), Charleston, SC, Apr. 1990, pp. 884-893.
[23] K. Wang and D. Gannon, "Applying AI techniques to program optimization for parallel computers," inParallel Processing for Supercomputers and Artificial Intelligence, K. Hwang and D. DeGroot Eds. New York: McGraw-Hill, 1989, pp. 441-485.
[24] M. Wolfe,Optimizing Supercompilers for Supercomputers. Cambridge MA: MIT Press, 1989.
[25] H. Zima, H. Bast, and H. Gerndt, "SUPERB: A tool for semi-automatic MIMD-SIMD parallelization,"Parallel Comput., vol. 6, pp. 1-18, 1988.

Index Terms:
Index Termscompile time; data partitioning; data distribution; distributed memory machines; matrixnotation; array accesses; parallel loops; sufficient conditions; communication-freepartitioning; linear references; loop transformations; data decompositions; heuristics;matrix algebra; parallel programming; program compilers
J. Ramanujam, P. Sadayappan, "Compile-Time Techniques for Data Distribution in Distributed Memory Machines," IEEE Transactions on Parallel and Distributed Systems, vol. 2, no. 4, pp. 472-482, Oct. 1991, doi:10.1109/71.97903
Usage of this product signifies your acceptance of the Terms of Use.