This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
An Implementation Framework for HPF Distributed Arrays on Message-Passing Parallel Computer Systems
September 1996 (vol. 7 no. 9)
pp. 897-914

Abstract—Data parallel languages, like High Performance Fortran (HPF), support the notion of distributed arrays. However, the implementation of such distributed array structures and their access on message passing computers is not straightforward. This holds especially for distributed arrays that are aligned to each other and given a block-cyclic distribution.

In this paper, an implementation framework is presented for HPF distributed arrays on message passing computers. Methods are presented for efficient (in space and time) local index enumeration, local storage, and communication.

Techniques for local set enumeration provide the basis for constructing local iteration sets and communication sets. It is shown that both local set enumeration and local storage schemes can be derived from the same equation. Local set enumeration and local storage schemes are shown to be orthogonal, i.e., they can be freely combined. Moreover, for linear access sequences generated by our enumeration methods, the local address calculations can be moved out of the enumeration loop, yielding efficient local memory address generation.

The local set enumeration methods are implemented by using a relatively simple general transformation rule for absorbing ownership tests. This transformation rule can be repeatedly applied to absorb multiple ownership tests. Performance figures are presented for local iteration overhead, a simple communication pattern, and storage efficiency.

[1] "High Performance Fortran Forum," High Performance Fortran Language Specification, ver. 1.1, Nov. 1994.
[2] A.H. Veen and M. de Lange, "Overview of the PREPARE Project," Proc. Fourth Int'l Workshop Compilers for Parallel Computers, H.J. Sips, ed., pp. 365-371, Dec. 1993.
[3] F. Andre, P. Brezany, O. Chéron, W. Denissen, J.L. Pazat, and K. Sanjari, "A New Compiler Technology for Handling HPF Data Parallel Constructs," Proc. Third Workshop on Languages, Compilers, and Run-time Systems, B.K. Szymanski and B. Sinharoy, eds., pp. 279-282, 1995.
[4] R. Ponnusamy,J. Saltz,, and A. Choudhary,“Runtime-compilation techniques for data partitioning and communication schedule reuse,” Proc. Supercomputing’93, pp. 361-370.Los Alamitos, Calif.: IEEE CS Press, Nov. 1993. Also available as Univ. ofMaryland Technical Report CS-TR-3055 and UMIACS-TR-93-32.
[5] P. Brezany, O. Chéron, K. Sanjari, and E. van Konijnenburg, "Processing Irregular Codes Containing Arrays with Multi-Dimensional Distributions by the PREPARE HPF Compiler," HPCN Europe '95, pp. 526-531. Springer-Verlag, 1995.
[6] S. Chatterjee, J. Gilbert, F. Long, R. Schreiber, and S. Tseng, “Generating Local Adresses and Communication Sets for Data Parallel Programs,” J. Parallel and Distributed Computing, vol. 26,pp. 72–84, 1995.
[7] D. Callahan and K. Kennedy, "Compiling Programs for Distributed-Memory Multiprocessors," J. Supercomputing, vol. 2, no. 2, pp. 151-169, Oct. 1988.
[8] M. Gerndt, "Array Distribution in SUPERB," Proc. Third Int'l Conference on Supercomputing,Crete, Greece, June 1989.
[9] A. Rogers and K. Pingali,“Process decomposition through locality of reference,” Proc. SIGPLAN’89 Conf. Program Language Design and Implementation,Portland, Ore., June 1989.
[10] E.M. Paalvast, A.J. van Gemund, and H.J. Sips, "A Method for Parallel Program Generation with an Application to the Booster Language," Proc. 1990 Int'l Conf. Supercomputing, pp. 457-469, June11-15 1990.
[11] C. Koelbel and P. Mehrotra, "Compiling Global Name-Space Parallel Loops for Distributed Execution," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 10, pp. 440-451, Oct. 1991.
[12] M. Le Fur, J.-L Pazat, and F. Andre, "Static Domain Analysis for Compiling Commutative Loop Nests," Publication interne 757, IRISA, Campus Universitaire de Beaulieu - 35042 Rennes Cedex, France, Sept 1993. URL: ftp//irisa.irisa.fr/techreports/1993/PI-757.ps.Z.
[13] C.-W. Tseng, “An Optimizing Fortran D Compiler for Distributed-Memory Machines,” doctoral thesis, Center for Research on Parallel Computation, Rice Univ., Houston, TX, Jan. 1993.
[14] J.M. Stichnoth,“Efficient compilation of array statements for private memory multicomputers,” Tech. Report CMU-CS-93-109, School of Computer Science, Carnegie Mellon Univ., Feb. 1993.
[15] J. Stichnoth,D. O’Hallaron,, and T. Gross,“Generating communication for array statements: Design, implementation, and evaluation,” J. of Parallel and Distributed Computing, vol. 21, no. 1, pp. 150-159, 1994.
[16] S.K.S. Gupta, S.D. Kaushik, S. Mufti, S. Sharma, C.-H Huang, and P. Sadayappan, "On Compiling Array Expressions for Efficient Execution on Distributed Memory Machines," Proc. Int'l Conf. Parallel Processing, vol. II, pp. 301-305, Aug. 1993.
[17] S.K.S. Gupta, S.D. Kaushik, C.-H. Huang, and P. Sadayappan, "On Compiling Array Expressions for Efficient Execution on Distributed-Memory Machines," Technical Report OSU-CISRC-4/94-TR19, Ohio State Univ., Columbus, OH 43210, 1994. URL: ftp://archive.cis.ohio-state.edu/pub/tech-report/1994/TR19.ps.gz.
[18] S.D. Kaushik, C.-H. Huang, and P. Sadayappan, "Incremental Generation of Index Sets for Array Statement Execution on Distributed-Memory Machines," Proc. Seventh Ann. Workshop Languages and Compilers for Parallel Computing, pp. 17.1-17.18, Cornell Univ., Aug. 1994. Also published in LNCS 892, pp 251-265, 1995, Springer Verlag.
[19] S. Benkner, P. Brezany,, and H. Zima, “Processing Array Statements and Procedure Interfaces in the Prepare High Performance Fortran Compiler,” Proc. Fifth Int'l Conf. Compiler Construction, Apr. 1994.
[20] C. Ancourt, F. Irigoin, F. Coelho, and R. Keryell, "A Linear Algebra Framework for Static HPF Code Distribution," Technical Report A-278-CRI, Ecole des Mines, Paris, Nov. 1995. An earlier version was presented at the Fourth Int'l Workshop on Compilers for Parallel Computers,Delft, The Netherlands, pp. 117-132, Dec. 1993.
[21] S. Hiranandani, K. Kennedy, J. Mellor-Crammey, and A. Sethi, “Compilation Technique for Block-Cyclic Distribution,” Proc. ACM Int'l Conf. Supercomputing, pp. 392-403, July 1994.
[22] A. Thirumalai and J. Ramanujam, “Fast Address Sequence Generation for Data Parallel Programs Using Integer Lattices,” Languages and Compilers for Parallel Computing: Lecture Notes in Computer Science. P. Sadayappan et al., eds., Springer-Verlag, 1996.
[23] K. Kennedy, N. ${\bf Nedeljkovi\acute c}$, and A. Sethi, “A Linear-Time Algorithm for Computing the Memory Access Sequence in Data Parallel Programs,” Proc. Fifth ACM SIGPLAN, Symp. Principles and Practice of Parallel Programming, 1995.
[24] S.D. Kaushik, C.H. Huang,, and P. Sadayappan, “Compiling Array Statements for Efficient Execution on Distributed Memory Machines: Two-Level Mappings,” Proc. Eighth Ann. Workshop Languages and Compilers for Parallel Computing, Aug 1995.
[25] K. Kennedy, N. ${\bf Nedeljkovi\acute c}$, and A. Sethi, “A Linear-Time Algorithm for Computing the Memory Access Sequence in Data Parallel Programs,” Proc. Fifth ACM SIGPLAN, Symp. Principles and Practice of Parallel Programming, 1995.
[26] K. Kennedy, N. Nedeljkovic, and A. Sethi, “Efficient Address Generation for Block-Cyclic Distribution,” Proc. Int'l Conf. Supercomputing, pp. 180-184, July 1995.
[27] C. Koelbel, “Compiler-Time Generation of Communication for Scientific Programs,” Supercomputing '91, pp. 101-110, Nov. 1991.
[28] E.M. Paalvast, H.J. Sips, and L.C. Breebaart, "Booster: A High-Level Language for Portable Parallel Algorithms," Applied Numerical Mathematics, vol. 8, no. 6, pp. 177-192, 1991.
[29] E.M. Paalvast, "Programming for Parallelism and Compiling for Efficiency," PhD thesis, Delft Univ. of Tech nology, June 1992.
[30] E.M. Paalvast, H.J. Sips, and A.J. van Gemund, "Automatic Parallel Program Generation and Optimization from Data Decompositions," Proc. 1991 Int'l Conf. Parallel Processing, pp. II 124-131, Aug. 1991.
[31] Y. Mahéo and J.-L. Pazat, "Distributed Array Management for HPF Compilers," Publication interne 787, IRISA, Campus Universitaire de Beaulieu-35042 Rennes Cedex, France, Dec. 1993. URL: ftp://irisa.irisa.fr/techreports/1993/PI-787.ps.Z.
[32] K.H. Rosen, Elementary Number Theory And Its Applications. Addison Wesley, 1984.
[33] H. Zima and B. Chapman, Supercompilers for Parallel and Vector Computers. ACM Press, 1990.
[34] M. Wolfe, High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.

Index Terms:
HPF, message passing, message aggregation, distributed arrays, parallel computers.
Citation:
Kees van Reeuwijk, Will Denissen, Henk J. Sips, Edwin M.R.M. Paalvast, "An Implementation Framework for HPF Distributed Arrays on Message-Passing Parallel Computer Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 7, no. 9, pp. 897-914, Sept. 1996, doi:10.1109/71.536935
Usage of this product signifies your acceptance of the Terms of Use.