|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Ching-Hsien Hsu, Yeh-Ching Chung, Don-Lin Yang, Chyi-Ren Dow, "A Generalized Processor Mapping Technique for Array Redistribution," IEEE Transactions on Parallel and Distributed Systems, vol. 12, no. 7, pp. 743-757, July, 2001. | |||
| BibTex | x | ||
| @article{ 10.1109/71.940748, author = {Ching-Hsien Hsu and Yeh-Ching Chung and Don-Lin Yang and Chyi-Ren Dow}, title = {A Generalized Processor Mapping Technique for Array Redistribution}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {12}, number = {7}, issn = {1045-9219}, year = {2001}, pages = {743-757}, doi = {http://doi.ieeecomputersociety.org/10.1109/71.940748}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - A Generalized Processor Mapping Technique for Array Redistribution IS - 7 SN - 1045-9219 SP743 EP757 EPD - 743-757 A1 - Ching-Hsien Hsu, A1 - Yeh-Ching Chung, A1 - Don-Lin Yang, A1 - Chyi-Ren Dow, PY - 2001 KW - Array redistribution KW - generalized processor mapping KW - distributed memory multicomputers KW - runtime support. VL - 12 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
Abstract—In many scientific applications, array redistribution is usually required to enhance data locality and reduce remote memory access in many parallel programs on distributed memory multicomputers. Since the redistribution is performed at runtime, there is a performance trade-off between the efficiency of the new data decomposition for a subsequent phase of an algorithm and the cost of redistributing data among processors. In this paper, we present a generalized processor mapping technique to minimize the amount of data exchange for BLOCK-CYCLIC(
[1] S. Benkner, “Handling Block-Cyclic Distribution Arrays in Vienna Fortran 90,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, June 1995.
[2] B. Chapman,P. Mehrotra,H. Moritsch,, and H. Zima,“Dynamic data distributions in Vienna Fortran,” Proc. of Supercomputing’93, pp. 284-293, Nov. 1993.
[3] S. Chatterjee, J. Gilbert, F. Long, R. Schreiber, and S. Tseng, “Generating Local Adresses and Communication Sets for Data Parallel Programs,” J. Parallel and Distributed Computing, vol. 26,pp. 72–84, 1995.
[4] Y.-C. Chung, C.-H. Hsu, and S.-W. Bai, “A Basic-Cycle Calculation Technique for Efficient Dynamic Data Redistribution,” IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 4, Apr. 1998.
[5] F. Desprez, J. Dongarra, and A. Petitet, C. Randriamaro, Y. Robert, “Scheduling Block-Cyclic Array Redistribution,” IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 2,pp. 192–205 1998.
[6] G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, U. Kremer, C.-W. Tseng, and M. Wu, “Fortran-D Language Specification,” Technical Report TR-91-170, Dept. of Computer Science, Rice Univ., Dec. 1991.
[7] S.K.S. Gupta, S.D. Kaushik, C.-H. Huang, and P. Sadayappan, “On the Generation of Efficient Data Communication for Distributed-Memory Machines,” Proc. Int'l Computing Symp., pp. 504-513, 1992.
[8] S.K.S. Gupta, S.D. Kaushik, C.-H. Huang, and P. Sadayappan, “On Compiling Array Expressions for Efficient Execution on Distributed-Memory Machines,” J. Parallel and Distributed Computing, vol. 32, pp. 155-172, 1996.
[9] High Performance Fortran Forum, “High Performance Fortran Language Specification (Version 1.1),” Rice Univ., Nov. 1994.
[10] S. Hiranandani, K. Kennedy, J. Mellor-Crammey, and A. Sethi, “Compilation Technique for Block-Cyclic Distribution,” Proc. ACM Int'l Conf. Supercomputing, pp. 392-403, July 1994.
[11] C.-H. Hsu and Y.-C. Chung, “Efficient Methods for$kr \rightarrow r$and$r \rightarrow kr$Array Redistribution,” J. Supercomputing, vol. 12, no. 2, pp. 253-276, May 1998.
[12] E. Kalns and L. Ni, “Processor Mapping Techniques towards Efficient Data Redistribution,” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 6,pp. 1,234–1,247, 1995.
[13] E.T. Kalns and L.M. Ni,“DaReL: A portable data redistribution library for distributed-memory machines,” Proc. 1994 Scalable Parallel Libraries Conf. 2, Oct. 1994.
[14] S.D. Kaushik, C.-H. Huang, R.W. Johnson, and P. Sadayappan, “An Approach to Communication-Efficient Data Redistribution,” Proc. 1994 ACM Int'l Conf. Supercomputing, pp. 364-373, June 1994.
[15] S.D. Kaushik, C.H. Huang, J. Ramanujam, and P. Sadayappan, “Multi-Phase Array Redistribution: Modeling and Evaluation,” Proc. Int'l Parallel Processing Symp., 1995.
[16] S.D. Kaushik, C.H. Huang, and P. Sadayappan, “Efficient Index Set Generation for Compiling HPF Array Statements on Distributed-Memory Machines,” J. Parallel and Distributed Computing, vol. 38, pp. 237-247, 1996.
[17] K. Kennedy, N. Nedeljkovic, and A. Sethi, “Efficient Address Generation for Block-Cyclic Distribution,” Proc. Int'l Conf. Supercomputing, pp. 180-184, July 1995.
[18] P.-Z. Lee and W.Y. Chen, “Compiler Techniques for Determining Data Distribution and Generating Communication Sets on Distributed-Memory Multicomputers,” Proc. 29th IEEE Hawaii Int'l Conf. System Sciences, pp. 537-546, Jan. 1996.
[19] Y.W. Lim, P.B. Bhat, and V.K. Prasanna, “Efficient Algorithms for Block-Cyclic Redistribution of Arrays,” Proc. Eighth IEEE Symp. Parallel and Distributed Processing, pp. 74-83, 1996.
[20] Y.W. Lim, N. Park, and V.K. Prasanna, “Efficient Algorithms for Multi-Dimensional Block-Cyclic Redistribution of Arrays,” Proc. 26th Int'l Conf. Parallel Processing, pp. 234-241, 1997.
[21] L. Prylli and B. Tourancheau, “Fast Runtime Block Cyclic Data Redistribution on Multiprocessors,” J. Parallel and Distributed Computing, vol. 45, 1997.
[22] S. Ramaswamy and P. Banerjee, "Automatic Generation of Efficient Array Redistribution Routines for Distributed Memory Multicomputers," Proc. Frontiers '95: The Fifth Symposium on the Frontiers of Massively Parallel Computation, pp. 342-349,McLean, Va., Feb. 1995.
[23] S. Ramaswamy, B. Simons, and P. Banerjee, “Optimizations for Efficient Array Redistribution on Distributed Memory Multicomputers,” J. Parallel and Distributed Computing, vol. 38, no. 2, pp. 217-228, Nov. 1996.
[24] J. Stichnoth,D. O’Hallaron,, and T. Gross,“Generating communication for array statements: Design, implementation, and evaluation,” J. of Parallel and Distributed Computing, vol. 21, no. 1, pp. 150-159, 1994.
[25] R. Thakur,A. Choudhary,, and G. Fox,“Runtime array redistribution in HPF programs,” Proc. 1994 Scalable High Performance Computing Conf., pp. 309-316, May 1994.
[26] R. Thakur, A. Choudhary, and J. Ramanujam, “Efficient Algorithms for Array Redistribution“ IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 6 pp. 587-594, June 1996.
[27] A. Thirumalai and J. Ramanujam, “HPF Array Statements: Communication Generation and Optimization,” Proc. Third Workshop on Languages, Compilers and Run-Time System for Scalable Computers, May 1995.
[28] V. Van Dongen, C. Bonello, and C. Freehill, “High Performance C—Language Specification Version 0.8.9,” Technical Report CRIM-EPPP-94/04-12, 1994.
[29] C. Van Loan, “Computational Frameworks for the Fast Fourier Transform,” SIAM, 1992.
[30] D.W. Walker and S.W. Otto, “Redistribution of BLOCK-CYCLIC Data Distributions Using MPI,” Concurrency: Practice and Experience, vol. 8, no. 9, pp. 707-728, Nov. 1996.
[31] A. Wakatani and M. Wolfe, “A New Approach to Array Redistribution: Strip Mining Redistribution,” Proc. Parallel Architectures and Languages Europe, July 1994.
[32] A. Wakatani and M. Wolfe, “Optimization of Array Redistribution for Distributed Memory Multicomputers,” Parallel Computing, vol. 21, no. 9, pp. 1485-1490, Sept. 1995.
[33] H. Zima, P. Brezany, B. Chapman, P. Mehrotra, and A. Schwald, “Vienna Fortran—A Language Specification Version 1.1,” ICASE Interim Report 21, ICASE NASA Langley Research Center, Hampton, Va. 23665, Mar. 1992.

