|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Chris H.Q. Ding, "An Optimal Index Reshuffle Algorithm for Multidimensional Arrays and Its Applications for Parallel Architectures," IEEE Transactions on Parallel and Distributed Systems, vol. 12, no. 3, pp. 306-315, March, 2001. | |||
| BibTex | x | ||
| @article{ 10.1109/71.914776, author = {Chris H.Q. Ding}, title = {An Optimal Index Reshuffle Algorithm for Multidimensional Arrays and Its Applications for Parallel Architectures}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {12}, number = {3}, issn = {1045-9219}, year = {2001}, pages = {306-315}, doi = {http://doi.ieeecomputersociety.org/10.1109/71.914776}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - An Optimal Index Reshuffle Algorithm for Multidimensional Arrays and Its Applications for Parallel Architectures IS - 3 SN - 1045-9219 SP306 EP315 EPD - 306-315 A1 - Chris H.Q. Ding, PY - 2001 KW - multidimensional arrays KW - index reshuffle KW - vacancy tracking cycles KW - global exchange KW - dynamic remapping. VL - 12 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
Abstract—Reshuffling elements of a multidimensional array according to an index operation traditionally requires an auxiliary buffer of the same size as the original array. Here, we describe a new in-place algorithm using vacancy tracking cycles with minimum memory access which eliminates the buffer array and the related copy-back, speeding up the reshuffle significantly for large arrays. The algorithm can be parallelized using a multithread approach on shared-memory multiprocessor computers. On distributed-memory multiprocessor computers, the index reshuffle of distributed multidimensional arrays amounts to a remapping of processor domains and is carried out using the in-place local algorithm combined with a global exchange algorithm. Implementation and test results on CRAY T3E and IBM SP indicate the effectiveness of the algorithm.
[1] J.J. Hack, J.M. Rosinski, D.L. Williamson, B.A. Boville, and J.E. Truesdale, “Computational Design of NCAR Community Climate Model,” Parallel Computing, vol. 21, pp. 1545-1555, 1995.
[2] J. Drake, I. Foster, J. Michalakes, B. Toonen, and P. Worley, “Design and Performance of a Scalable Parallel Community Climate Model,” Parallel Computing, vol. 21, pp. 1571-1581, 1995.
[3] I.T. Foster and P.H. Worley, “Parallel Algorithms for the Spectral Transform Method,” SIAM J. Scientific Statistical Computing, vol. 18, pp. 806-837, 1997.
[4] A.A. Mirin, D. Shumaker, and M.F. Wehner, “Efficient Filtering Techniques for Finite-Difference Atmospheric General Circulation Models on Parallel Processors,” Parallel Computing, vol. 24, pp. 729-740, 1998.
[5] C.H.Q. Ding and Y. He, “Data Organization and I/O in a Parallel Ocean Circulation Model,” Proc. Supercomputing '99, Technical Report 43384, Lawrence Berkeley Nat'l Laboratory, Nov 1999.
[6] D. Fraser, “Array Permutation by Index-Digit Permutation,” J. ACM, vol. 23, pp. 298-309, 1976.
[7] A. Edelman, S. Heller, and S.L. Johnsson, "Index Transformation Algorithms in a Linear Algebra Framework," IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 12, pp. 1,302-1,309, Dec. 1994.
[8] V. Kumar, A. Grama, A. Gupta, and G. Karypis, Introduction to Parallel Computing: Design and Analysis of Algorithms. Benjamin Cummings, 1994.
[9] S.L. Johnsson and C.T. Ho, “Matrix Transposition on Boolean n-Cube Configured Ensemble Architectures,” SIAM J. Matrix Analysis and Applications, vol. 9, pp. 419-454, 1988.
[10] S.H. Bokhari, “Complete Exchange on the Intel iPSC-860 Hypercube,” Technical Report 91-4, ICASE, 1991.
[11] G. Fox,M. Johnson,G. Lyzenga,S. Otto,J. Salmon,, and D. Walker,Solving Problems on Concurrent Processors, Vol. I: General Techniques andRegular Problems.Englewood Cliffs, N.J.: Prentice Hall 1988.
[12] J. Hennessy and D. Patterson, Computer Architecture: A Quantitative Approach. Morgan Kaufmann, 1995.

