Publication 1996 Issue No. 2 - February Abstract - Multiphase Complete Exchange: A Theoretical Analysis
Multiphase Complete Exchange: A Theoretical Analysis
February 1996 (vol. 45 no. 2)
pp. 220-229
 ASCII Text x Shahid H. Bokhari, "Multiphase Complete Exchange: A Theoretical Analysis," IEEE Transactions on Computers, vol. 45, no. 2, pp. 220-229, February, 1996.
 BibTex x @article{ 10.1109/12.485374,author = {Shahid H. Bokhari},title = {Multiphase Complete Exchange: A Theoretical Analysis},journal ={IEEE Transactions on Computers},volume = {45},number = {2},issn = {0018-9340},year = {1996},pages = {220-229},doi = {http://doi.ieeecomputersociety.org/10.1109/12.485374},publisher = {IEEE Computer Society},address = {Los Alamitos, CA, USA},}
 RefWorks Procite/RefMan/Endnote x TY - JOURJO - IEEE Transactions on ComputersTI - Multiphase Complete Exchange: A Theoretical AnalysisIS - 2SN - 0018-9340SP220EP229EPD - 220-229A1 - Shahid H. Bokhari, PY - 1996KW - Circuit switchingKW - complete exchangeKW - communication overheadKW - hypercubeKW - multiphase algorithmKW - parallel computing.VL - 45JA - IEEE Transactions on ComputersER -

Abstract-Complete Exchange requires each of N processors to send a unique message to each of the remaining N− 1 processors. For a circuit switched hypercube with N = 2d processors, the Direct and Standard algorithms for Complete Exchange are time optimal for very large and very small message sizes, respectively. For intermediate sizes, a hybrid Multiphase algorithm is better. This carries out Direct exchanges on a set of subcubes whose dimensions are a partition of the integer d. The best such algorithm for a given message size m could hitherto only be found by enumerating all partitions of d.

The Multiphase algorithm is analyzed assuming a high performance communication network. It is proved that only algorithms corresponding to equipartitions of d (partitions in which the maximum and minimum elements differ by at most one) can possibly be optimal. The run times of these algorithms plotted against m form a hull of optimality. It is proved that, although there is an exponential number of partitions, 1) the number of faces on this hull is $\Theta \left( {\sqrt d} \right)$, 2) the hull can be found in $\Theta \left( {\sqrt d} \right)$ time, and 3) once it has been found, the optimal algorithm for any given m can be found in Θ(log d) time.

These results provide a very fast technique for minimizing communication overhead in many important applications, such as matrix transpose, fast Fourier transform, and alternating directions implicit (ADI).

[2] S.H. Bokhari, "Communication overheads on the Intel iPSC-860 hypercube," ICASE Interim Report 10, May 1990.
[3] S.H. Bokhari, "Multiphase complete exchange on a circuit switched hypercube," Proc. 1991 Int'l Conf. Parallel Processing, pp. 525-529, 1991.
[4] S.H. Bokhari, "Complete exchange on the Intel iPSC-860 hypercube," Technical Report 91-4, ICASE, Jan. 1991.
[5] S.H. Bokhari, "Multiphase complete exchange on a circuit switched hypercube," Technical Report 91-5, ICASE, Jan. 1991.
[6] S.H. Bokhari, H. Berryman, "Complete Exchange on a Circuit Switched Mesh," Proc. Scalable High Performance Computing Conf., pp. 300-306, 1992.
[7] J. Douglas and J.E. Gunn, "A general formulation of alternating direction methods," Numer. Math., vol. 6, no. 5, 1964.
[8] T. Dunigan, "Hypercube clock synchronization," Concurrency: Practice and Experience, vol. 4, no. 3, pp. 257-268, May 1992.
[9] E. Grosswald, Topics from the Theory of Numbers.Boston, Mass: Birkhäuser, 1984.
[10] C-T. Ho and M.T. Raghunath, "Efficient communication primitives on hypercubes," Proc. Sixth. Distributed Memory Concurrent Computers, pp. 390-397, 1991.
[11] C-T. Ho and M. T. Raghunath, "Efficient communication primitives on hypercubes," Technical Report RJ 7932 (72915) IBM, T.J. Watson Center, Jan. 1991.
[12] T. Hoshino., PAX Computer: High Speed Parallel Processing and Scientific Computing.Reading, Mass.: Addison-Wesley, 1989.
[13] S.L. Johnsson and C.T. Ho, “Matrix Transposition on Boolean n-Cube Configured Ensemble Architectures,” SIAM J. Matrix Analysis and Applications, vol. 9, pp. 419-454, 1988.
[14] A.W. Marshall and I. Olkin, Inequalities: Theory of Majorization and Its Applications. Academic Press, 1979.
[15] D.W. Peaceman and H.H. Rachford., "The numerical solution of parabolic and elliptic differential equations," SIAM J., vol. 3, no. 1, 1955.
[16] T. Schmiermund and S.R. Seidel., "A communication model for the Intel iPSC/2," Technical Report CS-TR 9002, Dept. of Computer Science, Michigan Tech. Univ., April 1990.
[17] D.S. Scott, "Efficient All-to-All Communication Patterns in Hypercube and Mesh Topologies," Proc. Sixth Conf. Distributed Memory Concurrent Computers, pp. 398-403, 1991.
[18] S.R. Seidel, "Circuit switched vs. store-and-forward solutions to symmetric communication problems," Proc. Fourth. Conf. Hypercube Concurrent Computers and Applications, pp. 253-255, 1989.
[19] S. Seidel, M.-H. Lee, and S. Fotedar, "Concurrent bidirectional communication on the Intel iPSC/860 and iPSC/2," Technical Report CS-TR 9006, Dept. of Computer Science, Michigan Tech. Univ., Nov. 1990.
[20] R. Take., "A routing method for the all-to-all burst on hypercube network," Proc. 35th. National Conf. Info. Proc. Soc. Japan, pp. 151-152, 1987, in Japanese.

Index Terms:
Circuit switching, complete exchange, communication overhead, hypercube, multiphase algorithm, parallel computing.
Citation:
Shahid H. Bokhari, "Multiphase Complete Exchange: A Theoretical Analysis," IEEE Transactions on Computers, vol. 45, no. 2, pp. 220-229, Feb. 1996, doi:10.1109/12.485374