This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Computing Global Combine Operations in the Multiport Postal Model
August 1995 (vol. 6 no. 8)
pp. 896-900

Abstract—Consider a message-passing system of n processors, in which each processor holds one piece of data initially. The goal is to compute an associative and commutative reduction function on the n pieces of data and to make the result known to all the n processors. This operation is frequently used in many message-passing systems and is typically referred to as global combine, census computation, or gossiping. This paper explores the problem of global combine in the multiport postal model. This model is characterized by three parameters: n—the number of processors, k—the number of ports per processor, and λ—the communication latency. In this model, in every round r, each processor can send k distinct messages to k other processors, and it can receive k messages that were sent from k other processors λ− 1 rounds earlier. This paper provides an optimal algorithm for the global combine problem that requires the least number of communication rounds and minimizes the time spent by any processor in sending and receiving messages.

[1] A. Bagchi,E. f. Schmeichel,, and S.L. Hakimi,“Sequential information dissemination by packets,” Networks, vol. 22, no. 4, pp. 317-333, July 1992.
[2] V. Bala,J. Bruck,R. Bryant,R. Cypher,P. deJong,P. Elustondo,D. Frye,A. Ho,C.T. Ho,G. Irwin,S. Kipnis,R. Lawrence,, and M. Snir,“The IBM external user interface for scalable parallel systems,” Parallel Computing, vol. 20, no. 4, pp. 445-462, Apr. 1994.
[3] V. Bala,J. Bruck,R. Cypher,P. Elustondo,A. Ho,C.T. Ho,S. Kipnis,, and M. Snir,“CCL: A portable and tunable collective communication library forscalable parallel computers,” Eighth Int’l Parallel Processing Symp., IEEE, pp. 835-844, Apr. 1994.
[4] M. Barnett,R. Littlefield,D.G. Payne,, and R. van de Geijn,“Global combine on mesh architectures with wormhole routing,” Seventh Int’l Parallel Processing Symp., IEEE, Apr. 1993.
[5] A. Bar-Noy and S. Kipnis,“Designing broadcasting algorithms in the postal model formessage-passing systems,” Math. Systems Theory, vol. 27, no. 5, pp. 431-452, 1994.
[6] A. Bar-Noy and S. Kipnis,“Multiple message broadcasting in the postal model, Proc. Seventh Int’l Parallel Processing Symp., IEEE, Apr. 1993.
[7] A. Bar-Noy and S. Kipnis,“Broadcasting multiple messages in simultaneous send/receivesystems, Fifth Symp. Parallel and Distributed, Processing, IEEE, pp. 344-347, Dec. 1993.
[8] A. Bar-Noy,S. Kipnis,, and B. Schieber,“An optimal algorithm for computing census functions inmessage-passing systems,” Parallel Processing Letters, vol. 3, no. 1, pp. 19-23, Mar. 1993.
[9] A. Bar-Noy,S. Kipnis,, and B. Schieber,“Optimal computation of census functions in the postal model,” to appear in Discrete Applied Math.
[10] J. Bruck,L. de Coster,N. Dewulf,C.T. Ho,, and R. Lauwereins,“On the design and implementation of broadcast and global combineoperations using the postal model,” Sixth Symp. Parallel and Distributed Processing, IEEE, pp. 594-602, Oct. 1994.
[11] J. Bruck,R. Cypher,, and C.T. Ho,“Multiple message broadcasting with generalized Fibonacci trees,” Fourth Symp. Parallel and Distributed Processing, IEEE, pp. 424-431, Dec. 1992.
[12] J. Bruck,C.T. Ho,S. Kipnis,, and D. Weathersby,“Efficient algorithms for all-to-all communications in multiportmessage-passing systems,” Sixth Ann. Symp. Parallel Algorithms and Architectures, ACM, pp. 298-309, June 1994.
[13] J. Bruck,C.T. Ho,“Efficient global combine operations in multiport message-passingsystems,” Parallel Proc. Letters, vol. 3, no. 4, pp. 335-346, Dec. 1993.
[14] I. Cidon and I. Gopal,“PARIS: An approach to integrated high-speed private networks,” Int’l J. Digital and Analog Cabled Systems, vol. 1, no. 2, pp. 77-85, Apr.-June 1988.
[15] D. Clark,B. Davie,D. Farber,I. Gopal,B. Kadaba,D. Sincoskie,J. Smith,, and D. Tennenhouse,“The AURORA gigabit testbed,” Computer Networks and ISDN, 1991.
[16] D. Culler,A.C. Dusseau,R.P. Martin,, and K.E. Schauser,“Fast parallel sorting under LogP: From theory to practice,” Workshop on Portability and Performance for Parallel Processing,Southampton, England, 1993.
[17] D. Culler,R. Karp,D. Patterson,A. Sahay,K.E. Schauser,E. Santos,R. Subramonian,, and T. von Eicken,“LogP: Towards a realistic model of parallel computation,” Fourth Symp. Principles and Practices Parallel Programming, SIGPLAN’93, ACM, May 1993.
[18] J. Dongarra et al.,“Document for a standard message-passing interface,” Message Passing Interface Forum, Univ. of Tennessee, Tech. Report CS-93-214, Nov. 1993.
[19] Express 3.0 Introductory Guide. Parasoft Corporation, 1990.
[20] G. Fox,M. Johnson,G. Lyzenga,S. Otto,J. Salmon,, and D. Walker,Solving Problems on Concurrent Processors, Vol. I: General Techniques andRegular Problems.Englewood Cliffs, N.J.: Prentice Hall 1988.
[21] S.M. Hedetniemi,S.T. Hedetniemi,, and A.L. Liestman,“A survey of gossiping and broadcasting in communication networks,” Networks, vol. 18, no. 4, pp. 319-349, 1988.
[22] S.L. Johnsson and C.T. Ho,“Spanning graphs for optimum broadcasting and personalizedcommunication in hypercubes,” IEEE Trans. Computers, vol. 38, no. 9, pp. 1,249-1,268, Sept. 1989.
[23] R. Karp,A. Sahay,E. Santos,, and K.E. Schauser,“Optimal broadcast and summation in the LogP model,” Proc. Fifth Ann. Symp. Parallel Algorithms and Architectures, ACM, June 1993.
[24] C.E. Leiserson,Z.S. Abuhamdeh,D.C. Douglas,C.R. Feynman,M.N. Ganmuki,J.V. Hill,W.D. Hillis,B.C. Kuszmaul,M.A. St. Pierre,D.S. Wells,M.C. Wong,S.-W. Yang,, and R. Zak,“The network architecture of the connection machine CM-5,” Proc. Fourth Ann. Symp. Parallel Algorithms and Architectures, ACM, pp. 272-285, June 1992.
[25] J.F. Palmer,“The NCUBE family of parallel supercomputers,” Int’l Conf. Computer Design, IEEE, 1986.
[26] Q.F. Stout and B. Wagar,“Intensive hypercube communication: Prearranged communication inlink-bound machines,” J. Parallel and Distributed Computing, vol. 10, pp. 167-181, 1990.
[27] R.A. van de Geijn,“Efficient global combine operations,” Sixth Distributed Memory Computing Conf., IEEE, Apr. 1991.
[28] C.B. Stunkel,D.G. Shea,B. Abali,M.M. Denneau,P.H. Hochschild,D.J. Joseph,B.J. Nathanson,M. Tsao,, and P.R. Varker,“Architectures and implementation of vulcan,” Eighth Int’l Parallel Processing Symp., IEEE, pp. 268-274, Apr. 1994.

Index Terms:
Census computation, distributed systems, global combine, gossiping, message-passing systems, multiple ports, parallel computers, postal model.
Citation:
Amotz Bar-Noy, Jehoshua Bruck, Ching-Tien Ho, Shlomo Kipnis, Baruch Schieber, "Computing Global Combine Operations in the Multiport Postal Model," IEEE Transactions on Parallel and Distributed Systems, vol. 6, no. 8, pp. 896-900, Aug. 1995, doi:10.1109/71.406965
Usage of this product signifies your acceptance of the Terms of Use.