This Article 
 Bibliographic References 
 Add to: 
Performance-Based Path Determination for Interprocessor Communication in Distributed Computing Systems
March 1999 (vol. 10 no. 3)
pp. 316-327

Abstract—The different types of messages used by a parallel application program executing in a distributed computing system can each have unique characteristics so that no single communication network can produce the lowest latency for all messages. For instance, short control messages may be sent with the lowest overhead on one type of network, such as Ethernet, while bulk data transfers may be better suited to a different type of network, such as Fibre Channel or HiPPI. This work investigates how to exploit multiple heterogeneous communication networks that interconnect the same set of processing nodes using a set of techniques we call performance-based path determination (PBPD) [9], [10]. The performance-based path selection (PBPS) technique selects the best (lowest latency) network among several for each individual message to reduce the communication overhead of parallel programs. The performance-based path aggregation (PBPA) technique, on the other hand, aggregates multiple networks into a single virtual network to increase the available bandwidth. We test the PBPD techniques on a cluster of SGI multiprocessors interconnected with Ethernet, Fibre Channel, and HiPPI networks using a custom communication library built on top of the TCP/IP protocol layers. We find that PBPS can reduce communication overhead in applications compared to using either network alone, while aggregating networks into a single virtual network can reduce communication latency for bandwidth-limited applications. The performance of the PBPD techniques depends on the mix of message sizes in the application program and the relative overheads of the networks, as demonstrated in our analytical models.

[1] D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Darum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan,, and S. Weeratunga, “The NAS Parallel Benchmarks,” NAS Report RNR-94-007, Mar. 1994.
[2] F. Berman, R. Wolski, S. Figueira, J. Schopf,, and G. Shao, “Application-Level Scheduling on Distributed Heterogeneous Networks,” Proc. Supercomputing, 1996.
[3] I. Foster, J. Geisler, C. Kesselman,, and S. Tuecke, “Multimethod Communication for High-Performance Metacomputing Applications,” Proc. Supercomputing, 1996.
[4] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek,, and V. Sunderam,PVM: Parallel Virtual Machine—A Users' Guide and Tutorial for Networked Parallel Computing. The MIT Press, 1994.
[5] J. Hsieh, D.H.C. Du, N.J. Troullier,, and M. Lin, “Enhanced PVM Communications Over a HIPPI Networks,” Proc. Second Int'l Workshop High-Speed Network Computing, Apr. 1996.
[6] J.-M. Hsu and P. Banerjee, “Performance Measurement and Trace Driven Simulation of Parallel CAD and Numeric Applications on a Hypercube Multicomputer,” Proc. Int'l Symp. Computer Architecture, pp. 260-269, 1990.
[7] C. Huang, E.P. Kasten,, and P.K. McKinley, “Design and Implementation of Multicast Operations for ATM-Based High Performance Computing,” Proc. Supercomputing '94, pp. 164-173, Aug. 1994.
[8] A.A. Khokhar, V.K. Prasanna, M.E. Shaaban,, and C.L. Wang, “Heterogeneous Computing: Challenges and Opportunities,” Computer, pp. 18-27, June 1993.
[9] J. Kim and D.J. Lilja, “Exploiting Multiple Heterogeneous Networks to Reduce Communication Costs in Parallel Programs,” Proc. Heterogeneous Computing Workshop, Int'l Parallel Processing Symp., pp. 83-95, Apr. 1997.
[10] J. Kim and D.J. Lilja, “Utilizing Heterogeneous Networks in Distributed Parallel Computing Systems,” Proc. Int'l Symp. High Performance Distributed Computing, pp. 336-345, Aug. 1997.
[11] J. Kim and D.J. Lilja, “Characterization of Communication Patterns in Message-Passing Parallel Scientific Application Programs,” Proc. Workshop Communication, Architecture, and Applications for Network-Based Parallel Computing, Int'l Symp. High-Performance Computer Architecture, pp. 202-216, Jan. 1998.
[12] D.J. Lilja, “Partitioning Tasks Between a Pair of Interconnected Heterogeneous Processors: A Case Study,” Concurrency: Practice and Experience, pp. 209-223, May 1995.
[13] Message Passing Interface Forum, MPI: A Message-Passing Interface Standard, Version 1.1, June 1995.
[14] S. Nog and D. Kotz, “A Performance Comparison of TCP/IP and MPI on FDDI, Fast Ethernet, and Ethernet,” PCS-TR95-273, Dept. of Computer Science, Dartmouth College, 1995.
[15] L. Smarr and C. E. Catlett, “Metacomputing,” Comm. ACM, pp. 44-52, June 1992.
[16] R. Stevens, Unix Networking Programming.Englewood Cliffs, N.J.: Prentice Hall, 1990.
[17] C.A. Thekkath and H.M. Levy, “Limits to Low-Latency Communication on High-Speed Networks,” ACM Trans. Computer Systems, pp. 179-203, May 1993.
[18] D. Tolmie and J. Renwick, “HiPPI. Simplicity Yields Success,” IEEE Network, pp. 28-32, Jan. 1993.
[19] A. Wolman, G. Voelker,, and C.A. Thekkath, “Latency Analysis of TCP on an ATM Network,” Proc. USENIX, pp. 167-179, 1994.

Index Terms:
Cluster of workstations, network computing, performance estimation, network characteristics, heterogeneous computing.
JunSeong Kim, David J. Lilja, "Performance-Based Path Determination for Interprocessor Communication in Distributed Computing Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 10, no. 3, pp. 316-327, March 1999, doi:10.1109/71.755832
Usage of this product signifies your acceptance of the Terms of Use.