This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Performance Evaluation of Deterministic Routings, Multicasts, and Topologies on RHiNET-2 Cluster
August 2005 (vol. 16 no. 8)
pp. 747-759

Abstract—System Area Networks (SANs), which usually accept arbitrary topologies, have been used to connect nodes in PC/WS clusters or high-performance storage systems. Although deadlock-free routings, multicasts, and topologies for SANs have been widely developed, their evaluation on real PC clusters was rarely done. Thus, the evaluation of routings, multicasts, and topologies in real systems is important to analyze their impact on the total systems and validate their simulation results. In this paper, we implement and evaluate deadlock-free routings and unicast-based multicasts under various topologies and channel buffer sizes on a PC cluster called RHiNET-2 with 64 hosts. Execution results show that descending layers (DL) routing and structured channel pools improve up to 57 percent of bandwidth and 34 percent of barrier synchronization time compared with up*/down* routing. They also show that, by visiting hosts in numerical order, execution time of unicast-based barrier synchronization is improved up to 28 percent compared with that in random order. However, channel buffer sizes don't affect the bandwidth in the RHiNET-2 cluster. In addition to fundamental evaluation, we appraise them using NAS Parallel Benchmarks, and the DL routing achieves 3.2 percent improvement on their execution time compared with up*/down* routing.

[1] N.J. Boden et al., “Myrinet: A Gigabit-per-Second Local Area Network,” IEEE Micro, vol. 15, no. 1, pp. 29-35, 1995.
[2] T. Kudoh, S. Nishimura, J. Yamamoto, H. Nishi, O. Tatebe, and H. Amano, “RHiNET: A Network for High Performance Parallel Computing Using Locally Distributed Computing,” Proc. Information Assurance Workshop (IWIA), pp. 69-73, Nov. 1999.
[3] I.T. Assoc., Infiniband Architecture. Specification volume 1, release 1.0.a., available from the InfiniBand Trade Assoc., http:/www.infinibandta.com, June 2001.
[4] F. Petrini, W.C. Feng, and A. Hoisie, “The Quadrics Network (QsNet): High-Performance Clustering Technology,” Proc. Hot Interconnects Conf., pp. 125-130, Aug. 2001.
[5] W.J. Dally and C.L. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Trans. Computers, vol. 36, no. 5, pp. 547-553, May 1987.
[6] P. Kermani and L. Kleinrock, “Virtual Cut-Through: A New Computer Communication Switching Techniques,” Computer Networks, vol. 3, no. 4, pp. 267-286, 1979.
[7] S.L. Scott and G.T. Horson, “The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus,” Proc. Hot Interconnects IV Conf., pp. 147-156, Aug. 1996.
[8] Y. Yang, A. Funahashi, A. Jouraku, H. Nishi, H. Amano, and T. Sueyoshi, “Recursive Diagonal Torus: An Interconnection Network for Massively Parallel Computers,” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 7, pp. 701-715, 2001.
[9] Myricom, Inc., http:/www.myri.com/, 2005.
[10] M.D. Schroeder et al., “Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links,” IEEE J. Selected Areas in Comm., vol. 9, pp. 1318-1335, 1991.
[11] M.P. Merlin and J.P. Schweitzer, “Deadlock Avoidance in Store-and-Forward Networks,” IEEE Trans. Computers, vol. 28, no. 3, pp. 345-354, 1980.
[12] T. Skeie, O. Lysne, and I. Theiss, “Layered Shortest Path (LASH) Routing in Irregular System Area Networks,” Proc. Int'l Parallel and Distributed Processing Symp., pp. 162-169, Apr. 2002.
[13] M. Koibuchi, A. Jouraku, K. Watanabe, and H. Amano, “Descending Layers Routing: A Deadlock-Free Deterministic Routing Using Virtual Channels in System Area Networks with Irregular Topologies,” Proc. Int'l Conf. Parallel Processing, pp. 527-536, Oct. 2003.
[14] R. Kesavan and D.K. Panda, “Efficient Multicast on Irregular Switch-Based Cut-Through Networks with Up-Down Routing,” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 8, pp. 808-828, Aug. 2001.
[15] M. Koibuchi, K. Watanabe, K. Kono, A. Jouraku, and A. Amano, “Performance Evaluation of Routing Algorithms in RHiNET-2 Cluster,” Proc. IEEE Int'l Conf. Cluster Computing, pp. 395-402, Dec. 2003.
[16] J. Duato, S. Yalamanchili, and L. Ni, Interconnection Networks: An Engineering Approach. Morgan Kaufmann, 2002.
[17] J. Flich, M.P. Malumbres, P. Lopez, and J. Duato, “Performance Evaluation of Networks of Workstations with Hardware Shared Memory Model Using Execution-Driven Simulation,” Proc. Int'l Conf. Parallel Processing, pp. 146-153, Oct. 1999.
[18] T. Otsuka, K. Watanabe, J. Tsuchiya, H. Harada, J. Yamamoto, H. Nishi, T. Kudoh, and H. Amano, “Performance Evaluation of a Prototype of RHiNET-2: A Network-Based Distributed Parallel Computing System,” Proc. IASTED Int'l Conf. Applied Imformatics, pp. 738-743, Feb. 2003.
[19] K. Watanabe, T. Otsuka, J. Tsuchiya, H. Harada, J. Yamamoto, H. Nishi, T. Kudoh, and H. Amano, “Performance Evaluation of RHiNET-2/NI: A Network Interface for Distributed Parallel Computing Systems,” Proc. Int'l Symp. Cluster Computing and the Grid, pp. 318-325, May 2003.
[20] T. Takahashi, S. Sumimoto, A. Hori, H. Harada, and Y. Ishikawa, “PM2: High Performance Communication Middleware for Heterogeneous Network Environment,” Proc. SuperComputing Conf. 2000, pp. 52-53, Nov. 2000.
[21] J.C. Sancho, A. Robles, and J. Duato, “An Effective Methodology to Improve the Performance of the Up*/Down* Routing Algorithm,” IEEE Trans. Parallel and Distributed Systems, vol. 15, no. 8, pp. 740-767, 2004.
[22] J. Wu and L. Sheng, “Deadlock-Free Routing in Irregular Networks Using Prefix Routing,” Proc. Parallel and Distributed Computing Systems, pp. 424-430, Aug. 1999.
[23] J.C. Sancho and A. Robles, “Improving the Up*/Down* Routing Scheme for Networks of Workstations,” Proc. European Conf. Parallel Computing, pp. 882-889, Aug. 2000.
[24] M. Koibuchi, A. Jouraku, and H. Amano, “The Impact of Path Selection Algorithm of Adaptive Routing for Implementing Deterministic Routing,” Proc. Int'l Conf. Parallel and Distributed Processing Techniques and Applications, pp. 1431-1437, June 2002.
[25] P.K. McKinley, H. Xu, A.H. Esfahanian, and L.M. Ni, “Unicast-Based Multicast Communication in Wormhole-Routed Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 12, pp. 1252-1265, Dec. 1994.
[26] S. Nishimura, T. Kudoh, H. Nishi, J. Yamamoto, K. Harasawa, N. Matsudaira, S. Akutsu, K. Tasho, and H. Amano, “High-Speed Network Switch RHiNET-2/SW and Its Implementation with Optical Interconnections,” Proc. Hot Interconect Conf., pp. 31-38, Aug. 2000.
[27] Y. Ishikawa, H. Tezuka, A. Hori, S. Sumimoto, T. Takahashi, F. O'Carroll, and H. Harada, “RWC PC Cluster II and SCore Cluster System Software— High Performance Linux Cluster,” Proc. Fifth Ann. Linux Expo, pp. 55-62, May 1999.
[28] D. Bailey, T. Harris, W. Saphir, R. Wijngaart, A. Woo, and M. Yarrow, “The NAS Parallel Benchmarks 2.0,” NAS Technical Report, NAS-95-020, Dec. 1995.
[29] D. Bailey, T. Harris, W. Saphir, R. Wijngaart, A. Woo, and M. Yarrow, “New Implementations and Results for the NAS Parallel Benchmarks 2,” Proc. Eighth SIAM Conf. Parallel Processing for Scientific Computing, PP97, Mar. 1997.
[30] Himeno benchmark, http://w3cic.riken.go.jp/HPC/Hime noBMTprogram2.htm , 2005.
[31] J. Liu, B. Chandrasekaran, J. Wu, W. Jiang, S. Kini, W. Yu, D. Buntinas, P. Wyckoff, and D.K. Panda, “Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics,” Proc. SuperComputing Conf., 2003.
[32] http://www.opnet.com/productshome.html, 2005.
[33] M. Taufer and T. Stricker, “A Performance Monitor Based on Virtual Global Time for Clusters of PCS,” Proc. IEEE Int'l Conf. Cluster Computing, pp. 64-72, Dec. 2003.
[34] NetPIPE Team A Network Protocol Independent Performance Evaluator, http://www/scl.ameslab.govnetpipe/, 2005.
[35] Z. Lan and P. Deshikachar, “Performance Analysis of a Large Scale Cosmology Application on Three Cluaster Systems,” Proc. IEEE Int'l Conf. Cluster Computing, pp. 56-63, Dec. 2003.
[36] J. Beecroft, D. Addison, F. Petrini, and M. MclLaren, “QsNETII: An Interconnect for Supercomputing Applications,” http:/www.quadrics.com/, 2005.

Index Terms:
Deterministic routing, multicast, topology, performance evaluation, system area networks, RHiNET, interconnection networks, PC clusters.
Citation:
Michihiro Koibuchi, Konosuke Watanabe, Tomohiro Otsuka, Hideharu Amano, "Performance Evaluation of Deterministic Routings, Multicasts, and Topologies on RHiNET-2 Cluster," IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 8, pp. 747-759, Aug. 2005, doi:10.1109/TPDS.2005.97
Usage of this product signifies your acceptance of the Terms of Use.