This Article 
 Bibliographic References 
 Add to: 
Applying In-Transit Buffers to Boost the Performance of Networks with Source Routing
September 2003 (vol. 52 no. 9)
pp. 1134-1153
Jos? Flich, IEEE
Pedro L?pez, IEEE Computer Society
Jos? Duato, IEEE

Abstract—Clusters of workstations (COWs) are becoming increasingly popular as a cost-effective alternative to parallel computers. In these systems, processors are connected using irregular topologies, providing the wiring flexibility, scalability, and incremental expansion capability required in this environment. Myrinet is one of the most popular interconnection networks for COWs. Myrinet uses source routing and wormhole switching. The up*/down* routing algorithm is used to build the network routes. On the other hand, in Myrinet, network behavior is controlled by the software running at the network interfaces. Hence, new features such as new routing algorithms can be added by only changing this software. In previous work, we proposed the In-Transit Buffer (ITB) mechanism to improve the performance of source routing-based networks. The ITB mechanism temporarily ejects packets from the network at some intermediate hosts and later reinjects them into the network, performing a special kind of virtual cut-through switching at these hosts. We applied this mechanism to up*/down* routing, in order to remove the down \rightarrow up forbidden channel dependences that prevented minimal routing between every pair of hosts. Results showed that network throughput can be more than doubled on medium-sized (32 switches) networks. In this paper, we analyze in depth the effect of using ITBs in the network, showing that they not only serve for guaranteeing minimal routing, but also that they are a powerful mechanism able to balance network traffic and reduce network contention. To demonstrate these capabilities, we apply the ITB mechanism to improved routing schemes, such as DFS and smart-routing. These routing algorithms (without ITBs) are able to improve the performance of up*/down* by 30 percent and 90 percent, respectively, for a 32-switch network. The evaluation results show that, when ITBs are used together with these improved routing algorithms, network throughput achieved by DFS and smart-routing can still be improved by 56 percent and 23 percent, respectively. However, smart-routing requires a time to compute the routing tables that rapidly grows with network size, it being impossible in practice to build networks with more than 32 switches. This high computational cost is mainly motivated by the need of obtaining deadlock-free routing tables. However, when ITBs are used, one can decouple the stages of computing routing tables and breaking cycles. Moreover, as stated above, ITBs can be used to reduce network contention. In this way, in this paper, we also propose a completely new routing algorithm that tries to balance network traffic by using a simple and low time consuming strategy. The proposed algorithm guarantees deadlock freedom and reduces network contention with the use of ITBs. The evaluation results show that our algorithm obtains unprecedented throughputs in 32-switch networks, tripling the original up*/down* and almost doubling smart-routing.

[1] N. Boden et al., "Myrinet: A Gigabit-per-Second Local Area Network," IEEE Micro, Feb. 1995, pp. 29-36.
[2] R.V. Bopana and S. Chalasani, A Comparison of Adaptive Wormhole Routing Algorithms Proc. 20th Ann. Int'l Symp. Computer Architecture, May 1993.
[3] L. Cherkasova, V. Kotov, and T. Rokicki, “Fibre Channel Fabrics: Evaluation and Design,” Proc. 29th Hawaii Int'l Conf. System Sciences, Feb. 1995.
[4] S. Coll, J. Flich, M.P. Malumbres, P. Lopez, J. Duato, and F.J. Mora, A First Implementation of In-Transit Buffers on Myrinet GM Software Proc. Workshop Comm. Architecture for Clusters, Apr. 2001.
[5] W.J. Dally and C.L. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Trans. Computers, Vol. C-36, No. 5, May 1987, pp. 547-553.
[6] W.J. Dally, "Virtual-Channel Flow Control," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 2, pp. 194-205, Mar. 1992.
[7] J. Duato, “A Necessary and Sufficient Condition for Deadlock-Free Adaptive Routing in Wormhole Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 10, pp. 1,055–1,067, Oct. 1995.
[8] J. Flich, M.P. Malumbres, P. López, and J. Duato, Performance Evaluation of a New Routing Strategy for Irregular Networks with Source Routing Proc. Int'l Conf. Supercomputing, May 2000.
[9] J. Flich, P. López, M.P. Malumbres, and J. Duato, Improving the Performance of Regular Networks with Source Routing Proc. Int'l Conf. Parallel Processing, Aug. 2000.
[10] J. Flich, P. López, M.P. Malumbres, and J. Duato, Boosting the Performance of Myrinet Networks IEEE Trans. Parallel and Distributed Systems, vol. 13, no. 7, July 2002.
[11] GM homepage,http://www.myri.comGM', 2001.
[12] L. Widigen, E. Sowadsky, and K. McGrath, "Eliminating Operand Read Latency," Computer Architecture News, Dec. 1996, pp. 18-22.
[13] P.R. Miller, Efficient Comunications for Fine-Grain Distributed Computers PhD thesis, Southampton Univ., 1991.
[14] Myrinet, M2-CB-35 LAN cables, , 2001.
[15] S.S. Owicki and A.R. Karlin, “Factors in the Performance of the AN1 Computer Network,” Performance Evaluation Rev., vol. 20, pp. 167-180, June 1992.
[16] W. Qiao and L.M. Ni, “Adaptive Routing in Irregular Networks Using Cut-Through Switches,” Proc. 1996 Int'l Conf. Parallel Processing, Aug. 1996.
[17] J.C. Sancho, A. Robles, and J. Duato, New Methodology to Compute Deadlock-Free Routing Tables for Irregular Networks Proc. Workshop Comm. and Architectural Support for Network-Based Parallel Computing, Jan. 2000.
[18] M.D. Schroeder et al., Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links Technical Report SRC research report 59, DEC, Apr. 1990.
[19] S. L. Scott, J.R. Goodman, The Impact of Pipelined Channels on K-Ary N-Cube Networks IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 1, pp. 2-16, Jan. 1994.
[20] R. Sheifert, Gigabit Ethernet. Addison-Wesley, Apr. 1998.
[21] F. Silla and J. Duato, Improving the Efficiency of Adaptive Routing in Networks with Irregular Topology Proc. Int'l Conf. High Performance Computing, Dec. 1997.
[22] F. Silla, M.P. Malumbres, A. Robles, P. López, and J. Duato, Efficient Adaptive Routing in Networks of Workstations with Irregular Topology Proc. Workshop Comm. and Architectural Support for Network-Based Parallel Computing, Feb. 1997.

Index Terms:
Networks of workstations, irregular topologies, wormhole switching, minimal routing, source routing.
Jos? Flich, Pedro L?pez, Manuel Perez Malumbres, Jos? Duato, Tomas Rokicki, "Applying In-Transit Buffers to Boost the Performance of Networks with Source Routing," IEEE Transactions on Computers, vol. 52, no. 9, pp. 1134-1153, Sept. 2003, doi:10.1109/TC.2003.1228510
Usage of this product signifies your acceptance of the Terms of Use.