This Article 
 Bibliographic References 
 Add to: 
HIPIQS: A High-Performance Switch Architecture Using Input Queuing
March 2002 (vol. 13 no. 3)
pp. 275-289

Switch-based interconnects are used in a number of application domains, including parallel system interconnects, local area networks, and wide area networks. However, very few switches have been designed that are suitable for more than one of these application domains. Such a switch must offer both extremely low latency and very high throughput for a variety of different message sizes. While some architectures with output queuing have been shown to perform extremely well in terms of throughput, their performance can suffer when used in systems where a significant portion of the packets are extremely small. On the other hand, architectures with input queuing offer limited throughput or require fairly complex and centralized arbitration that increases latency. In this paper, we present a new input queue-based switch architecture called HIPIQS (HIgh-Performance Input-Queued Switch). It offers low latency for a range of message sizes and provides throughput comparable to that of output queuing approaches. Furthermore, it allows simple and distributed arbitration. HIPIQS uses a dynamically allocated multiqueue organization, pipelined access to multibank input buffers, and small cross-point buffers to deliver high performance. Our simulation results show that HIPIQS can deliver performance close to that of output queuing approaches over a range of message sizes, system sizes, and traffic. The switch architecture can therefore be used to build high performance switches that are useful for both parallel system interconnects and for building computer networks.

[1] ATM Forum, ATM User-Network Interface Specification, Version 3.1. Sept. 1994.
[2] L.N. Bhuyan, H.J. Wang, R. Iyer, and A. Kumar, “Impact of Switch Design on the Application Performance of Cache-Coherent Multiprocessors,” Proc. 12th IEEE Int'l Parallel Processing Symp., Apr. 1998.
[3] N. Boden et al., "Myrinet: A Gigabit-per-Second Local Area Network," IEEE Micro, Feb. 1995, pp. 29-36.
[4] G.A. Boughton, “Arctic Routing Chip,” Proc. First Int'l Workshop Parallel Computer Routing and Comm, pp. 310-317, 1994.
[5] W.E. Denzel, A.P.J. Engbersen, and I. Iliadis, “Flexible Shared-Buffer Switch for ATM at Gb/s Rates,” Computer Networks and ISDN Systems, vol. 27, no. 4, pp. 611-624, Jan. 1995.
[6] J. Duato, S. Yalamanchili, and L.M. Ni, Interconnection Networks: An Engineering Approach. Los Alamitos, Calif.: IEEE CS Press, 1997.
[7] M. Galles, “Spider: A High Speed Network Interconnect” IEEE Micro, vol. 17, no. 1, pp. 34–39 Jan.-Feb. 1997.
[8] D. Garcia, “ServerNet II,” Proc. 1997 Parallel Computing, Routing, and Comm. Workshop, June 1997.
[9] C.F. Joerg and A. Boughton, “The Monsoon Interconnection Network,” Proc. Int'l Conf. Computer Design, Oct. 1991.
[10] M.J. Karol, M.G. Hluchyj, and S.P. Morgan, “Input versus Output Queueing on a Space-Division Packet Switch,” IEEE Trans. Comm., vol. 35, no. 12, pp. 1347-1356, Dec. 1987.
[11] M. Katevenis, P. Vatsolaki, and A. Efthymiou, “Pipelined Memory Shared Buffer for VLSI Switches,” Computer Comm. Rev., vol. 25, no. 4, pp. 39-48, Oct. 1995.
[12] R.O. LaMaire and D.N. Serpanos, “Two-Dimensional Round-Robin Schedulers for Packet Switches with Multiple Input Queues,” IEEE/ACM Trans. Networking, vol. 2, no. 5, pp. 471-482, Oct. 1994.
[13] N. McKeown et al., "Tiny Tera: A Packet Switch Core," IEEE Micro, Vol. 17, No. 1, Jan.-Feb. 1997, pp. 26-33.
[14] A. Mu, J. Larson, R. Sastry, T. Wicki, and W.W. Wilcke, “A 9.6 GigaByte/s Throughput Plesiochronous Routing Chip,” Proc. COMPCON '96, Feb. 1996.
[15] N. Ni, M. Pirvu, and L. Bhuyan, “Circular Buffered Switch Design with Wormhole Routing and Virtual Channels,” Proc. Int'l Conf. Computer Design, Sept. 1998.
[16] D.K. Panda, D. Basak, D. Dai, R. Kesavan, R. Sivaram, M. Banikazemi, and V. Moorthy, “Simulation of Modern Parallel Systems: A CSIM-Based Approach,” Proc. 1997 Winter Simulation Conf. (WSC '97), pp. 1013-1020, Dec. 1997.
[17] C. Partridge, Gigabit Networking.Reading, Mass.: Addison-Wesley, 1994.
[18] R. Sivaram, D.K. Panda, and C.B. Stunkel, “Efficient Broadcast and Multicast on Multistage Interconnection Networks Using Multiport Encoding,” IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 10, pp. 1004-1028, Oct. 1998.
[19] R. Sivaram, C.B. Stunkel, and D.K. Panda, “HIPIQS: A High Performance Switch Architecture Using Input Queuing,” Proc. 12th Int'l Parallel Processing Symp., pp. 134-143, Apr. 1998.
[20] C.B. Stunkel, “Challenges in the Design of Contemporary Routers,” Proc. Second Parallel Computer Routing and Comm. Workshop, pp. 139-152, June 1997.
[21] C. Stunkel, D. Shea, B. Abali, M. Atkins, C. Bender, D. Grice, P. Hochshild, D. Joseph, B. Nathanson, R. Swetz, R. Stucke, M. Tsao, and P. Varker, “The SP2 High-Performance Switch,” IBM Systems J., vol. 34, no. 2,pp. 185–204, 1995.
[22] C.B. Stunkel, R. Sivaram, and D.K. Panda, “Implementing MultiDestination Worms in Switch-Based Parallel Systems: Architectural Alternatives and Their Impact,” Proc. 24th IEEE/ACM Ann. Int'l Symp. Computer Architecture (ISCA-24), pp. 50-61, June 1997.
[23] Y. Tamir and H.-C. Chi, "Symmetric Crossbar Arbiters for VLSI Communication Switches," IEEE Trans. Parallel and Distributed Systems, Vol. 4, No. 1, 1993, pp. 13-27.
[24] Y. Tamir and G.L. Frazier, "Dynamically-Allocated Multi-Queue Buffers for VLSI Communication Switches," IEEE Trans. Computers, vol. 41, no. 6, pp. 725-737, June 1992.

Index Terms:
parallel architectures, switch/router design, high-speed interconnects, interconnection networks, networks of workstations
Rajeev Sivaram, Craig B. Stunkel, Dhabaleswar K. Panda, "HIPIQS: A High-Performance Switch Architecture Using Input Queuing," IEEE Transactions on Parallel and Distributed Systems, vol. 13, no. 3, pp. 275-289, March 2002, doi:10.1109/71.993207
Usage of this product signifies your acceptance of the Terms of Use.