This Article 
 Bibliographic References 
 Add to: 
PBC: A Partially Buffered Crossbar Packet Switch
November 2009 (vol. 58 no. 11)
pp. 1568-1581
Lotfi Mhamdi, Delft University of Technology, Delft
The crossbar fabric is widely used as the interconnect of high-performance packet switches due to its low cost and scalability. There are two main variants of the crossbar fabric: unbuffered and internally buffered. On one hand, unbuffered crossbar fabric switches exhibit the advantage of using no internal buffers. However, they require a complex scheduler to solve input and output ports contention. Internally, buffered crossbar fabric switches, on the other hand, overcome the scheduling complexity by means of distributed schedulers. However, they require expensive internal buffers—one per crosspoint. In this paper, we propose a novel architecture, namely, the Partially Buffered Crossbar (PBC) switching architecture, where a small number of separate internal buffers are maintained per output. Our goal is to design a PBC switch having the performance of buffered crossbars and a cost comparable to that of unbuffered crossbars. We propose a class of round robin scheduling algorithms for the PBC architecture. Simulations results show that using as few as eight buffers per fabric column and irrespective of the number N of input ports of the switch, we can achieve similar performance to buffered crossbars that use N buffers per fabric output.

[1] N. McKeown, “A Fast Switched Backplane for a Gigabit Switched Router,” Business Comm. Rev., vol. 27, no. 12, 1997.
[2] N. McKeown, M. Izzard, A. Mekkittikul, B. Ellersick, and M. Horowitz, “The Tiny Tera: A Packet Switch Core,” IEEE Micro, vol. 17, no. 1, pp. 26-33, Jan./Feb. 1997.
[3] C. Minkenberg and T. Engbersen, “A Combined Input and Output Queued Packet-Switched System Based on a Prizma Switch-on-a-Chip Technology,” IEEE Comm. Magazine, vol. 38, no. 2, pp. 70-77, Dec. 2000.
[4] F. Abel, C. Minkenberg, P. Luijten, M. Gusat, and I. Iliadis, “A Four-Terabit Packet Switch Supporting Long Round-Trip Times,” IEEE Micro, vol. 23, no. 1, pp. 10-24, Jan./Feb. 2003.
[5] N. McKeown, “Scheduling Algorithms for Input-Queued Cell Switches,” PhD thesis, Univ. of California at Berkeley, May 1995.
[6] N. McKeown, “iSLIP Scheduling Algorithm for Input-Queued Switches,” IEEE Trans. Networking, vol. 7, no. 2, pp. 188-201, Apr. 1999.
[7] D.N. Serpanos and P.I. Antoniadis, “FIRM: A Class of Distributed Scheduling Algorithms for High-Speed ATM Switches with Input Queues,” Proc. IEEE INFOCOM, Mar. 2000.
[8] Y. Jiang and M. Hamdi, “A Fully Desyncronized Round-Robin Matching Scheduler for a VOQ Packet Switch Architecture,” Proc. IEEE Workshop High Performance Switching and Routing (HPSR), pp.407-411, May 2001.
[9] N. McKeown, A. Mekkittikul, V. Anantharam, and J. Walrand, “Achieving 100% Throughput in Input-Queued Switches,” IEEE Trans. Comm., vol. 47, no. 8, pp. 1260-1267, August 1999.
[10] A. Mekkittikul, “Scheduling Non-Uniform Traffic in High Speed Packet Switches and Routers,” PhD thesis, Stanford Univ., Nov. 1998.
[11] S. Nojima, E. Tsutsui, H. Fukuda, and M. Hashimmoto, “Integrated Packet Network Using Bus Matrix,” IEEE Trans. Comm., vol. 5, no. 8, pp. 1284-1291, Oct. 1987.
[12] M. Nabeshima, “Performance Evaluation of Combined Input- and Crosspoint-Queued Switch,” IEICE Trans. Comm., vol. B83-B, no. 3, pp. 737-741, Mar. 2000.
[13] R. Rojas-Cessa, E. Oki, Z. Jing, and H.J. Chao, “CIXB-1: Combined Input One-Cell-Crosspoint Buffered Switch,” Proc. IEEE Workshop High Performance Switching and Routing (HPSR), pp. 324-329, May 2001.
[14] L. Mhamdi and M. Hamdi, “MCBF: A High-Performance Scheduling Algorithm for Buffered Crossbar Switches,” IEEE Comm. Letters, vol. 7, no. 9, pp. 451-453, Sept. 2003.
[15] M. Katevenis, G. Passas, D. Simos, I. Papaefstathiou, and N. Chrysos, “Variable Packet Size Buffered Crossbar (CICQ) Switches,” Proc. IEEE Int'l Conf. Comm. (ICC), pp. 1090-1096, June 2004.
[16] L. Mhamdi, C. Kachris, and S. Vassiliadis, “A Reconfigurable Hardware Based Embedded Scheduler for Buffered Crossbar Switches,” Proc. ACM/SIGDA Int'l Symp. Field-Programmable Gate Arrays (FPGA), pp. 143-149, Feb. 2006.
[17] A.K. Choudhury and E.L. Hahne, “A New Buffer Management Scheme for Hierarchical Shared Memory Switches,” IEEE/ACM Trans. Networking, vol. 5, no. 5, pp. 728-738, Oct. 1997.
[18] F.M. Chiussi and A. Francini, “A Distributed Scheduling Architecture for Scalable Packet Switches,” IEEE J. Selected Areas in Comm., vol. 18, no. 12, pp. 2665-2683, Dec. 2000.
[19] N. Chrysos and M. Katevenis, “Scheduling in Switches with Small Internal Buffers,” Proc. IEEE Global Comm. Conf. (Globecom), pp.614-619, Nov. 2005.
[20] N. Chrysos and M. Katevenis, “Scheduling in Non-Blocking Buffered Three-Stage Switching Fabrics,” Proc. IEEE INFOCOM, Apr. 2006.
[21] S. Chuang, S. Iyer, and N. McKeown, “Practical Algorithms for Performance Guarantees in Buffered Crossbars,” Proc. IEEE INFOCOM, Mar. 2005.
[22] J. Balfour and W.J. Dally, “Design Tradeoffs for Tiled CMP On-Chip Networks,” Proc. Int'l Conf. Supercomputing (ICS), pp. 187-198, 2006.
[23] R.R. Cessa, “Design and Analysis of Reliable High-Performance Packet Switches,” PhD thesis, Polytechnic Univ., Apr. 2001.
[24] S.T. Chuang, “Providing Performance Guarantees with Crossbar-Based Routers,” PhD thesis, Stanford Univ., Dec. 2004.
[25] T. Anderson, S. Owicki, J. Saxe, and C. Thacker, “High Speed Switch Scheduling for Local Area Networks,” ACM Trans. Computer Systems, vol. 11, no. 4, pp. 319-352, Nov. 1993.
[26] R. Rojas-Cessa and Z. Dong, “Combined Input-Crosspoint Buffered Packet Switch with Flexible Access to Crosspoint Buffers,” Proc. IEEE Int'l Caribbean Conf. Devices, Circuits and Systems (ICCDCS), Apr. 2006.
[27] P. Krishna, N.S. Patel, A. Charny, and R.J. Simcoe, “On the Speedup Required for Work-Conserving Crossbar Switches,” IEEE J. Selected Areas in Comm., vol. 17, no. 6, pp. 1528-1537, June 1999.
[28] K. Yoshigoe, K. Christensen, and A. Jacob, “The RR/RR CICQ Switch: Hardware Design for 10-Gbps Link Speed,” Proc. IEEE Int'l Performance Computing and Comm. Conf. (IPCCC), pp. 481-485, Apr. 2003.
[29] P. Gupta and N. McKeown, “Design and Implementation of a Fast Crossbar Scheduler,” IEEE Micro, vol. 19, no. 1, pp. 20-28, Jan./Feb. 1999.
[30] P. Giaccone, D. Shah, and B. Prabhakar, “An Implementable Parallel Scheduler for Input-Queued Switches,” IEEE Micro, vol. 19, no. 1, pp. 1090-1096, Jan./Feb. 1999.

Index Terms:
Crossbar fabrics, partially buffered crossbars, scheduling.
Lotfi Mhamdi, "PBC: A Partially Buffered Crossbar Packet Switch," IEEE Transactions on Computers, vol. 58, no. 11, pp. 1568-1581, Nov. 2009, doi:10.1109/TC.2009.65
Usage of this product signifies your acceptance of the Terms of Use.