13th Symposium on High Performance Interconnects (HOTI'05) Addressing Queuing Bottlenecks at High Speeds Stanford, California, USA August 17-August 19 ISBN: 0-7695-2449-4
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CONECT.2005.7
Modern routers and switch fabrics can have hundreds of input and output ports running at up to 10 Gb/s; 40 Gb/s systems are starting to appear. At these rates, the performance of the buffering and queuing subsystem becomes a significant bottleneck. In high performance routers with more than a few queues, packet buffering is typically implemented using DRAM for data storage and a combination of off-chip and on-chip SRAM for storing the linked-list nodes and packet length, and the queue headers, respectively. This paper focuses on the performance bottlenecks associated with the use of off-chip SRAM. We show how the combination of implicit buffer pointers and multi-buffer list nodes can dramatically reduce the impact of buffering and queuing subsystem on queuing performance. We also show how combining it with coarse-grained scheduling can improve the performance of fair queuing algorithms, while also reducing the amount of off-chip memory and bandwidth needed. These techniques can reduce the amount of SRAM needed to hold the list nodes by a factor of 10 at the cost of about 10% wastage of the DRAM space, assuming an aggregation degree of 16.
Citation:
Sailesh Kumar, Jonathan Turner, Patrick Crowley, "Addressing Queuing Bottlenecks at High Speeds," hoti, pp.107-113, 13th Symposium on High Performance Interconnects (HOTI'05), 2005 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||