Dionisios Pnevmatikatos , FORTH-ICS , Heraklion
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MM.2014.56
High-radix single-chip routers have emerged as efficient building blocks for interconnection networks. It is too believed that at high radices hierarchical switch architectures are needed as crossbars scale with the square of router radix. This article proposes a novel micro-architecture that allows flat crossbar switches to scale to 128 ports supporting 32Gb/s/port while occupying 4.9mm2 and consuming 4.2W, or supporting 64Gb/s/port at 7.5mm2 and 7.5W, in 45nm CMOS. Key features include deep crossbar pipelining to cope with wire delay, a novel cross scheduler architecture to reduce wiring complexity, and catalytic custom gate placement within standard Electronic Design Automation (EDA) flows. Thus, it is also shown that, on chip, crossbar speedup and Combined Input-Output Queuing (CIOQ) is better than Hierarchical Queueing (HQ), providing top performance with orders of magnitude lower memory cost. Finally, a comparison with the recently-developed Swizzle-Switch prototypes is plotted and the potential of high-radix crossbars for System-on-a-Chip interconnects is advocated.
Dionisios Pnevmatikatos, "The Combined Input-Output Queued (CIOQ) Crossbar Architecture for High-Radix On-Chip Switches", IEEE Micro, , no. 1, pp. 1, PrePrints PrePrints, doi:10.1109/MM.2014.56