Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2013)
Edinburgh, United Kingdom United Kingdom
Sept. 7, 2013 to Sept. 11, 2013
Jungju Oh , Sch. of Comput. Sci., Georgia Inst. of Technol., Atlanta, GA, USA
Alenka Zajic , Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
Milos Prvulovic , Sch. of Comput. Sci., Georgia Inst. of Technol., Atlanta, GA, USA
Growth in core count creates an increasing demand for interconnect bandwidth, driving a change from shared buses to packet-switched on-chip interconnects. However, this increases the latency between cores separated by many links and switches. In this paper, we show that a low-latency unswitched interconnect built with transmission lines can be synergistically used with a high-throughput switched interconnect. First, we design a broadcast ring as a chain of unidirectional transmission line structures with very low latency but limited throughput. Then, we create a new adaptive packet steering policy that judiciously uses the limited throughput of this ring by balancing expected latency benefit and ring utilization. Although the ring uses 1.3% of the on-chip metal area, our experimental results show that, in combination with our steering, it provides an execution time reduction of 12.4% over a mesh-only baseline.
Throughput, Delays, Couplers, Transmitters, Receivers, Switches, Wires
Jungju Oh, A. Zajic and M. Prvulovic, "Automatic OpenCL work-group size selection for multicore CPUs," Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques(PACT), Edinburgh, United Kingdom United Kingdom, 2013, pp. 309-318.