The Community for Technology Leaders
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2013)
Edinburgh, United Kingdom United Kingdom
Sept. 7, 2013 to Sept. 11, 2013
ISSN: 1089-795X
ISBN: 978-1-4799-1018-2
pp: 309-318
Jungju Oh , Sch. of Comput. Sci., Georgia Inst. of Technol., Atlanta, GA, USA
Alenka Zajic , Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
Milos Prvulovic , Sch. of Comput. Sci., Georgia Inst. of Technol., Atlanta, GA, USA
ABSTRACT
Growth in core count creates an increasing demand for interconnect bandwidth, driving a change from shared buses to packet-switched on-chip interconnects. However, this increases the latency between cores separated by many links and switches. In this paper, we show that a low-latency unswitched interconnect built with transmission lines can be synergistically used with a high-throughput switched interconnect. First, we design a broadcast ring as a chain of unidirectional transmission line structures with very low latency but limited throughput. Then, we create a new adaptive packet steering policy that judiciously uses the limited throughput of this ring by balancing expected latency benefit and ring utilization. Although the ring uses 1.3% of the on-chip metal area, our experimental results show that, in combination with our steering, it provides an execution time reduction of 12.4% over a mesh-only baseline.
INDEX TERMS
Throughput, Delays, Couplers, Transmitters, Receivers, Switches, Wires
CITATION
Jungju Oh, Alenka Zajic, Milos Prvulovic, , "Automatic OpenCL work-group size selection for multicore CPUs", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 309-318, 2013, doi:10.1109/PACT.2013.6618827
191 ms
(Ver 3.3 (11022016))