2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (2016)
Sept. 11, 2016 to Sept. 15, 2016
Yipeng Wang , ECE, North Carolina State University, United States
Ren Wang , Intel Corporation, United States
Andrew Herdrich , Intel Corporation, United States
James Tsai , Intel Corporation, United States
Yan Solihin , ECE, North Carolina State University, United States
As the number of cores in a multicore system increases, core-to-core (C2C) communication is increasingly limiting the performance scaling of workloads that share data frequently. The traditional way cores communicate is by using shared memory space between them. However, shared memory communication fundamentally involves coherence invalidations and cache misses, which cause large performance overheads and incur a high amount of network traffic. Many important workloads incur significant C2C communication and are affected significantly by the costs, including pipelined packet processing which is widely used in software-based networking solutions. In these workloads, threads run on different cores and pass packets from one core to another for different stages of processing using software queues. In this paper, we analyze the behavior and overheads of software queue management. Based on this analysis, we propose a novel C2C Communication Acceleration Framework (CAF) to optimize C2C communication. CAF offloads substantial communication burdens from cores and memory to a designated, efficient hardware device we refer to as Queue Management Device (QMD) attached to the Network on Chip. CAF combines hardware and software optimizations to effectively reduce the queue-induced communication overheads and improve the overall system performance by up to 2–12× over traditional software queue implementations.
Hardware, Multicore processing, Acceleration, Coherence, Software algorithms, Instruction sets
Y. Wang, R. Wang, A. Herdrich, J. Tsai and Y. Solihin, "CAF: Core to core Communication Acceleration Framework," 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT), Haifa, Israel, 2016, pp. 351-362.