2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW) (2015)
May 25, 2015 to May 29, 2015
With continued performance scaling of many cores per chip, an on-chip, off-chip memory has increasingly become a system bottleneck due to inter-thread contention. The memory access streams emerging from many cores and the simultaneously executed threads, exhibit increasingly limited locality. Large and high-density DRAMs contribute significantly to system power consumption and data over fetch. We develop a fine-grained Victim Row-Buffer (VRB) memory system to increase performance of the memory system. The VRB mechanism helps reuse the data accessed from the memory banks, avoids unnecessary data transfers, mitigates memory contentions, and thus can improve system throughput and system fairness by decoupling row-buffer contentions. Through full-system cycle-accurate simulations of many threads applications, we demonstrate that our proposed VRB technique achieves an up to 19% (8.4% on average) system-level throughput improvement, an up to 20% (7.2% on average) system fairness improvement, and it saves 6.8% of power consumption across the whole suite.
Random access memory, Instruction sets, Arrays, Protocols, Bandwidth, Memory management, Timing
K. Gao, D. Fan, J. Wu and Z. Liu, "Decoupling Contention with Victim Row-Buffer on Multicore Memory Systems," 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW), Hyderabad, India, 2015, pp. 454-463.