The Community for Technology Leaders
2014 23rd International Conference on Parallel Architecture and Compilation (PACT) (2014)
Edmonton, Canada
Aug. 23, 2014 to Aug. 27, 2014
ISBN: 978-1-5090-6607-0
pp: 87-98
Biswabandan Panda , Department of Computer Science & Engineering, Indian Institute of Technology Madras, Chennai, India
Shankar Balachandran , Department of Computer Science & Engineering, Indian Institute of Technology Madras, Chennai, India
ABSTRACT
Hardware prefetchers are commonly used to hide and tolerate off-chip memory latency. Prefetching techniques in the literature are designed for multiple independent sequential applications running on a multicore system. In contrast to multiple independent applications, a single parallel application running on a multicore system exhibits different behavior. In case of a parallel application, cores share and communicate data and code among themselves, and there is commonality in the demand miss streams across multiple cores. This gives an opportunity to predict the demand miss streams and communicate the predicted streams from one core to another, which we refer as cross-core stream communication. We propose cross-core spatial streaming (XStream), a practical and storage-efficient cross-core prefetching technique. XStream detects and predicts the cross-core spatial streams at the private mid level caches (MLCs) and sends the predicted streams in advance to MLC prefetchers of the predicted cores. We compare the effectiveness of XStream with the ideal cross-core spatial streamer. Experimental results demonstrate that, on an average (geomean), compared to the state-of-the-art spatial memory streaming, storage efficient XStream reduces the execution time by 11.3% (as high as 24%) and 9% (as high as 29.09%) for 4-core and 8-core systems respectively.
INDEX TERMS
Prefetching, Hardware, History, Multicore processing, Random access memory, Training
CITATION
Biswabandan Panda, Shankar Balachandran, "XStream: Cross-core spatial streaming based MLC prefetchers for parallel applications in CMPs", 2014 23rd International Conference on Parallel Architecture and Compilation (PACT), vol. 00, no. , pp. 87-98, 2014, doi:10.1145/2628071.2628073
94 ms
(Ver 3.3 (11022016))