The Community for Technology Leaders
Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques (2001)
Barcelona, Spain
Sept. 8, 2001 to Sept. 12, 2001
ISBN: 0-7695-1363-8
pp: 0268
Nicholas Kohout , Intel Corp.
Seungryul Choi , Univ. of Maryland, College Park
Dongkeun Kim , Univ. of Maryland, College Park
Donald Yeung , Univ. of Maryland, College Park
Abstract: Pointer-chasing applications tend to traverse composed data structures consisting of multiple independent pointer chains. While the traversal of any single pointer chain leads to the serialization of memory operations, the traversal of independent pointer chains provides a source of memory parallelism. This paper presents multi-chain prefetching, a technique that utilizes off-line analysis and a hardware prefetch engine to prefetch multiple independent pointer chains simultaneously, thus exploiting inter-chain memory parallelism for the purpose of memory latency tolerance. This paper makes three contributions. First, we introduce a scheduling algorithm that identifies independent pointer chains in pointer-chasing codes and computes a prefetch schedule that overlaps serialized cache misses across separate chains. Our analysis focuses on static traversals. We also propose using speculation to identify independent pointer chains in dynamic traversals. Second, we present the design of a prefetch engine that traverses pointer-based data structures and overlaps multiple pointer chains according to our scheduling algorithm. Finally, we conduct an experimental evaluation of multi-chain prefetching and compare its performance against two existing techniques, jump pointer prefetching [9] and prefetch arrays [6]. Our results show multi-chain prefetching improves execution time by 40% across six pointer-chasing kernels from the Olden benchmark suite [14], and by 8% across four SPECInt CPU2000 benchmarks. Multi-chain prefetching also outperforms jump pointer prefetching and prefetch arrays by 28% on Olden, and by 12% on SPECInt. Furthermore, speculation can enable multi-chain prefetching for some dynamic traversal codes, but our technique loses its effectiveness when the pointer-chain traversal order is unpredictable. Finally, we also show that combining multi-chain prefetching with prefetch arrays can potentially provide higher performance than either technique alone.

D. Kim, S. Choi, D. Yeung and N. Kohout, "Multi-Chain Prefetching: Effective Exploitation of Inter-Chain Memory Parallelism for Pointer-Chasing Codes," Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques(PACT), Barcelona, Spain, 2001, pp. 0268.
79 ms
(Ver 3.3 (11022016))