The Community for Technology Leaders
Green Image
<p><b>Abstract</b>—As the disparity between processor and memory speeds continues to grow, memory latency is becoming an increasingly important performance bottleneck. While software-controlled prefetching is an attractive technique for tolerating this latency, its success has been limited thus far to array-based numeric codes. In this paper, we expand the scope of automatic compiler-inserted prefetching to also include the recursive data structures commonly found in pointer-based applications. We propose three compiler-based prefetching schemes, and automate the most widely applicable scheme (<it>greedy prefetching</it>) in an optimizing research compiler. Our experimental results demonstrate that compiler-inserted prefetching can offer significant performance gains on both uniprocessors and large-scale shared-memory multiprocessors.</p>
Caches, prefetching, pointer-based applications, recursive data structures, compiler optimization, shared-memory multiprocessors, performance evaluation.

T. C. Mowry and C. Luk, "Automatic Compiler-Inserted Prefetching for Pointer-Based Applications," in IEEE Transactions on Computers, vol. 48, no. , pp. 134-141, 1999.
79 ms
(Ver 3.3 (11022016))