Issue No.07 - July (1995 vol.6)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.395402
<p><it>Abstract</it>—To offset the effect of read miss penalties on processor utilization in shared-memory multiprocessors, several software- and hardware-based data prefetching schemes have been proposed. A major advantage of hardware techniques is that they need no support from the programmer or compiler.</p><p><it>Sequential prefetching</it> is a simple hardware-controlled prefetching technique which relies on the automatic prefetch of consecutive blocks following the block that misses in the cache, thus exploiting spatial locality. In its simplest form, the number of prefetched blocks on each miss is fixed throughout the execution. However, since the prefetching efficiency varies during the execution of a program, we propose to adapt the number of prefetched blocks according to a dynamic measure of prefetching effectiveness. Simulations of this adaptive scheme show reductions of the number of read misses, the read penalty, and of the execution time by up to 78%, 58%, and 25% respectively.</p>
Hardware-controlled prefetching, latency tolerance, memory consistency models, performance evaluation, sequential prefetching, shared-memory multiprocessors.
Fredrik Dahlgren, Michel Dubois, Per Stenström, "Sequential Hardware Prefetching in Shared-Memory Multiprocessors", IEEE Transactions on Parallel & Distributed Systems, vol.6, no. 7, pp. 733-746, July 1995, doi:10.1109/71.395402