This Article 
 Bibliographic References 
 Add to: 
A Performance Study of Instruction Cache Prefetching Methods
May 1998 (vol. 47 no. 5)
pp. 497-508

Abstract—Prefetching methods for instruction caches are studied via trace-driven simulation. The two primary methods are "fall-through" prefetch (sometimes referred to as "one block lookahead") and "target" prefetch. Fall-through prefetches are for sequential line accesses, and a key parameter is the distance from the end of the current line where the prefetch for the next line is initiated. Target prefetches work also for nonsequential line accesses. A prediction table is used and a key aspect is the prediction algorithm implemented by the table. Fall-through prefetch and target prefetch each improve performance significantly. When combined in a hybrid algorithm, their performance improvement is nearly additive. An instruction cache using a combined target and fall-through method can provide the same performance as a two to four times larger cache that does not prefetch. A good prediction method must not only be accurate, but prefetches must be initiated early enough to allow time for the instructions to return from main memory. To quantify this, we define a "prefetch efficiency" measure that reflects the amount of memory fetch delay that may be successfully hidden by prefetching. The better prefetch methods (in terms of miss rate) also have very high efficiencies, hiding approximately 90 percent of the miss delay for prefetched lines. Another performance measure of interest is memory traffic. Without prefetching, large line sizes give better hit rates; with prefetching, small line sizes tend to give better overall hit rates. Because smaller line sizes tend to reduce memory traffic, the top-performing prefetch caches produce less memory traffic than the top-performing nonprefetch caches of the same size.

[1] W.Y. Chen et al., "The Effect of Code Expanding Optimizations on Instruction Cache Design," IEEE Trans. Computers, vol. 42, no. 9, pp. 1,045-1,057, Sept. 1993.
[2] "CRAY Y-MP System Programmer Reference Manual," Cray Research, Inc., 1988.
[3] L. Gwennap, "UltraSparc Unleashes SPARC Performance," Microprocessor Report, pp. 1, 6-9, Oct.3 1994.
[4] M.D. Hill and A.J. Smith, "Experimental Evaluation of Microprocessor Cache Memories," Proc. 11th Ann. Symp. Computer Architecture, pp. 158-166, June 1984.
[5] W. Hwu and P. Chang, "Achieving High Instruction Cache Performance with an Optimizing Compiler," Proc. Third Int'l Conf. Architectural Support for Programming Languages and Operating Systems, 1989.
[6] D. Kuck et al., "The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers," CSRD report, Univ. of Illinois at Urbana-Champaign, 1988.
[7] J.K.F. Lee and A.J. Smith, "Branch Prediction Strategies and Branch Target Buffer Design," Computer, pp. 6-22, Jan. 1984.
[8] O. Lubeck, J. Moore, and R. Mendez, "A Benchmark Comparison of Three Supercomputers: Fujitsu VP-200, Hitachi S810/20, and Cray X-MP/2," Computer, Dec. 1985.
[9] S. McFarling, "Program Optimization for Instruction Caches," Proc. Third Int'l Conf. Architectural Support for Programming Languages and Operating Systems, 1989.
[10] T.C. Mowry, M.S. Lam, and A. Gupta, “Design and Evaluation of a Compiler Algorithm for Prefetching,” Proc. Fifth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, Oct. 1992.
[11] M. Slater, "AMD's K5 Designed to Outrun Pentium," Microprocessor Report, pp. 1, 6-11, Oct.24 1994.
[12] A.J. Smith, "Sequential Program Prefetching in Memory Hierarchies," Computer, pp. 7-21, Dec. 1978.
[13] J.E. Smith, "A Study of Branch Prediction Strategies," Proc. Eighth Ann. Int'l Symp. Computer Architecture, pp. 135-148, June 1981.
[14] A.J. Smith, "Cache Memories," ACM Computing Surveys, Vol. 14, 1982, pp. 473-540.

Index Terms:
Instruction caches, prefetching, target prefetch, lookahead prefetch, scalar processors, supercomputers.
Wei-Chung Hsu, James E. Smith, "A Performance Study of Instruction Cache Prefetching Methods," IEEE Transactions on Computers, vol. 47, no. 5, pp. 497-508, May 1998, doi:10.1109/12.677221
Usage of this product signifies your acceptance of the Terms of Use.