This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Partial Resolution in Branch Target Buffers
October 1997 (vol. 46 no. 10)
pp. 1142-1145

AbstractBranch target buffers, or BTBs, are small caches for program branching information. Like data caches, addresses are divided into equivalence classes based on their low order bits. Unlike data caches, however, complete resolution of a single address from within an equivalence class is not required for correct execution. Substantial savings are therefore possible by employing partial resolution, using fewer tag bits than necessary to uniquely identify an address. We present the relationship between the number of tag bits in a branch target buffer and prediction accuracy, based on dynamic simulations of the SPECINT92 benchmark suite. For a 512 entry BTB, on average only two tag bits are necessary to obtain 99.9 percent of the accuracy obtainable with a full tag; no more than nine tag bits are required to obtain identical prediction accuracy. This suggests that microprocessors can achieve substantial area savings in their BTB tag stores by employing partial resolution.

[1] B. Calder and D. Grunwald, Fast&Accurate Instruction Fetch and Branch Prediction Proc. 21st Ann. Int'l Symp. Computer Architecture, pp. 2-11, May 1994.
[2] Paul Chow, The MIPS-X RISC Microprocessor. Kluwer Academic, Aug. 1989.
[3] H.G. Cragon, Branch Strategy Taxonomy and Performance, IEEE CS Press, Los Alamitos, Calif., 1991.
[4] M. Denman, "Design of the PowerPC 604e™RISC Microprocessor," tutorial presentation, 28th Int'l Symp. Microarchitecture, Ann Arbor, Mich., Dec. 1995.
[5] D. Engler and T. Proebsting, "DCG: An Efficient, Retargetable Dynamic Code Generation System," Proc. Sixth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 263-271,San Jose, Calif., Oct. 1994.
[6] B. Fagin and K. Russell, "Partial Resolution in Branch Target Buffers," Proc. 28th Ann. Int'l Symp. Microarchitecture, pp. 193-198,Ann Arbor, Mich., Dec. 1995.
[7] M. Hill and A. Smith, "Evaluating Associativity in CPU Caches," IEEE Trans. Computers, vol. 38, no. 12, pp. 1,612-1,630, Dec. 1989.
[8] G. Hinton, "Pentium®Pro Processor," tutorial presentation, 28th Int'l Symp. Microarchitecture, Ann Arbor Mich., Dec. 1995.
[9] J.K.F. Lee and A.J. Smith, "Branch Prediction Strategies and Branch Target Buffer Design," Computer, pp. 6-22, Jan. 1984.
[10] D.K. Lewis, J.P. Costello, and D.M. O'Connor, "Design Trade-Offs for a 40 MIPS (peak) CMOS 32-bit Microprocessor," Proc. IEEE Int'l Conf. Computer Design: VLSI Computer Processors, pp. 110-113, Oct. 1988.
[11] S. McFarling and J. Hennessy, “Reducing the Cost of Branches,” Proc. 13th Ann. Int'l Symp. Computer Architecture, June 1986.
[12] S. Pan, K. So, and J. Rahmeh, “Improving the Accuracy of Dynamic Branch Prediction Using Branch Correlation,” Proc. Fifth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 76-84, Oct. 1992.
[13] J. Hennessy and D. Patterson, Computer Architecture: A Quantitative Approach. Morgan Kaufmann, 1995.
[14] C.H. Perleberg and A.J. Smith, "Branch Target Buffer Design and Optimization," IEEE Trans. Computers, vol. 42, no. 4, pp. 396-412, Apr. 1993.
[15] M. Schmit, Pentium Processor Optimization Tools, Academic Press, Boston, 1995.
[16] M. Smith, "Tracing with Pixie," Stanford Univ. Center for Integrated Systems, Apr. 1991.
[17] D.R. Stiles and H.L. McFarland, "Pipeline Control for a Single Cycle VLSI Implementation of a Complex Instruction Set Computer," Proc. Spring COMPCON 1989, pp. 504-508, 1989.
[18] D. Tabak, RISC Systems and Applications. John Wiley&Sons, 1996.
[19] A. Talcott, W. Yamamoto, M. Serrano, R. Wood, and M. Nemirovsky, "The Impact of Unresolved Branches on Branch Prediction Scheme Performance," Proc. 21st Ann. Int'l Symp. Computer Architecture, pp. 12-21, Apr. 1994.
[20] T.Y. Yeh and Y.N. Patt,"Alternative Implementations of Two-Level Adaptive Training Branch Prediction," Proc. 19th Ann. Int'l Symp. Computer Architecture, pp. 124-134, 1992.
[21] T.-Y. Yeh and Y. Patt, “A Comparison of Dynamic Branch Predictors that Use Two Levels of Branch History,” Proc. 20th Ann. Int'l Symp. Computer Architecture, pp. 257-266, May 1993.
[22] P.-Y. Chang, E. Hao, and Y.N. Patt, Alternative Implementations of Hybrid Branch Predictors Proc. 28th Ann. Int'l Symp. Microarchitecture, pp. 252-257, Dec. 1995.
[23] T. Yoshida and T. Enomoto, "The Mitsubishi VLSI CPU in the TRON Project," IEEE Micro, p. 24, Apr. 1987.

Index Terms:
Branch prediction, branch target buffer, cache memory, computer architecture, microarchitecture.
Citation:
Barry Fagin, "Partial Resolution in Branch Target Buffers," IEEE Transactions on Computers, vol. 46, no. 10, pp. 1142-1145, Oct. 1997, doi:10.1109/12.628399
Usage of this product signifies your acceptance of the Terms of Use.