| | This Article | | | |
| | | | Share | | | |
| | | | Bibliographic References | | | |
| | | | Add to: | | | |
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
| | | | Search | | | |
| | | | |
Power-Aware Branch Prediction: Characterization and Design
February 2004 (vol. 53 no. 2) pp. 168-186
Abstract—This paper uses Wattch and the SPEC 2000 integer and floating-point benchmarks to explore the role of branch predictor organization in power/energy/performance trade offs for processor design. Even though the direction predictor by itself represents less than 1 percent of the processor's total power dissipation, prediction accuracy is nevertheless a powerful lever on processor behavior and program execution time. A thorough study of branch predictor organizations shows that, as a general rule, to reduce overall energy consumption in the processor, it is worthwhile to spend more power in the branch predictor if this results in more accurate predictions that improve running time. This not only improves performance, but can also improve the energy-delay product by up to 20 percent. Three techniques, however, can reduce power dissipation without harming accuracy. Banking reduces the portion of the branch predictor that is active at any one time. A new on-chip structure, the prediction probe detector (PPD), uses predecode bits to entirely eliminate unnecessary predictor and branch target buffer (BTB) accesses. Despite the extra power that must be spent accessing it, the PPD reduces local predictor power and energy dissipation by about 31 percent and overall processor power and energy dissipation by 3 percent. These savings can be further improved by using profiling to annotate branches, identifying those that are highly biased and do not require static prediction. Finally, the paper explores the effectiveness of a previously proposed technique, pipeline gating, and finds that, even with adaptive control based on recent predictor accuracy, pipeline gating yields little or no energy savings. [1] D.H. Albonesi, Selective Cache Ways: On-Demand Cache Resource Allocation Proc. 32nd Ann. ACM/IEEE Int'l Symp. Microarchitecture, pp. 248-259, Nov. 1999. [2] R.I. Bahar and S. Manne, Power and Energy Reduction via Pipeline Balancing Proc. 28th Ann. Int'l Symp. Computer Architecture, June 2001. [3] D. Brooks, V. Tiwari, and M. Martonosi, Wattch: A Framework for Architectural-Level Power Analysis and Optimizations Proc. 27th Ann. Int'l Symp. Computer Architecture, pp. 83-94, June 2000. [4] D.C. Burger and T.M. Austin, The SimpleScalar Tool Set, Version 2.0 Computer Architecture News, vol. 25, no. 3, pp. 13-25, June 1997. [5] B. Calder and D. Grunwald, Fast&Accurate Instruction Fetch and Branch Prediction Proc. 21st Ann. Int'l Symp. Computer Architecture, pp. 2-11, May 1994. [6] B. Calder and D. Grunwald, Next Cache Lne and Set Prediction Proc. 22nd Ann. Int'l Symp. Computer Architecture, pp. 287-296, June 1995. [7] B. Calder, D. Grunwald, and J. Emer, Predictive Sequential Associative Cache Proc. Second Int'l Symp. High-Performance Computer Architecture, pp. 244-253, Feb. 1996. [8] P.-Y. Chang, E. Hao, and Y.N. Patt, Alternative Implementations of Hybrid Branch Predictors Proc. 28th Ann. Int'l Symp. Microarchitecture, pp. 252-257, Dec. 1995. [9] Digital Semiconductor, DECchip 21064/21064A Alpha AXP Microprocessors: Hardware Reference Manual, June 1994. [10] Digital Semiconductor, Alpha 21164 Microprocessor: Hardware Reference Manual, Apr. 1995. [11] S. Ghiasi, J. Casmira, and D. Grunwald, Using IPC Variation in Workload with Externally Specified Rates to Reduce Power Consumption Proc. Workshop Complexity-Effective Design, June 2000. [12] K. Ghose and M. Kamble, Reducing Power in Superscalar Processor Caches Using Subbanking, Multiple Line Buffers and Bit-Line Segmentation Proc. 1999 Int'l Symp. Low Power Electronics and Design, pp. 70-75, Aug. 1999. [13] R. Gonzalez and M. Horowitz, Energy Dissipation in General Purpose Microprocessors IEEE J. Solid-State Circuits, vol. 31, no. 9, Sept. 1996. [14] D. Grunwald, A. Klauser, S Manne, and A. Pleszkun, Confidence Estimation for Speculation Control Proc. 25th Ann. Int'l Symp. Computer Architecture, pp. 122-31, June 1998. [15] Z. Hu, P. Juang, P. Diodato, S. Kaxiras, K. Skadron, M. Martonosi, and D.W. Clark, Managing Leakage for Transient Data: Decay and Quasi-Static Memory Cells Proc. 2002 Int'l Symp. Low Power Electronics and Design, pp. 52-55, Aug. 2002. [16] Z. Hu, P. Juang, K. Skadron, D. Clark, and M. Martonosi, Applying Decay Strategies to Branch Predictors for Leakage Energy Savings Proc. 2002 Int'l Conf. Computer Design, pp. 442-445, Sept. 2002. [17] D.A. Jiménez, S.W. Keckler, and C. Lin, The Impact of Delay on the Design of Branch Predictors Proc. 33rd Ann. IEEE/ACM Int'l Symp. Microarchitecture, pp. 67-77, Dec. 2000. [18] R.E. Kessler, E.J. McLellan, and D.A. Webb, The Alpha 21264 Microprocessor Architecture Proc. 1998 Int'l Conf. Computer Design, pp. 90-95, Oct. 1998. [19] J. Kin, M. Gupta, and W. Mangione-Smith, "The Filter Cache: An Energy-Efficient Memory Structure," Proc. IEEE Int'l Symp. Microarchitecture, IEEE CS Press, 1997, pp. 184-193. [20] S. Manne, A. Klauser, and D. Grunwald, "Pipeline Gating: Speculation Control for Energy Reduction," Proc. 25th Ann. Int'l Symp. Computer Architecture (ISCA-25), 1998, IEEE CS Press, pp. 132-141. [21] S. McFarling, Combining Branch Predictors Technical Note TN-36, DEC WRL, June 1993. [22] S.-T. Pan, K. So, and J.T. Rahmeh, Improving the Accuracy of Dynamic Branch Prediction Using Branch Correlation Proc. Fifth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 76-84, Oct. 1992. [23] D. Parikh, K. Skadron, Y. Zhang, M. Barcella, and M. Stan, Power Issues Related to Branch Prediction Proc. Eighth Int'l Symp. High-Performance Computer Architecture, pp. 233-244, Feb. 2002. [24] D. Parikh, K. Skadron, Y. Zhang, M. Barcella, and M.R. Stan, Power Issues Related to Branch Prediction Technical Report CS-2001-25, Dept. of Computer Science, Univ. of Virginia, Nov. 2001. [25] H. Patil and J. Emer, Combining Static and Dynamic Branch Prediction to Reduce Destructive Aliasing Proc. Sixth Int'l Symp. High-Performance Computer Architecture, Jan. 2000. [26] K. Skadron, P.S. Ahuja, M. Martonosi, and D.W. Clark, Improving Prediction for Procedure Returns with Return-Address-Stack Repair Mechanisms Proc. 31st Ann. ACM/IEEE Int'l Symp. Microarchitecture, pp. 259-271, Dec. 1998. [27] K. Skadron, D.W. Clark, and M. Martonosi, Speculative Updates of Local and Global Branch History: A Quantitative Analysis J. Instruction-Level Parallelism, Jan. 2000, http://www.jilp.orgvol2. [28] K. Skadron, M. Martonosi, and D.W. Clark, A Taxonomy of Branch Mispredictions, and Alloyed Prediction as a Robust Solution to Wrong-History Mispredictions Proc. 2000 Int'l Conf. Parallel Architectures and Compilation Techniques, pp. 199-206, Oct. 2000. [29] J.E. Smith, A Study of Branch Prediction Strategies Proc. Eighth Ann. Int'l Symp. Computer Architecture, pp. 135-148, May 1981. [30] P. Song, UltraSparc-3 Aims at MP Servers Microprocessor Report, pp. 29-34, 27 Oct. 1997. [31] Standard Performance Evaluation Corp., SPEC CPU2000 Benchmarks http://www.specbench.org/osgcpu2000, 2000. [32] W. Tang, R. Gupta, and A. Nicolau, Design of a Predictive Filter Cache for Energy Savings in High Performance Processor Architectures Proc. 2001 Int'l Conf. Computer Design, pp. 68-73, Sept. 2001. [33] J. Turley, ColdFire Doubles Performance with v4 Microprocessor Report, 26 Oct. 1998. [34] S. Wallace and N. Bagherzadeh, Multiple Branch and Block Prediction Proc. Third Int'l Symp. High-Performance Computer Architecture, pp. 94-103, Feb. 1997. [35] S.J.E. Wilton and N.P. Jouppi, Cacti: An Enhanced Cache Access and Cycle Time Model IEEE J. Solid-State Circuits, vol. 31, no. 5, pp. 677-688, May. 1996. [36] T.-Y. Yeh and Y.N. Patt, Two-Level Adaptive Training Branch Prediction Proc. 24th Ann. Int'l Symp. Microarchitecture, pp. 51-61, Nov. 1991. [37] Z. Zhu and X. Zhang, Access-Mode Predictions for Low-Power Cache Design IEEE Micro, vol. 22, no. 2, pp. 58-71, Mar.-Apr. 2002.
Index Terms:
Low-power design, energy-aware systems, processor architecture, branch prediction, target prediction, power, banking, highly-biased branches, pipeline gating, speculation control.
Citation:
Dharmesh Parikh, Kevin Skadron, Yan Zhang, Mircea Stan, "Power-Aware Branch Prediction: Characterization and Design," IEEE Transactions on Computers, vol. 53, no. 2, pp. 168-186, Feb. 2004, doi:10.1109/TC.2004.1261827 Usage of this product signifies your acceptance of the Terms of Use.
|