This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Isolating Short-Lived Operands for Energy Reduction
June 2004 (vol. 53 no. 6)
pp. 697-709

Abstract—A mechanism for reducing the power requirements in processors that use a separate (architectural) register file (ARF) for holding committed values is proposed in this paper. We exploit the notion of short-lived operands—values that target architectural registers that are renamed by the time the instruction producing the value reaches the writeback stage. Our simulations of the SPEC 2000 benchmarks show that as much as 71 percent to 97 percent of the results are short-lived. Our technique avoids unnecessary writebacks into the result repository (a slot within the Reorder Buffer or a physical register) as well as writes into the ARF from unnecessary commitments by caching (and isolating) short-lived operands within a small dedicated register file. Operands are cached in this manner till they can be safely discarded without jeopardizing the recovery from possible branch mispredictions or reconstruction of the precise state in case of interrupts or exceptions. Additional energy savings are achieved by limiting the number of ports used for instruction commitment. The power/energy savings are validated using SPICE measurements of actual layouts in a 0.18 micron CMOS process. The energy reduction in the ROB and the ARF is about 20 percent (translating into the overall chip energy reduction of about 5 percent) and this is achieved with no increase in cycle time, little additional complexity, and no degradation in the number of instructions committed per cycle.

[1] D. Burger and T.M. Austin, The SimpleScalar Tool Set: Version 2.0 technical report, Dept. of Computer Science, Univ. of Wisconsin-Madison, June 1997, and documentation for all Simplescalar releases (through version 3.0).
[2] R. Balasubramonian, S. Dwarkadas, and D. Albonesi, Reducing the Complexity of the Register File in Dynamic Superscalar Processor Proc. 34th Int'l Symp. Microarchitecture (MICRO-34), 2001.
[3] E. Borch, E. Tune, S. Manne, and J. Emer, Loose Loops Sink Chips Proc. Int'l Conf. High Performance Computer Architecture (HPCA-02), 2002.
[4] J.L. Cruz et al., Multiple-Banked Register File Architecture Proc. 27th Int'l Symp. Computer Architecture, pp. 316-325, 2000.
[5] O. Ergin et al., A Circuit-Level Implementation of Fast, Energy-Efficient CMOS Comparators for High-Performance Microprocessors Proc. Int'l Conf. Computer Design (ICCD), 2002.
[6] D. Folegnani and A. Gonzalez, Energy-Effective Issue Logic Proc. Int'l Symp. Computer Architecture, July 2001.
[7] M. Franklin and G. Sohi, Register Traffic Analysis for Streamlining Inter-Operation Communication in Fine-Grain Parallel Processors Proc. Int'l Symp. Microarchitecture, 1992.
[8] L. Gwennap, PA-8000 Combines Complexity and Speed Microprocessor Report, vol 8, no. 15, 1994.
[9] Z. Hu and M. Martonosi, Reducing Register File Power Consumption by Exploiting Value Lifetime Characteristics Proc. Workshop Complexity-Effective Design, 2000.
[10] Intel Corp., The Intel Architecture Software Developers Manual 1999.
[11] R.E. Kessler, “The Alpha 21264 Microprocessor,” IEEE Micro, vol. 19, no. 2, pp. 24–36, Mar./Apr. 1999.
[12] G. Kucuk, D. Ponomarev, and K. Ghose, Low-Complexity Reorder Buffer Architecture Proc. Int'l Conf. Supercomputing, pp. 57-66, 2002.
[13] G. Lozano and G. Gao, Exploiting Short-Lived Variables in Superscalar Processors Proc. Int'l Symp. Microarchitecture, pp. 292-302, 1995.
[14] J. Martinez, J. Renau, M. Huang, M. Prvulovich, and J. Torrellas, Cherry: Checkpointed Early Resource Recycling in Out-of-Order Microprocessors Proc. 35th Int'l Symp. Microarchitecture, 2002.
[15] M. Moudgill, K. Pingali,, and S. Vassiliadis,"Register Renaming and Dynamic Speculation: an Alternative Approach," Proc. 26th Int'l Symp. Microarchitecture, ACM Press, 1993, pp. 202-213.
[16] D. Ponomarev, G. Kucuk, and O. Ergin, Reducing Datapath Energy through the Isolation of Short-Lived Operands Proc. 12th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), 2003.
[17] M. Slater, AMD's K5 Designed to Outrun Pentium Microprocessor Report, vol. 8, no. 14, 1994.
[18] E. Savransky, R. Ronen, and A. Gonzalez, Lazy Retirement: A Power Aware Register Management Mechanism Proc. Workshop Complexity-Effective Design, 2002.
[19] S.P. Song, M. Denman, and J. Chang, "The PowerPC 604 RISC Microprocessor," IEEE Micro, Oct. 1994, pp. 8-17.
[20] J. Smith and A. Pleszkun, Implementation of Precise Interrupts in Pipelined Processors Proc. Int'l Symp. Computer Architecture, pp. 36-44, 1985.
[21] J. Tseng and K. Asanovic, Banked Multiported Register Files for High Frequency Superscalar Microprocessors Proc. Int'l Symp. Computer Architecture, 2003.
[22] S. Wallase and N. Bagherzadeh, A Scalable Register File Architecture for Dynamically Scheduled Processors Proc. Int'l Conf. Parallel Architectures and Compilation Techniques (PACT-96), 1996.
[23] S. Gunther, F. Binns, D. Carmean, and J. Hall, Managing the Impact of Increasing Microprocessor Power Consumption Intel Technology J., Q1, 2001.
[24] P. Bose et al., Early-Stage Definition of LPX: A Low-Power Issue-Execute Processor Prototype Proc. HPCA Workshop Power-Aware Computer Systems, 2002.
[25] C. Small, Shrinking Devices Put a Squeeze on System Packaging EDN, vol. 39, no. 4, pp. 41-46, 17 Feb. 1994.
[26] A. Moshovos, Power-Aware Register Renaming technical report, Univ. of Toronto, Aug. 2002.
[27] D. Ponomarev, G. Kucuk, and K. Ghose, Reducing Power Requirements of Instruction Scheduling through Dynamic Allocation of Multiple Datapath Resources Proc. MICRO, pp. 90-101, 2001.
[28] I. Park, M. Powell, and T. Vijaykumar, Reducing Register Ports for Higher Speed and Lower Energy Proc. 35th Int'l Symp. Microarchitecture, 2002.
[29] N. Kim and T. Mudge, Reducing Register Ports Using Delayed Write-Back Queues and Operand Pre-Fetch Proc. Int'l Conf. Supercomputing, 2003.
[30] G. Kucuk, D. Ponomarev, O. Ergin, and K. Ghose, Reducing Reorder Buffer Complexity through Selective Operand Caching Proc. Int'l Symp. Low Power Electronics and Design (ISLPED), 2003.
[31] S. Manne, A. Klauser, and D. Grunwald, "Pipeline Gating: Speculation Control for Energy Reduction," Proc. 25th Ann. Int'l Symp. Computer Architecture (ISCA-25), 1998, IEEE CS Press, pp. 132-141.

Index Terms:
Short-lived operands, superscalar datapath, energy reduction.
Citation:
Dmitry Ponomarev, Gurhan Kucuk, Oguz Ergin, Kanad Ghose, "Isolating Short-Lived Operands for Energy Reduction," IEEE Transactions on Computers, vol. 53, no. 6, pp. 697-709, June 2004, doi:10.1109/TC.2004.11
Usage of this product signifies your acceptance of the Terms of Use.