This Article 
 Bibliographic References 
 Add to: 
A Speculative Control Scheme for an Energy-Efficient Banked Register File
June 2005 (vol. 54 no. 6)
pp. 741-751
Multiported register files are critical components of modern superscalar and simultaneously multithreaded (SMT) processors, but conventional designs consume considerable die area and power as register counts and issue widths grow. Banked multiported register files consisting of multiple interleaved banks of lesser ported cells can be used to reduce area, power, and access time and previous work has shown that such designs can provide sufficient bandwidth for a superscalar machine. These previous banked designs, however, have complex control structures to avoid bank conflicts or to buffer conflicting requests, which add to design complexity and would likely limit cycle time. This paper presents a much simpler and faster control scheme that speculatively issues potentially conflicting instructions, then quickly repairs the pipeline if conflicts occur. We show that, once optimizations to avoid regfile reads are employed, the remaining read accesses observed in detailed simulations are close to randomly distributed and this contributes to the effectiveness of our speculative control scheme. For a four-issue superscalar processor with 64 physical registers, we show that we can reduce area by a factor of three, access time by 25 percent, and energy by 40 percent, while decreasing IPC by less than 5 percent. For an eight-issue SMT processor with 512 physical registers, area is reduced by a factor of seven, access time by 30 percent, and energy by 60 percent, while decreasing IPC by less than 2 percent.

[1] A. Alvandpour, R. Krishnamurthy, K. Soumyanath, and S. Borkar, “A Low-Leakage Dynamic Multi-Ported Register File in 0.13 µm CMOS,” Proc. Int'l Symp. Low Power Electronics and Design (ISLPED), pp. 68-71, 2001.
[2] R. Balasubramonian, S. Dwarkadas, and D.H. Albonesi, “Reducing the Complexity of the Register File in Dynamic Superscalar Processors,” Proc. 34th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO-34), Dec. 2001.
[3] E. Borch, E. Tune, S. Manne, and J.S. Emer, “Loose Loops Sink Chips,” Proc. High Performance Computer Architecture (HPCA), pp. 299-310, Feb. 2002.
[4] D. Burger and T. Austin, “The Simplescalar Toolset, Version 2.0,” technical report, Univ. of Wisconsin-Madison, June 1997.
[5] A. Chandrakasan, W.J. Bowhill, and F. Fox, Design of High Performance Microprocessor Circuits. IEEE Press, 2000.
[6] Unisys Corp., “Scientific Processor Vector File Organization,” US patent 4,875,161, Oct. 1989.
[7] J.-L. Cruz, A. Gonzalez, M. Valero, and N.P. Topham, “Multiple-Banked Register File Architectures,” Proc. Int'l Symp. Computer Architecture (ISCA-27), pp. 316-325, 2000.
[8] DEC, “Vector Register System for Executing Plural Read/Write Commands Concurrently and Independently Routing Data to Plural Read/Write Ports,” US patent 4,980,817, Dec. 1990.
[9] K.I. Farkas, P. Chow, N.P. Jouppi, and Z.G. Vranesic, “The Multicluster Architecture: Reducing Cycle Time through Partitioning,” Proc. 30th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO-30), pp. 149-159, 1997.
[10] E.S. Fetzer et al., “A Fully-Bypassed 6-Issue Integer Datapath and Register File on an Itanium Microprocessor,” IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 1433-1440, Nov. 2002.
[11] J.A. Fisher, “Very Long Instruction Word Architectures and the ELI-512,” Proc. 10th Int'l Symp. Computer Architecture (ISCA-10), pp. 140-150, 1983.
[12] R.E. Kessler, “The Alpha 21264 Microprocessor,” IEEE Micro, vol. 19, no. 2, pp. 24-36, Mar./Apr. 1999.
[13] N.S. Kim and T. Mudge, “Reducing Register Ports Using Delayed Write-Back Queues and Operand Pre-Fetch,” Proc. 17th Ann. ACM Int'l Conf. Supercomputing (ICS), pp. 172-182, 2003.
[14] S. Palacharla, N. Jouppi, and J.E. Smith, “Complexity-Effective Superscalar Processors,” Proc. 24th Int'l Symp. Computer Architecture (ISCA-24), pp. 206-218, June 1997.
[15] I. Park, M.D. Powell, and T.N. Vijaykumar, “Reducing Register Ports for Higher Speed and Lower Energy,” Proc. 35th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO-35), Nov. 2002.
[16] R.P. Preston et al., “Design of an 8-Wide Superscalar RISC Microprocessor with Simultaneous Multithreading,” Int'l Solid-State Circuits Conf. (ISSCC) Digest and Visuals Supplement, Feb. 2002.
[17] S. Sair and M. Charney, “Memory Behavior of the SPEC2000 Benchmark Suite,” technical report, IBM Research Report, Yorktown Heights, N.Y., Oct. 2000.
[18] A. Seznec, E. Toullec, and O. Rochecouste, “Register Write Specialization Register Read Specialization: A Path to Complexity-Effective Wide-Issue Superscalar Processors,” Proc. 35th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO-35), Nov. 2002.
[19] G.S. Sohi, S. Breach, and T.N. Vijaykumar, “Multiscalar Processors,” Proc. 22nd Int'l Symp. Computer Architecture (ISCA-22), 1995.
[20] M. Tremblay, B. Joy, and K. Shin, “A Three Dimensional Register File for Superscalar Processors,” Proc. Hawaii Intl Conf. System Sciences (HICSS), Jan. 1995.
[21] J. Tseng and K. Asanović, “Energy-Efficient Register Access,” Proc. 13th Symp. Integrated Circuits and Systems Design, Sept. 2000.
[22] D.M. Tullsen, S. Eggers, and H.M. Levy, “Simultaneous Multithreading: Maximizing On-Chip Parallelism,” Proc. 22nd Int'l Symp. Computer Architecture (ISCA-22), 1995.
[23] S. Wallace and N. Bagherzadeh, “A Scalable Register File Architecture for Dynamically Scheduled Processors,” Proc. Int'l Conf. Parallel Architectures and Compilation (PACT), Oct. 1996.
[24] D.L. Weaver and T. Germond, The SPARC Architecture Manual/Version 9. Prentice Hall, Feb. 1994.
[25] V. Zyuban and P. Kogge, “The Energy Complexity of Register Files,” Proc. 1998 Int'l Symp. Low Power Electronics and Design (ISLPED), pp. 305-310, Aug. 1998.
[26] V.V. Zyuban and P.M. Kogge, “Inherently Lower-Power High-Performance Superscalar Architectures,” IEEE Trans. Computers, vol. 50, no. 3, pp. 268-285, Mar. 2001.

Index Terms:
Low-power, register file, speculative control, superscalar, simultaneous multithreading.
Jessica H. Tseng, Krste Asanovic, "A Speculative Control Scheme for an Energy-Efficient Banked Register File," IEEE Transactions on Computers, vol. 54, no. 6, pp. 741-751, June 2005, doi:10.1109/TC.2005.88
Usage of this product signifies your acceptance of the Terms of Use.