This Article 
 Bibliographic References 
 Add to: 
Hardware and Software Techniques for Controlling DRAM Power Modes
November 2001 (vol. 50 no. 11)
pp. 1154-1173

Abstract—The anticipated explosive growth of pervasive and mobile computing devices that are typically constrained by energy has brought hardware and software techniques for energy conservation into the spotlight. While there have been several studies and proposals for energy conservation for CPUs and peripherals, energy optimization techniques for selective operating mode control of DRAMs have not been fully explored. It has been shown that, for some systems, as much as 90 percent of overall system energy (excluding I/O) is consumed by the DRAM modules, thus, they serve as a good candidate for energy optimizations. Further, DRAM technology has also matured to provide several low energy operating modes (power modes), making it an opportunistic moment to conduct studies exploring the potential benefits of mode control techniques. This paper conducts an in-depth investigation of software and hardware techniques to take advantage of the DRAM mode control capabilities at a module granularity for energy savings. Using a memory system architecture capturing five different energy modes and corresponding resynchronization times, this paper presents several novel compilation techniques to both cluster the data across memory banks as well as to detect module idleness and perform energy mode transitions. In addition, hardware-assisted approaches (called self-monitoring) based on predictions of module interaccess times are proposed. These techniques are extensively evaluated using a set of a dozen benchmarks. It is shown that we get an average of 61 percent savings in DRAM energy using compiler-directed mode control. One of the self-monitored approaches gives as much as 89 percent savings (72 percent on the average), coming as close as 8.8 percent to the optimal energy savings that one can expect with DRAM module mode control. The optimization techniques are demonstrated to be invaluable for energy savings as memory technologies continue to evolve.

[1] D.H. Albonesi, Selective Cache Ways: On-Demand Cache Resource Allocation Proc. 32nd Ann. ACM/IEEE Int'l Symp. Microarchitecture, pp. 248-259, Nov. 1999.
[2] D.H. Albonesi, “An Architectural and Circuit-Level Approach to Improving the Energy Efficiency of Microprocessor Memory Structures,” Proc. 10th Int'l Conf. VLSI, pp. 192-205, Dec. 1999.
[3] “Advanced Configuration and Power Interface Specification,” Intel, Microsoft, and Toshiba, Revision 1.0b, 2 Feb. 1999.
[4] S.P. Amarasinghe, J.M. Anderson, M.S. Lam, and C.W. Tseng, “The SUIF Compiler for Scalable Parallel Machines,” Proc. Seventh SIAM Conf. Parallel Processing for Scientific Computing, Feb. 1995.
[5] N. Bellas, I. Hajj, and C. Polychronopoulos, “A New Scheme for I-Cache Reduction in High Performance Processors,” Proc. Power Driven Micro-Architecture Workshop in conjunction with Int'l Symp. Computer Architecture (ISCA '98), June 1998.
[6] L. Benini, A. Bogliogo, S. Cavallucci, and B. Ricco, “Monitoring System Activity for OS-Directed Dynamic Power Management,” Proc. ACM Int'l Symp. Low Power Electronics and Design (ISLPED '98), 1998.
[7] L. Benini, R. Hodgson, and P. Siegel, “System-Level Power Estimation and Optimization,” Proc. ACM Int'l Symp. Low Power Electronics and Design (ISLPED '98), 1998.
[8] R. Bhargava, L.K. John, B.L. Evans, and R. Radhakrishnan, “Evaluating MMX Technology Using DSP and Multimedia Applications,” Proc. IEEE Symp. Microarchitecture, pp. 37-46, Dec. 1998.
[9] V. Bhaskaran and K. Konstantinides, Image and Video Compression Standards—Algorithms and Architectures. Boston: Kluwer Academic, 1996.
[10] D. Brooks and M. Martonosi, “Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance,” Proc. Fifth Int'l Symp. High-Performance Computer Architecture, Jan. 1999.
[11] D. Brooks, V. Tiwari,, and M. Martonosi,"Wattch: A Framework for Architectural-Level Power Analysis and Optimizations," Proc. Int'l Symp. Computer Architecture (ISCA 00), ACM Press, 2000, pp. 83-94.
[12] F. Catthoor, S. Wuytack, E.D. Greef, F. Balasa, L. Nachtergaele, and A. Vandecappelle, Custom Memory Management Methodology—Exploration of Memory Organization for Embedded Multimedia System Design. Kluwer Academic, June 1998.
[13] L.-G. Chen, W.-T. Chen, Y.-S. Jehng, and C.-T. Church, “An Efficient Parallel Motion Estimation Algorithm for Digital Image Processing,” IEEE Trans. Circuits and Systems for Video Technology, vol. 1, no. 4, pp. 378-385, Dec. 1991.
[14] V. Delaluz, M. Kandemir, N. Vijaykrishnan, and M.J. Irwin, “Energy-Oriented Compiler Optimizations for Partitioned Memory Architecture,” Proc. Int'l Conf. Compilers, Architecture, and Synthesis for Embedded Systems (CASES 2000), Nov. 2000.
[15] V. Delaluz, M. Kandemir, N. Vijaykrishnan, A. Sivasubramaniam, and M.J. Irwin, “DRAM Energy Management Using Hardware and Software Directed Power Mode Control,” Proc. Int'l Conf. High Performance Computer Architecture (HPCA), Jan. 2001.
[16] “128/144-MBit Direct RDRAM Data Sheet,” Rambus Inc., May 1999.
[17] F. Douglas, P. Krishnan, and B. Marsh, “Thwarting the Power-Hungry Disk,” Proc. Winter Usenix, 1994.
[18] C. Ellis, “The Case for Higher Level Power Management,” Proc. IEEE Hot Topics in Operating Systems (HotOS), Mar. 1999.
[19] K. Farkas et al., "The Multicluster Architecture: Reducing Cycle Time Through Partitioning," to appear in Proc. 30th Ann. IEEE/ACM Int'l Symp Microarchitecture, IEEE Computer Society, Press, Los Alamitos, Calif., 1997.
[20] R. Gonzalez and M. Horowitz, Energy Dissipation in General Purpose Microprocessors IEEE J. Solid-State Circuits, vol. 31, no. 9, Sept. 1996.
[21] M.K. Gowan, L.L. Biro, and D.B. Jackson, "Power Considerations in the Design of the Alpha 21264 Microprocessor," Proc. IEEE/ACM Design Automation Conf., 1998, ACM, New York, pp. 726-731.
[22] P. Havlak and K. Kennedy, "An Implementation of Interprocedural Bounded Regular Section Analysis," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 3, pp. 350-360, July 1991.
[23] W.-M.W. Hwu, “Embedded Microprocessor Comparison,” Notes/412_lec1ppframe.htm, 2001.
[24] “Intel 440BX AGPset: 82443BX Host Bridge/Controller Data Sheet,” Apr. 1998.
[25] “Intel 820 Chip Set,” 820/, 2001.
[26] K. Itoh, K. Sasaki, and Y. Nakagome, “Trends in Low-Power RAM Circuit Technologies,” Proc. IEEE, vol. 83, no. 4, pp. 524-543, Apr. 1995.
[27] M.B. Kamble and K. Ghose,"Analytical Energy Dissipation Models for Low-Power Caches," Proc. Int'l Symp. Low Power Electronics and Design (ISPLED 97), ACM Press, 1997, pp. 143-148.
[28] M. Kandemir, N. Vijaykrishnan, M.J. Irwin, and W. Ye, “Influence of Compiler Optimizations on System Power,” Proc. Design Automation Conf. (DAC), June 2000.
[29] R. Lebeck, X. Fan, H. Zeng, and C.S. Ellis, “Power Aware Page Allocation,” Proc. Ninth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, Nov. 2000.
[30] K. Li, R. Kumpf, P. Horton, and T. Anderson, “A Quantitative Analysis of Disk Drive Power Management in Portable Computers,” Proc. Winter Usenix, 1994.
[31] Micromagic web page,http:/, 2001.
[32] K. McKinley, S. Carr, and C.W. Tseng, “Improving Data Locality with Loop Transformations,” ACM Trans. Programming Languages and Systems, vol. 18, no. 4, pp. 424-453, July 1996.
[33] T.C. Mowry, M.S. Lam, and A. Gupta, “Design and Evaluation of a Compiler Algorithm for Prefetching,” Proc. Fifth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, Oct. 1992.
[34] “Pentium III Processor Mobile Module MMC-2, Datasheet 243356-001,” Intel Corp., 2001.
[35] Rambus Inc.,http:/, 2001.
[36] G. Rivera and C.-W. Tseng, “Data Transformations for Eliminating Conflict Misses,” Proc. SIGPLAN Conf. Programming Language Design and Implementation, June 1998.
[37] K. Roy and M. Johnson, “Software Power Optimization, Power Design in Deep Submicron Electronics, Kluwer Academic, Oct. 1996.
[38] Samsung Semiconductor DRAM Products, family/browsedram.htm, 2001.
[39] F.J. Sanchez and A. Gonzalez, Proc. 31st Hawaii Int'l Conf. System Sciences (HICSS '98), Jan. 1998.
[40] W.-T. Shiue and C. Chakrabarti, “Memory Exploration for Low Power, Embedded Systems,” Proc. Design Automation Conf. (DAC '99), 1999.
[41] M. Stemm and R.H. Katz, “Measuring and Reducing Energy Consumption of Network Interfaces in Hand-Held Devices,” IEICE Trans. Comm., special Issue on mobile computing, 2000.
[42] C. Su and A. Despain, “Cache Design Trade-Offs for Power and Performance Optimization: A Case Study,” Proc. Int'l Symp. Low Power Electronics and Design, pp. 63-68, 1995.
[43] V. Tiwari, S. Malik, A. Wolfe, and T.C. Lee, “Instruction Level Power Analysis and Optimization of Software,” J. VLSI Signal Processing Systems, vol. 13, no. 2, Aug. 1996.
[44] M.C. Toburen, T.M. Conte, and M. Reilly, “Instruction Scheduling for Low Power Dissipation in High Performance Processors,” Proc. Power Driven Micro-Architecture Workshop in conjunction with Int'l Symp. Computer Architecture (ISCA '98), June 1998.
[45] R. Triolet, F. Irigoin, and P. Feautrier, "Direct Parallelization of CALL Statements," Proc. SIGPLAN '86 Symp. Compiler Construction, pp. 176-185,Palo Alto, Calif., June 1986.
[46] N. Vijaykrishnan et al., "Energy-Driven Integrated Hardware-Software Optimizations Using SimplePower," Proc. 27th Ann. Int'l Symp. Computer Architecture (ISCA), 2000, pp. 95-106.
[47] M. Wolf, D. Maydan, and D. Chen, “Combining Loop Transformations Considering Caches and Scheduling,” Proc. MICRO-29, pp. 274-286, Dec. 1996.
[48] M. Wolfe, High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.
[49] W. Ye, N. Vijaykrishnan, M. Kandemir, and M.J. Irwin, “The Design and Use of SimplePower: A Cycle-Accurate Energy Estimation Tool,” Proc. Design Automation Conf. (DAC), June 2000.
[50] V. Zyuban and P. Kogge, “Split Register File Architectures for Inherently Lower Power Microprocessors,” Proc. Power-Driven Microarchitecture Workshop in conjunction with Int'l Symp. Computer Architecture (ISCA '98), pp. 32-37, 1998.
[51] V. Zyuban and P. Kogge, “Inherently Lower-Power High-Performance Superscalar Architectures,” IEEE Trans. Computers, submitted.

Index Terms:
Memory architecture, low power, low power compilation, software-directed energy management.
Victor Delaluz, Mahmut Kandemir, N. Vijaykrishnan, Anand Sivasubramaniam, Mary Jane Irwin, "Hardware and Software Techniques for Controlling DRAM Power Modes," IEEE Transactions on Computers, vol. 50, no. 11, pp. 1154-1173, Nov. 2001, doi:10.1109/12.966492
Usage of this product signifies your acceptance of the Terms of Use.