This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Access Pattern Restructuring for Memory Energy
April 2004 (vol. 15 no. 4)
pp. 289-303
Victor De La Luz, IEEE Computer Society
Ismail Kadayif, IEEE Computer Society
Uger Sezer, IEEE

Abstract—Improving memory energy consumption of programs that manipulate arrays is an important problem as these codes spend large amounts of energy in accessing off-chip memory. In this paper, we propose a data-driven strategy to optimize the memory energy consumption in a banked memory system. Our compiler-based strategy modifies the original execution order of loop iterations in array-dominated applications to increase the length of the time period(s) in which memory banks are idle (i.e., not accessed by any loop iteration). To achieve this, it first classifies loop iterations according to their bank accesses patterns and then, with the help of a polyhedral tool, tries to bring the iterations with similar bank access patterns close together. Increasing the idle periods of memory banks brings two major benefits: first, it allows us to place more memory banks into low-power operating modes and, second, it enables us to use a more aggressive (i.e., more energy saving) operating mode (hence, saving more energy) for a given bank (instead of a less aggressive mode). The proposed strategy can reduce memory energy consumption in both sequential and parallel applications. Our strategy has been implemented in an experimental compiler using a polyhedral tool and evaluated using nine array-dominated applications on both a cacheless system and a system with cache memory. Our experimental results indicate that the proposed strategy is very successful in reducing the memory system energy and improves the memory energy by as much as 36.8 percent over a strategy that uses low-power modes without optimizing data access pattern. Our results also show that optimizations that target reducing off-chip memory energy can generate very different results from those that target at improving only cache locality.

[1] 128/144-MBit Direct RDRAM Data Sheet, Rambus Inc., May 1999.
[2] D.H. Albonesi, An Architectural and Circuit-Level Approach to Improving the Energy Efficiency of Microprocessor Memory Structures Proc. 10th Int'l Conf. VLSI, pp. 192-205, Dec. 1999.
[3] N. Bellas, I. Hajj, and C. Polychronopoulos, A New Scheme for I-Cache Reduction in High Performance Processors Proc. Power Driven Micro-Architecture Workshop, June 1998.
[4] L. Benini, A. Bogliogo, S. Cavallucci, and B. Ricco, “Monitoring System Activity for OS-Directed Dynamic Power Management,” Proc. ACM Int'l Symp. Low Power Electronics and Design (ISLPED '98), 1998.
[5] L. Benini, R. Hodgson, and P. Siegel, “System-Level Power Estimation and Optimization,” Proc. ACM Int'l Symp. Low Power Electronics and Design (ISLPED '98), 1998.
[6] D. Brooks and M. Martonosi, “Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance,” Proc. Fifth Int'l Symp. High-Performance Computer Architecture, Jan. 1999.
[7] D. Brooks, V. Tiwari, and M. Martonosi, Wattch: A Framework for Architectural-Level Power Analysis and Optimizations Proc. 27th Ann. Int'l Symp. Computer Architecture, pp. 83-94, June 2000.
[8] J. Bunda, W.C. Athas, and D. Fussell, Evaluating Power Implication of CMOS Microprocessor Design Decisions Proc. 1994 Int'l Workshop Low Power Design, Apr. 1994.
[9] B. Burgress et al., The PowerPCTM603 Microprocessor: A High-Performance, Low-Power, Super-Scalar RISC Processor Proc. IEEE COMPCON, Feb. 1994.
[10] F. Catthoor, S. Wuytack, E.D. Greef, F. Balasa, L. Nachtergaele, and A. Vandecappelle, Custom Memory Management Methodology Exploration of Memory Organization for Embedded Multimedia System Design. Kluwer Academic, 1998.
[11] S. Coleman and K.S. McKinley, Tile Size Selection Using Cache Organization and Data Layout Proc. ACM Conf. Programming Language Design and Implementation, June 1995.
[12] V. Delaluz, M. Kandemir, and U. Sezer, Improving Off-Chip Memory Energy Behavior in a Multiprocessor, Multibank Environment Proc. 14th Ann. Workshop Languages and Compilers for Parallel Computing, Aug. 2001.
[13] V. Delaluz, M. Kandemir, N. Vijaykrishnan, and M.J. Irwin, Energy-Oriented Compiler Optimizations for Partitioned Memory Architectures Proc. Int'l Conf. Compilers, Architecture, and Synthesis for Embedded Systems, Nov. 2000.
[14] F. Douglas, P. Krishnan, and B. Marsh, Thwarting the Power-Hungry Disk Proc. Winter Usenix, 1994.
[15] V. Delaluz, M. Kandemir, N. Vijaykrishnan, A. Sivasubramaniam, and M.J. Irwin, DRAM Energy Management Using Software and Hardware Directed Power Mode Control Proc. Seventh Int'l Conf. High Performance Computer Architecture, Jan. 2001.
[16] V. Delaluz, A. Sivasubramaniam, M. Kandemir, N. Vijaykrishnan, and M.J. Irwin, Scheduler-Based DRAM Energy Management Proc. Design Automation Conf., June 2002.
[17] G. De Micheli, Synthesis and Optimization of Digital Circuits. McGraw-Hill, 1994.
[18] C. Ellis, “The Case for Higher Level Power Management,” Proc. IEEE Hot Topics in Operating Systems (HotOS), Mar. 1999.
[19] X. Fan, C.S. Ellis, and A.R. Lebeck, Memory Controller Policies for DRAM Power Management Proc. Int'l Symp. Low Power Electronics and Design, 2001.
[20] K.I. Farkas, J. Flinn, G. Back, D. Grunwald, and J.-A.M. Anderson, Quantifying the Energy Consumption of a Pocket Computer and a Java Virtual Machine Proc. SIGMETRICS Conf., pp. 252-263, 2000.
[21] R. Gonzalez and M. Horowitz, Energy Dissipation in General Purpose Microprocessors IEEE J. Solid-State Circuits, vol. 31, no. 9, Sept. 1996.
[22] M.K. Gowan, L.L. Biro, and D.B. Jackson, "Power Considerations in the Design of the Alpha 21264 Microprocessor," Proc. IEEE/ACM Design Automation Conf., 1998, ACM, New York, pp. 726-731.
[23] C.-H. Hsu, U. Kremer, and M. Hsiao, Compiler-Directed Dynamic Frequency/Voltage Scheduling for Energy Reduction in Microprocessors Proc. Int'l Symp. Low Power Electronics and Design, Aug. 2001.
[24] W.W. Hwu, ECE 412 Class Notes Dept. of Electric and Computer Eng., Univ. of Illinois Urbana-Champaign,http://www.crhc.uiuc.edu/IMPACT/ece412/public_html index.html, 1998.
[25] K. Itoh, K. Sasaki, and Y. Nakagome, “Trends in Low-Power RAM Circuit Technologies,” Proc. IEEE, vol. 83, no. 4, pp. 524-543, Apr. 1995.
[26] M.B. Kamble and K. Ghose,"Analytical Energy Dissipation Models for Low-Power Caches," Proc. Int'l Symp. Low Power Electronics and Design (ISPLED 97), ACM Press, 1997, pp. 143-148.
[27] M. Kandemir, N. Vijaykrishnan, M.J. Irwin, and W. Ye, Influence of Compiler Optimizations on System Power Proc. Design Automation Conf., June 2000.
[28] M. Kandemir, U. Sezer, and V. Delaluz, Improving Memory Energy Using Access Pattern Classification Proc. Int'l Conf. Computer Aided Design, Nov. 2001.
[29] M. Kandemir, N. Vijaykrishnan, M.J. Irwin, and W. Ye, Influence of Compiler Optimizations on System Power Proc. Design Automation Conf., June 2000.
[30] M. Kandemir, I. Kolcu, and I. Kadayif, Influence of Loop Optimizations on Energy Consumption of Multibank Memory Systems Proc. Int'l Conf. Compiler Construction, Apr. 2002.
[31] W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman, and D. Wonnacott, The Omega Library Interface Guide Technical Report CS-TR-3445, Computer Science Dept., Univ. of Maryland, College Park, Mar. 1995.
[32] I. Kodukula, N. Ahmed, and K. Pingali, Data-Centric Multi-Level Blocking Proc. SIGPLAN Conf. Programming Language Design and Implementation, June 1997.
[33] A.R. Lebeck, X. Fan, H. Zeng, and C.S. Ellis, Power Aware Page Allocation Proc. Ninth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, Nov. 2000.
[34] K. Li, R. Kumpf, P. Horton, and T. Anderson, Quantitative Analysis of Disk Drive Power Management in Portable Computers Proc. Winter Usenix Conf., 1994.
[35] J.R. Lorch and A.J. Smith, “Software Strategies for Portable Computer Engergy Management,” IEEE Personal Comm., vol. 5, no. 3, June 1998.
[36] S.S. Muchnick, Advanced Compiler Design Implementation. Morgan Kaufmann, 1997.
[37] Pentium III Processor Mobile Module MMC-2, Datasheet 243356 001, Intel Corp., 2004.
[38] P.R. Panda, N.D. Dutt, and A. Nicolau, “Architectural Exploration and Optimization of Local Memory in Embedded Systems,” Proc. 10th Int'l Symp. System Synthesis, Sept. 1997.
[39] W.-T. Shiue and C. Chakrabarti, “Memory Exploration for Low Power, Embedded Systems,” Proc. Design Automation Conf. (DAC '99), 1999.
[40] M. Stemm and R.H. Katz, Measuring and Reducing Energy Consumption of Network Interfaces in Hand-Held Devices IEICE Trans. Comm., special issue on mobile computing, 2000.
[41] C. Su and A. Despain, Cache Design Trade-Offs for Power and Performance Optimization: A Case Study Proc. Int'l Symp. Low Power Electronics and Design, pp. 63-68, 1995.
[42] V. Tiwari, S. Malik, A. Wolfe, and T.C. Lee, Instruction Level Power Analysis and Optimization of Software J. VLSI Signal Processing Systems, vol. 13, no. 2, Aug. 1996.
[43] M.C. Toburen, T.M. Conte, and M. Reilly, Instruction Scheduling for Low Power Dissipation in High Performance Processors Proc. Power Driven Micro-Architecture Workshop (ISCA '98), June 1998.
[44] O.S. Unsal, R. Ashok, I. Koren, C.M. Krishna, and C.A. Moritz, A Compiler-Enabled Energy Efficient Data Caching Framework for Embedded and Multimedia Systems ACM Trans. Embedded Computing Systems, Special Issue on Low Power, 2003.
[45] N. Vijaykrishnan, M. Kandemir, M. J. Irwin, H. Kim, and W. Ye, “Energy-Driven Integrated Hardware-Software Optimizations Using SimplePower,” Proc. Int'l Symp. Computer Architecture, 2000.
[46] R.P. Wilson, R.S. French, C.S. Wilson, S.P. Amarasinghe, J.M. Anderson, S.W.K. Tjiang, S.-W. Liao, C.-W. Tseng, M.W. Hall, M.S. Lam, J.L. Hennessy, SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers ACM SIGPLAN Notices, vol. 29, no. 12, pp. 31-37, Dec. 1994.
[47] M. Wolfe, High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.
[48] W. Ye, N. Vijaykrishnan, M. Kandemir, and M.J. Irwin, The Design and Use of SimplePower: a Cycle-Accurate Energy Estimation Tool Proc. 37th Design Automation Conf., pp. 340-345, June 2000.
[49] V. Zyuban and P. Kogge, Split Register File Architectures for Inherently Lower Power Microprocessors Proc. Power-Driven Microarchitecture Workshop (ISCA '98), pp. 32-37, 1998.
[50] V. Zyuban and P. Kogge, “Inherently Lower-Power High-Performance Superscalar Architectures,” IEEE Trans. Computers, submitted.

Index Terms:
Compiler optimization, energy consumption, embedded systems, banked memories, access pattern.
Citation:
Victor De La Luz, Ismail Kadayif, Mahmut Kandemir, Uger Sezer, "Access Pattern Restructuring for Memory Energy," IEEE Transactions on Parallel and Distributed Systems, vol. 15, no. 4, pp. 289-303, April 2004, doi:10.1109/TPDS.2004.1271179
Usage of this product signifies your acceptance of the Terms of Use.