The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.08 - Aug. (2012 vol.61)
pp: 1127-1139
Chen-Wei Huang , National Chiao Tung University, Hsinchu
Shiao-Li Tsao , National Chiao Tung University, Hsinchu
ABSTRACT
Code repositioning is a well-known method of reducing inefficient off-chip memory accesses by streamlining cache behavior. Embedded systems with predetermined applications can achieve further improvement with the addition of fast and energy efficient scratchpad memory (SPM) on chip and moving frequent accesses code and/or data from main memory to SPM. While many researchers have attempted to either streamline cache accesses or improve the effectiveness of SPM, few studies focus on exploring their joint synergy. This study proposes integer linear programming (ILP) models that include both code repositioning and SPM code selection to identify the optimal code layout and reduce energy consumption in embedded systems with a cache and SPM. This study also proposes a two-stage metaheuristic algorithm. Experimental results reveal that 1) allocating a dedicated portion of the on-chip SRAM to the SPM is not always better than using a cache-only configuration and 2) it is not trivial to select code objects for the SPM. As much as 55 percent additional energy can be saved by applying both code repositioning and SPM code selection techniques.
INDEX TERMS
Code layout, embedded systems, energy consumption, scratchpad memory.
CITATION
Chen-Wei Huang, Shiao-Li Tsao, "Minimizing Energy Consumption of Embedded Systems via Optimal Code Layout", IEEE Transactions on Computers, vol.61, no. 8, pp. 1127-1139, Aug. 2012, doi:10.1109/TC.2011.122
REFERENCES
[1] D. Keitel-Schulz and N. Wehn, "Embedded DRAM Development: Technology, Physical Design, and Application Issues," IEEE Design and Test of Computers, vol. 18, no. 3, pp. 7-15, May 2001.
[2] J. Handy, The Cache Memory Book, second ed., pp. 22-23. Academic Press Professional, Inc., 1998.
[3] A.D. Samples and P.N. Hilfinger, "Code Reorganization for Instruction Caches," Technical Report UCB/CSD-88-447, EECS Dept., Univ. of California, Berkeley, Oct. 1988.
[4] W.W. Hwu and P.P. Chang, "Achieving High Instruction Cache Performance with an Optimizing Compiler," Proc. 16th Ann. Int'l Symp. Computer Architecture (ISCA '89), pp. 242-251, 1989.
[5] S. McFarling, "Program Optimization for Instruction Caches," SIGARCH Computer Architecture News, vol. 17, no. 2, pp. 183-191, 1989.
[6] W.Y. Chen, P.P. Chang, T.M. Conte, and W.W. Hwu, "The Effect of Code Expanding Optimizations on Instruction Cache Design," IEEE Trans. Computers, vol. 42, no. 9, pp. 1045-1057, Sept. 1993.
[7] N. Gloy, M. Smith, and C. Young, "Performance Issues in Correlated Branch Prediction Schemes," Proc. 28th Ann. Int'l Symp. Microarchitecture (Micro '95), pp. 3-14, Nov./Dec. 1995.
[8] ARM, http://www.arm.com/products/processors/classic/ arm11index.php, 2011.
[9] IBM, http://www.research.ibm.comcell/, 2011.
[10] Freescale, http://www.freescale.com/webapp/sps/site homepage.jsp?code=PC68KCF, 2011.
[11] R. Banakar, S. Steinke, B.S. Lee, M. Balakrishnan, and P. Marwedel, "Scratchpad Memory: A Design Alternative for Cache On-Chip Memory in Embedded Systems," Proc. 10th Int'l Symp. Hardware/Software Codesign (CODES '02), pp. 73-78, 2002.
[12] P.R. Panda, N.D. Dutt, and A. Nicolau, "On-Chip vs. Off-Chip Memory: The Data Partitioning Problem in Embedded Processor-Based Systems," ACM Trans. Design Automation of Electronic Systems, vol. 5, no. 3, pp. 682-704, 2000.
[13] J.A. Baiocchi and B.R. Childers, "Heterogeneous Code Cache: Using Scratchpad and Main Memory in Dynamic Binary Translators," Proc. 46th Ann. Design Automation Conf. (DAC '09), pp. 744-749, 2009.
[14] Y. Ishitobi, T. Ishihara, and H. Yasuura, "Code and Data Placement for Embedded Processors with Scratchpad and Cache Memories," J. Signal Processing Systems, vol. 60, pp. 221-224, Aug. 2010.
[15] M. Verma, L. Wehmeyer, and P. Marwedel, "Cache-Aware Scratchpad Allocation Algorithms for Energy-Constrained Embedded Systems," IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 10, pp. 2035-2051, Oct. 2006.
[16] H. Tomiyama and H. Yasuura, "Optimal Code Placement of Embedded Software for Instruction Caches," Proc. European Design and Test Conf., pp. 96-101, 1996.
[17] J.L. Hennessy and D.A. Patterson, Computer Architecture: A Quantitative Approach, fourth ed. Morgan Kaufmann, 2006.
[18] M.C. Merten, A.R. Trick, C.N. George, J.C. Gyllenhaal, and W.W. Hwu, "A Hardware-Driven Profiling Scheme for Identifying Program Hot Spots to Support Runtime Optimization," Proc. 26th Ann. Int'l Symp. Computer Architecture (ISCA '99), pp. 136-147, 1999.
[19] CPLEX, http://www01.ibm.com/software/integration/ optimizationcplex-optimizer/, 2011.
[20] J.H. Ahn, S. Thoziyoor, N. Muralimanohar, and N.P. Jouppi, "Cacti 5.1," technical report, HP Laboratories, 2008.
[21] H. Cho, B. Egger, J. Lee, and H. Shin, "Dynamic Data Scratchpad Memory Management for a Memory Subsystem with an MMU," ACM SIGPLAN Notices, vol. 42, no. 7, pp. 195-206, 2007.
[22] M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown, "Mibench: A free, Commercially Representative Embedded Benchmark Suite," Proc. IEEE Int'l Workshop Workload Characterization (WWC '01), pp. 3-14, Dec. 2001.
[23] K. Pettis and R. Hansen, "Profile Guided Code Positioning," Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation, pp. 16-27, June 1990.
[24] F. Glover, "Tabu Search—Part I," ORSA J. Computing, vol. 1, no. 3, pp. 190-206, 1989.
[25] F. Glover, "Tabu Search—Part II," ORSA J. Computing, vol. 2, no. 1, pp. 4-32, 1990.
[26] E. Verlind, G. Jong, and B. Lin, "Efficient Partial Enumeration for Timing Analysis of Asynchronous Systems," Proc. 33rd Ann. Design Automation Conf., pp. 55-58, 1996.
5 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool