The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.08 - August (2010 vol.59)
pp: 1047-1062
Bernhard Egger , Samsung Advanced Institute of Technology, Giheung
Seungkyun Kim , Seoul National University, Seoul
Choonki Jang , Seoul National University, Seoul
Jaejin Lee , Seoul National University, Seoul
Sang Lyul Min , Seoul National University, Seoul
Heonshik Shin , Seoul National University, Seoul
ABSTRACT
We propose a code scratchpad memory (SPM) management technique with demand paging for embedded systems that have no memory management unit. Based on profiling information, a postpass optimizer analyzes and optimizes application binaries in a fully automated process. It classifies the code of the application including libraries into three classes based on a mixed integer linear programming formulation: External code is executed directly from the external memory. Pinned code is loaded into the SPM when the application starts and stays in the SPM. Paged code is loaded into/unloaded from the SPM on demand. We evaluate the proposed technique by running 14 embedded applications both on a cycle-accurate ARM processor simulator and an ARM1136JF-S core. On the simulator, the reference case, a four-way set-associative cache, is compared to a direct-mapped cache and an SPM of comparable die area. On average, we observe an improvement of 12 percent in runtime performance and a 21 percent reduction in energy consumption. On the ARM11 board, the reference case run on the 16-KB four-way set-associative cache is compared to the demand paging solution on the 16-KB SPM, optionally supported by the cache. The measured results show both a runtime performance improvement and a reduction of the energy consumption by 23 percent, on average.
INDEX TERMS
Compilers, postpass optimization, code placement, demand paging, scratchpad memory, embedded systems.
CITATION
Bernhard Egger, Seungkyun Kim, Choonki Jang, Jaejin Lee, Sang Lyul Min, Heonshik Shin, "Scratchpad Memory Management Techniques for Code in Embedded Systems without an MMU", IEEE Transactions on Computers, vol.59, no. 8, pp. 1047-1062, August 2010, doi:10.1109/TC.2009.188
REFERENCES
[1] "ARM Architecture Version 5 (ARMv5)," http:/www.arm.com, 1996.
[2] "ARM Architecture Version 6 (ARMv6)," http:/www.arm.com, 2002.
[3] "The Intel IXP Network Processor," http://developer.intel.com/technology/itj/ 2002volume06issue03/, 2002.
[4] "Intel XScale Architecture," http:/www.intel.com, 2002.
[5] "Philips LPC3180 Microcontroller," http:/www.standardics. philips.com/, 2006.
[6] F. Angiolini, L. Benini, and A. Caprara, "Polynomial-time Algorithm for On-Chip Scratchpad Memory Partitioning," Proc. 2003 Int'l Conf. Compilers, Architecture and Synthesis for Embedded Systems (CASES '03), pp. 318-326, 2003.
[7] F. Angiolini, F. Menichelli, A. Ferrero, L. Benini, and M. Olivieri, "A Post-Compiler Approach to Scratchpad Mapping of Code," Proc. 2004 Int'l Conf. Compilers, Architecture, and Synthesis for Embedded Systems (CASES '04), pp. 259-267, 2004.
[8] O. Avissar and R. Barua, "An Optimal Memory Allocation Scheme for Scratchpad-Based Embedded Systems," IEEE Trans. Embedded Computing Systems, vol. 1, no. 1, pp. 6-26, Nov. 2002.
[9] R. Banakar, S. Steinke, B.-S. Lee, M. Balakrishnan, and P. Marwedel, "Scratchpad Memory: A Design Alternative for Cache On-Chip Memory in Embedded Systems," Proc. 10th Int'l Symp. Hardware/Software Codesign (CODES), May 2002.
[10] B. Egger, J. Lee, and H. Shin, "Dynamic Scratchpad Memory Management for Code in Portable Systems with an MMU," Trans. Embedded Computing Systems, vol. 7, no. 2, pp. 1-38, 2008.
[11] P. Francesco, P. Marchal, D. Atienza, L. Benini, F. Catthoor, and J.M. Mendias, "An Integrated Hardware/Software Approach for Runtime Scratchpad Management," Proc. 41st Ann. Conf. Design Automation (DAC '04), pp. 238-243, 2004.
[12] M. Kandemir and A. Choudhary, "Compiler-Directed Scratch Pad Memory Hierarchy Design and Management," Proc. 39th Conf. Design Automation (DAC '02), pp. 628-633, 2002.
[13] M. Kandemir, J. Ramanujam, M.J. Irwin, N. Vijaykrishnan, I. Kadayif, and A. Parikh, "Dynamic Management of Scratch-Pad Memory Space," Proc. 38th Conf. Design Automation (DAC '01), pp. 690-695, 2001.
[14] S. Steinke, N. Grunwald, L. Wehmeyer, R. Banakar, M. Balakrishnan, and P. Marwedel, "Reducing Energy Consumption by Dynamic Copying of Instructions Onto Onchip Memory," Proc. 15th Int'l Symp. System Synthesis (ISSS '02), pp. 213-218, 2002.
[15] S. Steinke, L. Wehmeyer, B.-S. Lee, and P. Marwedel, "Assigning Program and Data Objects to Scratchpad for Energy Reduction," Proc. Conf. Design, Automation and Test in Europe (DATE '02), p. 409, 2002.
[16] S. Udayakumaran and R. Barua, "Compiler-Decided Dynamic Memory Allocation for Scratch-Pad Based Embedded Systems," Proc. 2003 Int'l Conf. Compilers, Architecture and Synthesis for Embedded Systems (CASES '03), pp. 276-286, 2003.
[17] M. Verma, L. Wehmeyer, and P. Marwedel, "Cache-Aware Scratchpad Allocation Algorithm," Proc. Int'l Conf. Design, Automation and Test in Europe (DATE), Feb. 2004.
[18] M. Verma, L. Wehmeyer, and P. Marwedel, "Dynamic Overlay of Scratchpad Memory for Energy Minimization," Proc. Int'l Conf. Hardware/Software Codesign and System Synthesis, Sept. 2004.
[19] A. Janapsatya, A. Ignjatovic, and S. Parameswaran, "A Novel Instruction Scratchpad Memory Optimization Method Based on Concomitance Metric," Proc. 2006 Conf. Asia South Pacific Design Automation (ASP-DAC '06), pp. 612-617, 2006.
[20] R. Muth, S. Debray, S. Watterson, and K.D. Bosschere, "Alto : A Link-Time Optimizer for the Compaq Alpha," Software Practice and Experience, vol. 31, pp. 67-101, 2001.
[21] A. Gordon-Ross, S. Cotterell, and F. Vahid, "Exploiting Fixed Programs in Embedded Systems: A Loop Cache Example," Proc. IEEE Computer Architecture Letters, Jan. 2002.
[22] L.H. Lee, B. Moyer, and J. Arends, "Instruction Fetch Energy Reduction Using Loop Caches for Embedded Applications with Small Tight Loops," Proc. 1999 Int'l Symp. Low Power Electronics and Design (ISLPED '99), pp. 267-269, Feb. 1999.
[23] J. Lee, J. Kim, C. Jang, S. Kim, B. Egger, K. Kim, and S. Han, "Facsim: A Fast and Cycle-Accurate Architecture Simulator for Embedded Systems," Proc. 2008 ACM SIGPLAN/SIGBED Conf. Languages, Compilers and Tool Support for Embedded Systems (LCTES), June 2008.
[24] "Seoul National University Advanced Compiler Tool Kit," http://aces.snu.ac.krsnack.html, 2004.
[25] ARM, "ARM Ltd.," http:/www.arm.com, 2010.
[26] T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to Algorithms. The MIT Press, 1990.
[27] M.R. Guthaus, J.S. Ringenberg, D. Ernst, T.M. Austin, T. Mudge, and R.B. Brown, "Mibench: A Free, Commercially Representative Embedded Benchmark Suite," Proc. Fourth Ann. Workshop Workload Characterization, Dec. 1998.
[28] C. Lee, M. Potkonjak, and W.H. Mangione-Smith, "Mediabench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems," Proc. Int'l Symp. Microarchitecture, pp. 330-335, 1997.
[29] T.E.M.B. Consortium, "EEMBC Benchmark," http:/www.eembc. org, 2008.
[30] "MP3 Reference Decoder," http://www.mp3-tech.org/ programmer/ sourcesdist10.tgz, 1996.
[31] "H.264 Video Codec," http://www.itu.int/recT-REC-H.264, 2003.
[32] "Pretty Good Privacy (PGPi)," http:/www.pgpi.org/, 2002.
[33] Micron Technology, Inc., "Mobile SDRAM Power Calc 10," http://www.micron.comsystemcalc, 2004.
[34] Samsung Semiconductor, "K4X51163PC Mobile DDR SRAM," http://www.samsung.com/products/semiconductor MobileSDRAM/, 2005.
[35] S. Wilton and N. Jouppi, "CACTI: An Enhanced Cache Access and Cycle Time Model," IEEE J. Solid State Circuits, vol. 31, no. 5, pp. 677-688, May 1996.
[36] "ARM926EJ-S Jazelle-Enhanced Macrocell," http://www.arm. com/products/CPUsARM926EJ-S.html , 2001.
[37] "ARM1136JF-S Processor," http://www.arm.com/products/CPUsARM1136JF-S.html , 2002.
[38] Micron Technology, Inc, "MT48H8M16LF Mobile SDRAM," http://www.micron.com/products/drammobilesdram /, 2003.
[39] A. Shrivastava, I. Issenin, and N. Dutt, "Compilation Techniques for Energy Reduction in Horizontally Partitioned Cache Architectures," Proc. 2005 Int'l Conf. Compilers, Architectures and Synthesis for Embedded Systems (CASES '05), pp. 90-96, 2005.
[40] R. Cytron and P.G. Loewner, "An Automatic Overlay Generator," IBM J. Research and Development, vol. 30, no. 6, pp. 603-608, 1986.
[41] A. Silberschatz, P. Galvin, and G. Gagne, Applied Operating System Concepts. John Wiley and Sons, Inc., 2003.
[42] P.R. Panda, N.D. Dutt, and A. Nicolau, Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration. Kluwer Academic Publishers, 1999.
17 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool