This Article 
 Bibliographic References 
 Add to: 
Power-Performance Simulation and Design Strategies for Single-Chip Heterogeneous Multiprocessors
June 2005 (vol. 54 no. 6)
pp. 684-697
Single chip heterogeneous multiprocessors (SCHMs) are becoming more commonplace, especially in portable devices where reduced energy consumption is a priority. The use of coordinated collections of processors which are simpler or which execute at lower clock frequencies is widely recognized as a means of reducing power while maintaining latency and throughput. A primary limitation of using this approach to reduce power at the system level has been the time to develop and simulate models of many processors at the instruction set simulator level. High-level models, simulators, and design strategies for SCHMs are required to enable designers to think in terms of collections of cooperating, heterogeneous processors in order to reduce power. Toward this end, this paper has two contributions. The first is to extend a unique, preexisting high-level performance simulator, the Modeling Environment for Software and Hardware (MESH), to include power annotations. MESH can be thought of as a thread-level simulator instead of an instruction-level simulator. Thus, the problem is to understand how power might be calibrated and annotated with program fragments instead of at the instruction level. Program fragments are finer-grained than threads and coarser-grained than instructions. Our experimentation found that compilers produce instruction patterns that allow power to be annotated at this level using a single number over all compiler-generated fragments executing on a processor. Since energy is power*time, this makes system runtime (i.e., performance) the dominant factor to be dynamically calculated at this level of simulation. The second contribution arises from the observation that high-level modeling is most beneficial when it opens up new possibilities for organizing designs. Thus, we introduce a design strategy, enabled by the high-level performance power-simulation, which we refer to as spatial voltage scaling. The strategy both reduces overall system power consumption and improves performance in our example. The design space for this design strategy could not be explored without high-level SCHM power-performance simulation.

[1] R. Bergamaschi, I. Bolsens, R. Gupta, R. Harr, A. Jerraya, K. Keutzer, K. Olukotun, and K. Vissers, “Are Single-Chip Multiprocessors in Reach?” IEEE Design and Test of Computers, vol. 18, no. 1, pp. 82-89, Jan./Feb. 2001.
[2] W. Wolf, “How Many System Architectures?” Computer, vol. 36, no. 3, pp. 93-95, Mar. 2003.
[3] F. Karim, A. Mellan, A. Nguyen, U. Aydonat, and T. Abdelrahman, “A Multilevel Computing Architecture for Embedded Multimedia Applications,” IEEE Micro, vol. 24, no. 3, pp. 56-66, May-June 2004.
[4] J.M. Paul, “Programmer's View of SoCs,” Proc. Int'l Conf. Hardware/Software Codesign and System Synthesis (CODES-ISSS), pp. 159-161, Oct. 2003.
[5] J.L. Henning, “SPEC CPU2000: Measuring CPU Performance in the New Millennium,” Computer, vol. 33, no. 7, pp. 28-35, July 2000.
[6] M.R. Guthaus, J.S. Ringenberg, D. Ernst, T.M. Austin, T. Mudge, and R.B. Brown, “MiBench: A Free, Commercially Representative Embedded Benchmark Suite,” Proc. 2001 IEEE Int'l Workshop Workload Characterization (WWC-4), pp. 3-14, Dec. 2001.
[7] J.M. Paul, D.E. Thomas, and A. Bobrek, “Benchmark-Based Design Strategies for Single Chip Heterogeneous Multiprocessors,” Proc. Second IEEE/ACM/IFIP Int'l Conf. Hardware/Software Codesign and System Synthesis, 2004, pp. 54-59, 2004.
[8] A.S. Cassidy, J.M. Paul, and D.E. Thomas, “Layered, Multi-Threaded, High-Level Performance Design,” Proc. Design, Automation and Test in Europe Conf. and Exhibition, 2003, pp. 954-959, 2003.
[9] J.M. Paul, A. Bobrek, J.E. Nelson, J.J. Pieper, and D.E. Thomas, “Schedulers as Model-Based Design Elements in Programmable Heterogeneous Multiprocessors,” Proc. Design Automation Conf., 2003, pp. 408-411, June 2003.
[10] A. Bobrek, J.J. Pieper, J.E. Nelson, J.M. Paul, and D.E. Thomas, “Modeling Shared Resource Contention Using a Hybrid Simulation/Analytical Approach,” Proc. Design, Automation and Test in Europe Conf. and Exhibition, 2004, vol. 2, pp. 1144-1149, Feb. 2004.
[11] J.M. Paul, S.N. Peffers, and D.E. Thomas, “A Codesign Virtual Machine for Hierarchical, Balanced Hardware/Software System Modeling,” Proc. Design Automation Conf., pp. 390-395, 2000.
[12] C.L. Seitz, “System Timing,” Introduction to VLSI Systems, C. Mead and L. Conway, eds., Reading, Mass.: Addison-Wesley, 1980.
[13] C. Brandolese, W. Fomacian, F. Salice, and D. Sciuto, “An Instruction-Level Functionality-Based Energy Estimation Model for 32-Bits Microprocessors,” Proc. Design Automation Conf., pp. 346-350, 2000.
[14] X. Liu and M.C. Papaefthymiou, “A Static Power Estimation Methodology for IP-Based Design,” Proc. Design, Automation and Test in Europe, 2001, Conf. and Exhibition, pp. 280-287, 2001.
[15] E. Macii, M. Pedram, and F. Somenzi, “High-Level Power Modeling, Estimation, and Optimization,” IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 17, no. 11, pp. 1061-1079, Nov. 1998.
[16] I. Kadayif, M. Kandemir, N. Vijaykrishnan, M.J. Irwin, and A. Sivasubramaniam, “EAC: A Compiler Framework for High-Level Energy Estimation and Optimization,” Proc. Design, Automation and Test in Europe Conf. and Exhibition, 2002, pp. 436-442, 2002.
[17] M. Lajolo, A. Raghunathan, S. Dey, and L. Lavagno, “Cosimulation-Based Power Estimation for System-on-Chip Design,” IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 10, no. 3, pp. 253-266, June 2002.
[18] D. Brooks, V. Tiwari, and M. Martonosi, “Wattch: A Framework for Architectural-Level Power Analysis and Optimizations,” Proc. 27th Int'l Symp. Computer Architecture, 2000, pp. 83-94, 2000.
[19] W. Ye, N. Vijaykrishnan, M. Kandemir, and M.J. Irwin, “The Design and Use of Simplepower: A Cycle-Accurate Energy Estimation Tool,” Proc. Design Automation Conf., 2000, pp. 340-345, 2000.
[20] N. Kim, T. Kgil, V. Bertacco, T. Austin, and T. Mudge, “Microarchitectural Power Modeling Techniques for Deep Sub-Micron Microprocessors,” Proc. In'tl Symp. Low Power Electronics and Design (ISLPED), pp. 212-217, Aug. 2004.
[21] D. Burger and T.M. Austin, “The SimpleScalar Tool Set, Version 2.0,” SIGARCH Computer Architecture News, vol. 25, no. 3, pp. 13-25, 1997.
[22] J.T. Russell and M.F. Jacome, “Software Power Estimation and Optimization for High Performance, 32-Bit Embedded Processors,” Proc. Int'l Conf. Computer Design: VLSI in Computers and Processors (ICCD '98), pp. 328-333, Oct. 1998.
[23] A. Bona, M. Sami, D. Sciuto, C. Silvano, V. Zaccaria, and R. Zafalon, “Energy Estimation and Optimization of Embedded VLIW Processors Based on Instruction Clustering,” Proc. Design Automation Conf., pp. 886-891, 2002.
[24] G. Qu, N. Kawabe, K. Usami, and M. Potkonjak, “Function-Level Power Estimation Methodology for Microprocessors,” Proc. Design Automation Conf., pp. 810-813, 2000.
[25] M.T.-C. Lee, V. Tiwari, S. Malik, and M. Fujita, “Power Analysis and Minimization Techniques for Embedded DSP Software,” IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 5, no. 1, pp. 123-135, Mar. 1997.
[26] T.D. Burd, T.A. Pering, A.J. Stratakos, and R.W. Brodersen, “A Dynamic Voltage Scaled Microprocessor System,” IEEE J. Solid-State Circuits, vol. 35, no. 11, pp. 1571-1580, Nov. 2000.
[27] V. Tiwari and M.T.-C. Lee, “Power Analysis of a 32-Bit Embedded Microcontroller,” Proc. Design Automation Conf., 1995, Proc. ASP-DAC '95/CHDL '95/VLSI '95, IFIP Int'l Conf. Hardware Description Languages; IFIP Int'l Conf. Very Large Scale Integration, Asian and South Pacific, pp. 141-148, 1995.
[28] V. Tiwari, S. Malik, A. Wolfe, and M.T.-C. Lee, “Instruction Level Power Analysis and Optimization of Software,” Proc. Ninth Int'l Conf. VLSI Design, pp. 326-328, Jan. 1996.
[29] J. Flinn and M. Satyanarayanan, “PowerScope: A Tool for Profiling the Energy Usage of Mobile Applications,” Proc. Second IEEE Workshop Mobile Computing Systems and Applications (WMCSA '99), pp. 2-10, Feb. 1999.
[30] A.J. KleinOsowski and D.J. Lilja, “MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research,” Computer Architecture Letters, vol. 1, June 2002.
[31] M.B. Srivastava, A.P. Chandrakasan, and R.W. Brodersen, “Predictive System Shutdown and Other Architectural Techniques for Energy Efficient Programmable Computation,” IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 4, no. 1, pp. 42-55, Mar. 1996.
[32] A. Waizman and C. Chee-Yee, “Package Capacitors Impact on Microprocessor Maximum Operating Frequency,” Proc. 51st Electronic Components and Technology Conf., pp. 118-122, 2001.
[33] Intel PXA26x Processor Design Guide, manuals27863902.pdf, Oct. 2002.
[34] Transmeta Crusoe Processor Model TM5500 Processor Product Brief, brief_030206.pdf , Feb. 2003.
[35] T. Weiyu, R. Gupta, and A. Nicolau, “Power Savings in Embedded Processors through Decode Filter Cache,” Proc. Design, Automation and Test in Europe Conf. and Exhibition, pp. 443-448, 2002.

Index Terms:
System architectures, integration and modeling, power management, low-power design, energy-aware systems, performance analysis, design aids.
Brett H. Meyer, Joshua J. Pieper, JoAnn M. Paul, Jeffrey E. Nelson, Sean M. Pieper, Anthony G. Rowe, "Power-Performance Simulation and Design Strategies for Single-Chip Heterogeneous Multiprocessors," IEEE Transactions on Computers, vol. 54, no. 6, pp. 684-697, June 2005, doi:10.1109/TC.2005.103
Usage of this product signifies your acceptance of the Terms of Use.