This Article 
 Bibliographic References 
 Add to: 
Power/Performance/Thermal Design-Space Exploration for Multicore Architectures
May 2008 (vol. 19 no. 5)
pp. 666-681
Multicore architectures are ruling the recent microprocessor design trend. This is due to different reasons: better performance, thread-level parallelism bounds in modern applications, ILP diminishing returns, better thermal/power scaling (many small cores dissipate less than a large and complex one); and, ease and reuse of design. This paper presents a thorough evaluation of multicore architectures. The architecture we target is composed of a configurable number of cores, a memory hierarchy consisting of private L1, shared/private L2, and a shared bus interconnect. We consider a benchmark set composed of several parallel shared memory applications. We explore the design space related to the number of cores, L2 cache size and processor complexity, showing the behavior of the different configurations/ applications with respect to performance, energy consumption and temperature. Design tradeoffs are analyzed, stressing the interdependency of the metrics and design factors. In particular, we evaluate several chip floorplans. Their power/thermal characteristics are analyzed, showing the importance of considering thermal effects at the architectural level to achieve the best design choice.

[1] C. McNairy and R. Bhatia, “Montecito: A Dual-Core, Dual-Thread Itanium Processor,” IEEE Micro, vol. 25, no. 2, pp. 10-20, 2005.
[2] P. Kongetira, K. Aingaran, and K. Olukotun, “Niagara: A 32-Way Multithreaded Sparc Processor,” IEEE Micro, vol. 25, no. 2, pp. 21-29, 2005.
[3] B. Sinharoy, R.N. Kalla, J.M. Tendler, R.J. Eickemeyer, and J.B. Joyner, “Power5 System Microarchitecture,” IBM J. Research and Development, vol. 49, no. 4, pp. 505-521, 2005.
[4] T. Mudge, “Power: A First Class Constraint for Future Architectures,” Proc. Sixth Int'l Symp. High-Performance Computer Architecture (HPCA), 2000.
[5] K. Skadron, M.R. Stan, K. Sankaranarayanan, W. Huang, S. Velusamy, and D. Tarjan, “Temperature-Aware Microarchitecture: Modeling and Implementation,” ACM Trans. Architecture and Code Optimization, vol. 1, no. 1, pp. 94-125, 2004.
[6] J. Srinivasan, S.V. Adve, P. Bose, and J.A. Rivers, “The Impact of Technology Scaling on Lifetime Reliability,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '04), p. 177, 2004.
[7] J. Srinivasan, S.V. Adve, P. Bose, and J.A. Rivers, “Exploiting Structural Duplication for Lifetime Reliability Enhancement,” Proc. 32nd Ann. Int'l Symp. Computer Architecture (ISCA '05), pp.520-531, 2005.
[8] W. Zhao and Y. Cao, “New Generation of Predictive Technology Model for Sub-45 nm Design Exploration,” Proc. Seventh Int'l Symp. Quality Electronic Design (ISQED '06), pp. 585-590, 2006.
[9] K. Roy, S. Mukhopadhyay, and H. Mahmoodi-Meimand, “Leakage Current Mechanisms and Leakage Reduction Techniques in Deep-Submicrometer CMOS Circuits,” Proc. IEEE, vol. 91, no. 2, pp. 305-327, Feb. 2003.
[10] R. Chau, S. Datta, M. Doczy, B. Doyle, B. Jin, J. Kavalieros, A. Majumdar, M. Metz, and M. Radosavljevic, “Benchmarking Nanotechnology for High-Performance and Low-Power Logic Transistor Applications,” IEEE Trans. Nanotechnology, vol. 4, no. 2, pp. 153-158, Mar. 2005.
[11] “Superior Performance with Dual-Core,” white paper, Intel, xeonsrvrplatform brief.pdf, 2005.
[12] K. Quinn, J. Yang, and V. Turner, “The Next Evolution in Enterprise Computing: The Convergence of Multicore X86 Processing and 64-bit Operating Systems,” white paper, Advanced Micro Devices Inc., Apr. 2005.
[13] B. McCredie, POWER Roadmap. IBM Corp., http://www2. , 2006.
[14] S. Gochman, A. Mendelson, A. Naveh, and E. Rotem, “Introduction to Intel Core Duo Processor Architecture,” Intel Technology J., vol. 10, no. 2, pp. 89-98, 2006.
[15] J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, S. Sarangi, P. Sack, K. Strauss, and P. Montesinos, SESC Simulator, http:/, Jan. 2005.
[16] D. Brooks, V. Tiwari, and M. Martonosi, “Wattch: A Framework for Architectural-Level Power Analysis and Optimizations,” Proc. 27th Int'l Symp. Computer Architecture (ISCA '00), pp. 83-94, 2000.
[17] P. Shivakumar and N.P. Jouppi, “CACTI 3.0: An Integrated Cache Timing, Power, and Area Model,” Compaq Technical Report 2001/2, Western Research Laboratory, 2001.
[18] W. Liao, L. He, and K. Lepak, “Temperature and Supply Voltage Aware Performance and Power Modeling at Microarchitecture Level,” IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 24, no. 7, pp. 1042-1053, July 2005.
[19] M. Monchiero, R. Canal, and A. Gonzalez, “Design Space Exploration for Multicore Architectures: A Power/Performance/Thermal View,” Proc. 20th Ann. Int'l Conf. Supercomputing (ICS '06), 2006.
[20] J. Huh, D. Burger, and S. Keckler, “Exploring the Design Space of Future CMPs,” Proc. 10th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT '01), pp. 199-210, 2001.
[21] M. Ekman and P. Stenstrom, “Performance and Power Impact of Issue-Width in Chip-Multiprocessor Cores,” Proc. 2003 Int'l Conf. Parallel Processing (ICPP '03), pp. 359-369, 2003.
[22] J. Li and J. Martinez, “Power-Performance Implications of Thread-Level Parallelism on Chip Multiprocessors,” Proc. Int'l Symp. Performance Analysis of Systems and Software (ISPASS '05), pp. 124-134, 2005.
[23] J. Li and J.F. Martnez, “Power-Performance Considerations of Parallel Computing on Chip Multiprocessors,” ACM Trans. Architecture and Code Optimization, vol. 2, no. 4, pp. 397-422, 2005.
[24] J. Li and J. Martinez, “Dynamic Power-Performance Adaptation of Parallel Computation On-Chip Multiprocessors,” Proc. 12th Int'l Symp. High Performance Computer Architecture (HPCA), 2006.
[25] Y. Li, B. Lee, D. Brooks, Z. Hu, and K. Skadron, “CMP Design Space Exploration Subject to Physical Constraints,” Proc. 12th Int'l Symp. High Performance Computer Architecture (HPCA), 2006.
[26] L. Hsu, R. Iyer, S. Makineni, S. Reinhardt, and D. Newell, “Exploring the Cache Design Space for Large-Scale CMPs,” SIGARCH Computer Architecture News, vol. 33, no. 4, pp. 24-33, 2005.
[27] R. Kumar, V. Zyuban, and D.M. Tullsen, “Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling,” Proc. 32nd Ann. Int'l Symp. Computer Architecture (ISCA '05), pp. 408-419, 2005.
[28] Y. Li, D. Brooks, Z. Hu, and K. Skadron, “Performance, Energy, and Thermal Considerations for SMT and CMP Architectures,” Proc. 11th Int'l Symp. High-Performance Computer Architecture (HPCA '05), pp. 71-82, 2005.
[29] J. Donald and M. Martonosi, “Techniques for Multicore Thermal Management: Classification and New Exploration,” Proc. 33rd Int'l Symp. Computer Architecture (ISCA '06), pp. 78-88, 2006.
[30] P. Chaparro, G. Magklis, J. Gonzalez, and A. Gonzalez, “Distributing the Front End for Temperature Reduction,” Proc. 11th Int'l Symp. High-Performance Computer Architecture (HPCA '05), pp.61-70, 2005.
[31] K. Sankaranarayanan, S. Velusamy, M. Stan, and K. Skadron, “A Case for Thermal-Aware Floorplanning at the Microarchitectural Level,” J. Instruction-Level Parallelism,, Oct. 2005.
[32] J.C. Ku, S. Ozdemir, G. Memik, and Y. Ismail, “Thermal Management of On-Chip Caches through Power Density Minimization,” Proc. 38th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO '05), pp. 283-293, 2005.
[33] R.E. Kessler, “The Alpha 21264 Microprocessor,” IEEE Micro, vol. 19, no. 2, pp. 24-36, Mar./Apr. 1999.
[34] B.M. Beckmann and D.A. Wood, “Managing Wire Delay in Large Chip-Multiprocessor Caches,” Proc. 37th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO '04), pp. 319-330, 2004.
[35] S.C. Woo, M. Ohara, E. Torrie, J.P. Singh, and A. Gupta, “The SPLASH-2 Programs: Characterization and Methodological Considerations,” Proc. 22nd Ann. Int'l Symp. Computer Architecture (ISCA '95), pp. 24-36, 1995.
[36] M.-L. Li, R. Sasanka, S.V. Adve, Y.-K. Chen, and E. Debes, “The ALPBench Benchmark Suite for Complex Multimedia Applications,” Proc. IEEE Int'l Symp. Workload Characterization (IISWC '05), 2005.
[37] S. Palacharla, N.P. Jouppi, and J.E. Smith, “Complexity-Effective Superscalar Processors,” Proc. 24th Ann. Int'l Symp. Computer Architecture (ISCA '97), pp. 206-218, 1997.
[38] H.-S. Wang, X. Zhu, L.-S. Peh, and S. Malik, “Orion: A Power-Performance Simulator for Interconnection Networks,” Proc. 35th Ann. ACM/IEEE Int'l Symp. Microarchitecture (MICRO '02), pp. 294-305, 2002.
[39] J.A. Butts and G.S. Sohi, “A Static Power Model for Architects,” Proc. 33rd Ann. ACM/IEEE Int'l Symp. Microarchitecture (MICRO '00), pp. 191-201, 2000.
[40] A. Grove, “Changing Vectors of Moore's Law,” keynote speech, Proc. Int'l Electron Devices Meeting (IEDM '02), com/pressroom/archive/ speechesgrove_20021210.htm, 2002.
[41] The International Technology Roadmap for Semiconductors, , 2005.
[42] V. Zyuban and P.N. Strenski, “Balancing Hardware Intensity in Microprocessor Pipelines,” IBM J. Research and Development, vol. 47, nos. 5-6, pp. 585-598, 2003.

Index Terms:
Parallel Architectures, Energy-aware systems, Shared memory, Measurement, evaluation, Modeling, simulation of, multiple-processor systems
Matteo Monchiero, Ramon Canal, Antonio Gonz?lez, "Power/Performance/Thermal Design-Space Exploration for Multicore Architectures," IEEE Transactions on Parallel and Distributed Systems, vol. 19, no. 5, pp. 666-681, May 2008, doi:10.1109/TPDS.2007.70756
Usage of this product signifies your acceptance of the Terms of Use.