The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - Jan. (2013 vol.24)
pp: 170-183
Andrea Bartolini , University of Bologna, Bologna
Matteo Cacciari , University of Bologna, Bologna
Andrea Tilli , University of Bologna, Bologna
Luca Benini , University of Bologna, Bologna
ABSTRACT
As result of technology scaling, single-chip multicore power density increases and its spatial and temporal workload variation leads to temperature hot-spots, which may cause nonuniform ageing and accelerated chip failure. These critical issues can be tackled by closed-loop thermal and reliability management policies. Model predictive controllers (MPC) outperform classic feedback controllers since they are capable of minimizing performance loss while enforcing safe working temperature. Unfortunately, MPC controllers rely on a priori knowledge of thermal models and their complexity exponentially grows with the number of controlled cores. In this paper, we present a scalable, fully distributed, energy-aware thermal management solution for single-chip multicore platforms. The model-predictive controller complexity is drastically reduced by splitting it in a set of simpler interacting controllers, each one allocated to a core in the system. Locally, each node selects the optimal frequency to meet temperature constraints while minimizing the performance penalty and system energy. Comparable performance with state-of-the-art MPC controllers is achieved by letting controllers exchange a limited amount of information at runtime on a neighborhood basis. In addition, we address model uncertainty by supporting learning of the thermal model with a novel distributed self-calibration approach that matches well the controller architecture.
INDEX TERMS
Temperature sensors, Temperature measurement, Multicore processing, Complexity theory, Power demand, system identification, Thermal control, energy minimization, multicore, model predictive controller
CITATION
Andrea Bartolini, Matteo Cacciari, Andrea Tilli, Luca Benini, "Thermal and Energy Management of High-Performance Multicores: Distributed and Self-Calibrating Model-Predictive Controller", IEEE Transactions on Parallel & Distributed Systems, vol.24, no. 1, pp. 170-183, Jan. 2013, doi:10.1109/TPDS.2012.117
REFERENCES
[1] "Make IT Green," http://www.greenpeace.org/international/ Global/international/planet-2/report/2010/ 3make-it-green-cloud-computing.pdf, 2012.
[2] "SMART 2020: Enabling the Low Carbon Economy in the Information Age," http://www.smart2020.org/_assets/files01_Smart2020ReportSummary.pdf , 2008.
[3] M. Monchiero, R. Canal, and A. Gonzalez, "Power/Performance/Thermal Design-Space Exploration for Multicore Architectures," IEEE Trans. Parallel and Distributed Systems, vol. 19, no. 5, pp. 666-681, May 2008.
[4] Active Power "Data Center Thermal Runaway. A Review of Cooling Challenges in High Density Mission Critical Environments," http:/www.activepower.com, 2007.
[5] J. Moras, "HPC Efficiency and Scalability through Best Practices: Lessons Learned," Proc. HPC Advisory European Council Workshop, http:/www.hpcadvisorycouncil.com, May 2010.
[6] G. Contreras and M. Martonosi, "Techniques for Real-System Characterization of Java Virtual Machine Energy and Power Behavior," Proc. IEEE Int'l Symp. Workload Characterization (IISWC), Oct. 2006.
[7] R. Yavatkar and M. Tirumala, "Platform Wide Innovations to Overcome Thermal Challenges," Microelectronics J., vol. 39, no. 7, pp. 930-941, 2008.
[8] X. Wang, K. Ma, and Y. Wang, "Adaptive Power Control with Online Model Estimation for Chip Multiprocessors," IEEE Trans. Parallel and Distributed Systems, vol. 22, no. 10, pp. 1681-1696, Oct. 2011.
[9] G. Dhiman and T.S. Rosing, "Dynamic Voltage Frequency Scaling for Multi-Tasking Systems Using Online Learning," Proc. Int'l Symp. Low Power Electronics and Design (ISLPED), Aug. 2007.
[10] G. Keramidas, V. Spiliopoulos, and S. Kaxiras, "Interval-Based Models for Run-Time Dvfs Orchestration in Superscalar Processors," Proc. Seventh ACM Int'l Conf. Computing Frontiers (CF '10), pp. 287-296, 2010.
[11] J. Kong, S.W. Chung, and K. Skadron, "Recent Thermal Management Techniques for Microprocessors," ACM Computing Survey, vol. 44, article 13, 2012.
[12] P. Chaparro, J. Gonzalez, G. Magklis, Q. Cai, and A. Gonzalez, "Understanding the Thermal Implications of Multi-Core Architectures," IEEE Trans. Parallel and Distributed Systems, vol. 18, no. 8, pp. 1055-1065, Aug. 2007.
[13] M.M. Sabry, D. Atienza, and A.K. Coskun, "Thermal Analysis and Active Cooling Management for 3D MPSoCs," Proc. Int'l Symp. Circuits and Systems (ISCAS), 2011.
[14] K. Kang, J. Kim, S. Yoo, and C.M. Kyung, "Runtime Power Management of 3-D Multi-Core Architectures under Peak Power and Temperature Constraints," IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 30, no. 6, pp. 905-918, June 2011.
[15] X. Zhou, J. Yang, Y. Xu, Y. Zhang, and J. Zhao, "Thermal-Aware Task Scheduling for 3D Multicore Processors," IEEE Trans. Parallel and Distributed Systems, vol. 21, no. 1, pp. 60-71, Jan. 2010.
[16] A.K. Coskun, J.L. Ayala, D. Atienza, T.S. Rosing, and Y. Leblebici, "Dynamic Thermal Management in 3D Multicore Architectures," Proc. Design, Automation and Test in Europe Conf. and Exhibition (DATE), pp. 1410-1415, Apr. 2009.
[17] K. Skadron, T. Abdelzaher, and M.R. Stan, "Control-Theoretic Techniques and Thermal-RC Modeling for Accurate and Localized Dynamic Thermal Management," technical report, UMI Order Number: CS-2001-27., Univ. of Virginia, 2001.
[18] M. Kadin, S. Reda, and A. Uht, "Central versus Distributed Dynamic Thermal Management for Multi-Core Processors: Which One Is Better?" Proc. 19th ACM Great Lakes Symp. VLSI (GLSVLSI), pp. 137-140, 2009.
[19] Z. Wang, X. Zhu, C. McCarthy, P. Ranganathan, and V. Talwar, "Feedback Control Algorithms for Power Management of Servers," Proc. Int'l Workshop Feedback Control Implementation and Design in Computing Systems and Networks (FeBid), June 2008.
[20] E.F. Camacho and C. Bordons, Model Predictive Control. Springer, 1999.
[21] A. Bemporad, M. Morari, V. Dua, and E.N. Pistikopoulos, "The Explicit Linear Quadratic Regulator for Constrained Systems," Automatica, vol. 38, pp. 3-20, 2002.
[22] F. Zanini, D. Atienza, L. Benini, and G. De Micheli, "Multicore Thermal Management with Model Predictive Control," Proc. European Conf. Circuit Theory and Design (ECCTD), vol. 1, 2009.
[23] F. Zanini, D. Atienza, L. Benini, and G. De Micheli, "Thermal-Aware System-Level Modeling and Management for Multi-Processor Systems-On-Chip," Proc. IEEE Int'l Symp. Circuits and Systems (ISCAS), pp. 2481-2484, 2011.
[24] R. de la Guardia and W.M. Beltman, "Advanced Acoustic Management for PCs," InterNoise, 2006.
[25] R. Cochran and S. Reda, "Consistent Runtime Thermal Prediction and Control through Workload Phase Detection," Proc. 47th Design Automation Conf. (DAC), pp. 62-67, 2010.
[26] T.J.A. Eguia, S.X.-D. Tan, R. Shen, E.H. Pacheco, and M. Tirumala, "General Behavioral Thermal Modeling and Characterization for Multi-Core Microprocessor Design," Proc. Design, Automation and Test in Europe Conf. and Exhibition (DATE), 2010.
[27] P. Kumar and D. Atienza, "Runtime Adaptable On-Chip Thermal Triggers," Proc. Asia and South Pacific Design Automation Conf. (ASPDAC), pp. 255-260, 2011.
[28] J. Howard et al., "A 48-Core IA-32 Message-Passing Processor with DVFS in 45nm CMOS," Proc. Int'l Solid-State Circuits Conf. (ISSCC), 2010.
[29] F. Zanini, C.N. Jones, D. Atienza, and G. De Micheli, "Multicore Thermal Management Using Approximate Explicit Model Predictive Control," Proc. IEEE Int'l Symp. Circuits and Systems (ISCAS), pp. 3321-3324, May/June 2010.
[30] A.K. Coskun, T.S. Rosing, and K.C. Gross, "Utilizing Predictors for Efficient Thermal Management in Multiprocessor SoCs," IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 28, no. 10, pp. 1503-1516, Oct. 2009.
[31] A. Mutapcic and S. Boyd, "Processor Speed Control with Thermal Constraints," IEEE Trans. Circuits and Systems I, vol. 56, no. 9, pp. 1994-2008, Sept. 2009.
[32] E. Camponogara, D. Jia, H. Krogh, and S. Talukdar, "Distributed Model Predictive Control," IEEE Control Systems, vol. 22, no. 1, pp. 44-52, Feb. 2002.
[33] R. Scattolini, "Architectures for Distributed and Hierarchical Model Predictive Control," J. Process Control, vol. 19, no. 5, pp. 723-731, May 2009.
[34] S. Reda, R. Cochran, and A.N. Nowroz, "Improved Thermal Tracking for Processors Using Hard and Soft Sensor Allocation Techniques," IEEE Trans. Computers, vol. 60, no. 6, pp. 841-851, June 2011.
[35] S. Sharifi and T.S. Rosing, "Accurate Direct and Indirect On-Chip Temperature Sensing for Efficient Dynamic Thermal Management," IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 29, no. 10, pp. 1586-1599, Oct. 2010.
[36] A. Bartolini, M. Cacciari, A. Tilli, and L. Benini, "A Distributed and Self-Calibrating Model-Predictive Controller for Energy and Thermal Management of High-Performance Multicores," Proc. Design, Automation and Test in Europe Conf. and Exhibition (DATE), 2011.
[37] A. Bartolini, M. Cacciari, A. Tilli, L. Benini, and M. Gries, "A Virtual Platform Environment for Exploring Power, Thermal and Reliability Management Control Strategies in High-Performance Multicores," Proc. 20th Symp. Great Lakes Symp. VLSI (GLSVLSI), 2010.
[38] R. Bertran, M. Gonzalez, X. Martorell, N. Navarro, and E. Ayguade, "Decomposable and Responsive Power Models for Multicore Processors Using Performance Counters," Proc. ACM Int'l Conf. Supercomputing (ICS '10), pp. 147-158, 2010.
[39] K. Singh, M. Bhadauria, and S.A. McKee, "Real Time Power Estimation and Thread Scheduling via Performance Counters," SIGARCH Computer Architecture News, vol. 37, no. 2, pp. 46-55, July 2009.
[40] X. Chen, C. Xu, R.P. Dick, and Z. Morley Mao, "Performance and Power Modeling in a Multi-Programmed Multi-Core Environment," Proc. 47th Design Automation Conf. (DAC '10), pp. 813-818, 2012.
[41] V. Hanumaiah, S. Vrudhula, and K.S. Chatha, "Performance Optimal Online DVFS and Task Migration Techniques for Thermally Constrained Multi-Core Processors," IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems (CAD), vol. 30, no. 11, pp. 1677-1690, Nov. 2011.
[42] N. Sakran, "The Implementation of the 65nm Dual-Core 64b Merom Processor," Proc. IEEE Int'l Solid-State Circuits Conf. (ISSCC), 2007.
[43] H. Wei et al., "Accurate, Pre-RTL Temperature-Aware Design Using a Parameterized, Geometric Thermal Model," IEEE Trans. Computers, vol. 57, no. 9, pp. 1277-1288, Sept. 2008.
[44] W. Huang, K. Skadron, S. Gurumurthi, R.J. Ribando, and M.R. Stan, "Differentiating the Roles of IR Measurement and Simulation for Power and Temperature-Aware Design," Proc. Int'l Symp. Performance Analysis of Systems and Software (ISPASS), 2009.
[45] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C: The Art of Scientific Computing, second ed. Cambridge Univ. Press, 1992.
[46] qpOASES Homepage, http:/www.qpOASES.org/, 2011.
[47] F. Zanini, D. Atienza, G. De Micheli, and S.P. Boyd, "Online Convex Optimization-Based Algorithm for Thermal Management of MPSoCs," Proc. 20th Symp. Great Lakes Symp. VLSI (GLSVLSI), 2010.
[48] L. Ljung, System Identification - Theory for the User, second ed. PTR Prentice Hall, 1999.
[49] R. Guidorzi, Multivariable System Identification. Bononia Univ. Press, 2003.
[50] G. Paci, M. Morari, V. Dua, and E.N. Pistikopoulos, "Exploring Temperature-Aware Design in Low-Power MPSoCs," Proc. Design, Automation and Test in Europe Conf. and Exhibition (DATE), vol. 1, pp. 1-6, 2006.
[51] C. Bienia, S. Kumar, J.P. Singh, and K. Li, "The PARSEC Benchmark Suite: Characterization and Architectural Implications," Proc. 17th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), 2008.
[52] The MathWorks. MATLAB & Simulink, http:/www. mathworks.com/, 2012.
[53] Virtutech. Virtutech Simics, http:/www.virtutech.com/, 2012.
[54] M.M.K. Martin et al., "Multifacet's General Execution-Driven Multiprocessor Simulator (GEMS) toolset," SIGARCH Computer Architecture News, vol. 33, no. 4, pp. 92-99, 2005.
31 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool