The Community for Technology Leaders
RSS Icon
Issue No.12 - Dec. (2012 vol.61)
pp: 1752-1764
Osman Sarood , University of Illinois Urbana Champaign, Urbana
Phil Miller , University of Illinois at Urbana-Champaign, Urbana
Ehsan Totoni , University of Illinois at Urbana-Champaign, Urbana
Laxmikant V. Kalé , University of Illinois at Urbana-Champaign, Urbana
As we move to exascale machines, both peak power demand and total energy consumption have become prominent challenges. A significant portion of that power and energy consumption is devoted to cooling, which we strive to minimize in this work. We propose a scheme based on a combination of limiting processor temperatures using dynamic voltage and frequency scaling (DVFS) and frequency-aware load balancing that reduces cooling energy consumption and prevents hot spot formation. Our approach is particularly designed for parallel applications, which are typically tightly coupled, and tries to minimize the timing penalty associated with temperature control. This paper describes results from experiments using five different Charm++ and MPI applications with a range of power and utilization profiles. They were run on a 32-node (128-core) cluster with a dedicated air conditioning unit. The scheme is assessed based on three metrics: the ability to control processors' temperature and hence avoid hot spots, minimization of timing penalty, and cooling energy savings. Our results show cooling energy savings of up to 63 percent, with a timing penalty of only 2-23 percent.
Energy consumption, Energy efficiency, Load management, Runtime, Energy management, Green design, Temperature measurement, DVFS, Green IT, temperature aware, load balancing, cooling energy
Osman Sarood, Phil Miller, Ehsan Totoni, Laxmikant V. Kalé, "“Cool” Load Balancing for High Performance Computing Data Centers", IEEE Transactions on Computers, vol.61, no. 12, pp. 1752-1764, Dec. 2012, doi:10.1109/TC.2012.143
[1] R.F. Sullivan, "Alternating Cold and Hot Aisles Provides More Reliable Cooling for Server Farms," white paper, Uptime Inst., 2000.
[2] C.D. Patel, C.E. Bash, R. Sharma, M. Beitelmal, and R. Friedrich, "Smart Cooling of Data Centers," Proc. ASME Conf., vol. 2003, no. 36908b, pp. 129-137, 2003.
[3] R. Sawyer, "Calculating Total Power Requirements for Data Centers," white paper, Am. Power Conversion, 2004.
[4] O. Sarood, A. Gupta, and L.V. Kale, "Temperature Aware Load Balancing for Parallel Applications: Preliminary Work," Proc. Seventh Workshop High-Performance, Power-Aware Computing (HPPAC '11), 2011.
[5] O. Sarood and L.V. Kalé, "A 'Cool' Load Balancer for Parallel Applications," Proc. ACM/IEEE Conf. Supercomputing, Nov. 2011.
[6] C. Bash and G. Forman, "Cool Job Allocation: Measuring the Power Savings of Placing Jobs at Cooling-Efficient Locations in the Data Center," Proc. USENIX Ann. Technical Conf., pp. 29:1-29:6, 2007.
[7] L. Wang, G. von Laszewski, J. Dayal, and T. Furlani, "Thermal Aware Workload Scheduling with Backfilling for Green Data Centers," Proc. IEEE 28th Int'l Performance Computing and Comm. Conf. (IPCCC), Dec. 2009.
[8] L. Wang, G. von Laszewski, J. Dayal, X. He, A. Younge, and T. Furlani, "Towards Thermal Aware Workload Scheduling in a Data Center," Proc. Int'l Symp. Pervasive Systems, Algorithms, and Networks (ISPAN), Dec. 2009.
[9] Q. Tang, S. Gupta, D. Stanzione, and P. Cayton, "Thermal-Aware Task Scheduling to Minimize Energy Usage of Blade Server Based Datacenters," Proc. IEEE Second Int'l Symp. Dependable, Autonomic and Secure Computing, 2006.
[10] D. Rajan and P. Yu, "Temperature-Aware Scheduling: When Is System-Throttling Good Enough?" Proc. Ninth Int'l Conf. Web-Age Information Management (WAIM '08), pp. 397-404, July 2008.
[11] H. Le, S. Li, N. Pham, J. Heo, and T. Abdelzaher, "Joint Optimization of Computing and Cooling Energy: Analytic Model and a Machine Room Case Study," Proc. IEEE Int'l Conf. Distributed Computing Systems (ICDCS), June 2012.
[12] B. Rountree, D.K. Lowenthal, S. Funk, V.W. Freeh, B.R. de Supinski, and M. Schulz, "Bounding Energy Consumption in Large-Scale MPI Programs," Proc. ACM/IEEE Conf. Supercomputing, pp. 49:1-49:9, 2007.
[13] M.Y. Lim, V.W. Freeh, and D.K. Lowenthal, "Adaptive, Transparent CPU Scaling Algorithms Leveraging Inter-node MPI Communication Regions," Parallel Computing, vol. 37, nos. 10/11, pp. 667-683, 2011.
[14] R. Springer, D.K. Lowenthal, B. Rountree, and V.W. Freeh, "Minimizing Execution Time in MPI Programs on An Energy-Constrained, Power-Scalable Cluster," Proc. 11th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP '06), pp. 230-238, 2006.
[15] S. Huang and W. Feng, "Energy-Efficient Cluster Computing via Accurate Workload Characterization," Proc. IEEE/ACM Ninth Int'l Symp. Cluster Computing and the Grid (CCGRID '09), pp. 68-75, 2009.
[16] H. Hanson, S. Keckler, R.K, S. Ghiasi, F. Rawson, and J. Rubio, "Power, Performance, and Thermal Management for High-Performance Systems," Proc. IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS), Mar. 2007.
[17] A. Banerjee, T. Mukherjee, G. Varsamopoulos, and S. Gupta, "Cooling-Aware and Thermal-Aware Workload Placement for Green HPC Data Centers," Proc. Int'l Green Computing Conf., pp. 245-256, Aug. 2010.
[18] Q. Tang, S. Gupta, and G. Varsamopoulos, "Energy-Efficient Thermal-Aware Task Scheduling for Homogeneous High-Performance Computing Data Centers: A Cyber-Physical Approach," IEEE Trans. Parallel and Distributed Systems, vol. 19, no. 11, pp. 1458-1472, Nov. 2008.
[19] A. Merkel and F. Bellosa, "Balancing Power Consumption in Multiprocessor Systems," Proc. First ACM SIGOPS/EuroSys European Conf. Computer Systems (EuroSys '06), pp. 403-414, 2006.
[20] V.W. Freeh and D.K. Lowenthal, "Using Multiple Energy Gears in MPI Programs on a Power-Scalable Cluster," Proc. 10th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP '05), pp. 164-173, 1065967 , 2005.
[21] L. Kalé and S. Krishnan, "CHARM++: A Portable Concurrent Object Oriented System Based on C++," Proc. Conf. Object Oriented Programming Systems, Languages and Applications (OOPSLA '93), A. Paepcke ed., pp. 91-108, Sept. 1993.
[22] R.K. Brunner and L.V. Kalé, "Handling Application-Induced Load Imbalance Using Parallel Objects," Proc. Int'l Workshop Parallel and Distributed Computing for Symbolic and Irregular Applications, pp. 167-181, 2000.
[23] P. Jetley, F. Gioachin, C. Mendes, L.V. Kale, and T.R. Quinn, "Massively Parallel Cosmological Simulations with ChaNGa," Proc. IEEE Int'l Parallel and Distributed Processing Symp., 2008.
[24] G. Zheng, A. Bhatele, E. Meneses, and L.V. Kale, "Periodic Hierarchical Load Balancing for Large Supercomputers," Int'l J. High Performance Computing Applications, vol. 25, pp. 371-385, Mar. 2011.
[25] "Intel Turbo Boost Technology," technologyturboboost /, 2012.
[26] D.B.E.B.J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, and S. Weeratunga, "The NAS Parallel Benchmarks," Technical Report RNR-04-077, NASA Ames Research Center, 1994.
[27] R. Kufrin, "Perfsuite: An Accessible, Open Source Performance Analysis Environment for Linux," Proc. Linux Cluster Conf., 2005.
75 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool