The Community for Technology Leaders
2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) (2015)
Shenzhen, China
May 4, 2015 to May 7, 2015
ISBN: 978-1-4799-8006-2
pp: 617-626
Energy consumption has become one of the most important factors in High Performance Computing platforms. However, while there are various algorithmic and programming techniques to save energy, a user has currently no incentive to employ them, as they might result in worse performance. We propose to manage the energy budget of a supercomputer through EnergyFairShare (EFS), a FairShare-like scheduling algorithm. FairShare is a classic scheduling rule that prioritizes jobs belonging to users who were assigned small amount of CPU-second in the past. Similarly, EFS keeps track of users 'consumption of Watt-seconds and prioritizes those whom jobs consumed less energy. Therefore, EFS incentives users to optimize their code for energy efficiency. Having higher priority, jobs have smaller queuing times and, thus, smaller turn-around time. To validate this principle, we implemented EFS in a scheduling simulator and processed workloads from various HPC centers. The results show that, by reducing it energy consumption, auser will reduce it stretch (slowdown), compared to increasing it energy consumption. To validate the general feasibility odour approach, we also implemented EFS as an extension forSLURM, a popular HPC resource and job management system.We validated our plugin both by emulating a large scale platform, and by experiments upon a real cluster with monitored energy consumption. We observed smaller waiting times for energy efficient users.
Energy consumption, Supercomputers, Program processors, Hardware, Standards, Monitoring, Power measurement

Y. Georgiou, D. Glesser, K. Rzadca and D. Trystram, "A Scheduler-Level Incentive Mechanism for Energy Efficiency in HPC," 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)(CCGRID), Shenzhen, China, 2015, pp. 617-626.
89 ms
(Ver 3.3 (11022016))