Issue No. 12 - Dec. (2012 vol. 61)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2012.166
Michael Cardosa , University of Minnesota, Minneapolis
Aameek Singh , IBM Almaden Research Center, San Jose
Himabindu Pucha , IBM Almaden Research Center, San Jose
Abhishek Chandra , University of Minnesota, Minneapolis
MapReduce is a distributed computing paradigm widely used for building large-scale data processing applications. When used in cloud environments, MapReduce clusters are dynamically created using virtual machines (VMs) and managed by the cloud provider. In this paper, we study the energy efficiency problem for such MapReduce clouds. We describe a unique spatio-temporal tradeoff that includes efficient spatial fitting of VMs on servers to achieve high utilization of machine resources, as well as balanced temporal fitting of servers with VMs having similar runtimes to ensure a server runs at a high utilization throughout its uptime. We propose VM placement algorithms that explicitly incorporate these tradeoffs. Further, we propose techniques that dynamically scale MapReduce clusters to further improve energy consumption while ensuring that jobs meet or improve their expected runtimes. Our algorithms achieve energy savings over existing placement techniques, and an additional optimization technique further achieves savings while simultaneously improving job performance.
Virtual machines, Runtime, Clustering algorithms, Measurement, Resource management, Optimization, Heuristic algorithms, Energy efficiency, Energy management, Cloud computing, energy-efficiency, MapReduce, Hadoop, virtualization, cloud
H. Pucha, A. Singh, M. Cardosa and A. Chandra, "Exploiting Spatio-Temporal Tradeoffs for Energy-Aware MapReduce in the Cloud," in IEEE Transactions on Computers, vol. 61, no. , pp. 1737-1751, 2012.