The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - Dec. (2012 vol.61)
pp: 1737-1751
Michael Cardosa , University of Minnesota, Minneapolis
Aameek Singh , IBM Almaden Research Center, San Jose
Himabindu Pucha , IBM Almaden Research Center, San Jose
Abhishek Chandra , University of Minnesota, Minneapolis
ABSTRACT
MapReduce is a distributed computing paradigm widely used for building large-scale data processing applications. When used in cloud environments, MapReduce clusters are dynamically created using virtual machines (VMs) and managed by the cloud provider. In this paper, we study the energy efficiency problem for such MapReduce clouds. We describe a unique spatio-temporal tradeoff that includes efficient spatial fitting of VMs on servers to achieve high utilization of machine resources, as well as balanced temporal fitting of servers with VMs having similar runtimes to ensure a server runs at a high utilization throughout its uptime. We propose VM placement algorithms that explicitly incorporate these tradeoffs. Further, we propose techniques that dynamically scale MapReduce clusters to further improve energy consumption while ensuring that jobs meet or improve their expected runtimes. Our algorithms achieve energy savings over existing placement techniques, and an additional optimization technique further achieves savings while simultaneously improving job performance.
INDEX TERMS
Virtual machines, Runtime, Clustering algorithms, Measurement, Resource management, Optimization, Heuristic algorithms, Energy efficiency, Energy management, Cloud computing, energy-efficiency, MapReduce, Hadoop, virtualization, cloud
CITATION
Michael Cardosa, Aameek Singh, Himabindu Pucha, Abhishek Chandra, "Exploiting Spatio-Temporal Tradeoffs for Energy-Aware MapReduce in the Cloud", IEEE Transactions on Computers, vol.61, no. 12, pp. 1737-1751, Dec. 2012, doi:10.1109/TC.2012.166
REFERENCES
[1] J. Dean and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," Proc. Sixth Symp. Operating System Design and Implementation (OSDI), 2004.
[2] Hadoop, http:/hadoop.apache.org, 2012.
[3] AsterData, http://www.asterdata.com/customersbarnes-and-noble.php , 2012.
[4] M. Dublin http://www.genomeweb.com/informaticsgot- hadoop , 2012.
[5] Amazon Elastic MapReduce, http:/aws.amazon.com, 2012.
[6] Amazon Elastic Compute Cloud (EC2), http://aws.amazon.comec2, 2012.
[7] IBM Smart Business Private Cloud, ibm.com/ibmcloud, 2012.
[8] J. Koomey, "Worldwide Electricity Used in Data Centers," Environmental Research Letters, vol. 3, no. 3, article 034008, 2008.
[9] C. Belady, "In the Data Center, Power and Cooling Costs More than the IT Equipment it Supports," Electronics Cooling Magazine, vol. 13, no. 1, 2007.
[10] J. Chase, D. Anderson, P. Thakar, A. Vahdat, and R. Doyle, "Managing Energy and Server Resources in Hosting Centers," Proc. 18th ACM Symp. Operating Systems Principles (SOSP), 2001.
[11] P. Bohrer, E. Elnozahy, T. Keller, M. Kistler, C. Lefurgy, C. Mcdowell, and R. Rajamony, "The Case for Power Management in Web Servers," Power Aware Computing, pp. 261-289, Kluwer Academic Publishers, 2002.
[12] R. Bianchini and R. Rajamony, "Power and Energy Management for Server Systems," IEEE Computer, vol. 37, no. 11, pp. 68-76, Nov. 2004.
[13] Y. Chen, A. Das, W. Qin, A. Sivasubramaniam, Q. Wang, and N. Gautam, "Managing Server Energy and Operational Costs in Hosting Centers," Proc. ACM SIGMETRICS Int'l Conf. Measurement and Modeling of Computer Systems (SIGMETRICS), 2005.
[14] W. Lang, J.M. Patel, and S. Shankar, "Energy Management for MapReduce Clusters," Proc. VLDB Endowment, 2010.
[15] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, "Xen and the art of Virtualization," Proc. ACM Symp. Operating Systems Principles (SOSP), 2003.
[16] G. Ananthanarayanan, Kandula, A. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris, "Reining in the Outliers in Map-Reduce Clusters Using Mantri," Proc. Symp. Operating System Design and Implementation (OSDI), 2010.
[17] M. Cardosa, P. Narang, A. Chandra, H. Pucha, and A. Singh, "STEAMEngine: Driving MapReduce Provisioning in the Cloud," Technical Report 10-023, Dept. of CSE, Univ. Minnesota, Rep. 10-023, Sep. 2010.
[18] K. Kambatla, A. Pathak, and H. Pucha, "Towards Optimizing Hadoop Provisioning in the Cloud," Proc. Conf. Hot Topics in Cloud Computing (HotCloud), 2009.
[19] S. Babu, "Towards Automatic Optimization of MapReduce Programs," Proc. First ACM Symp. Cloud Computing (SOCC), 2010.
[20] E. Elnozahy, M. Kistler, and R. Rajamony, "Energy-Efficient Server Clusters," Proc. Workshop Power-Aware Computing Systems, 2002.
[21] A. Verma, P. Ahuja, and A. Neogi, "Pmapper: Power and Migration Cost Aware Placement of Applications in Virtualized Systems," Proc. Ninth ACM/IFIP/USENIX Int'l Conf. Middleware (Middleware), 2008.
[22] T. Burd, T. Pering, A. Stratakos, and R. Brodersen, "A Dynamic Voltage-Scaled Microprocessor System," IEEE J. Solid-State Circuits, vol. 35, no. 11, pp. 1571-1580, Nov. 2000.
[23] A. Singh, M. Korupolu, and D. Mohapatra, "Server-Storage Virtualization: Integration and Load Balancing in Data Centers," Proc. IEEE/ACM Int'l Conf. High Performance Computing, Networking, Storage and Analysis (Supercomputing), 2008.
[24] T. Wood, P. Shenoy, A. Venkataramani, and M. Yousif, "Black-Box and Gray-Box Strategies for Virtual Machine Migration," Proc. Fourth USENIX Conf. Networked Systems Design and Implementation (NSDI), 2007.
[25] M. Cardosa, A. Singh, H. Pucha, and A. Chandra, "Exploiting Spatio-Temporal Tradeoffs for Energy Efficient MapReduce in the Cloud," Technical Report 10-008, Dept. of CSE, Univ., Rep. 10-008 Apr. 2010.
[26] A. Verma, L. Cherkasova, and R. Campbell, "Resource Provisioning Framework for MapReduce Jobs with Performance Goals," Proc. Middleware, 2011.
[27] X. Fan, W. Weber, and L. Barroso, "Power Provisioning for a Warehouse-Sized Computer," Proc. 34th Ann. Int'l Symp. Computer Architecture (ISCA), 2007.
[28] K. Lim, P. Ranganathan, J. Chang, C. Patel, T. Mudge, and S. Reinhardt, "Understanding and Designing New Server Architectures for Emerging Warehouse-Computing Environments," Proc. 35th Ann. Int'l Symp. Computer Architecture (ISCA), 2008.
[29] Y. Chen, L. Keys, and R. Katz, "Towards Energy Efficient MapReduce," Technical Report UCB/EECS-2009-109, Univ. California, 2009.
[30] J. Leverich and C. Kozyrakis, "On the Energy (In)efficiency of Hadoop Clusters," Proc. SOSP Workshop Power Aware Computing and Systems (HotPower), 2009.
[31] "Scheduling in Hadoop," http://www.cloudera.com/blog/tagscheduling /, 2012.
[32] T. Sandholm and K. Lai, "MapReduce Optimization Using Dynamic Regulated Prioritization," Proc. ACM SIGMETRICS/Performance, 2009.
[33] M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg, "Quincy: Fair Scheduling for Distributed Computing Clusters," Proc. ACM SIGOPS 22nd Symp. Operating Systems Principles (SOSP), 2009.
[34] J. Polo, C. Castillo, D. Carrera, Y. Becerra, I. Whalley, M. Steinder, J. Torres, and E. Ayguade, "Resource-Aware Adaptive Scheduling for MapReduce Clusters," Proc. 12th ACM/IFIP/USENIX Int'l Conf. Middleware (Middleware), 2011.
[35] M. Cardosa, M. Korupolu, and A. Singh, "Shares and Utilities Based Power Consolidation in Virtualized Server Environments," Proc. 11th IFIP/IEEE Int'l Conf. Integrated Network Management (IM), 2009.
[36] L. Barroso and U. Hölzle, "The Case for Energy-Proportional Computing," IEEE Computer, vol. 40, no. 12, pp. 33-37, Dec. 2007.
[37] N. Tolia, Z. Wang, M. Marwah, C. Bash, P. Ranganathan, and X. Zhu, "Delivering Energy Proportionality with Non Energy-Proportional Systems—Optimizing the Ensemble," Proc. USENIX Conf. Power Aware Computing and Systems (HotPower), 2008.
[38] B. Chun, G. Iannaccone, G. Iannaccone, R. Katz, G. Lee, and L. Niccolini, "An Energy Case for Hybrid Datacenters" ACM SIGOPS Operating Systems Rev., vol. 44, pp. 76-80, 2010.
[39] G. Khanna, K. Beaty, G. Kar, and A. Kochut, "Application Performance Management in Virtualized Server Environments," Proc. IEEE/IFIP 10th Network Operations and Management Symp. (NOMS), 2006.
[40] M. Steinder, I. Whalley, D. Carrera, I. Gaweda, and D.M. Chess, "Server Virtualization in Autonomic Management of Heterogeneous Workloads," Proc. IFIP/IEEE 10th Int'l Symp. Integrated Network Management (IM), 2007.
[41] R. Nathuji and K. Schwan, "VirtualPower: Coordinated Power Management in Virtualized Enterprise Systems," Proc. 21st ACM SIGOPS Symp. Operating Systems Principles (SOSP), 2007.
[42] A. Kansal, J. Liu, A. Singh, R. Nathuji, and T. Abdelzaher, "Semantic-Less Coordination of Power Management and Application Performance," Proc. Workshop Power Aware Computing and Systems (HotPower), 2009.
[43] D.G. Feitelson and L. Rudolph, Parallel Job Scheduling: Issues and Approaches, vol. 949, Springer-Verlag, 1995.
[44] D.G. Feitelson, "Job Scheduling in Multiprogrammed Parallel Systems," IBM Research Report RC 87657, 1997.
42 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool