The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - {} (2013 vol.1)
pp: 1
Mukil Kesavan , Georgia Institute of Technology, Atlanta
Irfan Ahmad , CloudPhysics, Inc., Mountain View
Orran Krieger , Boston University, Boston
Ravi Soundararajan , VMware, Inc., Palo Alto
Ada Gavrilovska , Georgia Institute of Technology, Atlanta
Karsten Schwan , Georgia Institute of Technology, Atlanta
ABSTRACT
We present CCM (Cloud Capacity Manager) - a prototype system and its methods for dynamically multiplexing the compute capacity of virtualized datacenters at scales of thousands of machines, for diverse workloads with variable demands. Extending prior studies primarily concerned with accurate capacity allocation and ensuring acceptable application performance, CCM also sheds light on the tradeoffs due to two unavoidable issues in large scale commodity datacenters: (i) maintaining low operational overhead given variable cost of performing management operations necessary to allocate resources, and (ii) coping with the increased incidences of these operations' failures. CCM is implemented in an industry-strength cloud infrastructure built on top of the VMware vSphere virtualization platform and is currently deployed in a 700 physical host datacenter. Its experimental evaluation uses production workload traces and a suite of representative cloud applications to generate dynamic scenarios. Results indicate that the pragmatic cloud-wide nature of CCM provides up to 25% more resources for workloads and improves datacenter utilization by up to 20%, compared to the common alternative approach of multiplexing capacity within multiple independent smaller datacenter partitions.
INDEX TERMS
Resource management, Distributed processing, Fault tolerance, Hierarchical systems, Virtualization, Data processing, Measurements, Resource management, Distributed processing, Fault tolerance, Hierarchical systems, Virtualization, Data processing, Fault-tolerance, Distributed systems, Hierarchical design
CITATION
Mukil Kesavan, Irfan Ahmad, Orran Krieger, Ravi Soundararajan, Ada Gavrilovska, Karsten Schwan, "Practical Compute Capacity Management for Virtualized Datacenters", IEEE Transactions on Cloud Computing, vol.1, no. 1, pp. 1, {} 2013, doi:10.1109/TCC.2013.8
REFERENCES
[1] "Hyper-V: Using Hyper-V and Failover Clustering," http://technet.microsoft.com/en-us/library cc732181(v=ws.10).aspx, 2013.
[2] "Configuration Maximums—VMware vSphere 5.1," http://pubs. vmware.com/vsphere-51index.jsp , 2013.
[3] "Citrix Workload Balancing 2.1 Administrator's Guide," 2011.
[4] A. Gulati, G. Shanmuganathan, A. Holler, and I. Ahmad, "Cloud-Scale Resource Management: Challenges and Techniques," Proc. Third USENIX Conf. Hot Topics in Cloud Computing (HotCloud '11), 2011.
[5] T. Ristenpart et al., "Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds," Proc. 16th ACM Conf. Computer and Comm. Security (CCS '09), 2009.
[6] "Cloud Infrastructure Architecture Case Study," http://www.vmware.com/resources/techresources 10255, 2012.
[7] "Xen Cloud Platform Administrator's Guide - Release 0.1," 2009.
[8] A. Tumanov, J. Cipar, M.A. Kozuch, and G.R. Ganger, "Alsched: Algebraic Scheduling of Mixed Workloads in Heterogeneous Clouds," Proc. Third ACM Symp. Cloud Computing (SOCC '12), 2012.
[9] P. Padala et al., "Adaptive Control of Virtualized Resources in Utility Computing Environments," Proc. Second ACM SIGOPS/EuroSys European Conf. Computer Systems (EuroSys '07), 2007.
[10] D. Gmach, J. Rolia, L. Cherkasova, and A. Kemper, "Workload Analysis and Demand Prediction of Enterprise Data Center Applications," Proc. IEEE 10th Int'l Symp. Workload Characterization (IISWC '07), 2007.
[11] X. Meng, C. Isci, J. Kephart, L. Zhang, E. Bouillet, and D. Pendarakis, "Efficient Resource Provisioning in Compute Clouds via VM Multiplexing," Proc. Seventh Int'l Conf. Autonomic Computing (ICAC '10), 2010.
[12] X. Zhu et al., "1000 Islands: Integrated Capacity and Workload Management for the Next Generation Data Center," Proc. Int'l Conf. Autonomic Computing (ICAC '08), 2008.
[13] T. Wood et al., "Black-Box and Gray-Box Strategies for Virtual Machine Migration," Proc. Fourth USENIX Conf. Networked Systems Design and Implementation (NSDI '07), 2007.
[14] Z. Shen et al., "CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems," Proc. Second ACM Symp. Cloud Computing (SoCC '11), 2011.
[15] "Amazon Web Services - Case Studies," http://aws.amazon.com/solutionscase-studies /, 2013.
[16] O. Krieger, P. McGachey, and A. Kanevsky, "Enabling a Marketplace of Clouds: VMware's Vcloud Director," SIGOPS Operating Systems Rev., vol. 44, pp. 103-114, Dec. 2010.
[17] A. Verma et al., "The Cost of Reconfiguration in a Cloud," Proc. 11th Int'l Middleware Conf. Industrial Track (Middleware Industrial Track '10), 2010.
[18] V. Soundararajan and J.M. Anderson, "The Impact of Management Operations on the Virtualized Datacenter," Proc. 37th Ann. Int'l Symp. Computer Architecture (ISCA '10), 2010.
[19] "VMware vSphere," http://www.vmware.com/productsvsphere/, 2013.
[20] M. Kesavan et al., "Xerxes: Distributed Load Generator for Cloud-Scale Experimentation," Proc. Seventh Open Cirrus Summit, 2012.
[21] W. Sobel et al., "Cloudstone: Multi-platform, Multi-Language Benchmark and Measurement Tools for Web 2.0," Proc. First Workshop Cloud Computing and Its Applications (CCA '08), 2008.
[22] "Apache Nutch," http:/nutch.apache.org/, 2013.
[23] "Project Voldemort," http:/project-voldemort.com/, 2013.
[24] "HPL—A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers," http://www.netlib.org/benchmarkhpl/, 2013.
[25] "VMware DRS," http://www.vmware.com/productsDRS, 2013.
[26] "VMware Distributed Power Management Concepts and Use," http://www.vmware.com/files/pdfDPM.pdf, 2013.
[27] A. Gulati et al., "VMware Distributed Resource Management: Design, Implementation and Lessons Learned," VMware Technical J., vol. 1, no. 1, pp. 45-64, Apr. 2012.
[28] "DRS Performance and Best Practices," www.vmware.com/files/pdfdrs_performance_best_practices_wp.pdf , 2013.
[29] V. Kumar, B.F. Cooper, G. Eisenhauer, and K. Schwan, "Imanage: Policy-Driven Self-Management for Enterprise-Scale Systems," Proc. ACM/IFIP/USENIX Int'l Conf. Middleware (Middleware '07), 2007.
[30] "googleclusterdata—Traces of Google Tasks Running in a Production Cluster," http://code.google.com/pgoogleclusterdata /, 2013.
[31] R.N. Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat, "Portland: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric," Proc. ACM SIGCOMM, 2009.
[32] A. Greenberg, J.R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D.A. Maltz, P. Patel, and S. Sengupta, "Vl2: A Scalable and Flexible Data Center Network," Proc. ACM SIGCOMM, 2009.
[33] "Unified Computing System," http://www.cisco.com/en/US/netsol/ns944index.html , 2013.
[34] A. Gulati et al., "Decentralized Management of Virtualized Hosts," US patent US20120324441 A1, 2012.
[35] "VMware VI (vSphere) Java API," http:/vijava.sourceforge.net/, 2013.
[36] C. Clark, K. Fraser, S. Hand, J.G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield, "Live Migration of Virtual Machines," Proc. Second Conf. Symp. Networked Systems Design and Implementation (NSDI '05), 2005.
[37] H. Liu et al., "Performance and Energy Modeling for Live Migration of Virtual Machines," Proc. 20th Int'l Symp. High Performance Distributed Computing (HPDC '11), 2011.
[38] C.A. Waldspurger, "Memory Resource Management in VMware ESX Server," Proc. Fifth Symp. Operating Systems Design and Implementation (OSDI '02), 2002.
[39] B.F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, "Benchmarking Cloud Serving Systems with YCSB," Proc. First ACM Symp. Cloud Computing (SoCC '10), 2010.
[40] R. Cattell, "Scalable SQL and NoSQL Data Stores," ACM SIGMOD Record, vol. 39, no. 4, pp. 12-27, May 2011.
[41] P. Bodik et al., "Characterizing, Modeling, and Generating Workload Spikes for Stateful Services," Proc. First ACM Symp. Cloud Computing (SoCC '10), 2010.
[42] M. Chen et al., "Effective VM Sizing in Virtualized Data Centers," Proc. IFIP/IEEE Int'l Symp. Integrated Network Management, 2011.
[43] H.C. Lim, S. Babu, J.S. Chase, and S.S. Parekh, "Automated Control in Cloud Computing: Challenges and Opportunities," Proc. First Workshop Automated Control for Datacenters and Clouds (ACDC '09), 2009.
[44] Q. Zhu and G. Agrawal, "Resource Provisioning with Budget Constraints for Adaptive Applications in Cloud Environments," Proc. 19th ACM Int'l Symp. High Performance Distributed Computing (HPDC '10), 2010.
[45] R. Singh et al., "Autonomic Mix-Aware Provisioning for Non-Stationary Data Center Workloads," Proc. Seventh Int'l Conf. Autonomic Computing (ICAC '10), 2010.
[46] R. Nathuji, A. Kansal, and A. Ghaffarkhah, "Q-Clouds: Managing Performance Interference Effects for QoS-Aware Clouds," Proc. Fifth European Conf. Computer Systems (EuroSys '10), 2010.
[47] M.L. Massie, B.N. Chun, and D.E. Culler, "The Ganglia Distributed Monitoring System: Design, Implementation and Experience," Parallel Computing, vol. 30, p. 2004, 2003.
[48] M. Isard, "Autopilot: Automatic Data Center Management," ACM SIGOPS Operating Systems Rev., vol. 41, no. 2, pp. 60-67, Apr. 2007.
96 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool