The Community for Technology Leaders
RSS Icon
Issue No.02 - February (2012 vol.23)
pp: 353-366
Koushik Chakraborty , Utah State University, Logan
Philip M. Wells , Google, Inc., Madison
Gurindar S. Sohi , University of Wisconsin-Madison, Madison
Multiprocessor operating systems (OSs) pose several unique and conflicting challenges to System Virtual Machines (System VMs). For example, most existing system VMs resort to gang scheduling a guest OS's virtual processors (VCPUs) to avoid OS synchronization overhead. However, gang scheduling is infeasible for some application domains, and is inflexible in other domains. In an overcommitted environment, an individual guest OS has more VCPUs than available physical processors (PCPUs), precluding the use of gang scheduling. In such an environment, we demonstrate a more than two-fold increase in application runtime when transparently virtualizing a chip-multiprocessor's cores. To combat this problem, we propose a hardware technique to detect when a VCPU is wasting CPU cycles, and preempt that VCPU to run a different, more productive VCPU. Our technique can dramatically reduce cycles wasted on OS synchronization, without requiring any semantic information from the software. We then present a server consolidation case study to demonstrate the potential of more flexible scheduling policies enabled by our technique. We propose one such policy that logically partitions the CMP cores between guest VMs. This policy increases throughput by 10-25 percent for consolidated server workloads due to improved cache locality and core utilization.
Multicore, virtualization, synchronization, operating systems.
Koushik Chakraborty, Philip M. Wells, Gurindar S. Sohi, "Supporting Overcommitted Virtual Machines through Hardware Spin Detection", IEEE Transactions on Parallel & Distributed Systems, vol.23, no. 2, pp. 353-366, February 2012, doi:10.1109/TPDS.2011.143
[1] J. Smith and R. Nair, Virtual Machines: Versatile Platforms for Systems and Processes. Morgan Kaufmann, 2005.
[2] AMD64 Architecture Programmer's Manual Vol. 2: System Programming, Advanced Micro Devices, Dec. 2005.
[3] W. Armstrong, R. Arndt, D. Boutcher, R. Kovacs, D. Larson, K. Lucke, N. Nayar, and R. Swanberg, "Advanced Virtualization Capabilities of POWER5 Systems," IBM J. Research and Development, vol. 49, nos. 4/5, pp. 523-532, 2005.
[4] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, "Xen and the Art of Virtualization," Proc. 19th Symp. Operating Systems Principles, 2003.
[5] K. Govil, D. Teodosiu, Y. Huang, and M. Rosenblum, "Cellular Disco: Resource Management Using Virtual Clusters on Shared-Memory Multiprocessors," Proc. 17th Symp. Operating Systems Principles, 1999.
[6] R. Uhlig, G. Neiger, D. Rodgers, A.L. Santoni, F.C.M. Martins, A.V. Anderson, S.M. Bennett, A. Kagi, F.H. Leung, and L. Smith, "Intel Virtualization Technology," Computer, vol. 38, no. 5, pp. 48-56, May 2005.
[7] V. Uhlig, J. LeVasseur, E. Skoglund, and U. Dannowski, "Towards Scalable Multiprocessor Virtual Machines," Proc. Third Virtual Machine Research and Technology Symp., http://l4ka.orgpublications/, 2004.
[8] J.K. Ousterhout, "Scheduling Techniques for Concurrent Systems," Proc. Third Int'l Conf. Distributed Computing Systems, 1982.
[9] C.A. Waldspurger, "Memory Resource Management in VMware ESX Server," Proc. Fifth Symp. Operating Systems Design and Implementation, 2002.
[10] T.E. Anderson, B.N. Bershad, E.D. Lazowska, and H.M. Levy, "Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism," ACM Trans. Computer Systems, vol. 10, no. 1, pp. 53-79, 1992.
[11] A. Whitaker, M. Shaw, and S.D. Gribble, "Scale and Performance in the Denali Isolation Kernel," Proc. Fifth Symp. Operating Systems Design and Implementation, 2002.
[12] J.E. Smith, S.S. Sastry, T. Heil, and T.M. Bezenek, "Achieving High Performance via Co-Designed Virtual Machines," Proc. Int'l Workshop Innovative Architecture, 1999.
[13] R. Kumar, D.M. Tullsen, P. Ranganathan, N.P. Jouppi, and K.I. Farkas, "Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance," Proc. 31st Int'l Symp. Computer Architecture, 2004.
[14] K. Chakraborty, P. Wells, and G. Sohi, "Computation Spreading: Employing Hardware Migration to Specialize CMP Cores On-the-Fly," Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems, 2006.
[15] J.R. Larus and M. Parkes, "Using Cohort-Scheduling to Enhance Server Performance," Proc. USENIX Ann. Technical Conf., 2002.
[16] M.D. Powell, M. Gomaa, and T.N. Vijaykumar, "Heat-and-Run: Leveraging SMT and CMP to Manage Power Density through the Operating System," Proc. 11th Int'l Conf. Architectural Support for Programming Languages and Operating Systems, 2004.
[17] M. Gomaa, C. Scarbrough, T.N. Vijaykumar, and I. Pomeranz, "Transient-Fault Recovery for Chip Multiprocessors," Proc. 30th Int'l Symp. Computer Architecture, 2003.
[18] G.S. Sohi, S.E. Breach, and T.N. Vijaykumar, "Multiscalar Processors," Proc. 22nd Int'l Symp. Computer Architecture, 1995.
[19] P.M. Wells, K. Chakraborty, and G.S. Sohi, "Adapting to Intermittent Faults in Multicore Systems," ASPLOS-XIII: Proc. 13th Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 255-264. 2008.
[20] Sun Microsystems, Inc, "Sun Enterprise 10000 Server: Dynamic System Domains," , Viewed 6/23/2006, 2011.
[21] VMWare, "ESX Server—Best Practices Using VMware Virtual SMP," , Viewed 6/23/2006, 2011.
[22] B. Rosenburg, "Low-Synchronization Translation Lookaside Buffer Consistency in Large-Scale Shared-Memory Multiprocessors," Proc. 12th Symp. Operating Systems Principles, 1989.
[23] M. Hohmuth and H. Hartig, "Pragmatic Nonblocking Synchronization for Real-Time Systems," Proc. USENIX Ann. Technical Conf., 2001.
[24] K. Govil, D. Teodosiu, Y. Huang, and M. Rosenblum, "Cellular Disco: Resource Management Using Virtual Clusters on Shared-Memory Multiprocessors," Proc. 17th ACM Symp. Operating Systems Principles, 1999.
[25] A.C. Arpaci-Dusseau, "Implicit Coscheduling: Coordinated Scheduling with Implicit Information in Distributed Systems," ACM Trans. Computer Systems, vol. 19, no. 3, pp. 283-331, 2001.
[26] T. Li, A.R. Lebeck, and D.J. Sorin, "Spin Detection Hardware for Improved Management of Multithreaded Systems," IEEE Trans. Parallel and Distributed Systems, vol. 17, no. 6, pp. 508-521, June 2006.
[27] P. Wells, K. Chakraborty, and G. Sohi, "Hardware Support for Spin Management in Overcommitted Virtual Machines," PACT '06: Proc. 15th Int'l Conf. Parallel Architectures and Compilation Techniques, 2006.
[28] R. Rajwar and J.R. Goodman, "Speculative Lock Elision: Enabling Highly Concurrent Multithreaded Execution," Proc. 34th Ann. ACM/IEEE Int'l Symp. Microarchitecture, 2001.
[29] P. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hållberg, J. Högberg, F. Larsson, A. Moestedt, and B. Werner, "Simics: A Full System Simulation Platform," Computer, vol. 35, no. 2, pp. 50-58, Feb. 2002.
[30] P. Barford and M. Crovella, "Generating Representative Web Workloads for Network and Server Performance Evaluation," Proc. Int'l Conf. Measurement and Modeling of Computer Systems, 1998.
[31] "PostgreSQL," http:/, 2011.
[32] A.R. Alameldeen and D.A. Wood, "Variability in Architectural Simulations of Multi-Threaded Workloads," Proc. Ninth Int'l Symp. High-Performance Computer Architecture, 2003.
[33] N.J. Wang, J. Quek, T.M. Rafacz, and S.J. Patel, "Characterizing the Effects of Transient Faults on a High-Performance Processor Pipeline," Proc. Int'l Conf. Dependable Systems and Networks (DSN), 2004.
[34] K.M. Lepak and M.H. Lipasti, "Temporally Silent Stores," Proc. 10th Int'l Conf. Architectural Support for Programming Languages and Operating Systems, 2002.
[35] R. Figueiredo, P.A. Dinda, and J. Fortes, "Resource Virtualization Renaissance," Computer, vol. 38, no. 5, pp. 28-31, May 2005.
[36] J. Torrellas, A. Tucker, and A. Gupta, "Evaluating the Performance of Cache-Affinity Scheduling in Shared-Memory Multiprocessors," J. Parallel Distributed Computing, vol. 24, no. 2, pp. 139-151, 1995.
[37] G. Deen, M. Hammer, J. Bethencourt, I. Eiron, J. Thomas, and J. Kaufman, "Running Quake II on a Grid," IBM Systems J., vol. 45, no. 1, pp. 21-44, 2006.
[38] T.F. Wenisch, S. Somogyi, N. Hardavellas, J. Kim, A. Ailamaki, and B. Falsafi, "Temporal Streaming of Shared Memory," Proc. 32nd Ann. Int'l Symp. Computer Architecture, 2005.
[39] "VMware. VMware ESX Server 3 Ready Time Observations," , 2006.
117 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool