The Community for Technology Leaders
RSS Icon
Issue No.06 - November/December (2011 vol.8)
pp: 839-851
Kenichi Kourai , Kyushu Institute of Technology, Fukuoka
Shigeru Chiba , Tokyo Institute of Technology, Tokyo
As server consolidation using virtual machines (VMs) is carried out, software aging of virtual machine monitors (VMMs) is becoming critical. Since a VMM is fundamental software for running VMs, its performance degradation or crash failure affects all VMs running on top of it. To counteract such software aging, a proactive technique called software rejuvenation has been proposed. A simple example of rejuvenation is to reboot a VMM. However, simply rebooting a VMM is undesirable because that needs rebooting operating systems on all VMs. In this paper, we propose a new technique for fast rejuvenation of VMMs called the warm-VM reboot. The warm-VM reboot enables efficiently rebooting only a VMM by suspending and resuming VMs without saving the memory images to persistent storage. To achieve this, we have developed two mechanisms: on-memory suspend/resume of VMs and quick reload of a VMM. Compared with a normal reboot, the warm-VM reboot reduced the downtime by 74 percent at maximum. It also prevented the performance degradation due to cache misses after the reboot, which was 52 percent in case of a normal reboot. In a cluster environment, the warm-VM reboot achieved higher total throughput than the system using VM migration and a normal reboot.
Operating systems, checkpoint/restart, main memory, availability, performance.
Kenichi Kourai, Shigeru Chiba, "Fast Software Rejuvenation of Virtual Machine Monitors", IEEE Transactions on Dependable and Secure Computing, vol.8, no. 6, pp. 839-851, November/December 2011, doi:10.1109/TDSC.2010.20
[1] Y. Huang, C. Kintala, N. Kolettis, and N. Fulton, “Software Rejuvenation: Analysis, Module and Applications,” Proc. 25th Int'l Symp. Fault-Tolerant Computing, pp. 381-391, 1995.
[2] S. Garg, A. van Moorsel, K. Vaidyanathan, and K. Trivedi, “A Methodology for Detection and Estimation of Software Aging,” Proc. Ninth Int'l Symp. Software Reliability Eng., pp. 283-292, 1998.
[3] L. Li, K. Vaidyanathan, and K. Trivedi, “An Approach for Estimation of Software Aging in a Web Server,” Proc. Int'l Symp. Empirical Software Eng., pp. 91-100, 2002.
[4] M. Grottke, L. Li, K. Vaidyanathan, and K. Trivedi, “Analysis of Software Aging in a Web Server,” IEEE Trans. Reliability, vol. 55, no. 3, pp. 411-420, Sept. 2006.
[5] S. Garg, A. Puliafito, M. Telek, and K. Trivedi, “Analysis of Preventive Maintenance in Transactions Based Software Systems,” IEEE Trans. Computers, vol. 47, no. 1, pp. 96-107, Jan. 1998.
[6] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen and the Art of Virtualization,” Proc. 19th ACM Symp. Operating Systems Principles, pp. 164-177, 2003.
[7] S. Garg, Y. Huang, C. Kintala, and K. Trivedi, “Time and Load Based Software Rejuvenation: Policy, Evaluation and Optimality,” Proc. Fault Tolerance Symp., pp. 22-25, 1995.
[8] K. Kourai and S. Chiba, “A Fast Rejuvenation Technique for Server Consolidation with Virtual Machines,” Proc. 37th Ann. IEEE/IFIP Int'l Conf. Dependable Systems and Networks, pp. 245-254, 2007.
[9] A. Williamson, “Xen changeset 9392,” Xen Mercurial repositories, 2006.
[10] K. Fraser, “Xen changeset 11752,” Xen Mercurial repositories, 2006.
[11] VMware Inc., “VMware,” http:/
[12] K. Vaidyanathan and K. Trivedi, “A Measurement-Based Model for Estimation of Software Aging in Operational Software Systems,” Proc. 10th Int'l. Symp. Software Reliability Eng., pp. 84-93, 1999.
[13] K. Vaidyanathan and K. Trivedi, “A Comprehensive Model for Software Rejuvenation,” IEEE Trans. Dependable and Secure Computing, vol. 2, no. 2, pp. 124-137, Apr.-June 2003.
[14] V. Hanquez, “Xen changeset 8640,” Xen Mercurial repositories, 2006.
[15] Intel Corporation, “Intel Virtualization Technology Specification for the IA-32 Intel Architecture,” 2005.
[16] AMD, “AMD64 Virtualization Codenamed “Pacifica” Technology: Secure Virtual Machine Architecture Reference Manual,” 2005.
[17] JBoss Group, “JBoss Application Server,” http:/
[18] Hewlett-Packard, Intel, Microsoft, Phoenix Technologies, and Toshiba, “Advanced Configuration and Power Interface Specification, Revision 3.0b,” http:/, 2006.
[19] A. Pfiffer, “Reducing System Reboot Time with kexec,” http:/, 2003.
[20] Apache Software Foundation, “Apache HTTP Server Project,” http:/
[21] D. Mosberger and T. Jin, “httperf: A Tool for Measuring Web Server Performance,” ACM SIGMETRICS Performance Evaluation Rev., vol. 26, no. 3, pp. 31-37, 1998.
[22] V. Castelli, R. Harper, P. Heidelberger, S. Hunter, K. Trivedi, K. Vaidyanathan, and W. Zeggert, “Proactive Management of Software Aging,” IBM J. Research and Development, vol. 45, no. 2, pp. 311-332, Mar. 2001.
[23] K. Vaidyanathan, R. Harper, S. Hunter, and K. Trivedi, “Analysis and Implementation of Software Rejuvenation in Cluster Systems,” Proc. ACM SIGMETRICS Int'l Conf. Measurement and Modeling of Computer Systems, pp. 62-71, 2001.
[24] C. Clark, K. Fraser, S. Hand, J. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield, “Live Migration of Virtual Machines,” Proc. Second Conf. Symp. Networked Systems Design and Implementation, pp. 1-11, 2005.
[25] G. Candea, S. Kawamoto, Y. Fujiki, G. Friedman, and A. Fox, “Microreboot - A Technique for Cheap Recovery,” Proc. Sixth Conf. Symp. Operating Systems Design and Implementation, pp. 31-44, 2004.
[26] M. Accetta, R. Baron, W. Bolosky, D. Golub, R. Rashid, A. Tevanian, and M. Young, “Mach: A New Kernel Foundation for UNIX Development,” Proc. USENIX Summer Conf., pp. 93-112, 1986.
[27] M. Swift, B. Bershad, and H. Levy, “Improving the Reliability of Commodity Operating Systems,” Proc. 19th ACM Symp. Operating Systems Principles, pp. 207-222, 2003.
[28] B. Randell, “System Structure for Software Fault Tolerance,” IEEE Trans. Software Eng., vol. SE-1, no. 2, pp. 220-232, June 1975.
[29] S. Feldman and C. Brown, “IGOR: A System for Program Debugging via Reversible Execution,” Proc. Workshop Parallel and Distributed Debugging, pp. 112-123, 1989.
[30] J. Plank, J. Xu, and R. Netzer, “Compressed Differences: An Algorithm for Fast Incremental Checkpointing,” Technical Report CS-95-302, Univ. of Tennessee, 1995.
[31] GIGABYTE Technology, “i-RAM,” http:/www.gigabyte.
[32] M. Baker and M. Sullivan, “The Recovery Box: Using Fast Recovery to Provide High Availability in the UNIX Environment,” Proc. USENIX Summer Conf., pp. 31-44, 1992.
[33] P. Chen, W. Ng, S. Chandra, C. Aycock, G. Rajamani, and D. Lowell, “The Rio File Cache: Surviving Operating System Crashes,” Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 74-83, 1996.
[34] A. Chou, J. Yang, B. Chelf, S. Hallem, and D. Engler, “An Empirical Study of Operating Systems Errors,” Proc. 18th ACM Symp. Operating Systems Principles, pp. 73-88, 2001.
[35] A. Ganapathi, V. Ganapathi, and D. Patterson, “Windows XP Kernel Crash Analysis,” Proc. Large Installation System Administration Conf., pp. 149-159, 2006.
3 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool