The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - November (2009 vol.58)
pp: 1525-1538
Luis Moura Silva , University of Coimbra, Portugal
Javier Alonso , Universitat de Politecnica de Catalunya, Spain
Jordi Torres , Universitat de Politecnica de Catalunya and Barcelona Supercomputing Center, Spain
ABSTRACT
In this paper, we present an approach for software rejuvenation based on automated self-healing techniques that can be easily applied to off-the-shelf Application Servers. Software aging and transient failures are detected through continuous monitoring of system data and performability metrics of the application server. If some anomalous behavior is identified, the system triggers an automatic rejuvenation action. This self-healing scheme is meant to disrupt the running service for a minimal amount of time, achieving zero downtime in most cases. In our scheme, we exploit the usage of virtualization to optimize the self-recovery actions. The techniques described in this paper have been tested with a set of open-source Linux tools and the XEN virtualization middleware. We conducted an experimental study with two application benchmarks (Tomcat/Axis and TPC-W). Our results demonstrate that virtualization can be extremely helpful for fail-over and software rejuvenation in the occurrence of transient failures and software aging.
INDEX TERMS
Software rejuvenation, software aging, virtualization, self-healing.
CITATION
Luis Moura Silva, Javier Alonso, Jordi Torres, "Using Virtualization to Improve Software Rejuvenation", IEEE Transactions on Computers, vol.58, no. 11, pp. 1525-1538, November 2009, doi:10.1109/TC.2009.119
REFERENCES
[1] J. Hennessy and D. Patterson, Computer Architecture: A Quantitative Approach. Morgan & Kaufmann, 2002.
[2] E. Marcus and H. Stern, Blueprints for High Availability. Wiley, 2003.
[3] J. Kephart and D.M. Chess, “The Vision of Autonomic Computing,” Computer, vol. 36, no. 1, pp. 41-50, Jan. 2003.
[4] Y. Huang, C. Kintala, N. Kolettis, and N. Fulton, “Software Rejuvenation: Analysis, Module and Applications,” Proc. 25th Int'l Symp. Fault-Tolerant Computing, June 1995.
[5] A. Avritzer and E. Weyuker, “Monitoring Smoothly Degrading Systems for Increased Dependability,” Empirical Software Eng. J., vol 2, no. 1, pp. 59-77, 1997.
[6] Apache, http://httpd.apache.orgdocs/, 2009.
[7] Microsoft IIS, http:/www.microsoft.com/, 2009.
[8] V. Castelli, R. Harper, P. Heidelberg, S. Hunter, K. Trivedi, K. Vaidyanathan, and W. Zeggert, “Proactive Management of Software Aging,” IBM J. Research and Development, vol. 45, no. 2, Mar. 2001.
[9] K. Cassidy, K. Gross, and A. Malekpour, “Advanced Pattern Recognition for Detection of Complex Software Aging Phenomena in Online Transaction Processing Servers,” Proc. 2002 Int'l Conf. Dependable Systems and Networks, 2002.
[10] A. Tai, S. Chau, L. Alkalaj, and H. Hecht, “On-Board Preventive Maintenance: Analysis of Effectiveness and Optimal Duty Period,” Proc. Third Workshop Object-Oriented Real-Time Dependable Systems, 1997.
[11] E. Marshall, “Fatal Error: How Patriot Overlooked a Scud,” Science, vol. 255, pp. 1344-1347, Mar. 1992.
[12] MemProfiler, http:/memprofiler.com/, 2009.
[13] Parasoft Insure++, http:/www.parasoft.com, 2009.
[14] K. Vaidyanathan and K. Trivedi, “A Comprehensive Model for Software Rejuvenation,” IEEE Trans. Dependable and Secure Computing, vol. 2, no. 2, pp. 124-137, Apr.-June 2005.
[15] S. Garg, A. van Moorsel, K. Vaidyanathan, and K. Trivedi, “A Methodology for Detection and Estimation of Software Aging,” Proc. Ninth Int'l Symp. Software Reliability Eng., pp. 282-292, 1998.
[16] K. Vaidyanathan and K.S. Trivedi, “A Measurement-Based Model for Estimation of Resource Exhaustion in Operational Software Systems,” Proc. 10th IEEE Int'l Symp. Software Reliability Eng., pp. 84-93, 1999.
[17] L. Li, K. Vaidyanathan, and K. Trivedi, “An Approach for Estimation of Software Aging in a Web-Server,” Proc. 2002 Int'l Symp. Empirical Software Eng. (ISESE '02), 2002.
[18] K. Gross, V. Bhardwaj, and R. Bickford, “Proactive Detection of Software Aging Mechanisms in Performance Critical Computers,” Proc. 27th Ann. IEEE/NASA Software Eng. Symp., 2002.
[19] K. Kaidyanathan and K. Gross, “Proactive Detection of Software Anomalies through MSET,” Proc. Workshop Predictive Software Models (PSM '04), Sept. 2004.
[20] K. Gross and W. Lu, “Early Detection of Signal and Process Anomalies in Enterprise Computing Systems,” Proc. 2002 IEEE Int'l Conf. Machine Learning and Applications (ICMLA '02), June 2002.
[21] L. Silva, H. Madeira, and J.G. Silva, “Software Aging and Rejuvenation in a SOAP-Based Server,” Proc. IEEE Int'l Symp. Network Computing and Applications (NCA), July 2006.
[22] L. Bernstein, Y.D. Yao, and K. Yao, “Software Avoiding Failures Even When There Are Faults,” DoD Software Tech News, vol. 6, no. 2, pp. 8-11, Oct. 2003.
[23] A. Andrzejak and L.M. Silva, “Deterministic Models of Software Aging and Optimal Rejuvenation Schedules,” Proc. 10th IFIP/IEEE Int'l Symp. Integrated Network Management (IM '07), May 2007.
[24] G. Candea, A. Brown, A. Fox, and D. Patterson, “Recovery Oriented Computing: Building Multi-Tier Dependability,” Computer, vol. 37, no. 11, pp. 60-67, Nov. 2004.
[25] G. Candea, E. Kiciman, S. Zhang, and A. Fox, “JAGR: An Autonomous Self-Recovering Application Server,” Proc. Fifth Int'l Workshop Active Middleware Services, June 2003.
[26] A. Fox and D. Patterson, “When Does Fast Recovery Trump High Reliability?” Proc. Second Workshop Evaluating and Architecting System Dependability, 2002.
[27] R. Figueiredo, P. Dinda, and J. Fortes, “Resource Virtualization Renaissance,” Computer, vol. 38, no. 5, pp. 28-69, May 2005.
[28] M. Rosenblum and T. Garfinkel, “Virtual Machine Monitors: Current Technology and Future Trends,” IEEE Internet Computing, vol. 38, no. 5, pp. 39-47, May/June 2005.
[29] G. Candea, S. Kawamoto, Y. Fujiki, G. Friedman, and A. Fox, “Microreboot—A Technique for Cheap Recovery,” Proc. Sixth Symp. Operating Systems Design and Implementation (OSDI '04), Dec. 2004.
[30] VMware, http:/www.vmware.com, 2009.
[31] Xen, http:/www.xensource.com, 2009.
[32] Virtuoso, http:/www.virtuoso.com, 2009.
[33] LVS, http:/www.linuxvirtualserver.org/, 2009.
[34] ldirectord, http://www.vergenet.net/linuxldirectord, 2009.
[35] Ganglia, http:/ganglia.sourceforge.net, 2009.
[36] Essential about Java Servlet Filters, http://java.sun.com/ products/servletFilters.html , 2009.
[37] D. Menascé, “QoS Issues in Web Services,” IEEE Internet Computing, vol. 6, no. 6, pp. 72-74, Nov./Dec. 2002.
[38] R. Arpaci-Dusseau and A. Arpaci-Dusseau, “Fail-Stutter Fault Tolerance,” Proc. Eighth Workshop Hot Topics in Operating Systems, (HOTOS-VIII), 2001.
[39] S. Makridakis, S. Wheelwright, and R. Hyndman, Forecasting: Methods and Applications, third ed. John Wiley & Sons, 1998.
[40] G. Candea and A. Fox, “Crash-Only Software,” Proc. Ninth Workshop Hot Topics in Operating Systems, 2001.
[41] Apache Axis, http://ws.apache.orgaxis, 2009.
[42] TPC-W Specification, http://www.tpc.org/tpcwspecs.asp, 2009.
[43] TPC-W in Java for Tomcat and MySQL, http://www.cs.cmu/edu/~manjhitpcw.html, 2009.
[44] S. Tixeuil, W. Hoarau, and L.M. Silva, “An Overview of Existing Tools for Fault-Injection and Dependability Benchmarking in Grids,” Technical Report TR-0041, CoreGRID, http:/www. coregrid.net, 2009.
[45] TCP-IP RFC 793, http://www.ietf.org/rfcrfc793.txt, 2009.
13 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool