The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2010 vol.21)
pp: 452-465
Paulo Sousa , Ciencias da Univ. Lisboa, Lisboa
Alysson Neves Bessani , Ciencias da Univ. Lisboa, Lisboa
Miguel Correia , Ciencias da Univ. Lisboa, Lisboa
Nuno Ferreira Neves , Ciencias da Univ. Lisboa, Lisboa
Paulo Verissimo , Ciencias da Univ. Lisboa, Lisboa
ABSTRACT
In the past, some research has been done on how to use proactive recovery to build intrusion-tolerant replicated systems that are resilient to any number of faults, as long as recoveries are faster than an upper bound on fault production assumed at system deployment time. In this paper, we propose a complementary approach that enhances proactive recovery with additional reactive mechanisms giving correct replicas the capability of recovering other replicas that are detected or suspected of being compromised. One key feature of our proactive-reactive recovery approach is that, despite recoveries, it guarantees the availability of a minimum number of system replicas necessary to sustain correct operation of the system. We design a proactive-reactive recovery service based on a hybrid distributed system model and show, as a case study, how this service can effectively be used to increase the resilience of an intrusion-tolerant firewall adequate for the protection of critical infrastructures.
INDEX TERMS
Intrusion tolerance, proactive recovery, reactive recovery, firewall.
CITATION
Paulo Sousa, Alysson Neves Bessani, Miguel Correia, Nuno Ferreira Neves, Paulo Verissimo, "Highly Available Intrusion-Tolerant Services with Proactive-Reactive Recovery", IEEE Transactions on Parallel & Distributed Systems, vol.21, no. 4, pp. 452-465, April 2010, doi:10.1109/TPDS.2009.83
REFERENCES
[1] P. Verissimo, N.F. Neves, and M.P. Correia, "Intrusion-Tolerant Architectures: Concepts and Design," Architecting Dependable Systems, Springer, 2003.
[2] R. Ostrovsky and M. Yung, "How to Withstand Mobile Virus Attacks (Extended Abstract)," Proc. 10th ACM Symp. Principles of Distributed Computing, pp. 51-59, 1991.
[3] M. Castro and B. Liskov, "Practical Byzantine Fault-Tolerance and Proactive Recovery," ACM Trans. Computer Systems, vol. 20, no. 4, pp. 398-461, 2002.
[4] L. Zhou, F. Schneider, and R. Van Rennesse, "COCA: A Secure Distributed Online Certification Authority," ACM Trans. Computer Systems, vol. 20, no. 4, pp. 329-368, Nov. 2002.
[5] M.A. Marsh and F.B. Schneider, "CODEX: A Robust and Secure Secret Distribution System," IEEE Trans. Dependable and Secure Computing, vol. 1, no. 1, pp. 34-47, Jan. 2004.
[6] P. Sousa, N.F. Neves, and P. Verissimo, "Proactive Resilience through Architectural Hybridization," Proc. ACM Symp. Applied Computing (SAC '06), pp. 686-690, Apr. 2006.
[7] A. Doudou, B. Garbinato, R. Guerraoui, and A. Schiper, "Muteness Failure Detectors: Specification and Implementation," Proc. Third European Dependable Computing Conf., pp. 71-87, Sept. 1999.
[8] A. Doudou, B. Garbinato, and R. Guerraoui, "Encapsulating Failure Detection: From Crash to Byzantine Failures," Proc. Seventh Ada Europe Int'l Conf. Reliable Software Technologies (da Europe '02), pp. 24-50, 2002.
[9] R. Baldoni, J.-M. Hélary, M. Raynal, and L. Tangui, "Consensus in Byzantine Asynchronous Systems," J. Discrete Algorithms, vol. 1, no. 2, pp. 185-210, Apr. 2003.
[10] A. Haeberlen, P. Kouznetsov, and P. Druschel, "The Case for Byzantine Fault Detection," Proc. Second Workshop Hot Topics in System Dependability, 2006.
[11] P. Verissimo, N.F. Neves, and M. Correia, "CRUTIAL: The Blueprint of a Reference Critical Information Infrastructure Architecture," Int'l J. System of Systems Eng., vol. 1, nos. 1/2, pp. 78-95, 2008.
[12] A. Bessani, P. Sousa, M. Correia, N.F. Neves, and P. Verissimo, "The CRUTIAL Way of Critical Infrastructure Protection," IEEE Security & Privacy, vol. 6, no. 6, pp. 44-51, Nov./Dec. 2008.
[13] P. Sousa, N.F. Neves, and P. Verissimo, "How Resilient are Distributed $f$ Fault/Intrusion-Tolerant Systems?" Proc. Int'l Conf. Dependable Systems and Networks (DSN '05), pp. 98-107, June 2005.
[14] P. Sousa, N.F. Neves, and P. Verissimo, "Hidden Problems of Asynchronous Proactive Recovery," Proc. Workshop Hot Topics in System Dependability, June 2007.
[15] P. Verissimo, "Travelling through Wormholes: A New Look at Distributed Systems Models," Special Interest Group on Algorithms and Computation Theory News, vol. 37, no. 1,http://www.navigators.di.fc.ul.pt/docs/abstracts ver06travel.html, 2006.
[16] R.R. Obelheiro, A.N. Bessani, L.C. Lung, and M. Correia, "How Practical are Intrusion-Tolerant Distributed Systems?" Technical Report DI-FCUL TR 06-15, Dept. of Informatics, Univ. of Lisbon, 2006.
[17] A.N. Bessani, R.R. Obelheiro, P. Sousa, and I. Gashi, "On the Effects of Diversity on Intrusion Tolerance," Technical Report DI/FCUL TR 08-30, Dept. of Informatics, Univ. of Lisbon, Dec. 2008.
[18] V. Hadzilacos and S. Toueg, "A Modular Approach to the Specification and Implementation of Fault-Tolerant Broadcasts," Technical Report 94-1425, Dept. of Computer Science, Cornell Univ., May 1994.
[19] P. Verissimo and L. Rodrigues, Distributed Systems for System Architects. Kluwer Academic Publishers, 2001.
[20] S. Kent, "Protecting Externally Supplied Software in Small Computers," PhD dissertation, Laboratory of Computer Science, Massachusetts Inst. of Tech nology, 1980.
[21] P. Cloutier, P. Mantegazza, S. Papacharalambous, I. Soanes, S. Hughes, and K. Yaghmour, "DIAPM-RTAI Position Paper," Proc. Real-Time Linux Workshop, Nov. 2000.
[22] A. Casimiro, P. Martins, and P. Verissimo, "How to Build a Timely Computing Base Using Real-Time Linux," Proc. IEEE Int'l Workshop Factory Comm. Systems., pp. 127-134, Sept. 2000.
[23] P. Sousa, N.F. Neves, P. Verissimo, and W.H. Sanders, "Proactive Resilience Revisited: The Delicate Balance between Resisting Intrusions and Remaining Available," Proc. 25th IEEE Symp. Reliable Distributed Systems (SRDS '06), pp. 71-80, Oct. 2006.
[24] T.D. Chandra and S. Toueg, "Unreliable Failure Detectors for Reliable Distributed Systems," J. ACM, vol. 43, no. 2, pp. 225-267, Mar. 1996.
[25] A. Daidone, F. Di Giandomenico, A. Bondavalli, and S. Chiaradonna, "Hidden Markov Models as a Support for Diagnosis: Formalization of the Problem and Synthesis of the Solution," Proc. 25th IEEE Symp. Reliable Distributed Systems (SRDS '06), pp. 245-256, Oct. 2006.
[26] D.E. Denning, "An Intrusion-Detection Model," IEEE Trans. Software Eng., vol. 13, no. 2, pp. 222-232, Feb. 1987.
[27] B. Mukherjee, L. Heberlein, and K. Levitt, "Network Intrusion Detection," IEEE Network, vol. 8, no. 3, pp. 26-41, May/June 1994.
[28] B. Sprunt, L. Sha, and J. Lehoczky, "Aperiodic Task Scheduling for Hard-Real-Time Systems," Real-Time Systems, vol. 1, no. 1, pp. 27-60, 1989.
[29] A.N. Bessani, P. Sousa, M. Correia, N.F. Neves, and P. Verissimo, "Intrusion-Tolerant Protection for Critical Infrastructures," Technical Report DI/FCUL TR 07-8, Dept. of Informatics, Univ. of Lisbon, Apr. 2007.
[30] C. Wilson, "Terrorist Capabilities for Cyber-Attack," Int'l Critical Information Infrastructure Protection (CIIP) Handbook 2006, M. Dunn and V. Mauer, eds., vol. II, pp. 69-88, Center for Security Studies (CSS), ETH Zurich, 2006.
[31] S. Kent and K. Seo, "Security Architecture for the Internet Protocol," RFC 4301 (Proposed Standard), http://www.ietf.org/rfcrfc4301.txt, Dec. 2005.
[32] K. Stouffer, J. Falco, and K. Kent, "Guide to Supervisory Control and Data Acquisition (SCADA) and Industrial Control Systems Security," Recommendations of the Nat'l Inst. of Standards and Technology (NIST), Special Publication 800-82 (Initial Public Draft), Sept. 2006.
[33] S. Kent, "IP Authentication Header," RFC 4302 (Proposed Standard), http://www.ietf.org/rfcrfc4302.txt, Dec. 2005.
[34] P. Barham, B. Dragovic, K. Fraiser, S. Hand, T. Harris, A. Ho, R. Neugebaurer, I. Pratt, and A. Warfield, "Xen and the Art of Virtualization," Proc. 19th ACM Symp. Operating Systems Principles (SOSP '03), Oct. 2003.
[35] Nat'l Inst. of Standards and Tech nology, "Secure Hash Standard," Federal Information Processing Standards Publication 180-2, Aug. 2002.
[36] H. Shacham, M. Page, B. Pfaff, E.-J. Goh, N. Modadugu, and D. Boneh, "On the Effectiveness of Address-Space Randomization," Proc. 11th ACM Conf. Computer and Comm. Security, pp. 298-307, 2004.
[37] Y. Huang, C.M.R. Kintala, N. Kolettis, and N.D. Fulton, "Software Rejuvenation: Analysis, Module and Applications," Proc. 25th Int'l Symp. Fault Tolerant Computing (FTCS-25), pp. 381-390, June 1995.
[38] S. Garg, A. Puliafito, M. Telek, and K.S. Trivedi, "Analysis of Software Rejuvenation Using Markov Regenerative Stochastic Petri Nets," Proc. Int'l Symp. Software Reliability Eng. (ISSRE '95), Oct. 1995.
[39] Y. Huang and C.M.R. Kintala, "Software Implemented Fault Tolerance: Technologies and Experience," Proc. 23rd Int'l Symp. Fault Tolerant Computing (FTCS-23), pp. 2-9, June 1993.
[40] D. Patterson, A. Brown, P. Broadwell, G. Candea, M. Chen, J. Cutler, P. Enriquez, A. Fox, E. Kiciman, M. Merzbacher, D. Oppenheimer, N. Sastry, W. Tetzlaff, J. Traupman, and N. Treuhaft, "Recovery Oriented Computing (ROC): Motivation, Definition, Techniques and Case Studies," Technical Report UCB/CSD TR 02-1175, Computer Science Dept., Univ. of California at Berkeley, Mar. 2002.
[41] K.R. Joshi, M. Hiltunen, W.H. Sanders, and R. Schlichting, "Automatic Model-Driven Recovery in Distributed Systems," Proc. 24th IEEE Symp. Reliable Distributed Systems (SRDS '05), pp. 26-38, Oct. 2005.
[42] J. Yin, J.-P. Martin, A. Venkataramani, L. Alvisi, and M. Dahlin, "Separating Agreement Form Execution for Byzantine Fault Tolerant Services," Proc. 19th ACM Symp. Operating Systems Principles (SOSP '03), pp. 253-267, 2003.
[43] F.B. Schneider, "Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial," ACM Computing Surveys, vol. 22, no. 4, pp. 299-319, Dec. 1990.
15 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool