This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Complexity Issues in Automated Synthesis of Failsafe Fault-Tolerance
July-September 2005 (vol. 2 no. 3)
pp. 201-215
We focus on the problem of synthesizing failsafe fault-tolerance where fault-tolerance is added to an existing (fault-intolerant) program. A failsafe fault-tolerant program satisfies its specification (including safety and liveness) in the absence of faults. However, in the presence of faults, it satisfies its safety specification. We present a somewhat unexpected result that, in general, the problem of synthesizing failsafe fault-tolerant distributed programs from their fault-intolerant version is NP-complete in the state space of the program. We also identify a class of specifications, monotonic specifications, and a class of programs, monotonic programs, for which the synthesis of failsafe fault-tolerance can be done in polynomial time (in program state space). As an illustration, we show that the monotonicity restrictions are met for commonly encountered problems, such as Byzantine agreement, distributed consensus, and atomic commitment. Furthermore, we evaluate the role of these restrictions in the complexity of synthesizing failsafe fault-tolerance. Specifically, we prove that if only one of these conditions is satisfied, the synthesis of failsafe fault-tolerance is still NP-complete. Finally, we demonstrate the application of monotonicity property in enhancing the fault-tolerance of (distributed) nonmasking fault-tolerant programs to masking.

[1] B. Alpern and F.B. Schneider, “Defining Liveness,” Information Processing Letters, vol. 21, pp. 181-185, 1985.
[2] A. Arora and M.G. Gouda, “Closure and Convergence: A Foundation of Fault-Tolerant Computing,” IEEE Trans. Software Eng., vol. 19, no. 11, pp. 1015-1027, 1993.
[3] A. Arora and S.S. Kulkarni, “Detectors and Correctors: A Theory of Fault-Tolerance Components,” Proc. Int'l Conf. Distributed Computing Systems, pp. 436-443, May 1998.
[4] P. Attie and A. Emerson, “Synthesis of Concurrent Programs for an Atomic Read/Write Model of Computation,” ACM Trans. Programming Languages and Systems, vol. 23, no. 2, Mar. 2001.
[5] M. Barborak, A. Dahbura, and M. Malek, “The Consensus Problem in Fault-Tolerant Computing,” ACM Computing Surveys, vol. 25, no. 2, pp. 171-220, 1993.
[6] E.W. Dijkstra, A Discipline of Programming. Prentice-Hall, 1990.
[7] A. Doudou, B. Garbinato, R. Guerraoui, and A. Schiper, “Muteness Failure Detectors: Specification and Implementation,” Proc. European Dependable Computing Conf., pp. 71-87, 1999.
[8] A. Ebnenasir and S.S. Kulkarni, “FTSyn: A Framework for Automatic Synthesis of Fault-Tolerance,” Technical Report MSU-CSE-03-16, Computer Science and Eng. Dept., Michigan State Univ., East Lansing, July 2003.
[9] A. Ebnenasir and S.S. Kulkarni, “Efficient Synthesis of Failsafe Fault-Tolerant Distributed Programs,” Technical Report MSU-CSE-05-13, Computer Science and Eng. Dept., Michigan State Univ., East Lansing, Apr. 2005.
[10] M.J. Fischer, N.A. Lynch, and M.S. Peterson, “Impossibility of Distributed Consensus with One Faulty Processor,” J. ACM, vol. 32, no. 2, pp. 373-382, 1985.
[11] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, 1979.
[12] F.C. Gärtner and A. Jhumka, “Automating the Addition of Failsafe Fault-Tolerance: Beyond Fusion-Closed Specifications,” Proc. Conf. Formal Techniques in Real-Time and Fault-Tolerant Systems (FTRTFT), Sept. 2004.
[13] L. Gong, P. Lincoln, and J. Rushby, “Byzantine Agreement with Authentication: Observations and Applications in Tolerating Hybrid and Link Faults,” Proc. Dependable Computing for Critical Applications-5, pp. 139-157, Sept. 1995.
[14] S.S. Kulkarni, “Component-Based Design of Fault-Tolerance,” PhD thesis, Ohio State Univ., http://www.cse.msu.edu/~sandeep/dissertation dissertation.ps, 1999.
[15] S.S. Kulkarni and A. Arora, “Compositional Design of Multitolerant Repetitive Byzantine Agreement,” Proc. 17th Int'l Conf. Foundations of Software Technology and Theoretical Computer Science, pp. 169-183, Dec. 1997.
[16] S.S. Kulkarni and A. Arora, “Automating the Addition of Fault-Tolerance,” Proc. Sixth Int'l Symp. Formal Techniques in Real-Time and Fault-Tolerant Systems, pp. 82-93, 2000.
[17] S.S. Kulkarni, A. Arora, and A. Chippada, “Polynomial Time Synthesis of Byzantine Agreement,” Proc. 20th IEEE Symp. Reliable Distributed Systems, pp. 130-140, 2001.
[18] S.S. Kulkarni and A. Ebnenasir, “The Complexity of Adding Failsafe Fault-Tolerance,” Proc. Int'l Conf. Distributed Computing Systems, pp. 337-344, 2002.
[19] L. Lamport, R. Shostak, and M. Pease, “The Byzantine Generals Problem,” ACM Trans. Programming Languages and Systems, vol. 4, no. 3, pp. 382-401, July 1982.
[20] M. Singhal and N. Shivaratri, Advanced Concepts in Operating Systems: Distributed, Database, and Multiprocessor Operating Systems. McGraw-Hill, 1994.

Index Terms:
Index Terms- Fault-tolerance, automatic addition of fault-tolerance, formal methods, program synthesis, distributed programs.
Citation:
Sandeep S. Kulkarni, Ali Ebnenasir, "Complexity Issues in Automated Synthesis of Failsafe Fault-Tolerance," IEEE Transactions on Dependable and Secure Computing, vol. 2, no. 3, pp. 201-215, July-Sept. 2005, doi:10.1109/TDSC.2005.29
Usage of this product signifies your acceptance of the Terms of Use.