This Article 
 Bibliographic References 
 Add to: 
Component Based Design of Multitolerant Systems
January 1998 (vol. 24 no. 1)
pp. 63-78

Abstract—The concept of multitolerance abstracts problems in system dependability and provides a basis for improved design of dependable systems. In the abstraction, each source of undependability in the system is represented as a class of faults, and the corresponding ability of the system to deal with that undependability source is represented as a type of tolerance. Multitolerance thus refers to the ability of the system to tolerate multiple fault-classes, each in a possibly different way. In this paper, we present a component based method for designing multitolerance. Two types of components are employed by the method, namely detectors and correctors. A theory of detectors, correctors, and their interference-free composition with intolerant programs is developed, that enables stepwise addition of components to provide tolerance to a new fault-class while preserving the tolerances to the previously added fault-classes. We illustrate the method by designing a fully distributed multitolerant program for a token ring.

[1] D. Siewiorek and R. Swarz, Reliable Computer Systems: Design and Evaluation. Digital Press, 1992.
[2] W.N. Toy and L.C. Toy, The AT&T Case, ch.8, in [1].
[3] R.W. Kocsis, The Galileo Case, ch. 9, in [1].
[4] F. Bastani, I.-L. Yen, and I. Chen, "A Class of Inherently Fault-Tolerant Diffusing Programs," IEEE Trans. Software Eng., vol. 14, pp. 1,432-1,442, 1988.
[5] A. Arora, "A Foundation of Fault-Tolerant Computing," PhD thesis, Univ. of Texas at Austin, 1992.
[6] S. Dolev and T. Herman, "Superstabilzing Protocols for Dynamic Distributed Systems," Proc. Second Workshop Self-Stabilizing Systems, 1995.
[7] M. Gouda and M. Schneider, "Maximal Flow Routing," Proc. Second Workshop Self-Stabilizing Systems, 1995.
[8] I. Yen and F. Bastani, "A Highly Safe Self-Stabilizing Mutual Exclusion Algorithm, Proc. Second Workshop Self-Stabilizing Systems, 1995.
[9] B. Alpern and F.B. Schneider, "Defining Liveness," Information Processing Letters vol. 21, pp. 181-185, 1985.
[10] A. Arora and M.G. Gouda, “Closure and Convergence: A Foundation of Fault-Tolerant Computing,” IEEE Trans. Software Eng., vol. 19, no. 11, pp. 1,015–1,027, 1993.
[11] E.W. Dijkstra, A Discipline of Programming.Englewood Cliffs, N.J.: Prentice Hall, 1976.
[12] D. Gries, The Science of Programming.New York, Heidelberg, Berlin: Springer-Verlag, 1981.
[13] B. Alpern and F.B. Schneider, "Proving Boolean Combinations of Deterministic Properties," Proc. Second Symp. Logic Computer Science, pp. 131-137, 1987.
[14] K.M. Chandy and J. Misra, Parallel Program Design—A Foundation. Addison-Wesley, 1988.
[15] M. Abadi and L. Lamport, "The Existence of Refinement Mappings," Theoretical Computing Science, vol. 82, no. 2, pp. 253-284, May 1991.
[16] G.M. Levin and D. Gries, "A Proof Technique for Communicating Sequential Processes," Acta Informatica, pp. 281-302, 1981.
[17] R.J.R. Back and K. Sere, "Stepwise Refinement of Parallel Programs," ACM Trans. Software Eng. and Methodology, vol. 3, no. 4, pp. 133-180, 1994.
[18] J. Misra and K.M. Chandy, "Proofs of Networks of Processes," IEEE Trans. Software Eng., vol. 7, no. 4, pp. 417-426, 1981.
[19] S. Owicki and D. Gries, "An Axiomatic Proof Technique for Parallel Programs," Acta Informatica, vol. 6, pp. 319-340, 1976.
[20] A. Arora and S.S. Kulkarni, "Designing Masking Fault-Tolerance via Nonmasking Fault-Tolerance," IEEE Trans. Software Eng., Vol. 24, No. 6, 1998, pp. 435-450.
[21] J. Rushby, "Critical System Properties: Survey and Taxonomy," Reliability Eng. and System, vol. 43, pp. 180-219, 1994.
[22] M. Herlihy and J. Wing, "Specifying Graceful Degradation," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 1, pp. 93-104, 1991.
[23] S.S. Kulkarni and A. Arora, "Multitolerance in Distributed Reset," Chicago J. Theoretical Computer Science, Special Issue on Self-Stabilization, 1998, to appear.
[24] A. Arora and S.S. Kulkarni, "Multitolerance and Its Design," Technical Report OSU-CISRC 07/96 TR-37, Ohio State Univ., 1996.
[25] A. Arora, M.G. Gouda, and G. Varghese, "Constraint Satisfaction as a Basis for Designing Nonmasking Fault-Tolerance," J. High Speed Networks, vol. 5, no. 3, pp. 293-306, 1996.
[26] S.S. Kulkarni and A. Arora, "Multitolerant Barrier Synchronization," Information Processing Letters, vol. 64, no. 1, 1997.
[27] S.S. Kulkarni and A. Arora, "Compositional Design of Multitolerant Repetitive Byzantine Agreement," Proc. 17th Int'l Conf. Foundations of Software Technology and Theoretical Computer Science,Kharagpur, India, pp. 169-183, Dec. 1997.

Index Terms:
Formal methods, compositional design, interference-freedom, stepwise design, detectors, correctors, dependability, fault-tolerance, graceful degradation.
Anish Arora, Sandeep S. Kulkarni, "Component Based Design of Multitolerant Systems," IEEE Transactions on Software Engineering, vol. 24, no. 1, pp. 63-78, Jan. 1998, doi:10.1109/32.663998
Usage of this product signifies your acceptance of the Terms of Use.