Issue No.06 - June (1998 vol.24)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/32.689401
<p><b>Abstract</b>—Masking fault-tolerance guarantees that programs continually satisfy their specification in the presence of faults. By way of contrast, nonmasking fault-tolerance does not guarantee as much: it merely guarantees that when faults stop occurring, program executions converge to states from where programs continually (re)satisfy their specification. We present in this paper a component based method for the design of masking fault-tolerant programs. In this method, components are added to a fault-intolerant program in a stepwise manner, first, to transform the fault-intolerant program into a nonmasking fault-tolerant one and, then, to enhance the fault-tolerance from nonmasking to masking. We illustrate the method by designing programs for agreement in the presence of Byzantine faults, data transfer in the presence of message loss, triple modular redundancy in the presence of input corruption, and mutual exclusion in the presence of process fail-stops. These examples also serve to demonstrate that the method accommodates a variety of fault-classes. It provides alternative designs for programs usually designed with extant design methods, and it offers the potential for improved masking fault-tolerant programs.</p>
Masking and nonmasking fault-tolerance, component based design, correctors, detectors, stepwise design formal methods, distributed systems.
Anish Arora, "Designing Masking Fault-Tolerance via Nonmasking Fault-Tolerance", IEEE Transactions on Software Engineering, vol.24, no. 6, pp. 435-450, June 1998, doi:10.1109/32.689401