Issue No. 11 - November (1997 vol. 23)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/32.637385
<p><b>Abstract</b>—A reconfigurable fault tolerant system achieves the attributes of dependability of operations through fault detection, fault isolation and reconfiguration, typically referred to as the FDIR paradigm. Fault diagnosis is a key component of this approach, requiring an accurate determination of the health and state of the system. An imprecise state assessment can lead to catastrophic failure due to an optimistic diagnosis, or conversely, result in underutilization of resources because of a pessimistic diagnosis. Differing from classical testing and other off-line diagnostic approaches, we develop procedures for maximal utilization of the system state information to provide for continual, on-line diagnosis and reconfiguration capabilities as an integral part of the system operations. Our diagnosis approach, unlike existing techniques, does not require administered testing to gather syndrome information but is based on monitoring the system message traffic among redundant system functions. We present comprehensive on-line diagnosis algorithms capable of handling a continuum of faults of varying severity at the node and link level. Not only are the proposed algorithms on-line in nature, but are themselves tolerant to faults in the diagnostic process. Formal analysis is presented for all proposed algorithms. These proofs offer both insight into the algorithm operations and facilitate a rigorous formal verification of the developed algorithms.</p>
Fault diagnosis, on-line, fault handling, formal methods.
Chris J. Walter, Patrick Lincoln, Neeraj Suri, "Formally Verified On-Line Diagnosis", IEEE Transactions on Software Engineering, vol. 23, no. , pp. 684-721, November 1997, doi:10.1109/32.637385