Issue No. 08 - August (1998 vol. 31)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/2.707619
Although much research has been devoted to the automatic detection of hardware failures, relatively little has been done in the software arena. To date, no method has explicitly and cost-effectively dealt with failure detection in software systems whose specifications are nondeterministic. In such systems, the specification permits multiple outputs for the same input sequence and system state. Nondeterminism in specifications is advantageous because the specification writer can avoid stating irrelevant behavior as mandatory, freeing the software designer to choose a behavioral alternative that would yield a more desirable implementation. Unfortunately, this flexibility comes at a cost to the failure-detection mechanism. It must accommodate all the target system?s legal behavioral alternatives and avoid favoring one of them. This article describes a hierarchical supervisor, whose failure-detection mechanism explicitly addresses systems with nondeterministic specifications. The supervisor, a unit separate from the target system, observes the system?s external inputs and outputs and reports any failures. Its hierarchical structure results from splitting the task of identifying the behavioral alternative the target system chooses from the task of checking the details of system behavior. This structure makes it possible to efficiently trade off detection accuracy and computational cost. A key element is a model, derived from the target system?s specification, that prunes behavioral alternatives. To evaluate their approach, the authors created a prototype supervisor and used it to supervise the execution of the control program of a small telephone exchange. Results indicate that the hierarchical supervisor can significantly reduce the computational cost of considering the target system?s behavioral alternatives. However, although the supervisor?s computational cost is significantly reduced, it is still higher than that for the target system.
R. E. Seviora and T. Savor, "Toward Automatic Detection of Software Failures," in Computer, vol. 31, no. , pp. 68-74, 1998.