This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Toward Systematic Design of Fault-Tolerant Systems
April 1997 (vol. 30 no. 4)
pp. 51-58

The mid-century "space race" was a major impetus for the development of fault-tolerant computing. Over the succeeding 25 years researchers expanded the concept of fault tolerance and refined the techniques for achieving it. Nevertheless, the bottom-up approach, entailing an infrastructure of autonomously fault-tolerant subsystems integrated with global fault tolerance functions, is less common today than the top-down approach, which relies on off-the-shelf (OTS) subsystems and a global monitoring function.

A design paradigm for the systematic treatment of fault tolerance involves four steps: specification, implementation, evaluation, and modification. The paradigm offers a way to minimize the probability of oversights, mistakes, and inconsistencies that may occur during the implementation of fault tolerance. In spite of the long-range merits of this bottom-up approach, time and cost constraints often lead developers to use OTS subsystems when designing systems that are expected to be highly dependable.

Even the Pentium Pro, which appears to have the most complete set of fault tolerance functions among contemporary microprocessors, has major drawbacks. Moreover, systems built from OTS subsystems are difficult to retrofit for fault tolerance. Without hardware support for fault tolerance, the only solution is to build a software monitor subsystem that tries to check all subsystems for indications of failure. But the monitor itself is unprotected because it resides and executes on an OTS processor.

Researchers would do well to consider the human immune system as a model for systems in which fault tolerance is an integral attribute of every hardware element.

Citation:
Algirdas Avizienis, "Toward Systematic Design of Fault-Tolerant Systems," Computer, vol. 30, no. 4, pp. 51-58, April 1997, doi:10.1109/2.585154
Usage of this product signifies your acceptance of the Terms of Use.