This Article 
 Bibliographic References 
 Add to: 
An Environment for Developing Fault-Tolerant Software
February 1991 (vol. 17 no. 2)
pp. 153-159

An environment that supports execution of programs using both N-version programming and recovery blocks in a uniform manner is described. For N-version programming, the system offers an easy and flexible way of specifying the target machines for the separate versions. The basic unit of fault tolerance supported by this system is at the procedure or function level. Each such program unit can be packaged as its own task, and different fault tolerance techniques can subsequently be employed, even within the same application. The environment also allows versions to be written in different programming languages and executed on different machines. This enhances the independence between the different versions, making the fault tolerance techniques more effective. This environment has been developed for use on Unix-based hosts and currently runs on a network of Sun and DEC workstations.

[1] A. Avizienis, P. Gunninberg, J. Kelly, L. Strigiui, P. Traverse, K. Tsa, and V. Voges, "The UCLA DEDIX system: A distributed testbed for multiple-version software," inDig. Papers, FTCS-15, 1985, pp. 126-133.
[2] T. Anderson and R. Kerr, "Recovery blocks in Action: A system supporting high reliability," inProc. 2nd Int. Conf. Software Engineering, Oct. 1979, pp. 447-457.
[3] A. Avizienis and J. Kelly, "Fault tolerance by design diversity: Concepts and experiments,"Computer, vol. 17, no. 8, Aug. 1984.
[4] A. Avizienis, M. Lyu, and W. Schutz, "In search of diversity: A six-language study of fault-tolerant flight control software," inDig. Papers, FTCS-18, Tokyo, Japan, 1988, pp. 15-22.
[5] L. Chen and A. Avizienis, "N-version programming: A fault tolerance approach to reliability of software operation," inDig. Papers, FTCS-8, Toulouse, France, 1978, pp. 3-9.
[6] J. Kelly and A. Avizienis, "A specification oriented multiversion software experiment," inDig. Papers, FTCS-13, Milan, Italy, June 1983, pp. 120-125.
[7] J. Kelly, D. Eckhardt, M. Vouk, D. McAllister, and A. Caglayan, "A large scale second generation experiment in multi-version software: Description and early results," inDig. Papers, FTC-18, Tokyo, Japan, 1988, pp. 9-14.
[8] K. Kim and J. Yoon, "Approaches to implementation of a reparable distributed recovery block scheme," inDig. Papers, FTCS-18, 1988, pp. 50-55.
[9] J. Knight and N. Leveson, "An experimental evaluation of the assumption of independence in multiversion programming,"IEEE Trans. Software Eng., vol. SE-12, no. 1, pp. 96-109, Jan. 1986.
[10] J. Purtilo, "Polylith: An environment to support management of tool interfaces," inProc. SIGPLAN Symp. Language Issues in Programming Environments, July 1985, pp. 12-18.
[11] J. Purtilo, "On specifying an environment," inProc. IEEE Ninth Int. Computer Software and Applications Conf., Oct. 1985, pp. 456-463.
[12] J. Purtilo, D. Reed, and D. Grunwald, "Environments for prototyping parallel algorithms,"Parallel and Distributed Comput., vol. 5, pp. 421-437, 1988.
[13] B. Randell, "System structure for software fault tolerance,"IEEE Trans. Software Eng., vol. 1, no. 2, pp. 220-232, June 1975.
[14] S. Srivastava, "Concurrent Pascal with backward error recovery: Language features and examples,"Software: Practice and Experience, vol. 9, pp, 1001-1020, 1979.

Index Terms:
environment; fault-tolerant software; N-version programming; recovery blocks; programming languages; Unix-based hosts; Sun; DEC workstations; fault tolerant computing; programming environments; software reliability; system recovery
J.M. Purtilo, P. Jalote, "An Environment for Developing Fault-Tolerant Software," IEEE Transactions on Software Engineering, vol. 17, no. 2, pp. 153-159, Feb. 1991, doi:10.1109/32.67596
Usage of this product signifies your acceptance of the Terms of Use.