• Publication
  • 1989
  • Issue No. 5 - May
  • Abstract - Distributed Execution of Recovery Blocks: An Approach for Uniform Treatment of Hardware and Software Faults in Real-Time Applications
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Distributed Execution of Recovery Blocks: An Approach for Uniform Treatment of Hardware and Software Faults in Real-Time Applications
May 1989 (vol. 38 no. 5)
pp. 626-636
The concept of distributed execution of recovery blocks is examined as an approach for uniform treatment of hardware and software faults. A useful characteristic of the approach is the relatively small time cost it requires. The approach is thus suitable for incorporation into real-time computer systems. A specific formulation of the approach that is aimed at minimizing the recovery time is pre

[1] T. Anderson and B. Randell, Eds.,Computing System Reliability. Cambridge, England: Cambridge University Press, 1979.
[2] A. Avizienis, M. Lyu, and W. Schutz, "In search of diversity: A six-language study of fault-tolerant flight control software," inDig. Papers, FTCS-18, Tokyo, Japan, 1988, pp. 15-22.
[3] P. Brinch Hansen,The Architecture of Concurrent Programs. Englewood Cliffs, NJ: Prentice-Hall, 1977.
[4] K. N. Chandy and C. V. Ramamoorthy, "Rollback and recovery strategies for computer programs,"IEEE Trans. Comput., pp. 59- 65, June 1972.
[5] C. G. Davis and R. L. Couch, "Ballistic missile defense: A supercomputer challenge,"IEEE Computer, pp. 37-46, Nov. 1980.
[6] H. Hecht, "Fault-tolerant software for real-time applications,"ACM Comput. Surveys, vol. 8, no. 4, pp. 391-407, Dec. 1976.
[7] J. Horning, H. C. Lauer, P. M. Melliar-Smith, and B. Randell, "A program structure for error detection and recovery,"Lecture Notes in Computer Science, vol. 16, New York: Springer-Verlag, 1974, pp. 171-187.
[8] K. H. Kim, H. Hecht, J. Huang, and M. Naghibzadeh, "Strategies for structured and fault-tolerant design of recovery programs," inProc. IEEE Comput. Soc. Int. Comput. Software Appl. Conf. (COMPSAC), Nov. 1978, pp. 651-656.
[9] K. H. Kim, "Evolution of a virtual machine supporting fault-tolerant distributed processes at a research laboratory," inProc. Int. Conf. Data Eng., Los Angeles, CA, Apr. 1984, pp. 620-628.
[10] K. H. Kim, "Software fault tolerance," inHandbook of Software Engineering. C. R. Vick and C. V. Ramamoorthy, Eds. New York: Van Nostrand Reinhold, 1984, ch. 20.
[11] H. Kopetz and W. Merker, "The architecture of MARS," inProc. IEEE Comput. Soc. 15th Int. Symp. Fault-Tolerant Comput., June 1985, pp. 274-279.
[12] W. C. McDonald and R. W. Smith, "A flexible distributed testbed for real time applications,"IEEE Computer, vol. 15, pp. 25-39, Oct. 1982.
[13] B. Randell, "System structure for software fault tolerance,"IEEE Trans. Software Eng., pp. 220-232, June 1975.
[14] J. A. Rohr, "STAREX self-repair routines: Software recovery in the JPL-STAR computer," inDig. IEEE Comput. Soc. Int. Symp. Fault-Tolerant Comput., 1973, pp. 11-16.
[15] O. Serlin, "Fault-tolerant systems in commercial applications,"IEEE Computer, pp. 19-30, Aug. 1984.
[16] N. A. Vosbury, "The process design system," inProc. IEEE Comput. Soc. Int. Comput. Software Appl. Conf. (COMPSAC), 1979, pp. 374-379.

Index Terms:
hardware faults; recovery blocks; uniform treatment; software faults; real-time applications; distributed execution; time cost; real-time computer systems; distributed recovery blocks scheme; DRB scheme; forward recovery; distributed execution; load-sharing multiprocessing scheme; multimicrocomputer networks; tolerance; distributed processing; fault tolerant computing.
Citation:
K.H. Kim, H.O. Welch, "Distributed Execution of Recovery Blocks: An Approach for Uniform Treatment of Hardware and Software Faults in Real-Time Applications," IEEE Transactions on Computers, vol. 38, no. 5, pp. 626-636, May 1989, doi:10.1109/12.24266
Usage of this product signifies your acceptance of the Terms of Use.