This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
The MAFT Architecture for Distributed Fault Tolerance
April 1988 (vol. 37 no. 4)
pp. 398-405
A description is given of the multicomputer architecture for fault tolerance (MAFT), a distributed system designed to provide extremely reliable computation in real-time control systems. MAFT is based on the physical and functional partitioning of executive functions from applications functions. The implementation of the executive functions in a special-purpose hardware processor allows the fau

[1] M. W. Johnstonet al., "AIPS system requirements (revision 1)," CSDL-C-5738, Charles Stark Draper Lab., Inc., Cambridge, MA, Aug. 1983.
[2] J. H. Wensleyet al., "SIFT: Design and analysis of a fault-tolerant computer for aircraft control,"Proc. IEEE, vol. 66, Oct. 1978.
[3] A. L. Hopkinset al., "FTMP-A highly reliable fault-tolerant multiprocessor for aircraft,"Proc. IEEE, vol. 66, Oct. 1978.
[4] T. B. Smith, "Fault-tolerant processor concepts and operation," inProc. Fourteenth IEEE Fault-Tolerant Comput. Symp., June 1984.
[5] D. L. Palumbo and R. W. Butler, "Measurement of SIFT operating system overhead," NASA Tech. Memo. 86322, 1985.
[6] A. Whitesideet al., "Fault-tolerant multicomputer system for control applications," inProc. Eleventh IEEE Fault-Tolerant Comput. Symp., June 1981.
[7] C. J. Walteret al., "MAFT: A multicomputer architecture for faulttolerance in real-time control systems," inProc. IEEE Real-Time Syst. Symp., Dec. 1985.
[8] L. Lamport, R. Shostak, and M. Pease, "The Byzantine Generals Problem,"ACM Trans. Programming Languages and Systems, Vol. 4, No. 3, July 1982, pp. 382-401.
[9] J. Goldberget al., "Development and analysis of the software implemented fault-tolerance (SIFT) computer," Final Rep. NASA Contract NASA-CR-172146, Feb. 1984.
[10] E. W. Czeck, D. P. Siewiorek, and Z. Segall, "Fault free performance validation of a fault-tolerant multiprocessor: Baseline and synthetic workload measurements," Carnegie Mellon Univ., Dep. Comput. Sci., CMU-CS-85-117.
[11] E. W. Czeck, D. P. Siewiorek, and Z. Segall, "Advanced information processing system (AIPS) system requirements (revision 1)," Rep. CSDL-C-5709, Charles Stark Draper Lab., Inc., Cambridge, MA, Oct. 1984.
[12] J. C. Knightet al., "A large scale experiment in N-version programming," inProc. Fifteenth IEEE Fault-Tolerant Comput. Symp., June 1985, pp. 135-139.
[13] D. Dolevet al., "Reaching approximate agreement in the presence of faults," inProc. Third Symp. Reliability Distributed Software Database Syst., Oct. 1983.
[14] D. Davies and J. Wakerly, "Synchronization and matching in redundant systems,"IEEE Trans. Comput., vol. C-27, pp. 531-539, June 1978.
[15] M. Pease, R. Shostak, and L. Lamport, "Reaching agreement in the presence of faults,"J. Ass. Comput. Mach., vol. 27, pp. 228-234, Apr. 1980.
[16] R. W. Butler, "An assessment of the real-time application capabilities of the SIFT computer system," NASA Tech. Memo. 84432, Apr. 1982.
[17] T.K. Srikanth and S. Toueg, "Optimal Clock Synchronization,"J. ACM, Vol. 34, No. 3, July 1987, pp. 626-645.
[18] R. M. Kieckhafer, "Task reconfiguration in a distributed real-time system," inProc. Eighth IEEE Real-Time Syst. Symp., Dec. 1987.
[19] G. K. Manacher, "Production and stabilization of real-time task schedules,"J. Ass. Comput. Mach., vol. 14, July 1967.
[20] O. Babaoglu and R. Drummond, "Streets of Byzantium: Network architectures for fast reliable broadcasts,"IEEE Trans. Software Eng., vol. SE-11, pp. 546-554, June 1985.
[21] K. Perry, "Randomized Byzantine agreement," inProc. Fourth Symp. Rel. Distributed Software Database Syst., Silver Springs, MD, Oct. 1984, pp. 107-118.
[22] D. P. Gluch and M. J. Paul, "Fault-tolerance in distributed digital flyby-wire flight control systems," inProc. AIAA/IEEE Seventh Digital Avion. Syst. Conf., Oct. 13-16, 1986.

Index Terms:
MAFT architecture; distributed fault tolerance; multicomputer architecture for fault tolerance; real-time control systems; functional partitioning; special-purpose hardware processor; application programs; Byzantine agreement; approximate agreement algorithms; computer architecture; distributed processing; fault tolerant computing.
Citation:
R.M. Keichafer, C.J. Walter, A.M. Finn, P.M. Thambidurai, "The MAFT Architecture for Distributed Fault Tolerance," IEEE Transactions on Computers, vol. 37, no. 4, pp. 398-405, April 1988, doi:10.1109/12.2183
Usage of this product signifies your acceptance of the Terms of Use.