This Article 
 Bibliographic References 
 Add to: 
Efficient Implementation Techniques for Gracefully Degradable Multiprocessor Systems
April 1995 (vol. 44 no. 4)
pp. 503-517

Abstract—We propose the dynamic reconfiguration network (DRN) and a monitoring-at-transmission (MAT) bus to support dynamic reconfiguration of an N-modular redundancy (NMR) multiprocessor system. In the reconfiguration process, a maximal number of processor triads are guaranteed to be formed on each processor cluster, thus supporting gracefully degradable operations. This is made possible by dynamically routing the control and clock signals of processors on the DRN so as to synchronize fault-free processors. The MAT bus is an efficient way to implement a triple modular redundant (TMR) pipeline voter (PV), which is a special case of the voting network proposed. Extensive experimental results have shown to support our design concept, and the performance of different cache memory organizations is evaluated through an analytic model.

[1] B. Parhami,“Voting networks,” IEEE Trans. Reliability, vol. 40, pp. 380-394, Aug. 1991.
[2] J. Kim,J.-C. Liu,P. Swarnam,T. Park,Y. Hao,, and T. Urbanik,“The area-wide real-time traffic control (ARTC) system: A distributed computingsystem,” IEEE Computer Software Application Conf., pp. 263-268,Chicago, June 1992.
[3] J. Kim,J.-C. Liu,P. Swarnam,, and T. Urbanik,“The area-wide real-time traffic control (ARTC) system: A new traffic control concept,” IEEE Trans. Vehicular Tech., vol. 42, no. 2, pp. 212-224, May 1993.
[4] M. Iacoponi and S. McDonald,“Distributed reconfiguration and recovery in the advanced architecture on-board processor,” Digest of Papers, FTCS-21, pp. 436-443, 1991.
[5] J. Stankovic and D. Towsley,“Dynamic relocation in a highly integrated real-time distributed system,” Proc. Int’l Conf. Distributed Computing Systems, pp. 374-381, May 1986.
[6] G. Barigazzi and L. Strigini,“Application-transparent setting of recovery points,” Digest of Papers, FTCS-13, pp. 48-55, 1983.
[7] D.B. Hunt and P.N. Marinos,“A general-purpose cache-aided rollback error recovery (CARER) technique,” Digest of Papers, FTCS-17, pp. 170-175, 1987.
[8] R.E. Ahmed,R.C. Frazier,, and P.N. Marinos,“Cache-aided rollback error recovery (CARER) algorithms for shared memory multiprocessor systems,” Proc. 17th Int’l Symp. Computer Architecture, pp. 82-88, 1990.
[9] C. Chen,A. Geng,T. Kikuno,, and K. Torii,“Reconfiguration algorithms for fault-tolerant arrays with minimum number of dangerous processors,” Digest of Papers, FTCS-21, pp. 452-461, 1991.
[10] A.L. Hopkins,T. Smith,, and J. Lala,“FTMP—A highly reliable fault-tolerant multiprocessor for aircraft,“Proc. IEEE, vol. 66, pp. 1221-1239, Oct. 1978.
[11] The Stratus Computer System, Stratus Inc., July 1992.
[12] The Series 400 Sequoia Systems, Sequoia Inc., July 1992.
[13] FT-6100 Fault Tolerant Computer TPR Architecture, Hitachi Ltd., July 1992.
[14] T.B. Smith,“Fault-tolerant processor concepts and operation,” Tech. Report, Charle Stark Draper Lab., CSDL-P-1727, May 1983.
[15] D.P. Sieworek, et al, “C.vmp: A voted multiprocessor,” Proc. IEEE, vol. 66, Oct. 1978.
[16] W. Greer and B. Kean,“Digital phase-locked loops move into analog territory,” Electronic Design, pp. 95-100, Mar. 1982.
[17] The Transputer Databook, Inmos Corp, 1989.
[18] P. Suetens, P. Fua, and A.J. Hanson, "Computational Strategies for Object Recognition," ACM Computing Surveys, Vol. 24, No. 1, Mar. 1992, pp. 5-61.
[19] D.P. Siewiorek,“Fault-tolerance in commercial computers,” Computer, July 1990.
[20] S.R. McConnel,D.P. Siewiorek,, and M.M. Tsao,The measurement and analysis of transient errors in digital systems,” Digest of Papers, FTCS-9, pp. 67-70, 1979.
[21] D.P. Siewiorek and R.S. Swarz,The Theory and Practice of Reliable System Design, Digital Equipment Corp., Bedford, Mass., 1992.
[22] J. Beetem,Galaxy CAD User Manual, Univ. of Wisconsin, Madison, Wis., 1992.

Index Terms:
Clock synchronization, cluster-based multiprocessors, N-modular redundancy (NMR), pipelined voter (PV), reliability, voting.
Kang G. Shin, Jyh-Charn Liu, "Efficient Implementation Techniques for Gracefully Degradable Multiprocessor Systems," IEEE Transactions on Computers, vol. 44, no. 4, pp. 503-517, April 1995, doi:10.1109/12.376166
Usage of this product signifies your acceptance of the Terms of Use.