This Article 
 Bibliographic References 
 Add to: 
Optimal Reconfiguration Algorithms for Real-Time Fault-Tolerant Processor Arrays
May 1995 (vol. 6 no. 5)
pp. 498-511

Abstract—In this paper we consider the problem of reconfiguring processor arrays subject to computational loads that alternate between two modes. A strict mode is characterized by a heavy computational load and severe constraints on response time while a relaxed mode is characterized by a relatively light computational load and relaxed constraints on response time. In the strict mode, reconfiguration is performed by a distributed local algorithm in order to achieve fast recovery from faults. In the relaxed mode, a global reconfiguration algorithm is used to restore the system to a state that maximizes the probability that future faults occurring in subsequent strict modes will be repairable.

Several new results are given for this problem. Efficient reconfiguration algorithms are described for a number of general classes of architectures. These general algorithms obviate the need for architecture-specific algorithms for architectures in these classes. We show that it is unlikely that similar algorithms can be obtained for related classes of architectures since the reconfiguration problem for these classes is NP-complete. Finally, a general approximation algorithm is described that can be used for any architecture. Experimental results are given, suggesting that our algorithms are very effective.

[1] P. Banerjee,“Strategies for reconfiguring hypercubes under faults,”inProc. 20th Int. Symp. Fault-Tolerant Comput., June 1990, pp. 210–215.
[2] M. Chean and J. A. B. Fortes,“A taxonomy of reconfiguration techniques for fault-tolerant processor arrays,”IEEE Comput., vol. 23, pp. 55–69, Jan. 1990.
[3] C. Chen, A. Feng, T. Kikuno, and K. Torii,“Reconfiguration algorithm for fault-tolerant arrays with minimum number of dangerous processors,”inProc. 21st Int. Symp. Fault-Tolerant Comput., 1991, pp. 452–459.
[4] M. Davis and H. Putnam, "A Computing Procedure for Quantification Theory," J. ACM, vol. 7, July 1960, pp. 201-215.
[5] S. Even, A. Itai, and A. Shamir,“On the complexity of timetable and multicommodity flow problems,”SIAM J. Comput., vol. 5, pp. 691–703, 1976.
[6] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness.New York: W.H. Freeman, 1979.
[7] A.S.M. Hassan and V.K. Argawal, "A Fault-Tolerant Modular Architecture for Binary Trees," IEEE Trans. Computers, Vol. C-35, Apr. 1986, pp. 356-361.
[8] B. Kim and D. Towsley,“Dynamic flow control protocols for packet-switching multiplexers serving real-time multipacket messages,”IEEE Trans. Commun., vol. COM-34, Apr. 1986.
[9] R. Libeskind-Hadas,“Reconfiguration of fault-tolerant VLSI systems,”Dep. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Tech. Rep. UIUCDCS-R-93-1824, July 1993.
[10] F. Lombardi, R. Negrini, M. G. Sami and R. Stefanelli,“Reconfiguration of VLSI arrays: A covering approach,”inProc. 17th Int. Symp. Fault-Tolerant Comput., 1987, pp. 251–256.
[11] R. G. Melhem,“Bi-level reconfigurations of fault tolerant arrays,”IEEE Trans. Comput., vol. 41, pp. 231–239, Feb. 1992.
[12] K. Mehlhorn, Graph Algorithms and NP-Completeness.Berlin: Springer-Verlag, 1984.
[13] R. Negrini, M. G. Sami, and R. Stefanelli,Fault Tolerance Through Reconfiguration in VLSI and WSI Arrays. Cambridge, MA: The M.I.T. Press, 1989.
[14] C. S. Raghavendra, A. Avizienis, and M. Ercegovac,“Fault tolerance in binary tree architectures,”IEEE Trans. Comput., vol. C-33, pp. 568–572, June 1984.
[15] D. A. Rennels,“On implementing fault-tolerance in binary hypercubes,”Proc. 16th Int. Symp. Fault-Tolerant Comput., 1986, pp. 344–349.
[16] C. Savage,“Maximum matchings and trees,”Inform. Processing Lett., vol. 10, nos. 4, 5, pp. 202–205, July 5, 1980.
[17] N. Shrivastava and R. G. Melhem,“Efficient and optimal fault-to-spare assignments in doubly fault tolerant arrays,”inProc. IEEE Int. Workshop on Defect and Fault Tolerance in VLSI Syst., Nov. 1991, pp. 247–259.
[18] A. D. Singh,“A reconfigurable modular fault tolerant binary tree architecture,”inProc. 17th Int. Symp. Fault-Tolerant Comput., 1987, pp. 298–304.
[19] ——,“Interstitial redundancy: An area efficient fault tolerance scheme for large area VLSI processor arrays,”IEEE Trans. Comput., vol. 37, pp. 1398–1410, Nov. 1988.
[20] R. Tarjan, "Data Structures and Network Algorithms," SIAM,Philadelphia, Penn., 1983.
[21] M. W. Yung, M. J. Little, R. D. Etchells, and J. G. Nash,“Redundancy for yield enhancement in the 3-D computer,”inProc. 1989 IEEE Int. Conf. on Wafer Scale Integr., 1989, pp. 73–82.

Ran Libeskind-Hadas, Nimish Shrivastava, Rami G. Melhem, C. L. Liu, "Optimal Reconfiguration Algorithms for Real-Time Fault-Tolerant Processor Arrays," IEEE Transactions on Parallel and Distributed Systems, vol. 6, no. 5, pp. 498-511, May 1995, doi:10.1109/71.382318
Usage of this product signifies your acceptance of the Terms of Use.