
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
D. Gu, D.J. Rosenkrantz, S.S. Ravi, "Construction of Check Sets for AlgorithmBased Fault Tolerance," IEEE Transactions on Computers, vol. 43, no. 6, pp. 641650, June, 1994.  
BibTex  x  
@article{ 10.1109/12.286298, author = {D. Gu and D.J. Rosenkrantz and S.S. Ravi}, title = {Construction of Check Sets for AlgorithmBased Fault Tolerance}, journal ={IEEE Transactions on Computers}, volume = {43}, number = {6}, issn = {00189340}, year = {1994}, pages = {641650}, doi = {http://doi.ieeecomputersociety.org/10.1109/12.286298}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Computers TI  Construction of Check Sets for AlgorithmBased Fault Tolerance IS  6 SN  00189340 SP641 EP650 EPD  641650 A1  D. Gu, A1  D.J. Rosenkrantz, A1  S.S. Ravi, PY  1994 KW  multiprocessing systems; computational complexity; error detection; fault tolerant computing; check sets; algorithmbased fault tolerance; error detection; multiprocessor systems; ABFT; check set; minimum cardinality; bounded check size assumption; bounded check size model; fault detection; design problem; NPhard. VL  43 JA  IEEE Transactions on Computers ER   
Algorithmbased fault tolerance (ABFT) is a popular approach to achieve fault and error detection in multiprocessor systems. The design problem for ABFT is concerned with the construction of a check set of minimum cardinality that detects a specified number of errors or faults. Previous work on this problem has assumed an a priori bound on the size of a check. We motivate and carry out an investigation of the problem without the bounded check size assumption. We establish upper and lower bounds on the number of checks needed to detect a given number of errors. The upper bounds are obtained through new schemes which are easy to implement, and the lower bounds are established using new types of arguments. These bounds are sharply different from those previously established under the bounded check size model. We also show that unlike error detection, the design problem for fault detection is NPhard even for detecting only one fault.
[1] J. A. Abrahamet al., "Fault tolerance techniques for systolic arrays,"IEEE Comput. Mag., vol. 20, pp. 6574, July 1987.
[2] P. Banerjee and J. A. Abraham, "Bounds on algorithmbased fault tolerance in multiple processor systems,"IEEE Trans. Comput., vol. C35, pp. 296306, Apr. 1986.
[3] P. Banerjee and J. A. Abraham, "Concurrent Fault Diagnosis in Multiple Processor Systems," inProc. 16th Int. Symp. FaultTolerant Computing (FTCS16), Vienna, Austria, July 1986, pp. 298303.
[4] P. Banerjeeet al., "An evaluation of systemlevel fault tolerance on the intel hypercube multiprocessor," inProc. 18th Int. Symp. FaultTolerant Comput., 1988, pp. 362367.
[5] D. M. Blough and A. Pelc, "Almost certain fault diagnosis through algorithmbased fault tolerance," Tech.Rep. ECE9209, Dept. of Elect. and Comput. Eng., Univ. of California, Irvine, CA, Aug. 1992.
[6] M. R. Garey and D. S. Johnson,Computers and Intractability: A Guide to Theory of NPCompleteness. San Francisco, CA: Freeman, 1979.
[7] D. Gu, D. J. Rosenkrantz, and S. S. Ravi, "Determining performance measures of algorithmbased fault tolerant systems,"J. Parallel and Distrib. Computing, vol. 18, no. 1, pp. 5670, May 1993.
[8] K. H. Huang and J. A. Abraham, "Algorithmbased fault tolerance for matrix operations,"IEEE Trans. Comput., vol. C33, no. 6, pp. 518528, June 1984.
[9] J. Y. Jou and J. A. Abraham, "Fault tolerant FFT networks," inProc. 15th Int. Symp. FaultTolerant Computing (FTCS15), June 1985, pp. 338343.
[10] J. Y. Jou and J. A. Abraham, "Fault tolerant matrix arithmetic and signal processing on highly concurrent computing structures,"Proc. IEEE, vol. 74, no. 5, pp. 732741, May 1986.
[11] F. T. Luk and H. Park, "Analysis of algorithmbased fault tolerance techniques," inJ. Parallel Distribut. Comput., vol. 5, pp. 172184, 1988.
[12] M. Malek and Y. H. Choi, "A fault tolerant FFT processor," inProc. 15th Int Symp. FaultTolerant Computing (FTCS15), June 1985, pp. 266271.
[13] V. S. S. Nair and J. A. Abraham, "General linear codes for fault tolerant matrix operations on processor arrays," inProc. Int. Symp. FaultTolerant Comput., Tokyo, June 1988, pp. 180185.
[14] V. S. S. Nair and J. A. Abraham, "A model for the analysis of faulttolerant signal processing architectures," inProc. SPIE Conf., San Diego, CA, Aug. 1988.
[15] V. S. S. Nair and J. A. Abraham, "A model for the analysis, design and comparison of faulttolerant WSI architectures," inProc. Workshop on Wafer Scale Integration, Como, Italy, June 1989.
[16] V. S. S. Nair and J. A. Abraham, "Hierarchical design and analysis of faulttolerant multiprocessor systems using concurrent error detection," inProc. 20th Int. Symp. FaultTolerant Comput., (FTCS20), Newcastle upon Tyne, June 1990, pp. 130137.
[17] D. J. Rosenkrantz and S. S. Ravi, "Improved bounds for algorithmbased fault tolerance,"IEEE Trans. Comput., vol. 42, no. 5, pp. 630635, May 1993.
[18] R. K. Sitaraman and N. K. Jha, "Optimal design of checks for error detection and location in faulttolerant multiprocessor systems," inProc. 5th Int. Conf. FaultTolerant Comput. Syst., Nurnberg, Germany, Sept. 1991.
[19] D. L. Tao, C. R. P. Hartmann, and Y. S. Chen, "A novel concurrent error detection scheme for FFT networks," inProc. Int. Symp. Fault Tolerant Comput., NewcastleuponTyne, U.K., June 1990, pp. 114121.
[20] B. Vinnakota and N. K. Jha, "A dependence graphbased approach to the design of algorithmbased fault tolerant systems," inProc. Int. Symp. Fault Tolerant Comput., NewcastleuponTyne, U.K., June 1990, pp. 122129.