This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Algorithm-Based Fault Tolerant Synthesis for Linear Operations
April 1996 (vol. 45 no. 4)
pp. 425-438

Abstract—High-level synthesis is becoming more important in practical design environments to meet new system requirements and, increasingly, fault tolerance is one especially because of high-speed and low power demands. This paper explores several basic aspects of low-cost fault tolerant synthesis for practical linear systems. It deals with practical design constraints that require basic operations to be only performed by a limited processing resources and do not normally assign each operation a separate processing resource. Two core issues, partitioning and allocation, for fault tolerant synthesis are examined. Results demonstrate a high-level abstraction and framework for fault tolerant synthesis which is almost totally independent of the physical hardware implementation. Issues in designing 1-fault detectable FFT system are considered in detail to illustrate the significance and effects of fault tolerant synthesis schemes. Our ultimate goal is to incorporate these techniques in future automated design tools so that fault tolerance features can be part of the design options.

[1] K.H. Huang and J.A. Abraham,"Algorithm-Based Fault Tolerance for Matrix Operations," IEEE Trans. Computers, vol. 33, pp. 518-528, Dec. 1984.
[2] J.Y. Jou and J.A. Abraham, "Fault Tolerant FFT Networks," IEEE Trans. Computers, Vol. 37, May 1988, pp. 548-561.
[3] Y. Choi and M. Malek,“A fault-tolerant FFT processor,” IEEE Trans. Computers, vol. 37, pp. 617-621, May 1988.
[4] D.L. Tao and C.R.P. Hartmann,"A Novel Concurrent Error Detection Scheme for FFT Networks," CEAS Tech. Report 562, Dept. of Electrical Eng., State Univ. of New York at Stony Brook, Nov. 1988.
[5] S. Wang and N.K. Jha,“Algorithm-based fault-tolerance for FFT networks,” Proc. IEEE Int’l Symp. Circuits and Systems, pp. 141-144, 1992.
[6] Y.H. Choi and M. Malek, “A Fault-Tolerant Systolic Sorter,” IEEE Trans. Computers, vol. 37, no. 5, pp. 621-624, May 1988.
[7] C.Y. Chen and J.A. Abraham,"Fault Tolerant Systems for the Computations of Eigenvalues and Singular Values," Proc. SPIE Advanced Algebra and Architecture for Signal Processing, vol. 696, pp. 228-237, Aug. 1986.
[8] J.Y. Jou and J.A. Abraham,"Fault Tolerant Matrix Arithmetic and Signal Processing on Highly Concurrent Computing Structures," Proc. IEEE, vol. 74, pp. 732-741, May 1986.
[9] F.T. Luk,"Algorithm-Based Fault Tolerance for Parallel Matrix Equations Solvers," Proc. SPIE Real Time Signal Proc., vol. 564, pp. 49-53, Aug. 1985.
[10] F.T. Luk and H. Park, “A Fault Tolerance Matrix Triangularizations on Systolic Arrays,” IEEE Trans. Computers, vol. 37, no. 11, pp. 1434-1438, Nov. 1988.
[11] A.L.N. Reddy and P. Banerjee, “Algorithm-Based Fault Detection for Signal Processing Applications,” IEEE Trans. Computers, vol. 39, no. 10, pp. 1,304-1,308, Oct. 1990.
[12] B. Vinnakota and N.K. Jha, "A Dependence Graph-Based Approach to the Design of Algorithm-Based Fault Tolerant Systems," Proc. Int'l Symp. Fault-Tolerant Computing, pp. 122-129,Newcastle-upon-Tyne, U.K., June 1990.
[13] B. Vinnakota and N. Jha, "Synthesis of Algorithm-Based Fault-tolerant Systems from Dependence Graphs," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 8, pp. 864-874, Aug. 1993.
[14] P. Banerjee and J.A. Abraham, "Bounds on Algorithm-Based Fault Tolerance in Multiple Processor Systems," IEEE Trans. Computers, Apr. 1986, pp. 296-306.
[15] S.Y. Kung, VLSI Array Processors. Prentice Hall, 1988.
[16] J.A. Bondy and U.S.R. Murty,Graph Theory with Applications. Macmillan Press Ltd, 1976.
[17] A. Chatterjee and M.A. d'Abreu,"Concurrent Error Detection and Fault-Tolerance in Linear Digital State Variable Systems," Proc. FTCS, pp. 136-143, 1991.
[18] A. Chatterjee and M.A. d'Abreu,"The Design of Fault-Tolerant Linear Digital State Variable Systems: Theory and Techniques," IEEE Trans. Computers, vol. 42, no. 10, pp. 794-808, Oct. 1993.
[19] J. Sung,"Fault Tolerant Linear System Synthesis: Partition and Scheduling," PhD thesis, Univ. of California, Davis, 1994.
[20] V.S.S. Nair and J.A. Abraham, "Real-Number Codes for Fault-Tolerant Matrix Operations on Processor Arrays," IEEE Trans. on Computers, Vol. 39, No. 4, Apr. 1990, pp. 426-435.
[21] J. Wakerly,Error Detecting Codes: Self-Checking Circuits and Applications.New York: North-Holland, 1978.
[22] G.R. Redinbo,"Real Codes in the Fourier Domain for Fault-Tolerant Signal Processing," Spectral Techniques: Theory and Applications, Moraga and R. Creutzburg, eds. Berlin: Springer-Verlag, 1991.
[23] D. Goldberg, “What Every Computer Scientist Should Know About Floating-Point Arithmetic,” Computing Surveys, vol. 23, no. 1, pp. 5-48, 1991.
[24] B.G. Zagar and G.R. Redinbo,"Watchdog Parity Channels for Digital Filter Protection," Proc. FTCS, pp. 186-191 1988.
[25] S.S. Guillory,J.A. Martin,G.R. Redinbo, and B.G. Zagar,"Fault-Tolerant Design Methods for VLSI Digital Filter Implementations," VLSI in Signal Processing III, H.J. Moscovitz and R.W. Brodersen, eds., pp. 373-385.New York: IEEE Press, 1988.
[26] G.R. Redinbo and B.G. Zagar,"Protecting Digital Signal Processors (DSP) with Modified Real Convolutional Codes," Proc. 23rd Ann. Conf. Signals, Systems, and Computers, pp. 33-37, 1989.
[27] P. Banarjee, J.T. Rahmeh, C. Stunkel, V.S. Nair, K. Roy, V. Balasubramanian, and J.A. Abraham, “Algorithm-Based Fault Tolerance on a Hypercube Multiprocessor,” IEEE Trans. Computers, vol. 39, no. 9, pp. 1132-1145, Sept. 1990.
[28] D.F. Elliot and K.R. Rao, Fast Transforms: Algorithms, Analysis, Applications. London: Academic Press, 1982.
[29] C. Tong and P.N. Swarztrauber,"Ordered Fast Fourier Transformations on a Massively Parallel Hypercube Multiprocessor," J. Parallel and Distributed Computing, vol. 12, pp. 50-59, 1991.
[30] W.W. Peterson and E.J. Weldon Jr.,Error Correcting Codes.Cambridge, Mass.: MIT Press, 1981.
[31] S. Lin and D. J. Costello,Error Control Coding: Fundamentals and Applications. Englewood Cliffs, NJ: Prentice-Hall, 1983.

Index Terms:
1-fault detectable (1-FD) system, algorithm-based fault tolerant (ABFT) synthesis, data flow graph (DFG), fast Fourier transform (FFT), gain matrix and error space.
Citation:
Jan-Lung Sung, G. Robert Redinbo, "Algorithm-Based Fault Tolerant Synthesis for Linear Operations," IEEE Transactions on Computers, vol. 45, no. 4, pp. 425-438, April 1996, doi:10.1109/12.494100
Usage of this product signifies your acceptance of the Terms of Use.