This Article 
 Bibliographic References 
 Add to: 
Extending Backward Error Assertions to Tolerance of Large Errors in Floating Point Computations
April 1997 (vol. 46 no. 4)
pp. 505-510

Abstract—The use of backward error assertions combined with iterative refinement has been suggested for the correction of small fault induced errors in the floating point solution of linear systems. We extend this to the correction of large errors, typically caused by the failure of a single processor (or column of processors) in an array.

[1] C. Anfinson and F.T. Luk, "A Linear Algebraic Model of Algorithm-Based Fault Tolerance," IEEE Trans. Computers, Dec. 1988, pp. 1599-1604.
[2] D.L. Boley,G.H. Golub,S. Makar,N. Saxena, and E.J. McCluskey,"Floating Point Fault Tolerance Using Backward Error Assertions," IEEE Trans. Computers, Special Issue on Fault-Tolerant Computing, vol. 44, no. 2, pp. 302-311, Feb. 1995.
[3] R.P. Brent, F.T. Luk, and C.J. Anfinson, "Checksum Schemes for Fault Tolerant Systolic Computing," Mathematics in Signal Processing II, J.G. McWhirter, ed., pp. 791-804.Oxford: Oxford Univ. Press, 1990.
[4] M.P. Connolly and P. Fitzpatrick, "Fault Tolerant QR Decomposition for Adaptive Signal Processing," SPIE: Advanced Signal Processing, vol. 2,296, pp. 740-750, 1994.
[5] M.P. Connolly and P. Fitzpatrick, "Fault Tolerant QRD Recursive Least Squares," IEE Proc.-E Computers and Digital Techniques, 143, pp. 137-144, 1996.
[6] P. Fitzpatrick, "A Coding Theoretic Approach to Fault Tolerant Matrix Decompositions and Solution of Linear Systems of Equations," Mathematics in Signal Processing III, J.G. McWhirter, ed., pp. 41-50.Oxford: Clarendon Press, 1994.
[7] P. Fitzpatrick, "On Fault Tolerant Matrix Decomposition," J. VLSI Signal Processing, vol. 8, pp. 293-303, 1994.
[8] P. Fitzpatrick, "Fault Tolerant Linear Algebra," Bull. Inst. Math. and Its Applicationss., vol. 32, pp. 17-22, 1996.
[9] P. Fitzpatrick and C.C. Murphy, "Fault Tolerant Matrix Triangularization and Solution of Linear Systems of Equations," Proc. Application Specific Array Processors, pp. 469-480. IEEE CS Press, 1992.
[10] P. Fitzpatrick and C.C. Murphy, "Solution of Linear Systems of Equations in the Presence of two Transient Errors," IEE Proc.-E, vol. 140, pp. 247-254, 1993.
[11] G.H. Golub and C.F. Van Loan, Matrix Computations, second edition. Johns Hopkins Univ. Press, 1989.
[12] K.H. Huang and J.A. Abraham, "Algorithm-Based Fault Tolerance for Matrix Operations," IEEE Trans. Computers, vol. 33, pp. 518-528, 1984.
[13] N.J. Higham, "Iterative Refinement Enhances the Stability of QR Factorization Methods for Solving Liner Equations," BIT, vol. 31, pp. 447-468, 1991.
[14] J-Y. Jou and J.A. Abraham, "Fault-Tolerant Matrix Arithmetic and Signal Processing on Highly Concurrent Computing Structures," Proc. IEEE, vol. 74, pp. 732-741, 1986.
[15] M. Jankowski and H. Wozniakowski, "Iterative Refinement Implies Numerical Stability," BIT, vol. 17, pp. 303-311, 1977.
[16] F.T. Luk and H. Park, “A Fault Tolerance Matrix Triangularizations on Systolic Arrays,” IEEE Trans. Computers, vol. 37, no. 11, pp. 1434-1438, Nov. 1988.
[17] F.T. Luk and H. Park, “An Analysis of Algorithm-Based Fault Tolerance Techniques,” J. Parallel and Distributed Computing, vol. 5, pp. 172-184, 1988.
[18] H. Park, "On Multiple Error Correction in Matrix Triangularizations Using Checksum Schemes," J. Parallel and Distributed Computing, vol. 14, pp. 90-97, 1992.
[19] A. Roy-Chowdhury and P. Banerjee, "Algorithm-Based Fault Location and Recovery for Matrix Computations," Proc. 24th FTCS, pp. 38-48, 1994.
[20] A. Roy-Chowdhury and P. Banerjee,"A New Error Analysis Based Method for Tolerance Computation for Algorithm-Based Checks," IEEE Trans. Computers, vol. 45, no. 2, pp. 238-243, Feb. 1996.
[21] R.D. Skeel, "Iterative Refinement Implies Numerical Stability for Gaussian Elimination," Math. Comp. vol. 35, pp. 817-832

Index Terms:
Fault tolerance, algorithm-based fault tolerance, backward error assertions, floating point computation.
Patrick Fitzpatrick, "Extending Backward Error Assertions to Tolerance of Large Errors in Floating Point Computations," IEEE Transactions on Computers, vol. 46, no. 4, pp. 505-510, April 1997, doi:10.1109/12.588072
Usage of this product signifies your acceptance of the Terms of Use.