Issue No.06 - June (1998 vol.47)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/12.689644
<p><b>Abstract</b>—An extension to Algorithm-Based Fault Tolerance (ABFT) methodologies shows how parity values dictated by a real convolutional code can be employed by Kalman estimation techniques to perform real number correction for protecting linear processing systems. Intermittent failures appearing in the output samples are detected and corrected using only the syndromes normally generated in ABFT schemes. The algebraic structure of a real convolutional code provides separation needed by recursive Kalman state estimators to affect mean-square error correction. State and parity measurement equations model faults and computational noise in both the linear processing and parity generation subassemblies, and, in a departure from previous models, the noise sources are considered time-varying. The Kalman one-step estimator which makes decisions on all parity values up to the present point is determined, and it separates naturally into detection and correction operations permitting corrective action only when the detection levels exceed thresholds based on roundoff noise energy. The detector/corrector uses efficient multirate block processing techniques as determined by the real convolutional code.</p><p>A smoothed fixed-lag Kalman estimator which uses parity values for a fixed amount beyond the point of interest is needed to complete the correction. It employs one-step estimator quantities and implementation simplifications are possible. Examples showing the correction behavior and mean-square error performance are presented, and the size of overhead calculations for detection and correction is estimated. A protected processing system is constructed by introducing additional subassemblies, mostly comparators, with the detection and correction parts already described. Under the usual assumptions of at most a single subassembly failure, no improperly detected or corrected data leave the overall protected configuration.</p>
Algorithm-based fault tolerance, fault-tolerant linear processing, Kalman recursive filtering, mean-square error estimation, real convolutional codes, real number error correction, time-varying fault models, totally self-checking comparators.
G. Robert Redinbo, "Generalized Algorithm-Based Fault Tolerance: Error Correction via Kalman Estimation", IEEE Transactions on Computers, vol.47, no. 6, pp. 639-655, June 1998, doi:10.1109/12.689644