
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
V.S.S. Nair, J.A. Abraham, "RealNumber Codes for FaultTolerant Matrix Operations on Processor Arrays," IEEE Transactions on Computers, vol. 39, no. 4, pp. 426435, April, 1990.  
BibTex  x  
@article{ 10.1109/12.54836, author = {V.S.S. Nair and J.A. Abraham}, title = {RealNumber Codes for FaultTolerant Matrix Operations on Processor Arrays}, journal ={IEEE Transactions on Computers}, volume = {39}, number = {4}, issn = {00189340}, year = {1990}, pages = {426435}, doi = {http://doi.ieeecomputersociety.org/10.1109/12.54836}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Computers TI  RealNumber Codes for FaultTolerant Matrix Operations on Processor Arrays IS  4 SN  00189340 SP426 EP435 EPD  426435 A1  V.S.S. Nair, A1  J.A. Abraham, PY  1990 KW  real number codes; encoding; faulttolerant matrix operations; processor arrays; linearity; necessary and sufficient condition; multiplication; transposition; LU decomposition; error detecting; performance overhead; simulation experiments; encoding; error detection codes; fault tolerant computing. VL  39 JA  IEEE Transactions on Computers ER   
A generalization of existing real numer codes is proposed. It is proven that linearity is a necessary and sufficient condition for codes used for faulttolerant matrix operations such as matrix addition, multiplication, transposition, and LU decomposition. It is also proven that for every linear code defined over a finite field, there exists a corresponding linear realnumber code with similar error detecting capabilities. Encoding schemes are given for some of the example codes which fall under the general set of realnumber codes. With the help of experiments, a rule is derived for the selection of a particular code for a given application. The performance overhead of fault tolerance schemes using the generalized encoding schemes is shown to be very low, and this is substantiated through simulation experiments.
[1] A. Avizienis, "Faulttolerance: The survival attribute of digital systems,"Proc. IEEE, vol. 66, pp. 11091125, Oct. 1978.
[2] S. E. Butner, "Triple time redundancy, faultmasking in bytesliced systems," in Tech. Rep. CSL TR 211, Comput. Syst. Lab., Dep. of Elec. Eng., Stanford Univ., Stanford, CA, Aug. 1981.
[3] J. H. Patel and L. Y. Fung, "Concurrent error detection in ALU's by recomputing with shifted operands,"IEEE Trans. Comput., vol. C31, pp. 589595, July 1982.
[4] J. Wakerly,ErrorDetecting Codes, SelfChecking Circuits and Applications. New York: Elsevier North Holland, 1978.
[5] P. Banerjee and J. A. Abraham, "Faultsecure algorithms for multiple processor systems," inProc. 11th Int. Symp. Comput. Architecture, June 1984, pp. 279287.
[6] K. H. Huang and J. A. Abraham, "Algorithmbased fault tolerance for matrix operations,"IEEE Trans. Comput., vol. C33, pp. 518528, June 1984.
[7] F. T. Luk and H. Park, "Faulttolerant matrix triangularizations on systolic arrays," Tech. Rep. EECEG862, Feb. 1986.
[8] J. Y. Jou and J. A. Abraham, "Faulttolerant FFT networks,"IEEE Trans. Comput., vol. 37, pp. 548561, May 1988.
[9] C. Y. Chen and J. A. Abraham, "Faulttolerant systems for the computation of eigenvalues and singular values,"Proc. SPIE, Advanced Algorithms and Architectures for Signal Processing, vol. 696, pp. 228237, Aug. 1986.
[10] J. Y. Jou and J. A. Abraham, "Faulttolerant matrix arithmetic and signal processing on highly concurrent computing structures,"Proc. IEEE, vol. 74, no. 5, pp. 732741, May 1986.
[11] J. L. Larson, "Methods for automatic error analysis of numerical algorithms," Rep. UIUCDCSR78916, Urbana, IL, Apr. 1978.
[12] W. W. Peterson and E. J. Weldon, Jr.,ErrorCorrecting Codes. Cambridge, MA: MIT Press, 1981.
[13] J. Y. Jou and J. A. Abraham, "Faulttolerant algorithms and architectures for real time signal processing," inProc. Int. Conf. Parallel Processing, vol. 1, Aug. 1988, pp. 359362.
[14] P. Banerjee, J. T. Rahmeh, C. B. Stunkel, V. S. S. Nair, K. Roy, and J. A. Abraham, "Algorithmbased fault tolerance on a hypercube multiprocessor,"IEEE Trans. Comput., to be published.
[15] J. H. Wilkinson,The Algebraic Eigenvalue Problem. London, England: Oxford University Press, 1965.
[16] R. A. Willoughby, "Sparse matrix algorithms and their relation to problem classes and computer architectures," inLarge Sparse Sets of Linear Equations. New York: 1971, pp. 257277.
[17] J. Larson and A. Sameh, "Efficient calculation of the effects of rounding errors,"ACM Trans. Math. Software, vol. 4, pp. 228236, 1978.
[18] A. M. Cohen,Numerical Analysis. New York: Wiley, 1973.
[19] R. E. Blahut,Theory and Practice of Error Control Codes. Reading, MA: AddisonWesley, May 1984.
[20] B. Bose and T. R. N. Rao, "Theory of unidirectional error correcting/detecting codes,"IEEE Trans. Comput., vol. C31, pp. 521530, June 1982.
[21] C. W. Curtis,Linear Algebra. New York: SpringerVerlag, 1984.
[22] T. G. Marshall Jr., "Coding of real number sequences for error correction: A digital signal processing problem,"IEEE J. Select. Areas Commun., vol. SAC2, no. 2, pp. 381392, Mar. 1984.
[23] V. S. S. Nair and J. A. Abraham, "Average checksum codes for faulttolerant matrix operations on processor arrays," inProc. Int. Conf. Supercomput., vol. 3, Santa Clara, CA, May 59, 1987, pp. 284290.
[24] W. Ronsch, "Stability aspects in using parallel algorithms,"Parallel Comput., vol. 1, pp. 7598, Aug. 1984.
[25] L. Snyder,Poker Programming Manual, University of Washington, Seattle, WA, 1984.
[26] J. A. Abraham, "Fault tolerance techniques for highly parallel signal processing architectures,"SPIE Highly Parallel Signal Processing Architectures, vol. 614, pp. 4965, 1986.
[27] E. Gallopoulos, "Processor arrays for problems in computational physics," Ph.D. dissertation, Univ. of Illinois, Urbana, IL, 1985.
[28] V. S. S. Nair, "General linear codes for faulttolerant matrix operations on processor arrays," M.S. thesis, Univ. of Illinois, Urbana, IL, Aug. 1988.
[29] W. Kahan, "Further remarks on reducing truncation errors,"Commun. ACM, vol. 8, pp. 4048, 1965.
[30] V. S. S. Nair and J. A. Abraham, "General linear codes for fault tolerant matrix operations on processor arrays," inProc. Int. Symp. FaultTolerant Comput., Tokyo, June 1988, pp. 180185.
[31] U. Kulish and G. Bohlender,Features of Hardware Implementation of an Optimal Arithmetic. New York: Academic, 1983.