This Article 
 Bibliographic References 
 Add to: 
Multibit Correcting Data Interface for Fault-Tolerant Systems
April 1993 (vol. 42 no. 4)
pp. 433-446

A fault-detecting, bidirectional data interface between uncoded data from one component, such as a processor, and coded data in the rest of the system is described. This interface is capable of correcting a single multibit symbol error or detecting the occurrence of two such errors. The device uses a shortened Reed-Solomon code, and two practical symbol sizes are considered; nibble (4-bit) errors are protected by a (40, 32) binary equivalent shortened code, and byte errors are covered by a (80, 64) binary-sized code. The Reed-Solomon codes have maximum protection levels, even when shortened, and allow simplifying the design options. A dual orthogonal basis used for the symbols' representations provides significant hardware savings. The interface unit achieves internal fault detection by comparing regenerated parity values in a totally self-checking equality checker. A fault-tolerant ultrareliable memory module is proposed and evaluated. An illustrative design is realized using a single desktop programmable gate array.

[1] T.R.N. Rao and F. Fujawara,Error Control Codes for Computer Systems, Prentice Hall, 1989.
[2] M. Blaum and H. van Tilborg, "On t-error correcting-all unidirectional error detecting codes,"IEEE Trans. Comput., vol. C-38, pp. 1493-1501, 1989.
[3] D. L. Tao, C. R. P. Hartmann, and P. K. Lala, "An efficient class of unidirectional error detecting-correcting codes,"IEEE Trans. Comput., vol. C-37, pp. 879-882, 1988.
[4] L. A. Dunning, G. Dial, and M. R. Varansi, "Unidirectional byte error detecting codes for computer memory systems,"IEEE Trans. Comput., vol. C-39, pp. 592-595, 1990.
[5] L. A. Dunning, G. Dial, and M. Varanasi, "Unidirectional 9-bit byte error detection codes for computer memory systems," inDig. Papers, 19th Int. Conf. Fault Tolerant Comput., June 1989, pp. 88-93.
[6] V. C. Hamacher, Z. G. Vranesic, and S. G. Zaky,Computer Organization, Third ed. New York: McGraw-Hill, 1990.
[7] J.L. Hennessy and David A. Patterson,Computer Architecture: A Quantitative Approach, Morgan Kaufmann, San Mateo, Calif., 1990.
[8] D. C. Bossen, "b-Adjacent error correction,"IBM J. Res. Develop., vol. 14, pp. 402-408, 1970.
[9] S. J. Hong and A. M. Patel, "A general class of maximal codes for computer applications,"IEEE Trans. Comput., vol. C-31, pp. 1322-1331, 1972.
[10] W. W. Peterson and E. J. Weldon, Jr.,Error Correcting Codes, second ed. Cambridge: M.I.T. Press, 1972.
[11] S. Lin and D. J. Costello,Error Control Coding. Englewood Cliffs, NJ: Prentice-Hall, 1983.
[12] R. J. McEliece,Finite Fields for Computer Scientists and Engineers. Boston, MA: KIuwer Academic, 1987.
[13] B.W. Johnson,Design and Analysis of Fault Tolerant Digital Systems, Addison-Wesley, Reading, Mass., 1989.
[14] U. S. Department of Defense,Military Handbook: Reliability Prediction of Electronic Equipment, MIL-HDBK-217E, Notice 1, 2 Jan. 1990.
[15] G. R. Redinbo, "Fault-tolerant decoders for cyclic error-correcting codes,"IEEE Trans. Comput., vol. C-36, pp. 47-63, 1987.
[16] D. K. Pradhan,Fault Tolerant Computing: Theory and Techniques. Englewood Cliffs, NJ: Prentice-Hall, 1986.
[17] J. Wakerley,Error Detecting Codes, Self-Checking Circuits and Applications, New York: North Holland, 1978.
[18] The Programmable Gate Array Data Book, Xilinx, Inc., 1988.
[19] ACT 1 Family Gate Arrays-Product Information, Actel Corp. 1988.
[20] A. El Gamal et al., "An Architecture for Electrically Configurable Gate Arrays,"IEEE J. Solid-State Circuits, Vol. 24, Apr. 1989, pp. 394-398.
[21] A. M. Mohsen, E. Z. Hamdy, and J. L. McCullum, "Programmable low impedance anti-fuse element," U.S. Patent 4823 181, Apr. 18, 1989.
[22] D. A. Anderson, "Design of self-checking digital networks using coding techniques," Ph.D. dissertation, Univ. Illinois at Urbana-Champaign, University Microfilms #72-12, 065, 1971.

Index Terms:
multibit correcting data interface; nibble errors; fault-tolerant systems; uncoded data; processor; shortened Reed-Solomon code; byte errors; binary-sized code; dual orthogonal basis; totally self-checking equality checker; ultrareliable memory module; single desktop programmable gate array; error correction codes; fault tolerant computing; logic arrays; Reed-Solomon codes.
G.R. Redinbo, L.M. Napolitano, Jr., D.D. Andaleon, "Multibit Correcting Data Interface for Fault-Tolerant Systems," IEEE Transactions on Computers, vol. 42, no. 4, pp. 433-446, April 1993, doi:10.1109/12.214690
Usage of this product signifies your acceptance of the Terms of Use.