This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Reliable Floating-Point Arithmetic Algorithms for Error-Coded Operands
April 1994 (vol. 43 no. 4)
pp. 400-412

Reliable floating-point arithmetic is vital for dependable computing systems. It is also important for future high-density VLSI realizations that are vulnerable to soft-errors. However, the direct checking of floating-point arithmetic is still an open problem. The author presents a set of reliable floating-point arithmetic algorithms for low-cost residue encoded and Berger encoded operands, respectively. Closed form equations are derived for floating-point addition, subtraction, multiplication, and division. Given the standard IEEE floating-point numbers, the proposed reliable floating-point multiplication algorithms for low-cost residue encoded operands are extremely low-cost: it requires less than 8% of hardware redundancy in all cases. For reliable floating-point addition and subtraction, the author finds the hardware redundancy ratios of applying low-cost residue code is about the same as that of applying Berger code: less than 40% of hardware redundancy for single precision numbers and about 16% for double precision numbers. For reliable floating-point division, Berger encoded operands yields hardware cost-effectiveness: about 45% for single precision numbers and about 36% for double precision numbers.

[1] J. Mori, M. nagamatsu, M. Hirano, S. Tanaka, M. Noda, Y. Toyoshima, K. Hashimoto, H. Hayashida, and K. Maeguchi, "10-ns 54×54 parallel-structured full-array multiplier fabricated with 0.5µm CMOS technology,"IEEE J. Solid-State Circuits, vol. 26, no. 4, pp. 600-606, Apr. 1991.
[2] T. E. Williams and M. A. Horowitz, "A zero-overhead self-timed 160 ns 54-b CMOS divider,"IEEE J. Solid-State Circuits, vol. 26, no. 11, pp. 1651-1661, Nov. 1991.
[3] Y. Savaria, N. C. Rumin, J. F. Hayes and V. K. Agarwal, "Soft-error filtering: A solution to the reliability problem of future VLSI digital circuits,"Proc. IEEE, vol. 74, pp. 669-683, May 1984.
[4] J. H. Patel and L. Y. Fung, "Concurrent error detection in ALU's by recomputing with shifted operands,"IEEE Trans. on Comput., vol. C-31, no. 7, pp. 589-595, July 1982.
[5] C. C. Wu, "Time redundant fault-tolerant in bit-sliced ALU's,"IEEE Trans. on Comput., vol. C-36, no. 11, pp. 1387-1389, November 1987.
[6] W. T. Cheng and J. H. Patel, "Concurrent error detection in iterative logic array," inProc. 14th Int. Symp. Fault-Tolerant Comput., June 1984, pp. 298-305.
[7] S. Laha and J. H. Patel, "Error correction in arithmetic operations using time redundancy," inProc. 13th Int. Symp. Fault-Tolerant Comput., June 1983, pp. 298-305.
[8] K. H. Huang and J. A. Abraham, "Algorithm-based fault-tolerance for matrix operations,"IEEE Trans. Comput., vol. C-33, no. 6, pp. 518-528, June 1984.
[9] C. Anfinson and F. T. Luk, "A linear algebraic model of algorithm-based fault tolerance,"IEEE Trans. Comput., vol. 37, no. 12, pp. 1599-1604, Dec. 1988.
[10] V. S. S. Nair and J. A. Abraham, "Real-number codes for fault-tolerant matrix operations on processor arrays,"IEEE Trans. Comput., vol. 39, no. 4, pp. 426-435, Apr. 1990.
[11] P. Banerjee, J. T. Rahmeh, C. Stunkel, V. S. Nair, K. Roy, V. Balasubramanian, and J. A. Abraham, "Algorithm-based fault tolerance on hypercube multiprocessors,"IEEE Trans. Comput., vol. 39, no. 9, pp. 1132-1145, Sept. 1990.
[12] A. Avizienis, G. C. Gilley, F. P. Mathur, D. A. Rennels, J. A. Rohr and D. K. Rubin, "The STAR (self-testing and repairing) computer: an investigation of the theory and practice of fault-tolerant computer design,"IEEE Trans. Comput., vol. C-20, no. 11, pp. 1312-1321, Nov. 1971.
[13] A. Avizienis, "Arithmetic algorithms for error-coded operands,"IEEE Trans. Comput., vol. C-22, no. 6, pp. 567-572, June 1973.
[14] T. R. N. Rao,Error Coding for Arithmetic Processors. New York: Academic, 1974.
[15] F. F. Sellers, M. Y. Hsiao and L. W. Bearnson,Error Detecting Logic for Digital Computers. New York: McGraw-Hill, 1968.
[16] J. F. Wakerly,Error Detecting Codes, Self-Checking Circuits and Applications. New York: North-Holland, 1978.
[17] K. Furuya, Y. Akita and Y. Tohma, "Logic design of fault-tolerant dividers based on data complementation strategy," inProc. 13th Symp. Fault-Tolerant Comput., June 1983, pp. 306-313.
[18] E. Fujiwara and K. Haruta, "Fault-tolerant arithmetic logic unit using parity-based codes,"Trans. Inst. Electron. Commun. Eng. Japan, vol. E-64, no. 10, pp. 653-660, Oct. 1981.
[19] T.R.N. Rao and F. Fujawara,Error Control Codes for Computer Systems, Prentice Hall, 1989.
[20] J. C. Lo, S. Thanawastien, and T. R. N. Rao, "Concurrent error detection in arithmetic and logical operations using Berger codes," inProc. 9th Symp. Computer Arithmetic, Sept. 1989, pp. 233-240.
[21] J. C. Lo, S. Thanawastien, T. R. N. Rao, and M. Nicolaidis, "An SFS Berger check prediction ALU and its applications to self-checking processor designs,"IEEE Trans. Comput.-Aided Design, vol. 11, no. 4, pp. 525-540, Apr. 1992.
[22] J. M. Berger, "A note on an error detection code for asymmetric channels,"Inform. Contr., vol. 4, no. 1, pp. 68-73, Mar. 1961.
[23] M. J. Ashjaee and S. M. Reddy, "On totally self-checking checkers for separable codes,"IEEE Trans. Comput., vol. C-26, no. 8, pp. 737-744, Aug. 1977.
[24] D. A. Anderson and G. Metze, "Design of totally self-checking check circuits form-out-of-ncodes,"IEEE Trans. Comput., vol. C-22, no. 3, pp. 263-269, Mar. 1973.
[25] R. M. Sedmak and H. L. Liebergot, "Fault tolerance of a general purpose computer implemented by very large scale integration,"IEEE Trans. Comput., vol. C-25, no. 6, pp. 492-500, June 1980.
[26] K. Hwang,Computer Arithmetic: Principles, Architecture, and Design. New York: Wiley, 1979.
[27] M. A. Marouf and D. A. Friedman, "Design of self-checking checkers for Berger codes," inProc. 8th Symp. Fault-Tolerant Comput., June 1978, pp. 179-184.
[28] J.-Ch. Lo and S. Thanawastien, "The design of fast totally self-checking Berger code checkers," inDig. Papers 18th Int. FTC Symp., Tokyo, Japan, July 1988, pp. 226-231.
[29] J. C. Lo, S. Thanawastien and T. R. N. Rao, "Berger check prediction for array multipliers and array dividers,"IEEE Trans. Comput., vol. 42, no. 7, pp. 892-896, July 1993.
[30] J. G. G. Langdon and C. K. Tang, "Concurrent error detection for group look-ahead binary adders,"IBM J. Res. Develop., pp. 563-573, Sept. 1970.
[31] J. A. Abraham and W. K. Fuchs, "Fault and error models for VLSI,"Proc. IEEE, vol. 74, no. 5, pp. 639-654, May 1986.
[32] J. F. Wakerly, "Detection of unidirectional multiple errors using low-cost arithmetic codes,"IEEE Trans. Comput., vol. C-24, no. 6, pp. 210-212, June 1975.
[33] C. V. Freiman, "Optimal error detecting codes for completely asymmetric binary channels,"Inform. Contr., vol. 5, no. 1, pp. 64-71, Mar. 1962.
[34] D. Nikolos, A. M. Paschalis, and G. Philokyprou, "Efficient design of totally self-checking checkers for all low-cost arithmetic codes,"IEEE Trans. Comput., vol. 37, no. 7, pp. 807-814, July 1988.
[35] S. J. Piestrak, "Design of high-speed and cost-effective self-testing checkers for low-cost arithmetic codes,"IEEE Trans. Comput., vol. 39, no. 3, pp. 360-374, Mar. 1990.
[36] S. J. Piestrak, "Design of fast self-testing checkers for a class of Berger codes,"IEEE Trans. Comput., vol. C-36, no. 5, pp. 629-634, May 1987.
[37] S. J. Piestrak, "The minimal test set for sorting networks and the use of sorting networks in self-testing checkers for unordered codes," inDig. Papers 20th Int. Fault-Tolerant Comput. Symp., Newcastle upon Tyne, U.K., June 26-28, 1990, pp. 467-474.
[38] D. A. Anderson, "Design of self-checking digital networks using coding techniques," in Coordinated Sci. Labs, Rep. R-527, Univ. of Illinois, Urbana-Champaign, 1971.
[39] Y. Tamir and C. H. Sequin, "Design and applications of self-testing comparators implemented with MOS PLA's,"IEEE Trans. Comput., vol. C-33, no. 6, pp. 493-506, June 1984.
[40] N. K. Jha, "Strongly fault-secure and strongly self-checking domino-CMOS implementations of totally self-checking circuits,"IEEE Trans. Computer-Aided Design, vol. 9, Mar. 1990.
[41] M. Nicolaidis and B. Courtois, "Strongly code disjoint checkers,"IEEE Trans. Comput., vol. 37, no. 6, pp. 751-756, June 1988.
[42] M. Nicolaidis, "A unified built-in-self-test scheme: UBIST," inProc. 18th Int. Symp. Fault-Tolerant Comput., June 1988, pp. 157-163.

Index Terms:
redundancy; digital arithmetic; error correction codes; floating-point arithmetic; error-coded operands; high-density VLSI; soft-errors; residue encoded; Berger encoded; reliable floating-point multiplication; redundancy ratios; hardware redundancy; Berger check prediction; computer arithmetic; concurrent error detection; standard IEEE floating-point numbers,; low-cost residue codes.
Citation:
Jien-Chung Lo, "Reliable Floating-Point Arithmetic Algorithms for Error-Coded Operands," IEEE Transactions on Computers, vol. 43, no. 4, pp. 400-412, April 1994, doi:10.1109/12.278479
Usage of this product signifies your acceptance of the Terms of Use.