This Article 
 Bibliographic References 
 Add to: 
Hardware Starting Approximation Method and Its Application to the Square Root Operation
December 1996 (vol. 45 no. 12)
pp. 1356-1369

Abstract—Quadratically converging algorithms for high-order arithmetic operations typically are accelerated by a starting approximation. The higher the precision of the starting approximation, the less number of iterations required for convergence. Traditional methods have used look-up tables or polynomial approximations, or a combination of the two called piecewise linear approximations. This paper provides a revision and major extension to our study [1] proposing a nontraditional method for reusing the hardware of a multiplier. An approximation is described in the form of partial product array (PPA) composed of Boolean elements. The Boolean elements are chosen such that their sum is a high-precision approximation to a high-order arithmetic operation such as square root, reciprocal, division, logarithm, exponential, and trigonometric functions. This paper derives a PPA that produces in the worst case a 16-bit approximation to the square root operation. The implementation of the PPA utilizes an existing 53 bit multiplier design requiring approximately 1,000 dedicated logic gates of function, additional repowering circuits, and has a latency of one multiplication.

[1] E.M. Schwarz and M.J. Flynn,“Hardware starting approximation for the square root operation,” Proc. IEEE 11th Symp. Computer Arithmetic, pp. 103-11, 1993.
[2] C.T. Fike, Computer Evaluation of Mathematical Functions.Englewood Cliffs, N.J.: Prentice Hall, 1968.
[3] K. Hwang,Computer Arithmetic, Principles, Architecture, and Design.New York: John Wiley&Sons, 1979.
[4] S. Waser and M.J. Flynn,Introduction to Arithmetic for Digital System Designers.New York: CBS College Publishing, 1982.
[5] C.V. Ramamoorthy, J.R. Goodman, and K.H. Kim, "Some Properties of Iterative Square-Rooting Methods Using High-Speed Multiplication," IEEE Trans. Computers, vol. 21, pp. 837-847, Aug. 1972.
[6] "IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Std 754-1985," The Inst. of Electrical and Electronic Engineers, Inc., New York, Aug. 1985.
[7] A.D. Booth, "A Signed Multiplication Technique," Quarterly J. Mechanical and Applied Math., vol. 4, pp. 236-240, 1951.
[8] O.L. MacSorley, "High-Speed Arithmetic in Binary Computers," Proc. IRE, vol. 99, pp. 67-91, Jan. 1961.
[9] S. Vassiliadis, E.M. Schwarz, and B.M. Sung, "Hard-Wired Multipliers with Encoded Partial Products," IEEE Trans. Computers, vol. 40, no. 11, pp. 1,181-1,197, Nov. 1991.
[10] R. Stefanelli, "A Suggestion for a High-Speed Parallel Binary Divider," IEEE Trans. Computers, vol. 21, no. 1, pp. 42-55, Jan. 1972.
[11] D.M. Mandelbaum,“A systematic method for division with high average bit skipping,” IEEE Trans. Computers, vol. 39, pp. 127-130, Jan. 1990.
[12] D.M. Mandelbaum,“Some results on a SRT type division scheme,” IEEE Trans. Computers, vol. 42, pp. 102-106, Jan. 1993.
[13] D.M. Mandelbaum, "A Method for Calculation of the Square Root Using Combinatorial Logic," J. VLSI Signal Processing, vol. 6, pp. 233-242, Dec. 1993.
[14] D.M. Mandelbaum and S.G. Mandelbaum, "Fast, Efficient Parallel-Acting Method of Generating Functions Defined by Power Series, Including Logarithm, Exponential, and Sine, Cosine," IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 1, pp. 33-45, Jan. 1996.
[15] E.M. Schwarz and M.J. Flynn,“Cost efficient high radix division,” J. VLSI Signal Processing, pp. 293-305, Aug. 1991.
[16] E.M. Schwarz and M.J. Flynn,“Parallel high radix nonrestoring division,” IEEE Trans. Computers, vol. 42, no. 10, pp. 1,234-1,246, Oct. 1993.
[17] E.M. Schwarz and M.J. Flynn, "Approximating the Sine Function with Combinational Logic," Proc. 26th Asilomar Conf. Signals, Systems, and Computers, vol. 1, pp. 386-390, Oct. 1992.
[18] E.M. Schwarz and M.J. Flynn, "Direct Combinatorial Methods for Approximating Trigonometric Functions," Technical Report CSL-TR-92-525, Stanford Univ., May 1992.
[19] E.M. Schwarz,“High-radix algorithms for high-order arithmetic expressions,” doctorial dissertation, Stanford Univ., Jan. 1993.
[20] S. Wolfram, Mathematica: A System for Doing Mathematics by Computer.New York: Addison-Wesley, 1988.
[21] W.G. Schneeweiss, Boolean Functions with Engineering Applications and Computer Programs, chapter 7. New York: Springer-Verlag, 1989.
[22] C.T. Fike, "Starting Approximations for Square Root Calculation on IBM System/360," Comm. ACM, vol. 9, pp. 297-299, Apr. 1966.
[23] J.F. Hart et al., Computer Approximations.New York: John Wiley&Sons, 1968.
[24] N. Anderson, "Minimum Relative Error Approximations for 1/t," Numerische Mathematik, vol. 54, pp. 117-124, 1988.
[25] Numerical Methods, G. Dahlquist, A. Bjorck, and N. Anderson, eds. Englewood Cliffs, N.J.: Prentice Hall, 1974.
[26] M. Ito, N. Takagi, and S. Yajima, “Efficient Initial Approximation and Fast Converging Methods for Division and Square Root,” Proc. 12th Symp. Computer Arithmetic (ARITH12), pp. 2-9, 1995.
[27] R.C. Agarwal, F.G. Gustavson, J. McComb, and S. Schmidt, "Engineering and Scientific Subroutine Library Release 3 for IBM ES/3090 Vector Multiprocessors," IBM Systems J., vol. 28, pp. 345-350, 1989.
[28] S. Gal and B. Bachelis, "An Accurate Elementary Mathematical Library for the IEEE Floating Point Standard," ACM Trans. Math. Software, vol. 17, pp. 26-45, Mar. 1991.
[29] P.M. Farmwald, "High Bandwidth Evaluation of Elementary Functions," Proc. Fifth Symp. Computer Arithmetic, pp. 139-142, 1981.
[30] T. Nakayama, "Arithmetic Operation Apparatus for Elementary Function," U.S. Patent No. 5,235,535, Aug.10, 1993.
[31] W.F. Wong and E. Goto, "Fast Evaluation of the Elementary Functions in Single Precision," IEEE Trans. Computers, vol. 44, no. 9, pp. 453-457, Sept. 1990.
[32] S. Vassiliadis, J. Delgado-Frias, and M. Zhang, "High Performance with Low Implementation Cost Sigmoid Generators," Proc. Int'l Joint Conf. Neural Networks, pp. 1,931-1,934, 1993.
[33] M. Zhang, "Hardwired Elementary Functions for Neural Network Emulators," PhD thesis, Dept. of Advanced Technology and Computer Eng., State Univ. of New York at Binghamton, 1993.
[34] R.K. Montoye, E. Hokenek, and S.L. Runyon, "Design of the IBM RISC System/6000 Floating-Point Execution Unit," IBM J. Research and Development, vol. 34, pp. 59-70, Jan. 1990.
[35] H. Hassler and N. Takagi, "Function Evaluation by Table Look-Up and Addition," Proc. 12th Symp. Computer Arithmetic, pp. 10-16, July 1995.

Index Terms:
Computer arithmetic, approximation theory, square root, multiplication, counter tree, division.
Eric M. Schwarz, Michael J. Flynn, "Hardware Starting Approximation Method and Its Application to the Square Root Operation," IEEE Transactions on Computers, vol. 45, no. 12, pp. 1356-1369, Dec. 1996, doi:10.1109/12.545966
Usage of this product signifies your acceptance of the Terms of Use.