This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Unified Architecture for the Accurate and High-Throughput Implementation of Six Key Elementary Functions
April 2010 (vol. 59 no. 4)
pp. 449-456
Amirhossein Alimohammad, Ukalta Engineering, Edmonton
Saeed Fouladi Fard, Ukalta Engineering, Edmonton
Bruce F. Cockburn, University of Alberta, Edmonton
This paper presents a unified architecture for the compact implementation of several key elementary functions, including reciprocal, square root, and logarithm, in single-precision floating-point arithmetic. The proposed high-throughput design is based on uniform domain segmentation and curve fitting techniques. Numerically accurate least-squares regression is utilized to calculate the polynomial coefficients. The architecture is optimized by analyzing the trade-off between the size of the required memory and the precision of intermediate variables to achieve the minimum 23-bit accuracy required for single-precision floating-point representation. The efficiency of the proposed unified data path is demonstrated on a common field-programmable gate array.

[1] M.J. Flynn, "On Division by Functional Iteration," IEEE Trans. Computers, vol. 19, no. 8, pp. 702-706, Aug. 1970.
[2] M.D. Ercegovac and T. Lang, Digital Arithmetic. Morgan Kaufmann, 2004.
[3] M.D. Ercegovac and T. Lang, Division and Square Root: Digit Recurrence Algorithms and Implementations. Kluwer Academic, 1994.
[4] M.D. Ercegovacet et al., "Improving Goldschmidt Division, Square Root, and Square Root Reciprocal," IEEE Trans. Computers, vol. 49, no. 7, pp. 759-763, July 2000.
[5] E.M. Schwarz and M.J. Flynn, "Hardware Starting Approximation for the Square Root Operation," Proc. IEEE Symp. Computer Arithmetic, pp. 103-111, 1993.
[6] W.F. Wong and E. Goto, "Fast Evaluation of the Elementary Functions in Single Precision," IEEE Trans. Computers, vol. 44, no. 3, pp. 453-457, Mar. 1995.
[7] D. Das Sarma and D.W. Matula, "Faithful Bipartite ROM Reciprocal Tables," Proc. IEEE Symp. Computer Arithmetic, pp. 17-28, 1995.
[8] J.-M. Muller, "A Few Results on Table-Based Methods," Reliable Computing, vol. 5, no. 3, pp. 279-288, 1999.
[9] M.J. Schulte and J.E. Stine, "Approximating Elementary Functions with Symmetric Bipartite Tables," IEEE Trans. Computers, vol. 48, no. 8, pp. 842-847, Aug. 1999.
[10] I. Koren and O. Zinaty, "Evaluating Elementary Functions in a Numerical Coprocessor Based on Rational Approximations," IEEE Trans. Computers, vol. 39, no. 8, pp. 1030-1037, Aug. 1990.
[11] P.T.P. Tang, "Table-Lookup Algorithms for Elementary Functions and Their Error Analysis," Proc. IEEE Symp. Computer Arithmetic, pp. 232-236, 1991.
[12] D. Wong and M.J. Flynn, "Fast Division Using Accurate Quotient Approximations to Reduce the Number of Iterations," IEEE Trans. Computers, vol. 41, no. 8, pp. 981-995, Aug. 1992.
[13] A.A. Liddicoat and M.J. Flynn, "High-Performance Floating Point Divide," Proc. Euromicro Symp. Digital Systems Design, pp. 354-361, 2001.
[14] P. Hung, H. Fahmy, O. Mencer, and M.J. Flynn, "Fast Division Algorithm with a Small Lookup Table," Proc. Asilomar Conf. Signals, Systems, and Computers, pp. 1465-1468, 1999.
[15] J.-C. Jeong et al., "A Cost-Effective Pipelined Divider with a Small Lookup Table," IEEE Trans. Computers, vol. 53, no. 4, pp. 489-495, Apr. 2004.
[16] N. Takagi, "Powering by a Table Look-Up and a Multiplication with Operand Modification," IEEE Trans. Computers, vol. 47, no. 11, pp. 1216-1222, Nov. 1998.
[17] J.-A. Piñeiro and J.D. Bruguera, and J.-M. Muller, "Faithful Powering Computation Using Table Look-Up and a Fused Accumulation Tree," Proc. IEEE Symp. Computer Arithmetic, pp. 40-47, 2001.
[18] J.-A. Piñeiro, S.F. Oberman, J.-M. Muller, and J.D. Bruguera, "High-Speed Function Approximation Using a Minimax Quadratic Interpolator," IEEE Trans. Computers, vol. 54, no. 3, pp. 304-318, Mar. 2005.
[19] M.J. Schulte and E.E. Swartzlander, "Hardware Designs for Exactly Rounded Elementary Functions," IEEE Trans. Computers, vol. 43, no. 8, pp. 964-973, Aug. 1994.
[20] M.D. Ercegovac, T. Lang, J.-M. Muller, and A. Tisserand, "Reciprocation, Square Root, Inverse Square Root, and Some Elementary Functions Using Small Multipliers," IEEE Trans. Computers, vol. 49, no. 7, pp. 628-637, July 2000.
[21] D.D. Sarma and D.W. Matula, "Faithful Interpolation in Reciprocal Tables," Proc. IEEE Symp. Computer Arithmetic, pp. 82-91, 1997.
[22] V.K. Jain, S.A. Wadekar, and L. Lin, "A Universal Nonlinear Component and its Application to WSI," IEEE Trans. Components, Hybrids, and Manufacturing Technology, vol. 16, no. 7, pp. 656-664, Nov. 1993.
[23] J. Cao and B. Wei, "High-Performance Hardware for Function Generation," Proc. IEEE Symp. Computer Arithmetic, pp. 184-188, 1997.
[24] P. Soderquist and M. Leeser, "Division and Square Root: Choosing the Right Implementation," IEEE Micro, vol. 17, no. 4, pp. 56-66, July/Aug. 1997.
[25] J.-M. Muller, Elementary Functions. Algorithms and Implementation. Birkhauser, 1997.
[26] A. Alimohammad, S.F. Fard, B.F. Cockburn, and C. Schlegel, "A Compact and Accurate Gaussian Variate Generator," IEEE Trans. Very Large Scale Integration Systems, vol. 16, no. 5, pp. 517-527, May 2008.
[27] N.R. Draper and H. Smith, Applied Regression Analysis. John Wiley & Sons, Inc., 1998.
[28] G.H. Golub and C.F.V. Loan, Matrix Computations. Johns Hopkins Univ. Press, 1996.
[29] K. Levenberg, "A Method for the Solution of Certain Non-Linear Problems in Least Squares," Quarterly of Applied Math., vol. 2, pp. 164-168, 1994.
[30] MATLAB 7 C and Fortran API Reference, The Mathworks, Inc., 2008.
[31] J. Detrey and F.D. Dinechin, "Second Order Function Approximation Using a Single Multiplication on FPGAs," Proc. IEEE Int'l Conf. Field Programmable Logic and Applications (FPL), pp. 221-230, 2004.
[32] Virtex-4 User Guide, Xilinx, Inc., June 2008.
[33] M. Ito, N. Takagi, and S. Yajima, "Efficient Initial Approximation for Multiplicative Division and Square Root by a Multiplication with Operand Modification," IEEE Trans. Computers, vol. 46, no. 4, pp. 495-498, Apr. 1997.
[34] Maple 8 Programming Guide, Waterloo Maple, Inc., 2002.

Index Terms:
Floating-point arithmetic, single-precision arithmetic, reciprocal, square root, logarithm, computer arithmetic.
Citation:
Amirhossein Alimohammad, Saeed Fouladi Fard, Bruce F. Cockburn, "A Unified Architecture for the Accurate and High-Throughput Implementation of Six Key Elementary Functions," IEEE Transactions on Computers, vol. 59, no. 4, pp. 449-456, April 2010, doi:10.1109/TC.2009.169
Usage of this product signifies your acceptance of the Terms of Use.