This Article 
 Bibliographic References 
 Add to: 
High-Speed Function Approximation Using a Minimax Quadratic Interpolator
March 2005 (vol. 54 no. 3)
pp. 304-318
A table-based method for high-speed function approximation in single-precision floating-point format is presented in this paper. Our focus is the approximation of reciprocal, square root, square root reciprocal, exponentials, logarithms, trigonometric functions, powering (with a fixed exponent p), or special functions. The algorithm presented here combines table look-up, an enhanced minimax quadratic approximation, and an efficient evaluation of the second-degree polynomial (using a specialized squaring unit, redundant arithmetic, and multioperand addition). The execution times and area costs of an architecture implementing our method are estimated, showing the achievement of the fast execution times of linear approximation methods and the reduced area requirements of other second-degree interpolation algorithms. Moreover, the use of an enhanced minimax approximation which, through an iterative process, takes into account the effect of rounding the polynomial coefficients to a finite size allows for a further reduction in the size of the look-up tables to be used, making our method very suitable for the implementation of an elementary function generator in state-of-the-art DSPs or graphics processing units (GPUs).

[1] J. Cao and B. Wei, “High-Performance Hardware for Function Generation,” Proc. 13th Int'l Symp. Computer Arithmetic (ARITH13), pp. 184-188, 1997.
[2] J. Cao, B. Wei, and J. Cheng, “High-Performance Architectures for Elementary Function Generation,” Proc. 15th Int'l Symp. Computer Arithmetic (ARITH15), pp. 136-144, 2001.
[3] T.C. Chen, “A Binary Multiplication Scheme Based on Squaring,” IEEE Trans. Computers, vol. 20, pp. 678-680, 1971.
[4] W. Cody and W. Waite, Software Manual for the Elementary Functions. Prentice-Hall, 1980.
[5] D. DasSarma and D.W. Matula, “Faithful Interpolation in Reciprocal Tables,” Proc. 13th Symp. Computer Arithmetic (ARITH13), pp. 82-91, 1997.
[6] D. DasSarma and D.W. Matula, “Faithful Bipartite ROM Reciprocal Tables,” IEEE Trans. Computers, vol. 47, no. 11, pp. 1216-1222, Nov. 1998.
[7] K. Diefendorff, P.K. Dubey, R. Hochprung, and H. Scales, “Altivec Extension to PowerPC Accelerates Media Processing,” IEEE Micro, pp. 85-95, Mar./Apr. 2000.
[8] M.D. Ercegovac and T. Lang, “On-Line Arithmetic: A Design Methodology and Applications,” VLSI Signal Processing, III, chapter 24, IEEE Press, 1988.
[9] M.D. Ercegovac and T. Lang, Division and Square Root: Digit Recurrence Algorithms and Implementations. Kluwer Academic, 1994.
[10] M.J. Flynn, “On Division by Functional Iteration,” IEEE Trans. Computers, vol. 19, pp. 702-706, 1970.
[11] J. Foley, A. vanDam, S. Feiner, and J. Hughes, Computer Graphics: Principles and Practice in C, second ed. Addison-Wesley, 1995.
[12] D. Harris, “A Powering Unit for an OpenGL Lighting Engine,” Proc. 35th Asilomar Conf. Signals, Systems, and Computers, pp. 1641-1645, 2001.
[13] J.F. Hart, E.W. Cheney, C.L. Lawson, H.J. Maehly, C.K. Mesztenyi, J.R. Rice, H.G. Thacher, and C. Witzgall, Computer Approximations. New York: Wiley, 1968.
[14] N. Ide et al. (Sony Playstation2), “2. 44-GFLOPS 300-MHz Floating-Point Vector-Processing Unit for High-Performance 3D Graphics Computing,” IEEE J. Solid-State Circuits, vol. 35, no. 7, pp. 1025-1033, July 2000.
[15] V.K. Jain, S.A. Wadecar, and L. Lin, “A Universal Nonlinear Component and Its Application to WSI,” IEEE Trans. Components, Hybrids, and Manufacturing Technology, vol. 16, no. 7, pp. 656-664, 1993.
[16] T. Jayarshee and D. Basu, “On Binary Multiplication Using the Quarter Square Algorithm,” Proc. Spring Joint Computer Conf., pp. 957-960, 1974.
[17] I. Koren, “Evaluating Elementary Functions in a Numerical Coprocessor Based on Rational Approximations,” IEEE Trans. Computers, vol. 40, pp. 1030-1037, 1990.
[18] A. Kunimatsu et al. (Sony Playstation2), “Vector Unit Architecture for Emotion Synthesis,” IEEE Micro, vol. 20, no. 2, pp. 40-47, Mar./Apr. 2000.
[19] D.M. Lewis, “114 MFLOPS Logarithmic Number System Arithmetic Unit for DSP Applications,” IEEE J. Solid-State Circuits, vol. 30, no. 12, pp. 1547-1553, 1995.
[20] P. Markstein, IA-64 and Elementary Functions. Hewlett-Packard Professional Books, 2000.
[21] C. May, E. Silha, R. Simpson, and H. Warren, The PowerPC Architecture: A Specification for a New Family of RISC Processors. San Francisco: Morgan Kaufman, 1994.
[22] Microsoft Corp., Microsoft DirectX Technology Review, 2004, default.aspx.
[23] J.-M. Muller, Elementary Functions. Algorithms and Implementation. Birkhauser, 1997.
[24] J.-M. Muller, “A Few Results on Table-Based Methods,” Reliable Computing, vol. 5, no. 3, 1999.
[25] J.-M. Muller, “Partially Rounded Small-Order Approximations for Accurate, Hardware-Oriented, Table-Based Methods,” Proc. IEEE 16th Int'l Symp. Computer Arithmetic (ARITH16), pp. 114-121, 2003.
[26] S. Oberman, G. Favor, and F. Weber, “AMD-3DNow! Technology: Architecture and Implementations,” IEEE Micro, vol. 19, no. 2, pp. 37-48, Mar./Apr. 1999.
[27] S.F. Oberman, “Floating Point Division and Square Root Algorithms and Implementation in the AMD-K7 Microprocessor,” Proc. 14th Symp. Computer Arithmetic (ARITH14), pp. 106-115, Apr. 1999.
[28] A.R. Omondi, Computer Arithmetic Systems. Algorithms, Architecture and Implementations. Prentice Hall, 1994.
[29] J.-A. Piñeiro, “Algorithms and Architectures for Elementary Function Computation,” PhD dissertation, Univ. of Santiago de Compostela, 2003.
[30] J.-A. Piñeiro and J.D. Bruguera, “High-Speed Double-Precision Computation of Reciprocal, Division, Square Root and Inverse Square Root,” IEEE Trans. Computers, vol. 51, no. 12, pp. 1377-1388, Dec. 2002.
[31] J.A. Piñeiro, J.D. Bruguera, and J.-M. Muller, “Faithful Powering Computation Using Table Look-Up and Fused Accumulation Tree,” Proc. IEEE 15th Int'l Symp. Computer Arithmetic, pp. 40-47, 2001.
[32] J.-A. Piñeiro, S. Oberman, J.-M. Muller, and J.D. Bruguera, “High-Speed Function Approximation Using a Minimax Quadratic Interpolator,” technical report, Univ. of Santiago de Compostela, Spain, June 2004,
[33] J.R. Rice, The Approximation of Functions. Reading, Mass.: Addison Wesley, 1964.
[34] M.J. Schulte and J.E. Stine, “Approximating Elementary Functions with Symmetric Bipartite Tables,” IEEE Trans. Computers, vol. 48, no. 8, pp. 842-847, Aug. 1999.
[35] M.J. Schulte and J.E. Stine, “The Symmetric Table Addition Method for Accurate Function Approximation,” J. VLSI Signal Processing, vol. 21, no. 2, pp. 167-177, 1999.
[36] M.J. Schulte and E.E. Swartzlander, “Hardware Designs for Exactly Rounded Elementary Functions,” IEEE Trans. Computers, vol. 43, no. 8, pp. 964-973, Aug. 1994.
[37] M.J. Schulte and K.E. Wires, “High-Speed Inverse Square Roots,” Proc. 14th Int'l Symp. Computer Arithmetic (ARITH14), pp. 124-131, Apr. 1999.
[38] E.M. Schwarz and M.J. Flynn“Hardware, Starting Approximation for the Square Root Operation“ Proc. 11th Symp. Computer Arithmetic (ARITH11), pp. 103-111, 1993.
[39] H.C. Shin, J.A. Lee, and L.S. Kim, “A Minimized Hardware Architecture of Fast Phong Shader Using Taylor Series Approximation in 3D Graphics,” Proc. Int'l Conf. Computer Design, VLSI in Computers and Processors, pp. 286-291, 1998.
[40] N. Takagi, “Powering by a Table Look-Up and a Multiplication with Operand Modification,” IEEE Trans. Computers, vol. 47, no. 11, pp. 1216-1222, Nov. 1998.
[41] P.T.P. Tang, “Table-Driven Implementation of the Logarithm Function in IEEE Floating-Point Arithmetic,” ACM Trans. Math. Software, vol. 4, no. 16, pp. 378-400, Dec. 1990.
[42] P.T.P. Tang, “Table Look-Up Algorithms for Elementary Functions and Their Error Analysis,” Proc. IEEE 10th Int'l Symp. Computer Arithmetic (ARITH10), pp. 232-236, 1991.
[43] Waterloo Maple Inc., Maple 8 Programming Guide, 2002.

Index Terms:
Table-based methods, reciprocal, square root, elementary functions, minimax polynomial approximation, single-precision computations, computer arithmetic.
Jose-Alejandro Pi?eiro, Stuart F. Oberman, Jean-Michel Muller, Javier D. Bruguera, "High-Speed Function Approximation Using a Minimax Quadratic Interpolator," IEEE Transactions on Computers, vol. 54, no. 3, pp. 304-318, March 2005, doi:10.1109/TC.2005.52
Usage of this product signifies your acceptance of the Terms of Use.