Subscribe

Issue No.03 - March (2011 vol.60)

pp: 418-432

Davide De Caro , University of Napoli, Napoli, Italy

Antonio Giuseppe Maria Strollo , University of Napoli, Napoli, Italy

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2010.127

ABSTRACT

A novel technique for designing piecewise-polynomial interpolators for hardware implementation of elementary functions is investigated in this paper. In the proposed approach, the interval where the function is approximated is subdivided in equal length segments and two adjacent segments are grouped in a segment pair. Suitable constraints are then imposed between the coefficients of the two interpolating polynomials in each segment pair. This allows reducing the total number of stored coefficients. It is found that the increase in the approximation error due to constraints between polynomial coefficients can easily be overcome by increasing the fractional bits of the coefficients. Overall, compared with standard unconstrained piecewise-polynomial approximation having the same accuracy, the proposed method results in a considerable advantage in terms of the size of the lookup table needed to store polynomial coefficients. The calculus of the coefficients of constrained polynomials and the optimization of coefficients bit width is also investigated in this paper. Results for several elementary functions and target precision ranging from 12 to 42 bits are presented. The paper also presents VLSI implementation results, targeting a 90 nm CMOS technology, and using both direct and Horner architectures for constrained degree-1, degree-2, and degree-3 approximations.

INDEX TERMS

Elementary functions, min-max approximation, polynomial approximation, computer arithmetic, VLSI systems.

CITATION

Davide De Caro, Antonio Giuseppe Maria Strollo, "Elementary Functions Hardware Implementation Using Constrained Piecewise-Polynomial Approximations",

*IEEE Transactions on Computers*, vol.60, no. 3, pp. 418-432, March 2011, doi:10.1109/TC.2010.127REFERENCES

- [1] J.M. Muller,
Elementary Functions: Algorithms and Implementations, second ed. Birkhauser, 2006.- [2] B. Parhami,
Computer Arithmetic: Algorithms and Hardware Designs. Oxford Univ. Press, Aug. 1999.- [3] F. De Dinechin and A. Tisserand, "Multipartite Table Methods,"
IEEE Trans. Computers, vol. 54, no. 3, pp. 319-330, Mar. 2005.- [4] F. De Dinechin and J. Detrey, "Second Order Function Approximation Using a Single Multiplication on FPGA,"
Proc. 14th Int'l Conf. Field-Programmable Logic and Applications, pp. 221-230, Aug. 2004.- [5] J. Detrey and F. De Dinechin, "Table-Based Polynomials for Fast Hardware Function Evaluation,"
Proc. 16th Int'l Conf. Application Specific Systems, Architecture and Processors, 2005.- [6] M.J. Schulte and E.E. Swartzlander, "Hardware Design for Exactly Rounded Elementary Functions,"
IEEE Trans. Computers, vol. 43, no. 8, pp. 964-973, Aug. 1994.- [7] J. Cao, B. Wei, and J. Cheng, "High-Performance Architectures for Elementary Function Generation,"
Proc. 15th Symp. Computer Arithmetic (ARITH15), pp. 136-144, 2001.- [8] J.A. Pineiro, S.F. Oberman, J.M. Muller, and J.D. Bruguera, "High-Speed Function Approximation Using a Minimax Quadratic Interpolator,"
IEEE Trans. Computers, vol. 54, no. 3, pp. 304-318, Mar. 2005.- [9] D.U. Lee, A.A. Gaffar, O. Mencer, and W. Luk, "Optimizing Hardware Function Evaluation,"
IEEE Trans. Computers, vol. 45, no. 12, pp. 1520-1531, Dec. 2005.- [10] S. Nagayama, T. Sasao, and J.T. Butler, "Compact Numerical Function Generators Based on Quadratic Approximation: Architecture and Synthesis Method,"
IEICE Trans. Fundamentals, vol. E89-A, no. 12, pp. 3510-3518, Dec. 2006.- [11] T. Sasao, S. Nagayama, and J.T. Butler, "Numerical Function Generators Using LUT Cascades,"
IEEE Trans. Computers, vol. 56, no. 6, pp. 826-838, June 2007.- [12] D.U. Lee, R.C.C. Cheung, W. Luk, and J.D. Villasenor, "Hierarchical Segmentation for Hardware Function Evaluation,"
IEEE Trans. Very Large Scale Integration, vol. 17, no. 1, pp. 103-116, Jan. 2009.- [13] A. Tisserand, "High Performance Hardware Operators for Polynomial Evaluation,"
Int'l J. High Performance Systems Architecture, vol. 1, no. 1, pp. 14-23, 2007.- [14] E.G. Walters and M.J. Schulte, "Efficient Function Approximation Using Truncated Multipliers and Squarers,"
Proc. 17th IEEE Symp. Computer Arithmetic, 2005.- [15] D. Lee, R. Cheung, W. Luk, and J.D. Villasenor, "Hardware Implementation Trade-Offs of Polynomial Approximations and Interpolations,"
IEEE Trans. Computers, vol. 57, no. 5, pp. 686-701, May 2008.- [16] D. Lee and J.D. Villasenor, "A Bit-Width Optimization Methodology for Polynomial Based Function Evaluation,"
IEEE Trans. Computers, vol. 56, no. 4, pp. 567-571, Apr. 2007.- [17] D. Lee and J.D. Villasenor, "Optimized Custom Function Evaluation for Embedded Processors,"
IEEE Trans. Computers, vol. 58, no. 1, pp. 46-59, Jan. 2009.- [18] D. De Caro, N. Petra, and A.G.M. Strollo, "High Performance Special Function Unit for Programmable 3D Graphics Processors,"
IEEE Trans. Circuits and Systems I, vol. 56, no. 9, pp. 1958-1978, Sept. 2009.- [19] N. Brisebarre, J.-M. Muller, and A. Tisserand, "Computing Machine-Efficient Polynomial Approximation,"
ACM Trans. Math. Software, vol. 32, no. 2, pp. 236-256, June 2006.- [20] N. Brisebarre, J.-M. Muller, A. Tisserand, and S. Torres, "Hardware Operators for Function Evaluation Using Sparse-Coefficient Polynomials,"
Electronic Letters, vol. 42, no. 25, pp. 1441-1442, Dec. 2006.- [21] S. Chevillard and N. Brisebarre, "Efficient Polynomial L-Approximations,"
Proc. 18th IEEE Symp. Computer Arithmetic (ARITH-18), pp. 169-176, June 2007.- [22] E.G. Walters, M.J. Schulte, and M.G. Arnold, "Truncated Squarers with Constant and Variable Correction,"
Proc. SPIE: Advanced Signal Processing Algorithms, Architectures, and Implementations XIV, pp. 40-50, Aug, 2004.- [23] A. Liddicot and M.J. Flynn, "Parallel Square and Cube Computations,"
Proc. 34th Asilomar Conf. Signals, Systems and Computers, Oct. 2000.- [24] J.E. Stine and J.M. Blank, "Partial Product Reduction for Parallel Cubing,"
Proc. IEEE CS Ann. Symp. VLSI, pp. 337-342, 2007.- [25] W. Fraser, "A Survey of Methods to Compute Minimax and Near Minimax Polynomial Approximations for Functions of a Single Independent Variable,"
J. ACM, vol. 12, no. 3, pp. 295-314, July 1965.- [26] A. Ashrafi and R. Adhami, "Theoretical Upper Bound of the Spurious Free Dynamic Range in Direct Digital Frequency Synthesizers Realized by Polynomial Interpolation Methods,"
IEEE Trans. Circuit and System, vol. 54, no. 10, pp. 2252-2261, Oct. 2007.- [27] GNU Linear Programming Kit (GLPK), http://www.gnu.org/softwareglpk/, 2010.
- [28] V. Jain, S. Wadekar, and L. Lin, "A Universal Nonlinear Component and Its Application to WSI,"
IEEE Trans. Components, Hybrids and Manufacturing Technology, vol. 16, no. 7, pp. 656-664, Nov. 1993.- [29] M.J. Shulte and E.E. Swartzlander,Jr., "Truncated Multiplication with Correction Constant,"
Proc. Workshop VLSI Signal Processing VI, pp. 388-396, 1993.- [30] A.G.M. Strollo, N. Petra, and D. De Caro, "Dual-Tree Error Compensation for High Performance Fixed-Width Multipliers,"
IEEE Trans. Circuits and Systems II: Express Briefs, vol. 52, no. 8, pp. 501-507, Aug. 2005. |