
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
M.J. Schulte, E.E. Swartzlander, Jr., "Hardware Designs for Exactly Rounded Elementary Functions," IEEE Transactions on Computers, vol. 43, no. 8, pp. 964973, August, 1994.  
BibTex  x  
@article{ 10.1109/12.295858, author = {M.J. Schulte and E.E. Swartzlander, Jr.}, title = {Hardware Designs for Exactly Rounded Elementary Functions}, journal ={IEEE Transactions on Computers}, volume = {43}, number = {8}, issn = {00189340}, year = {1994}, pages = {964973}, doi = {http://doi.ieeecomputersociety.org/10.1109/12.295858}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Computers TI  Hardware Designs for Exactly Rounded Elementary Functions IS  8 SN  00189340 SP964 EP973 EPD  964973 A1  M.J. Schulte, A1  E.E. Swartzlander, Jr., PY  1994 KW  CMOS integrated circuits; digital arithmetic; Chebyshev approximation; approximation theory; polynomials; summing circuits; multiplying circuits; hardware designs; exactly rounded elementary functions; reciprocal; squareroot; polynomial approximation; multioperand adder; Chebyshev series approximation; singleprecision floating point numbers; chip area; 1.0micron CMOS technology; computational delay; computer arithmetic; exact rounding; parallel multiplier; argument reduction; specialpurpose hardware; 1 micron. VL  43 JA  IEEE Transactions on Computers ER   
This paper presents hardware designs that produce exactly rounded results for the functions of reciprocal, squareroot, 2/sup x/, and log/sub 2/(x). These designs use polynomial approximation in which the terms in the approximation are generated in parallel, and then summed by using a multioperand adder. To reduce the number of terms in the approximation, the input interval is partitioned into subintervals of equal size, and different coefficients are used for each subinterval. The coefficients used in the approximation are initially determined based on the Chebyshev series approximation. They are then adjusted to obtain exactly rounded results for all inputs. Hardware designs are presented, and delay and area comparisons are made based on the degree of the approximating polynomial and the accuracy of the final result. For singleprecision floating point numbers, a design that produces exactly rounded results for all four functions has an estimated delay of 80 ns and a total chip area of 98 mm/sup 2/ in a 1.0micron CMOS technology. Allowing the results to have a maximum error of one unit in the last place reduces the computational delay by 5% to 30% and the area requirements by 33% to 77%.
[1] P. Markstein, "Computation of elementary functions on the IBM RISC System/6000 processor,"IBM J. Res Develop., vol. 34, no. 1, pp. 111119, Jan. 1990.
[2] W. Cody and W. WaRe,Software Manual for the Elementary Funclions. Englewood Cliffs, NJ: PrenticeHall, 1980.
[3] C. T. Fike,Computer Evaluation of Mathematical Functions. Englewood Cliffs, NJ: PrenticeHall, 1968.
[4] J. E. Volder, "The CORDIC trigonometric computing technique,"IRE Trans. Electron. Comput., vol. EC8, pp. 330334, 1959.
[5] M. J. Flynn, "'On division by functional iteration,"IEEE Trans. Comput., vol. C19, pp. 702706, 1970.
[6] I. Koren and O. Zinaty, "Evaluation of elementary functions in a numerical coprocessor based on rational approximations,"IEEE Trans. Comput., vol. 39, pp. 10301037, 1990.
[7] M. D. Ercegovac, "Radix16 evaluation of certain elementary functions,"IEEE Trans. Comput., vol. C22, pp. 561566, 1973.
[8] A.S. Noetzel, "an interpolating memory unit for function evaluation: analysis and design,"IEEE Trans. Comput., vol. 38, pp. 377384, 1989.
[9] G. H. Garcia and W. J. Kubitz, "Minimum mean running time function generation using read only memory,"IEEE Trans. Comput., vol. C32, pp. 147156, 1983.
[10] P.T.P. Tang, "Tablelookup algorithms for elementary functions and their error analysis,"Proc. 10th Symp. Comput. Arithmetic, 1991, pp. 232236.
[11] American National Standards Institute, "IEEE Standard 754 for binary floating point arithmetic,"ANSI/IEEE Standard No. 754, Washington DC, 1985.
[12] D. Goldberg, "What every computer scientist should know about floatingpoint arithmetic,"ACM Computing Surv., vol. 23, pp. 548, 1991.
[13] D. Hough, "Elementary functions based on IEEE arithmetic,"Mini/Micro West Conf. Rec., 1983, pp. 14.
[14] S. Gal and B. Bachelis, "An accurate elementary mathematical library for the ieee floating point standard,"ACM Trans. Math. Software, vol. 17, no. 1, pp. 2645, Mar. 1991.
[15] C. M. Black, R. P. Burton, and T. H. Miller, "The need for an industry standard of accuracy for elementary function programs,"ACM Trans. Mathematical Software, vol. 1, pp. 361366, 1984.
[16] A. Ziv, "Fast evaluation of elementary mathematical functions with correctly rounded last bit,"ACM Trans. Mathematical Software, vol. 17, pp. 410423, 1991.
[17] M. J. Schulte and E. E. Swartzlander, "Exact rounding of certain elementary functions,"Proc. 11th Symp. Comput. Arithmetic, 1993, pp. 138145.
[18] J.H. Mathews,Numerical Methods for Computer Science, Engineering and Mathematics. Englewood Cliffs, NJ: PrenticeHall, 1987.
[19] P.M. Farmwald, "High bandwidth evaluation of elementary functions,"Proc. 5th Symp. Comput. Arithmetic, 1981, pp. 139142.
[20] V.K. Jain, "Arithmetic analysis of a new reciprocal cell,"1992 Int. Conf. Comput. Design: VLSI in Comput. and Processors, 1992, pp. 106109.
[21] V. K. Jain, S.A. Wadekar, and L. Lin, "Universal nonlinear component and its application to WSI,"IEEE Trans. Components, Hybrids, and Manufacturing Technol., vol. 16, pp. 656664, 1993.
[22] V. K. Jain, G. E. Perez, and J. M. Wills, "DSP coprocessor cell for systolic arrays,"VLSI Signal Processing, vol. VI, pp. 480488, 1993.
[23] C.R. Baugh and B. A. Wonley, "A two's complement parallel array multiplication algorithm,"IEEE Trans. Comput., vol. C22, pp. 10451047, 1973.
[24] C. S. Wallace, "A suggestion for a fast multiplier,"IEEE Trans. Electron. Comput., vol. EC13, pp. 1417, 1964.
[25] R. P. Brent and H. Y. Kung, "A regular layout for parallel adders,"IEEE Trans. Comput., vol. C31, pp. 260264, 1982.
[26] A. Weinberger and J. L. Smith, "A logic for highspeed addition,"Nat. Bureau of Standards Circular, no. 591, pp. 312, 1958.
[27] E.E. Swartzlander, Jr., "Merged arithmetic,"IEEE Trans. Comput., vol. C29, pp. 946950, 1980.
[28] L. Dadda, "Some schemes for parallel multipliers,"Alta Frequenza, vol. 34, pp. 349356, 1965.
[29] LSI Logic Corp.,LSI Logic 1.0 Micron CellBased Products Databook. Milpitas, CA: 1991.