This Article 
 Bibliographic References 
 Add to: 
Optimized Custom Precision Function Evaluation for Embedded Processors
January 2009 (vol. 58 no. 1)
pp. 46-59
Dong-U Lee, Mojix, Los Angeles
John D. Villasenor, University of California, Los Angeles, Los Angeles
Fixed-point processors are utilized in an enormous variety of applications, often for tasks that require the evaluation of mathematical functions. We present an automated method for mapping functions to such processors via polynomials that explicitly targets the native word-length of the processor, thereby significantly reducing the execution time relative to commonly used floating-point emulation approaches based on traditional mathematical libraries. The methods presented here also contrast with hand-tuned processor-specific code, which has the potential to deliver efficient implementations but at the cost of significant design time. We describe an automated design flow utilizing multi-word arithmetic to provide overflow protection and precision accurate to one unit in the last place (ulp). Analytical approaches are used to minimize the number of fixed-width operands required for each operation and to ensure that precision requirements are met. This allows automated generation of processor-optimized code and characterization of a design space representing a rich range of tradeoffs among precision, latency, and memory cost.

[1] C. Inacio and D. Ombres, “The DSP Decision: Fixed Point or Floating,” IEEE Spectrum, vol. 33, no. 9, pp. 72-74, 1996.
[2] “Fast Floating-Point Arithmetic Emulation on the Blackfin Processor Platforms (Engineer to Engineer Note: EE-185),” Analog Devices, 2003.
[3] S. Gal and B. Bachelis, “An Accurate Elementary Mathematical Library for the IEEE Floating Point Standard,” ACM Trans. Math. Software, vol. 17, no. 1, pp. pp. 26-45, 1991.
[4] “Extended-Precision Fixed-Point Arithmetic on the Blackfin Processor Platform (Engineer to Engineer Note: EE-186),” Analog Devices, 2003.
[5] G. Constantinides and G. Wöginger, “The Complexity of Multiple Wordlength Assignment,” Applied Math. Letters, vol. 15, no. 2, pp.137-140, 2001.
[6] V. Lefevre, J. Muller, and A. Tisserand, “Toward Correctly Rounded Transcendentals,” IEEE Trans. Computers, vol. 47, no. 11, pp.1235-1243, Nov. 1998.
[7] J. Muller, Elementary Functions: Algorithms and Implementation. Birkhauser Verlag, 1997.
[8] P. Markstein, IA-64 and Elementary Functions: Speed and Precision. Prentice Hall, 2000.
[9] R. Cheung, D. Lee, O. Mencer, W. Luk, and P. Cheung, “Automating Custom-Precision Function Evaluation for Embedded Processors,” Proc. ACM/IEEE Int'l Conf. Compilers, Architecture, and Synthesis for Embedded Systems (CASES '05), pp. 22-31, 2005.
[10] G. Bandera, M. Gonzalez, J. Villalba, J. Hormigo, and E. Zapata, “Evaluation of Elementary Functions Using Multimedia Features,” Proc. 18th IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS '04), p. 90a, 2004.
[11] C. Iordache and P. Tang, “An Overview of Floating-Point Support and Math Library on the Intel XScale Architecture,” Proc. 16th IEEE Symp. Computer Arithmetic (ARITH '03), pp.122-127, 2003.
[12] C. Thron, “How to Code Fast, Accurate Math Functions on DSP Parallel Devices,” article, http://www. embedded.comshowArticle.jhtml?articleID=47901094 , 2004.
[13] “C24x Fixed Point Math Library Model User's Guide,” Texas Instruments, printsprc068. html, 2002.
[14] F. Testa, “Floating Point Math Functions,” Microchip Application Note (AN660), 00660.pdf, 1997.
[15] S. Kim, K. Kum, and W. Sung, “Fixed-Point Optimization Utility for C and C++ Based Digital Signal Processing Programs,” IEEE Trans. Circuits and Systems II, vol. 45, no. 11, pp. 1455-1464, 1998.
[16] H. Keding, M. Willems, M. Coors, and H. Meyr, “FRIDGE: A Fixed-Point Design and Simulation Environment,” Proc. ACM/IEEE Design Automation and Test in Europe (DATE '98), pp. 429-435, 1998.
[17] K. Kum, K. Kang, and W. Sung, “AUTOSCALER for C: An Optimizing Floating-Point to Integer C Program Converter for Fixed-Point Digital Signal Processors,” IEEE Trans. Circuits and Systems II, vol. 47, no. 9, pp. 840-848, 2000.
[18] N. Doi, T. Horiyama, M. Nakanishi, and S. Kimura, “Minimization of Fractional Wordlength on Fixed-Point Conversion for High-Level Synthesis,” Proc. ACM/IEEE Asia and South Pacific Design Automation Conf. (ASP-DAC '04), pp. 80-85, 2004.
[19] D. Lee, A. Abdul Gaffar, O. Mencer, and W. Luk, “MiniBit: Bit-Width Optimization via Affine Arithmetic,” Proc. 42nd ACM/IEEE Design Automation Conf. (DAC '05), pp. 837-840, 2005.
[20] W. Cody and W. Waite, Software Manual for the Elementary Functions. Prentice Hall, 1980.
[21] D. Knuth, The Art of Computer Programming: Seminumerical Algorithms, vol. 2, third ed. Addison-Wesley, 1997.
[22] D. Lee, W. Luk, J. Villasenor, and P. Cheung, “Hierarchical Segmentation Schemes for Function Evaluation,” Proc. IEEE Int'l Conf. Field-Programmable Technology (FPT '03), pp. 92-99, 2003.
[23] L. de Figueiredo and J. Stolfi, “Self-Validated Numerical Methods and Applications,” Brazilian Math. Colloquium Monograph, IMPA, 1997.
[24] L. Ingber, “Adaptive Simulated Annealing (ASA) 25.15,”, 2004.
[25] D. Lee, A. Abdul Gaffar, R. Cheung, O. Mencer, W. Luk, and G. Constantinides, “Accuracy-Guaranteed Bit-Width Optimization,” IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 10, pp. 1990-2000, 2006.
[26] “ILOG CPLEX 9.0, User's Manual,” ILOG, http://www.ilog. com/productscplex, 2003.
[27] “8-Bit AVR Microcontroller with 128K Bytes In-System Programmable Flash ATmega128 Summary,” Atmel, 2004.
[28] “TMS320C6000 CPU and Instruction Set Reference Guide (SPRU189F),” Texas Instruments, 2000.
[29] AVR Studio 4.12, Atmel, , 2005.
[30] “Code Composer Studio Development Tools v3.1 Getting Started Guide (SPRU509F),” Texas Instruments, , 2005.
[31] B. Huber, How to Write Multiplies Correctly in C Code, Texas Instruments Application Report (SPRA683), , 2000.
[32] P. Kulkarni, D. Ganesan, P. Shenoy, and Q. Lu, “SensEye: A Multi-Tier Camera Sensor Network,” Proc. ACM Int'l Conf. Multimedia (Multimedia '05), pp. 229-238, 2005.
[33] M. Rahimi, R. Baer, O. Iroezi, J. Garcia, J. Warrior, D. Estrin, and M. Srivastava, “Cyclops: In Situ Image Sensing and Interpretation in Wireless Sensor Networks,” Proc. Third ACM Conf. Embedded Networked Sensor Systems (SenSys '05), pp. 192-204, 2005.

Index Terms:
Cost/performance, High-Speed Arithmetic, Spline and piecewise polynomial interpolation, Elementary function approximation
Dong-U Lee, John D. Villasenor, "Optimized Custom Precision Function Evaluation for Embedded Processors," IEEE Transactions on Computers, vol. 58, no. 1, pp. 46-59, Jan. 2009, doi:10.1109/TC.2008.124
Usage of this product signifies your acceptance of the Terms of Use.