|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Zhen Luo, Margaret Martonosi, "Accelerating Pipelined Integer and Floating-Point Accumulations in Configurable Hardware with Delayed Addition Techniques," IEEE Transactions on Computers, vol. 49, no. 3, pp. 208-218, March, 2000. | |||
| BibTex | x | ||
| @article{ 10.1109/12.841125, author = {Zhen Luo and Margaret Martonosi}, title = {Accelerating Pipelined Integer and Floating-Point Accumulations in Configurable Hardware with Delayed Addition Techniques}, journal ={IEEE Transactions on Computers}, volume = {49}, number = {3}, issn = {0018-9340}, year = {2000}, pages = {208-218}, doi = {http://doi.ieeecomputersociety.org/10.1109/12.841125}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Computers TI - Accelerating Pipelined Integer and Floating-Point Accumulations in Configurable Hardware with Delayed Addition Techniques IS - 3 SN - 0018-9340 SP208 EP218 EPD - 208-218 A1 - Zhen Luo, A1 - Margaret Martonosi, PY - 2000 KW - Delayed addition KW - accumulation KW - multiply-accumulate KW - MAC KW - FPGA. VL - 49 JA - IEEE Transactions on Computers ER - | |||
Abstract—The speed of arithmetic calculations in configurable hardware is limited by carry propagation, even with the dedicated hardware found in recent FPGAs. This paper proposes and evaluates an approach called
[1] ANSI/IEEE Std. 754-1985, Binary Floating-Point Arithmetic, IEEE Press, Piscataway, N.J., 1985 (also called ISO/IEC 559).
[2] L. Louca, T.A. Cook, and W.H. Johnson, “Implementation of IEEE Single Precision Floating Point Addition and Multiplication on FGPAs,” Proc. IEEE Symp. FPGAs for Custom Computing Machines, Apr. 1996.
[3] R.W. Canik and E.E. Swartzlander, “Implementing Array Multipliers in XILINX FPGAs,” Proc. 1994 28th Asilomar Conf. Signals, Systems, and Computers, 1994.
[4] J.R. Taylor, An Introduction to Error Analysis. Univ. Science Books, 1982.
[5] J.H. Wilkinson, Rounding Errors in Algebraic Processes. Prentice Hall, 1963.
[6] D.A. Patterson, J.L. Hennessy, and D. Goldberg, Computer Architecture, A Quantitative Approach, Appendix A, second ed. Morgan Kaufmann, 1996.
[7] N. Weste and K. Eshraghian, Principles of CMOS VLSI Design, Addison-Wesley, 1994.
[8] N. Ohkubo et al., “A 4.4-ns CMOS 54$\times$54-b Multiplier Using Pass-Transistor Multiplexor,” IEEE J. Solid-State Circuits, vol. 30, pp. 251-256, Mar. 1995.
[9] H. Makino, Y. Nakase, H. Susuki, H. Morinaka, H. Shinohara, and K. Mashiko, "An 8.8-ns 54×54-Bit Multiplier with High Speed Redundant Binary Architecture," IEEE J. Solid State Circuits, vol. 31, pp. 773-783, June 1996.
[10] C.S. Wallace, “Suggestions for a Fast Multiplier,” IEEE Trans. Electronic Computers, vol. 13, pp. 114-117, Feb. 1964.
[11] Y. Kanie et al., “4-2 Compressor with Complementary Pass-Transistor Logic,” IEICE Trans. Electron, vol. E77-c, no. 4, pp. 789-796, Apr. 1994.
[12] C. Heikes and G. Colon-Bonet, A Dual Floating Point Coprocessor with an FMAC Architecture Proc. IEEE Int'l Solid State Circuits Conf. (ISSCC96), pp. 354-355, 1996.
[13] N. Shirazi, A. Walters, and P. Athanas, “Quantitative Analysis of Floating Point Arithmetic on FPGA Based Custom Computing Machines,” Proc. IEEE Symp. FPGAs for Custom Computing Machines, pp. 155-162, Apr. 1995.
[14] W.B. Ligion III, S. McMillan, G. Monn, F. Stivers, and K.D. Underwood, “A Re-Evaluation of the Practicality of Floating-Point Operations on FPGAs,” Proc. IEEE Symp. FPGAs for Custom Computing Machines, Apr. 1998.
[15] D.P. Bhandarkar, Alpha Implementations and Architecture, Complete Reference and Guide. Digital Press, 1996.
[16] R.K. Yu and G.B. Zyner, “167 MHz Radix-4 Floating Point Multiplier,” Proc. 12th Symp. Computer Arithmetic, vol. 12, pp. 149-154, 1995.
[17] F.M. McMahon, “The Livermore FORTRAN Kernels: A Computer Test of Numerical Performance Range,” Technical Report UCRL-55745, Lawrence Livermore Nat'l Laboratory, Univ. of California, Davis, Dec. 1986.
[18] D. Priest, “Differences among IEEE 754 Implementations,” http://www.validgh.com/goldbergaddendum.html , 1997.
[19] Xilinx, “XC4000E and XC4000X Series Field Programmable Gate Arrays, Product Specification,” V1.4, Nov. 1997.
[20] D. Goldberg, “What Every Computer Scientist Should Know about Floating-Point Arithmetic,” http://www.validgh.com/goldbergpaper.ps, 1991.
[21] N.J. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, Philadelphia, 1996.
[22] Microelectronics Group, Lucent Tech nologies, “Create Multiply-Accumulate Functions in ORCA FPGAs,” Feb. 1997.
[23] Altera, “FLEX 10K v.s. FPGA performance,” Technical Brief 12, Sept. 1996.
[24] Altera, “Implementing Multipliers in Flex 10K Devices,” Application Note 53, Mar. 1996.
[25] Xilinx, “Virtex-E 1.8V Field Programmable Gate Arrays Datasheet Description v1.1,” 1999.
[26] M. Nomura et al., “A 300-MHz 16-b 0.5 um BiCMOS Digital Signal Processor Core LSI,” IEEE J. Solid State Circuits, vol. 29, no. 3, Mar. 1994.
[27] N.D. Gupta, “Reconfigurable Computing for Space-Time Adaptive Processing,” master's thesis proposal, Dept. of Computer Science, Texas Tech Univ., Fall 1997.
[28] S.T. Smith et al., “Linear and Nonlinear Conjugate Gradient Methods for Adaptive Processing,” Proc. 1996 Int'l Conf. Acoustics, Speech, and Signal Processing, May 1996.

