This Article 
 Bibliographic References 
 Add to: 
A Comparison of Three Rounding Algorithms for IEEE Floating-Point Multiplication
July 2000 (vol. 49 no. 7)
pp. 638-650

Abstract—A new IEEE compliant floating-point rounding algorithm for computing the rounded product from a carry-save representation of the product is presented. The new rounding algorithm is compared with the rounding algorithms of Yu and Zyner [26] and of Quach et al. [17]. For each rounding algorithm, a logical description and a block diagram is given, the correctness is proven, and the latency is analyzed. We conclude that the new rounding algorithm is the fastest rounding algorithm, provided that an injection (which depends only on the rounding mode and the sign) can be added in during the reduction of the partial products into a carry-save encoded digit string. In double precision format, the latency of the new rounding algorithm is $12$ logic levels compared to $14$ logic levels in the algorithm of Quach et al. and $16$ logic levels in the algorithm of Yu and Zyner.

[1] H. Al-Twaijry, “Area and Performance Optimized CMOS,” PhD thesis, Stanford Univ., Aug. 1997, .
[2] G.W. Bewick, “Fast Multiplication: Algorithms and Implementation,” PhD thesis, Stanford Univ., Mar. 1994,
[3] R.P. Brent and H.T. Kung, “A Regular Layout for Parallel Adders,” IEEE Trans. Computers, vol. 31, no. 3, pp. 260-264, Mar. 1982.
[4] J.T. Coonen, “Specification for a Proposed Standard for Floating-Point Arithmetic,” Memorandum ERL M78/72, Univ. of California, Berkeley, 1978.
[5] L. Dadda, “Some Schemes for Parallel Multipliers,” Alta Frequenza, vol. 34, pp. 349-356, 1965.
[6] M. Daumas and D.W. Matula, “Recoders for Partial Compression and Rounding,” Technical Report 97-01, Laboratoire de l'Informatique du Parallelisme, Lyon, France, 1997, / LIP/Rapports/RR/
[7] G. Even, S.M. Mueller, and P.M. Seidel, “A Dual Mode IEEE Multiplier,” Proc. Second IEEE Int'l Conf. Innovative Systems in Silicon, pp. 282-289, 1997.
[8] C.N. Hinds, E.V. Fiene, D.T. Marquette, and E.E. Quintana, “Parallel Method and Apparatus for Detecting and Completing Floating-Point Operations Involving Special Operands,” US patent 5339266, 1994.
[9] IEEE Standard for Binary Floating-Point Arithmetic. New York: ANSI/IEEE 754-1985, 1985.
[10] C. Lee, “Multistep Gradual Rounding,” IEEE Trans. Computers, vol. 32, no. 4, pp. 595-600, Apr. 1989.
[11] C. Martel, V.G. Oklobdzija, R. Ravi, and P.F. Stelling, "Design Strategies for Optimal Multiplier Circuits," Proc. 12th IEEE Symp. Computer Arithmetic, pp. 42-49, 1995.
[12] Z.-J. Mou and F. Jutand, “Overturned-Stairs Adder Trees and Multiplier Design,” IEEE Trans. Computers, vol. 41, no. 8, pp. 940-948, Aug. 1992.
[13] S. Oberman, "Design Issues in High Performance Floating Point Arithmetic Units," PhD thesis, Stanford Univ., Nov. 1996.
[14] S. Oberman, H. Al-Twaijry, and M. Flynn, The SNAP Project: Design of Floating Point Arithmetic Units Proc. 13th IEEE Symp. Computer Arithmetic, pp. 156-165, 1997.
[15] V.G. Oklobdzija, D. Villeger, and S.S. Liu, "A Method for Speed Optimized Partial Product Reduction and Generation of Fast Parallel Multipliers Using an Algorithmic Approach," IEEE Trans. Computers, vol. 45, no. 3, pp. 294-305, Mar. 1996.
[16] R.M. Owens, R.S. Bajwa, and M.J. Irwin, “Reducing the Number of Counters Needed for Integer Multiplication,” Proc. 12th Symp. Computer Arithmetic, pp. 38-41, 1995.
[17] N. Quach, N. Takagi, and M. Flynn, “On Fast IEEE Rounding,” Technical Report CSL-TR-91-459, Stanford Univ., Jan. 1991.
[18] M.R. Santoro, G. Bewick, and M.A. Horowitz, “Rounding Algorithms for IEEE Multipliers,” Proc. Ninth Symp. Computer Arithmetic, pp. 176-183, 1989.
[19] P.-M. Seidel, “How to Half the Latency of IEEE Compliant Floating-Point Multiplication,” Proc. 24th Euromicro Conf., 1998.
[20] P.-M. Seidel, “The Design of IEEE Compliant Floating-Point Units and Their Quantitative Analysis,” PhD thesis, Univ.of Saarland, Dec. 1999.
[21] N. Takagi,H. Yasuura,, and S. Yajima,“High-speed VLSI multiplication algorithm with a redundant binary addition tree,” IEEE Trans. Computers, vol. 34, no. 9, pp. 789-796, Sept. 1985.
[22] A. Tyagi, A Reduced-Area Scheme for Carry-Select Adders IEEE Trans. Computers, vol. 42, no. 10, Oct. 1993.
[23] J. Vuillemin, “A Very Fast Multiplication Algorithm for VLSI Implementation,” INTEGRATION the VLSI J, vol. 1, pp. 39-52, 1983.
[24] C.S. Wallace, “A Suggestion for Parallel Multipliers,” IEEE Trans. Electronic Computers, vol. 13, pp. 14-17, 1964.
[25] Z. Wang, G.A. Jullien, and W.C. Miller, “A New Design Technique for Column Compression Multipliers,” IEEE Trans. Computers, vol. 44, no. 8, pp. 962-970, Aug. 1995.
[26] R.K. Yu and G.B. Zyner, “167 MHz Radix-4 Floating Point Multiplier,” Proc. 12th Symp. Computer Arithmetic, vol. 12, pp. 149-154, 1995.
[27] R.K. Yu and G.B. Zyner, “Method and Apparatus for Partially Suporting Subnormal Operands in Floating-Point Multiplication,” US patent 5602769, 1997.
[28] G. Zyner, “Circuitry for Rounding in a Floating-Point Multiplier,” US patent 5150319, 1992.

Index Terms:
Floating-point arithmetic, IEEE 754 Standard, floating-point multiplication, IEEE rounding.
Guy Even, Peter-Michael Seidel, "A Comparison of Three Rounding Algorithms for IEEE Floating-Point Multiplication," IEEE Transactions on Computers, vol. 49, no. 7, pp. 638-650, July 2000, doi:10.1109/12.863033
Usage of this product signifies your acceptance of the Terms of Use.