The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - May (2010 vol.59)
pp: 679-693
Alvaro Vazquez , ENS-Lyon, Lyon
Elisardo Antelo , University of Santiago de Compostela, Santiago de Compostela
Paolo Montuschi , Politecnico di Torino, Torino
ABSTRACT
The new generation of high-performance decimal floating-point units (DFUs) is demanding efficient implementations of parallel decimal multipliers. In this paper, we describe the architectures of two parallel decimal multipliers. The parallel generation of partial products is performed using signed-digit radix-10 or radix-5 recodings of the multiplier and a simplified set of multiplicand multiples. The reduction of partial products is implemented in a tree structure based on a decimal multioperand carry-save addition algorithm that uses unconventional (non BCD) decimal-coded number systems. We further detail these techniques and present the new improvements to reduce the latency of the previous designs, which include: optimized digit recoders for the generation of 2^n-tuples (and 5-tuples), decimal carry-save adders (CSAs) combining different decimal-coded operands, and carry-free adders implemented by special designed bit counters. Moreover, we detail a design methodology that combines all these techniques to obtain efficient reduction trees with different area and delay trade-offs for any number of partial products generated. Evaluation results for 16-digit operands show that the proposed architectures have interesting area-delay figures compared to conventional Booth radix-4 and radix--8 parallel binary multipliers and outperform the figures of previous alternatives for decimal multiplication.
INDEX TERMS
Decimal multiplication, parallel multiplication, decimal carry-save addition, decimal codings.
CITATION
Alvaro Vazquez, Elisardo Antelo, Paolo Montuschi, "Improved Design of High-Performance Parallel Decimal Multipliers", IEEE Transactions on Computers, vol.59, no. 5, pp. 679-693, May 2010, doi:10.1109/TC.2009.167
REFERENCES
[1] F.Y. Busaba, C.A. Krygowski, W.H. Li, E.M. Schwarz, and S.R. Carlough, "The IBM z900 Decimal Arithmetic Unit," Proc. Conf. Record of the Asilomar Conf. Signals, Systems and Computers, vol. 2, pp. 1335-1339, Nov. 2001.
[2] F.Y. Busaba, T. Slegel, S. Carlough, C. Krygowski, and J.G. Rell, "The Design of the Fixed Point Unit for the z990 Microprocessor," Proc. 14th ACM Great Lakes Symp. VLSI 2004, pp. 364-367, Apr. 2004.
[3] I.D. Castellanos and J.E. Stine, "Compressor Trees for Decimal Partial Product Reduction," Proc. 18th ACM Great Lakes Symp. VLSI, pp. 107-110, Mar. 2008.
[4] M. Cornea, C. Anderson, J. Harrison, P.T.P. Tang, E. Schneider, and C. Tsen, "A Software Implementation of the IEEE 754R Decimal Floating-Point Arithmetic Using the Binary Encoding Format," Proc. 18th IEEE Symp. Computer Arithmetic, pp. 29-37, June 2007.
[5] M.F. Cowlishaw, "Decimal Floating-Point: Algorism for Computers," Proc. 16th IEEE Symp. Computer Arithmetic, pp. 104-111, July 2003.
[6] M.F. Cowlishaw, The decNumber ANSI C Library, IBM Corp., 2003.
[7] M.F. Cowlishaw, E.M. Schwarz, R.M. Smith, and C.F. Webb, "A Decimal Floating-Point Specification," Proc. 15th IEEE Symp. Computer Arithmetic, pp. 147-154, June 2001.
[8] L. Dadda, "Multioperand Parallel Decimal Adder: A Mixed Binary and BCD Approach," IEEE Trans. Computers, vol. 56, no. 10, pp. 1320-1328, Oct. 2007.
[9] L. Dadda and A. Nannarelli, "A Variant of a Radix-10 Combinational Multiplier," Proc. IEEE Int'l Symp. Circuits and Systems (ISCAS '08), pp. 3370-3373, May 2008.
[10] A.Y. Duale, M.H. Decker, H.-G. Zipperer, M. Aharoni, and T.J. Bohizic, "Decimal Floating-Point in Z9: An Implementation and Testing Perspective," IBM J. Research and Development, vol. 51, nos. 1/2, pp. 217-227, Jan. 2007.
[11] L. Eisen et al., "IBM POWER6 Accelerators: VMX and DFU," IBM J. Research and Development, vol. 51, no. 6, pp. 663-684, Nov. 2007.
[12] M.A. Erle and M.J. Schulte, "Decimal Multiplication via Carry-Save Addition," Proc. IEEE Int'l Conf. Application-Specific Systems, Architectures, and Processors, pp. 348-358, June 2003.
[13] M.A. Erle, E.M. Schwarz, and M.J. Schulte, "Decimal Multiplication with Efficient Partial Product Generation," Proc. 17th IEEE Symp. Computer Arithmetic, pp. 21-28, June 2005.
[14] M.A. Erle, J.M. Linebarger, and M.J. Schulte, "Potential Speedup Using Decimal Floating-Point Hardware," Proc. 36th Asilomar Conf. Signals, Systems and Computers, pp. 1073-1077, Nov. 2002.
[15] M.A. Erle, M.J. Schulte, and B.J. Hickman, "Decimal Floating-Point Multiplication via Carry-Save Addition," Proc. 18th IEEE Symp. Computer Arithmetic, pp. 46-55, June 2007.
[16] B.J. Hickman, A. Krioukov, M.A. Erle, and M.J. Schulte, "A Parallel IEEE P754 Decimal Floating-Point Multiplier," Proc. 25th IEEE Conf. Computer Design, pp. 296-303, Oct. 2007.
[17] IEEE Std 754(TM)-2008, IEEE Standard for Floating-Point Arithmetic, IEEE CS, Aug. 2008.
[18] R.D. Kenney and M.J. Schulte, "High-Speed Multioperand Decimal Adders," IEEE Trans. Computers, vol. 54, no. 8, pp. 953-963, Aug. 2005.
[19] R.D. Kenney, M.J. Schulte, and M.A. Erle, "High-Frequency Decimal Multiplier," Proc. IEEE Int'l Conf. Computer Design: VLSI in Computers and Processors, pp. 26-29, Oct. 2004.
[20] T. Lang and A. Nannarelli, "A Radix-10 Combinational Multiplier," Proc. 40th Asilomar Conf. Signals, Systems, and Computers, pp. 313-317, Oct. 2006.
[21] R.H. Larson, "High-Speed Multiply Using Four Input Carry-Save Adder," IBM Technical Disclosure Bull., vol. 16, no. 7, pp. 2053-2054, Dec. 1973.
[22] N. Ohkubo et al., "A 4.4 ns CMOS 54x54-Bit Multiplier Using Pass-Transistor Multiplexer," IEEE J. Solid State Circuits, vol. 30, no. 3, pp. 251-256, Mar. 1995.
[23] T. Ohtsuki et al., "Apparatus for Decimal Multiplication," US Patent 4,677,583, June 1987.
[24] R.K. Richards, Arithmetic Operations in Digital Computers. D. Van Nostrand Company, Inc., 1955.
[25] M. Schmookler and A. Weinberger, "High Speed Decimal Addition," IEEE Trans. Computers, vol. 20, no. 8, pp. 862-866, Aug. 1971.
[26] E.M. Schwarz, R.M. Averill,III, and L.J. Sigal, "A Radix-8 CMOS S/390 Multiplier," Proc. 13th IEEE Symp. Computer Arithmetic (ARITH-13 '97), pp. 2-9, July 1997.
[27] E.M. Schwarz, J.S. Kapernick, and M.F. Cowlishaw, "Decimal Floating-Point Support on the IBM System z10 Processor," IBM J. Research and Development, vol. 51, no. 1, Jan./Feb. 2009.
[28] B. Shirazi, D.Y.Y. Yun, and C.N. Zhang, "RBCD: Redundant Binary Coded Decimal Adder," Proc. IEE Conf. Computers and Digital Techniques, vol. 136, pp. 156-160, Mar. 1989.
[29] I.E. Sutherland, R.F. Sproull, and D. Harris, Logical Effort: Designing Fast CMOS Circuits. Morgan Kaufmann, 1999.
[30] A. Svoboda, "Decimal Adder with Signed-Digit Arithmetic," IEEE Trans. Computers, vol. 18, no. 3, pp. 212-215, Mar. 1969.
[31] T. Ueda, "Decimal Multiplying Assembly and Multiply Module," US Patent 5379245, Jan. 1995.
[32] A. Vázquez and E. Antelo, "Conditional Speculative Decimal Addition," Proc. Seventh Conf. Real Numbers and Computers (RNC 7), pp. 47-57, July 2006.
[33] A. Vázquez, E. Antelo, and P. Montuschi, "A New Family of High-Performance Parallel Decimal Multipliers," Proc. 18th IEEE Symp. Computer Arithmetic, pp. 195-204, June 2007.
[34] L.K. Wang, C. Tsen, M.J. Schulte, and D. Jhalani, "Benchmarks and Performance Analysis of Decimal Floating-Point Applications," Proc. IEEE 25th Int'l Conf. Computer Design, pp. 164-170, Oct. 2007.
[35] G.S. White, "Coded Decimal Number Systems for Digital Computers," Proc. Institute of Radio Engineers, vol. 41, no. 10, pp. 1450-1452, Oct. 1953.
5 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool