Subscribe

Issue No.11 - November (2009 vol.58)

pp: 1539-1552

Ghassem Jaberipur , Shahid Beheshti University and Institute for Research in Fundamental Sciences, Tehran

Amir Kaivani , Shahid Beheshti University, Tehran

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2009.110

ABSTRACT

Hardware support for decimal computer arithmetic is regaining popularity. One reason is the recent growth of decimal computations in commercial, scientific, financial, and Internet-based computer applications. Newly commercialized decimal arithmetic hardware units use radix-10 sequential multipliers that are rather slow for multiplication-intensive applications. Therefore, the future relevant processors are likely to host fast parallel decimal multiplication circuits. The corresponding hardware algorithms are normally composed of three steps: partial product generation (PPG), partial product reduction (PPR), and final carry-propagating addition. The state of the art is represented by two recent full solutions with alternative designs for all the three aforementioned steps. In addition, PPR by itself has been the focus of other recent studies. In this paper, we examine both of the full solutions and the impact of a PPR-only design on the appropriate one. In order to improve the speed of parallel decimal multiplication, we present a new PPG method, fine-tune the PPR method of one of the full solutions and the final addition scheme of the other; thus, assembling a new full solution. Logical Effort analysis and 0.13 \mu{\rm m} synthesis show at least 13 percent speed advantage, but at a cost of at most 36 percent additional area consumption.

INDEX TERMS

Decimal computer arithmetic, parallel decimal multiplication, partial product generation and reduction, logic design.

CITATION

Ghassem Jaberipur, Amir Kaivani, "Improving the Speed of Parallel Decimal Multiplication",

*IEEE Transactions on Computers*, vol.58, no. 11, pp. 1539-1552, November 2009, doi:10.1109/TC.2009.110REFERENCES

- [2] F.Y. Busaba, C.A. Krygowski, W.H. Li, E.M. Schwarz, and S.R. Carlough, “The IBM z900 Decimal Arithmetic Unit,”
Proc. Asilomar Conf. Signals, Systems, Computers, vol. 2, pp. 1335-1339, Nov. 2001.- [3] S. Shankland, “IBM's POWER6 Gets Help with Math, Multimedia,”
ZDNet News, Oct. 2006.- [5] IEEE Standards Committee,
754-2008 IEEE Standard for Floating-Point Arithmetic, (http://ieeexplore.IEEE.org/servlet opac?punumber=4610933 ), pp. 1-58, Aug. 2008, DOI: 10.1109/IEEESTD.2008.4610935. - [7] J. Thompson, K. Nandini, and M.J. Schulte, “A 64-Bit Decimal Floating-Point Adder,”
Proc. IEEE Computer Soc. Ann. Symp. VLSI Emerging Trends VLSI Systems Design (ISVLSI '04), pp. 197-198, Feb. 2004.- [8] A. Vazquez and E. Antelo, “Conditional Speculative Decimal Addition,”
Proc. Seventh Conf. Real Numbers Computers (RNC 7), pp. 47-57, July 2006.- [11] M.A. Erle and M.J. Schulte, “Decimal Multiplication via Carry-Save Addition,”
Proc. Conf. Application-Specific Systems, Architectures, Processors, pp. 348-358, June 2003.- [12] R.D. Kenney, M.J. Schulte, and M.A. Erle, “A High-Frequency Decimal Multiplier,”
Proc. IEEE Int'l. Conf. Computer Design: VLSI Computers Processors (ICCD), pp. 26-29, Oct. 2004.- [13] M.A. Erle, E.M. Schwartz, and M.J. Schulte, “Decimal Multiplication with Efficient Partial Product Generation,”
Proc. 17th IEEE Symp. Computer Arithmetic, pp. 21-28, June 2005.- [14] W. Liang-Kai and M.J. Schulte, “Decimal Floating-Point Division Using Newton-Raphson Iteration,”
Proc. 15th Int'l. Conf. Application-Specific Systems, Architectures Processors, pp. 84-95, 2004.- [17] L. Wang and M.J. Schulte, “A Decimal Floating-Point Divider Using Newton-Raphson Iteration,”
J. VLSI Signal Processing Systems, vol. 14, no. 1, pp. 3-18, Oct. 2007.- [18] T. Lang and A. Nannarelli, “A Radix-10 Combinational Multiplier,”
Proc. Asilomar Conf. Signals, Systems, Computers, pp. 313-317, Nov. 2006.- [19] I.D. Castellanos and J.E. Stine, “Compressor Trees for Decimal Partial Product Reduction,”
Proc. 18th ACM Great Lakes Symp. VLSI, pp. 107-110, May 2008.- [20] A. Vazquez, E. Antelo, and P. Montuschi, “A New Family of High-Performance Parallel Decimal Multipliers,”
Proc. 18th IEEE Symp. Computer Arithmetic, pp. 195-204, June 2007.- [21] I.E. Sutherland, R.F. Sproull, and D. Harris,
Logical Effort: Designing Fast CMOS Circuits. Morgan Kaufmann, 1999.- [23] R.K. Richards,
Arithmetic Operations in Digital Computers. Van Nostrand, 1955.- [24] R.H. Larson, “High Speed Multiply Using Four Input Carry Save Adder,”
IBM Technical Disclosure Bull., vol. 16, no. 7, pp. 2053-2054, Dec. 1973.- [25] T. Ueda, “Decimal Multiplying Assembly and Multiply Module,” US Patent 5379245, Jan. 1995.
- [26] C.S. Wallace, “A Suggestion for Fast Multiplier,”
IEEE Trans. Electronic Computers, vol. 13, no. 2, pp. 14-17, Feb. 1964.- [28] P.M. Kogge and H.S. Stone, “A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations,”
IEEE Trans. Computers, vol. 22, no. 8, pp.786-793, Aug. 1973.- [29] B. Hickmann, A. Krioukov, M. Schulte, and M. Erle, “A Parallel IEEE P754 Decimal Floating-Point Multiplier,”
Proc. 25th Int'l. Conf. Computer Design (ICCD '07), pp. 296-303, Oct. 2007. |