This Article 
 Bibliographic References 
 Add to: 
A Simple High-Speed Multiplier Design
October 2006 (vol. 55 no. 10)
pp. 1253-1258
The performance of multiplication is crucial for multimedia applications such as 3D graphics and signal processing systems, which depend on the execution of large numbers of multiplications. Previously reported algorithms mainly focused on rapidly reducing the partial products rows down to final sums and carries used for the final accumulation. These techniques mostly rely on circuit optimization and minimization of the critical paths. In this paper, an algorithm to achieve fast multiplication in two's complement representation is presented. Rather than focusing on reducing the partial products rows down to final sums and carries, our approach strives to generate fewer partial products rows. In turn, this influences the speed of the multiplication, even before applying partial products reduction techniques. Fewer partial products rows are produced, thereby lowering the overall operation time. In addition to the speed improvement, our algorithm results in a true diamond-shape for the partial product tree, which is more efficient in terms of implementation. The synthesis results of our multiplication algorithm using the Artisan TSMC 0.13um 1.2-Volt standard-cell library show 13 percent improvement in speed and 14 percent improvement in power savings for 8-bit \times 8-bit multiplications (10 percent and 3 percent, respectively, for 16-bit \times 16-bit multiplications) when compared to conventional multiplication algorithms.

[1] Artisan Components, TSMC $0.13\mu{\rm m}$ Process CL013LV 1.2-Volt SAGE-X Standard Cell Library Databook. Artisan Components, Oct. 2001.
[2] A.D. Booth, “A Signed Binary Multiplication Technique,” Quarterly J. Mechanical and Applied Math., vol. 4, pp. 236-240, 1951.
[3] D.P. Agrawal and T.R.N. Rao, “On Multiple Operand Addition of Signed Binary Numbers,” IEEE Trans. Computers, vol. 27, pp. 1068-1070, Nov. 1978.
[4] L. Dadda, “Some Schemes for Parallel Multiplier,” Alta Frequenza, vol. 34, pp. 349-356, 1965.
[5] F. Elguibaly, “A Fast Parallel Multiplier-Accumulator Using the Modified Booth Algorithm,” IEEE Trans. Circuits and Systems, vol. 47, no. 9, pp. 902-908, 2000.
[6] M.D. Ercegovac and T. Lang, Digital Arithmetic. Los Altos, Calif.: Morgan Kaufmann, 2003.
[7] J. Fadavi-Ardekani, “M x N Booth Encoded Multiplier Generator Using Optimized Wallace Trees,” IEEE Trans. Very Large Scale Integration, vol. 1, no. 2, pp. 120-125, 1993.
[8] A. Farooqui and V. Oklobdzija, “General Data-Path Organization of a MAC Unit for VLSI Implementation of DSP Processors,” Proc. 1998 IEEE Int'l Symp. Circuits and Systems, vol. 2, pp. 260-263, 1998.
[9] D. Gajski, Principles of Digital Design. Prentice Hall, 1997.
[10] R. Hashemian and C. P. Chen, “A New Parallel Technique for Design of Decrement/Increment and Two's Complement Circuits,” Proc. 34th Midwest Symp. Circuits and Systems, vol. 2, pp. 887-890, 1991.
[11] Z. Huang and M. Ercegovac, “High-Performance Left-to-Right Array Multiplier Design,” Proc. 16th Symp. Computer Arithmetic, pp. 4-11, June 2003.
[12] K. Hwang, Computer Arithmetic Principles, Architecture, and Design. New York: Wiley, 1979.
[13] N. Itoh, Y. Naemura, H. Makino, Y. Nakase, T. Yoshihara, and Y. Horiba, “A 600-MHz 54x54-bit Multiplier with Rectangular-Styled Wallace Tree,” IEEE J. Solid-State Circuits, vol. 36, no. 2, pp. 249-257, 2001.
[14] J.-Y. Kang and J.-L. Gaudiot, “A Fast and Well-Structured Multiplier,” EUROMICRO Symp. Digital System Design, pp. 508-515, Aug. 2004.
[15] J.-Y. Kang, W.-H. Lee, and T.-D. Han, “A Design of a Multiplier Module Generator Using 4-2 Compressor,” Proc. Korea Inst. of Telematics and Electronics (KITE) Fall Conf., vol. 16, pp. 388-392, 1993.
[16] M. Nagamatsu, S. Tanaka, J. Mori, T. Noguchi, and K. Hatanaka, “A 15ns 32x32-bit CMOS Multiplier with an Improved Parallel Structure,” Digest of Technical Papers, IEEE Custom Integrated Circuits Conf., 1989.
[17] V.G. Oklobdzija, D. Villeger, and S. S. Liu, “A Method for Speed Optimized Partial Product Reduction and Generation of Fast Parallel Multipliers Using an Algorithmic Approach,” IEEE Trans. Computers vol. 45, no. 3, pp. 294-306, Mar. 1996.
[18] D.A. Patterson and J.L. Hennessy, Computer Architecture: A Quantitative Approach. San Mateo, Calif.: Morgan Kaufmann, 1996.
[19] M.R. Santoro and M. Horowitz, “SPIM: A Pipelined 64x64-bit Iterative Multiplier,” IEEE Trans. Circuits and Systems, vol. 24, no. 2, pp. 487-493, 1989.
[20] N. Slingerland and A.J. Smith, “Measuring the Performance of Multimedia Instruction Sets,” IEEE Trans. Computers, vol. 51, no. 11, pp. 1317-1332, 2002.
[21] P.F. Stelling, C.U. Martel, V.G. Oklobdzija, and R. Ravi, “Optimal Circuits for Parallel Multipliers,” IEEE Trans. Computers, vol. 47, no. 3, pp. 273-285, Mar. 1998.
[22] Sy nopsys. Design Compiler User's Guide, http:/www.synopsys. com/, 2004.
[23] D. Villeger and V. Oklobdzija, “Analysis of Booth Encoding Efficiency in Parallel Multipliers Using Compressors for Reduction of Partial Products,” Proc. 27th Ann. Asilomar Conf. Signals, Systems, and Computers, vol. 1, pp. 781-784, 1993.
[24] C.S. Wallace, “A Suggestion for a Fast Multiplier,” IEEE Trans. Computers, vol. 13, no. 2, pp. 14-17, 1964.
[25] A. Weinberger, “4:2 Carry-Save Adder Module,” IBM Technical Disclosure Bull., vol. 23, 1981.
[26] W.-C. Yeh and C.-W. Jen, “High-Speed Booth Encoded Parallel Multiplier Design,” IEEE Trans. Computers, vol. 49, no. 7, pp. 692-701, July 2000.

Index Terms:
Multiplier, Booth, modified Booth, partial products.
Jung-Yup Kang, Jean-Luc Gaudiot, "A Simple High-Speed Multiplier Design," IEEE Transactions on Computers, vol. 55, no. 10, pp. 1253-1258, Oct. 2006, doi:10.1109/TC.2006.156
Usage of this product signifies your acceptance of the Terms of Use.