The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.10 - Oct. (2013 vol.62)
pp: 2013-2025
Javier Hormigo , University of Malaga, Malaga
Julio Villalba , University of Malaga, Malaga
Emilio L. Zapata , University of Malaga, Malaga
ABSTRACT
Although redundant addition is widely used to design parallel multioperand adders for ASIC implementations, the use of redundant adders on Field Programmable Gate Arrays (FPGAs) has generally been avoided. The main reasons are the efficient implementation of carry propagate adders (CPAs) on these devices (due to their specialized carry-chain resources) as well as the area overhead of the redundant adders when they are implemented on FPGAs. This paper presents different approaches to the efficient implementation of generic carry-save compressor trees on FPGAs. They present a fast critical path, independent of bit width, with practically no area overhead compared to CPA trees. Along with the classic carry-save compressor tree, we present a novel linear array structure, which efficiently uses the fast carry-chain resources. This approach is defined in a parameterizable HDL code based on CPAs, which makes it compatible with any FPGA family or vendor. A detailed study is provided for a wide range of bit widths and large number of operands. Compared to binary and ternary CPA trees, speedups of up to 2.29 and 2.14 are achieved for 16-bit width and up to 3.81 and 3.11 for 64-bit width.
INDEX TERMS
Field programmable gate arrays, Delay, Adders, Routing, Radiation detectors, Hardware design languages, Hardware, carry-save adders, Computer arithmetic, reconfigurable hardware, multioperand addition, redundant representation
CITATION
Javier Hormigo, Julio Villalba, Emilio L. Zapata, "Multioperand Redundant Adders on FPGAs", IEEE Transactions on Computers, vol.62, no. 10, pp. 2013-2025, Oct. 2013, doi:10.1109/TC.2012.168
REFERENCES
[1] B. Cope, P. Cheung, W. Luk, and L. Howes, "Performance Comparison of Graphics Processors to Reconfigurable Logic: A Case Study," IEEE Trans. Computers, vol. 59, no. 4, pp. 433-448, Apr. 2010.
[2] S. Dikmese, A. Kavak, K. Kucuk, S. Sahin, A. Tangel, and H. Dincer, "Digital Signal Processor against Field Programmable Gate Array Implementations of Space-Code Correlator Beamformer for Smart Antennas," IET Microwaves, Antennas Propagation, vol. 4, no. 5, pp. 593-599, May 2010.
[3] S. Roy and P. Banerjee, "An Algorithm for Trading off Quantization Error with Hardware Resources for MATLAB-based FPGA Design," IEEE Trans. Computers, vol. 54, no. 7, pp. 886-896, July 2005.
[4] F. Schneider, A. Agarwal, Y.M. Yoo, T. Fukuoka, and Y. Kim, "A Fully Programmable Computing Architecture for Medical Ultrasound Machines," IEEE Trans. Information Technology in Biomedicine, vol. 14, no. 2, pp. 538-540, Mar. 2010.
[5] J. Hill, "The Soft-Core Discrete-Time Signal Processor Peripheral [Applications Corner]," IEEE Signal Processing Magazine, vol. 26, no. 2, pp. 112-115, Mar. 2009.
[6] J.S. Kim, L. Deng, P. Mangalagiri, K. Irick, K. Sobti, M. Kandemir, V. Narayanan, C. Chakrabarti, N. Pitsianis, and X. Sun, "An Automated Framework for Accelerating Numerical Algorithms on Reconfigurable Platforms Using Algorithmic/Architectural Optimization," IEEE Trans. Computers, vol. 58, no. 12, pp. 1654-1667, Dec. 2009.
[7] H. Lange and A. Koch, "Architectures and Execution Models for Hardware/Software Compilation and their System-Level Realization," IEEE Trans. Computers, vol. 59, no. 10, pp. 1363-1377, Oct. 2010.
[8] L. Zhuo and V. Prasanna, "High-Performance Designs for Linear Algebra Operations on Reconfigurable Hardware," IEEE Trans. Computers, vol. 57, no. 8, pp. 1057-1071, Aug. 2008.
[9] C. Mancillas-Lopez, D. Chakraborty, and F.R. Henriquez, "Reconfigurable Hardware Implementations of Tweakable Enciphering Schemes," IEEE Trans. Computers,, vol. 59, no. 11, pp. 1547-1561, Nov. 2010.
[10] T. Guneysu, T. Kasper, M. Novotny, C. Paar, and A. Rupp, "Cryptanalysis with COPACOBANA," IEEE Trans. Computers, vol. 57, no. 11, pp. 1498-1513, Nov. 2008.
[11] I. Kuon and J. Rose, "Measuring the Gap between FPGAs and ASICs," IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 26, no. 2, pp. 203-215, Feb. 2007.
[12] M. Frederick and A. Somani, "Beyond the Arithmetic Constraint: Depth-Optimal Mapping of Logic Chains in LUT-Based FPGAs," Proc. ACM/SIGDA Int'l Symp. Field Programmable Gate Arrays, pp. 37-46, 2008.
[13] T. PreuBer and R. Spallek, "Enhancing FPGA Device Capabilities by the Automatic Logic Mapping to Additive Carry Chains," Proc. Int'l Conf. Field Programmable Logic and Applications (FPL), pp. 318-325, 2010.
[14] S. Gao, D. Al-Khalili, and N. Chabini, "Implementation of Large Size Multipliers Using Ternary Adders and Higher Order Compressors," Proc. Int'l Conf. Microelectronics (ICM), pp. 118-121, 2009.
[15] S. Gao, D. Al-Khalili, and N. Chabini, "FPGA Realization of High Performance Large Size Computational Functions: Multipliers and Applications," Analog Integrated Circuits and Signal Processing, vol. 70, no. 2, pp. 165-179, Feb. 2011.
[16] K. Macpherson and R. Stewart, "Rapid Prototyping - Area Efficient FIR Filters for High Speed FPGA Implementation," IEEE Proc. Vision, Image and Signal Processing, vol. 153, no. 6, pp. 711-720, Dec. 2006.
[17] P. Meher, S. Chandrasekaran, and A. Amira, "FPGA Realization of FIR Filters by Efficient and Flexible Systolization using Distributed Arithmetic," IEEE Trans. Signal Processing, vol. 56, no. 7, pp. 3009-3017, July 2008.
[18] Z. Kincses, Z. Nagy, L. Orzo, P. Szolgay, and G. Mezo, "Implementation of a Parallel SAD Based Wavefront Sensor Architecture on FPGA," Proc. European Conf. Circuit Theory and Design (ECCTD '09), pp. 823-826, Aug. 2009.
[19] F. Bensaali, A. Amira, and A. Bouridane, "Accelerating Matrix Product on Reconfigurable Hardware for Image Processing Applications," IEE Proc. Circuits, Devices and Systems, vol. 152, no. 3, pp. 236-246, June 2005.
[20] Y.F. Chan, M. Moallem, and W. Wang, "Design and Implementation of Modular FPGA-Based PID Controllers," IEEE Trans. Industrial Electronics, vol. 54, no. 4, pp. 1898-1906, Aug. 2007.
[21] C. Villalpando, A. Morfopolous, L. Matthies, and S. Goldberg, "FPGA Implementation of Stereo Disparity with High Throughput for Mobility Applications," Proc. IEEE Aerospace Conf., pp. 1-10, Mar. 2011.
[22] C. Wallace, "A Suggestion for a Fast Multiplier," IEEE Trans. Electronic Computers, vol. 13, no. 1, pp. 14-17, Feb. 1964.
[23] L. Dadda, "Some Schemes for Parallel Multipliers," Alta Frequenza, vol. 34, no. 5, pp. 349-356, 1965.
[24] M.D. Ercegovac and T. Lang, Digital Arithmetic. Morgan Kaufmann Publishers, 2004.
[25] P. Kornerup, "Reviewing 4-to-2 Adders for Multi-Operand Addition," J. VLSI Signal Processing, vol. 40, pp. 143-152, 2005.
[26] J.-L. Beuchat and J.-M. Muller, "Automatic Generation of Modular Multipliers for FPGA Applications," IEEE Trans. Computers, vol. 57, no. 12, pp. 1600-1613, Dec. 2008.
[27] G. Cardarilli, S. Pontarelli, M. Re, and A. Salsano, "On the Use of Signed Digit Arithmetic for the New 6-Inputs LUT Based FPGAs," Proc. IEEE 15th Int'l Conf. Electronics, Circuits and Systems (ICECS), pp. 602-605, 2008.
[28] M. Ortiz, F. Quiles, J. Hormigo, F. Jaime, J. Villalba, and E. Zapata, "Efficient Implementation of Carry-Save Adders in FPGAs," Proc. IEEE 20th Int'l Conf. Application-Specific Systems, Architectures and Processors (ASAP), pp. 207-210, 2009.
[29] W. Kamp, A. Bainbridge-Smith, and M. Hayes, "Efficient Implementation of Fast Redundant Number Adders for Long Word-Lengths in FPGAs," Proc. Int'l Conf. Field-Programmable Technology (FPT '09), pp. 239-246, 2009.
[30] H. Parandeh-Afshar, P. Brisk, and P. Ienne, "Efficient Synthesis of Compressor Trees on FPGAs," Proc. Asia and South Pacific Design Automation Conf. (ASPDAC), pp. 138-143, 2008.
[31] H. Parandeh-Afshar, P. Brisk, and P. Ienne, "Exploiting Fast Carry-Chains of FPGAs for Designing Compressor Trees," Proc. Int'l Conf. Field Programmable Logic and Applications (FPL), pp. 242-249, aug. 2009.
[32] H. Parandeh-Afshar, P. Brisk, and P. Ienne, "Improving Synthesis of Compressor Trees on FPGAs via Integer Linear Programming," Proc. Int'l Conf. Design, Automation and Test in Europe (DATE '08), pp. 1256-1261, 2008.
[33] T. Matsunaga, S. Kimura, and Y. Matsunaga, "Multi-Operand Adder Synthesis on FPGAs Using Generalized Parallel Counters," Proc. Asia and South Pacific Design Automation Conf. (ASP-DAC), pp. 337-342, 2010.
[34] M. Frederick and A. Somani, "Multi-Bit Carry Chains for High-Performance Reconfigurable Fabrics," Proc. Int'l Conf. Field Programmable Logic and Applications, pp. 1-6, 2006.
[35] P. Brisk, A. Verma, P. Ienne, and H. Parandeh-Afshar, "Enhancing FPGA Performance for Arithmetic Circuits," Proc. ACM/IEEE 44th Design Automation Conf. (DAC '07), pp. 334-337, 2007.
[36] H. Parandeh-Afshar, A. Verma, P. Brisk, and P. Ienne, "Improving FPGA Performance for Carry-Save Arithmetic," IEEE Trans. Very Large Scale Integration Systems, vol. 18, no. 4, pp. 578-590, Apr. 2010.
[37] A. Cevrero, P. Athanasopoulos, H. Parandeh-Afshar, A. Verma, P. Brisk, F. Gurkaynak, Y. Leblebici, and P. Ienne, "Architectural Improvements for Field Programmable Counter Arrays: Enabling Efficient Synthesis of Fast Compressor Trees on FPGAs," Proc. ACM/SIGDA Int'l Symp. Field Programmable Gate Arrays, pp. 181-190, 2008.
[38] A. Cevrero, P. Athanasopoulos, H. Parandeh-Afshar, A.K. Verma, H.S.A. Niaki, C. Nicopoulos, F.K. Gurkaynak, P. Brisk, Y. Leblebici, and P. Ienne, "Field Programmable Compressor Trees: Acceleration of Multi-Input Addition on FPGAs," ACM Trans. Reconfigurable Technology Systems, vol. 2, pp. 13:1-13:36, June 2009.
[39] R. Gutierrez, J. Valls, and A. Perez-Pascual, "FPGA-Implementation of Time-Multiplexed Multiple Constant Multiplication Based on Carry-Save Arithmetic," Proc. 19th Int'l Conf. Field Programmable Logic and Applications (FPL), pp. 609-612, 2009.
[40] W.J. Stenzel, W.J. Kubitz, and G.H. Garcia, "Compact High-Speed Parallel Multiplication Scheme," IEEE Trans. Computers, vol. 26, no. 10, pp. 948-957, Oct. 1977.
[41] S. Dormido and M. Canto, "Synthesis of Generalized Parallel Counters," IEEE Trans. Computers, vol. 30, no. 9, pp. 699-703, Sept. 1981.
[42] P. Kornerup and J.-M. Muller, "Leading Guard Digits in Finite Precision Redundant Representations," IEEE Trans. Computers, vol. 55, no. 5, pp. 541-548, May 2006.
[43] S.-F. Hsiao, M.-R. Jiang, and J.-S. Yeh, "Design of High-Speed Low-Power 3-2 Counter and 4-2 Compressor for Fast Multipliers," IEE Electronics Letters, vol. 34, no. 4, pp. 341-343, Feb. 1998.
[44] Altera, "Stratix ii vs. Virtex-4 Performance Comparison, WP-S2052505-2.0," www.altera.comliterature, 2006.
[45] Xilinx, "Achieving Higher System Performance with the Virtex-5 Family of FPGAs, WP245," www.xilinx.com/supportdocumentation, 2006.
[46] Xilinx, "Ug369 virtex-6 FPGA dsp48e1 Slice, User Guide," www. xilinx.com/supportdocumentation, 2009.
52 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool