loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
18th International Parallel and Distributed Processing Symposium (IPDPS'04) - Papers
Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on FPGAs
Santa Fe, New Mexico
April 26-April 30
ISBN: 0-7695-2132-0
Ling Zhuo, University of Southern California
Viktor K. Prasanna, University of Southern California
The abundant hardware resources on current FPGAs provide new opportunities to improve the performance of hardware implementations of scientific computations. In this paper, we propose two FPGA-based algorithms for floating-point matrix multiplication, a fundamental kernel in a number of scientific applications. We analyze the design tradeoffs in implementing this kernel on FPGAs. Our algorithms employ a linear array architecture with a small control logic. This architecture effectively utilizes the hardware resources on the entire FPGA and reduces the routing complexity. The processing elements (PEs) used in our algorithms are modular so that floating-point units can be easily embedded into them. In our designs, the floating-point units are optimized to maximize the number of PEs integrated on the FPGA as well as the clock speed. Experimental results show that our algorithms achieve high clock speeds and provide good scalability. Our algorithms achieve superior sustained floating-point performance compared with existing FPGA-based implementations and state-of-the-art processors.
Citation:
Ling Zhuo, Viktor K. Prasanna, "Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on FPGAs," ipdps, vol. 1, pp.92a, 18th International Parallel and Distributed Processing Symposium (IPDPS'04) - Papers, 2004
Usage of this product signifies your acceptance of the Terms of Use.