|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines
Blocking LU Decomposition for FPGAs
Charlotte, North Carolina, USA
May 02-May 04
ISBN: 978-0-7695-4056-6
| ASCII Text | x | ||
| Guiming Wu, Yong Dou, Gregory D. Peterson, "Blocking LU Decomposition for FPGAs," Field-Programmable Custom Computing Machines, Annual IEEE Symposium on, pp. 109-112, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2010. | |||
| BibTex | x | ||
| @article{ 10.1109/FCCM.2010.25, author = {Guiming Wu and Yong Dou and Gregory D. Peterson}, title = {Blocking LU Decomposition for FPGAs}, journal ={Field-Programmable Custom Computing Machines, Annual IEEE Symposium on}, volume = {0}, year = {2010}, isbn = {978-0-7695-4056-6}, pages = {109-112}, doi = {http://doi.ieeecomputersociety.org/10.1109/FCCM.2010.25}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Field-Programmable Custom Computing Machines, Annual IEEE Symposium on TI - Blocking LU Decomposition for FPGAs SN - 978-0-7695-4056-6 SP109 EP112 A1 - Guiming Wu, A1 - Yong Dou, A1 - Gregory D. Peterson, PY - 2010 VL - 0 JA - Field-Programmable Custom Computing Machines, Annual IEEE Symposium on ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/FCCM.2010.25
To efficiently perform large matrix LU decomposition on FPGAs with limited local memory, the original algorithm needs to be blocked. In this paper, we propose a block LU decomposition algorithm for FPGAs, which is applicable for matrices of arbitrary size. We introduce a high performance hardware design, which mainly consists of a linear array of processing elements (PEs), to implement our block LU decomposition algorithm. A total of 36 PEs can be integrated into a Xilinx Virtex-5 xc5vlx330 FPGA on our self-designed PCI-Express card, reaching a sustained performance of 8.50 GFLOPS at 133 MHz, which outperforms previous work.
Citation:
Guiming Wu, Yong Dou, Gregory D. Peterson, "Blocking LU Decomposition for FPGAs," fccm, pp.109-112, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2010
Usage of this product signifies your acceptance of the Terms of Use.
