The Community for Technology Leaders
Green Image
Issue No. 01 - Jan. (2013 vol. 24)
ISSN: 1045-9219
pp: 85-91
G. R. Morris , Eng. R&D Center, US Army, Vicksburg, MS, USA
K. H. Abed , Dept. of Comput. Eng., Jackson State Univ., Jackson, MS, USA
ABSTRACT
High-performance heterogeneous computers that employ field programmable gate arrays (FPGAs) as computational elements are known as high-performance reconfigurable computers (HPRCs). For floating-point applications, these FPGA-based processors must satisfy a variety of heuristics and rules of thumb to achieve a speedup compared with their software counterparts. By way of a simple sparse matrix Jacobi iterative solver, this paper illustrates some of the issues associated with mapping floating-point kernels onto HPRCs. The Jacobi method was chosen based on heuristics developed from earlier research. Furthermore, Jacobi is relatively easy to understand, yet is complex enough to illustrate the mapping issues. This paper is not trying to demonstrate the speedup of a particular application nor is it suggesting that Jacobi is the best way to solve equations. The results demonstrate a nearly threefold wall clock runtime speedup when compared with a software implementation. A formal analysis shows that these results are reasonable. The purpose of this paper is to illuminate the challenging floating-point mapping process while simultaneously showing that such mappings can result in significant speedups. The ideas revealed by research such as this have already been and should continue to be used to facilitate a more automated mapping process.
INDEX TERMS
Jacobian matrices, Field programmable gate arrays, Reconfigurable architectures, Iterative methods, Jacobi iterative method, Field programmable gate array (FPGA), reconfigurable computer (RC), high-performance reconfigurable computer (HPRC), high-performance heterogeneous computer (HPHC)
CITATION
G. R. Morris, K. H. Abed, "Mapping a Jacobi Iterative Solver onto a High-Performance Heterogeneous Computer", IEEE Transactions on Parallel & Distributed Systems, vol. 24, no. , pp. 85-91, Jan. 2013, doi:10.1109/TPDS.2012.121
209 ms
(Ver 3.1 (10032016))