Subscribe

Issue No.12 - December (2008 vol.57)

pp: 1614-1623

Junqing Sun , University of Tennessee, Knoxville

Gregory D. Peterson , University of Tennessee, Knoxville

Olaf O. Storaasli , Oak Ridge National Laboratory, Knoxville

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2008.89

ABSTRACT

Compared to higher-precision data formats, lower-precision data formats result in higher performance for computational intensive applications on FPGAs because of their lower resource cost, reduced memory bandwidth requirements, and higher circuit frequency. On the other hand, scientific computations usually demand highly accurate solutions. This paper seeks to utilize lower-precision data formats whenever possible for higher performance without losing the accuracy of higher-precision data formats by using mixed-precision algorithms and architectures. First, we analyze the floating-point performance of different data formats on FPGAs. Second, we introduce mixed-precision iterative refinement algorithms for linear solvers and give error analysis. Finally, we propose an innovative architecture for a mixed-precision direct solver for reconfigurable computing. Our results show that our mixed-precision algorithm and architecture significantly improve the performance of linear solvers on FPGAs.

INDEX TERMS

Cost/performance, VLSI, Computer arithmetic, Multiple precision arithmetic, mixed precision arithmetic in reconfigurable computing

CITATION

Junqing Sun, Gregory D. Peterson, Olaf O. Storaasli, "High-Performance Mixed-Precision Linear Solver for FPGAs",

*IEEE Transactions on Computers*, vol.57, no. 12, pp. 1614-1623, December 2008, doi:10.1109/TC.2008.89REFERENCES

- [1] J.W. Demmel,
Applied Numerical Linear Algebra. SIAM Press, 1997.- [2] K.D. Underwood, “FPGAs versus CPUs: Trends in Peak Floating-Point Performance,”
Proc. ACM/SIGDA 12th Int'l Symp. Field Programmable Gate Arrays (FPGA '04), Feb. 2004.- [4] R. Strzodka and D. Göddeke, “Mixed Precision Methods for Convergent Iterative Schemes,”
Proc. Workshop Edge Computing Using New Commodity Architectures (EDGE '06), May 2006.- [5] R. Strzodka and D. Göddeke, “Pipelined Mixed Precision Algorithm on FPGAs for Fast and Accurate PDE Solvers from Low Precision Components,”
Proc. 14th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM '06), May 2006.- [6] D. Göddeke, R. Strzodka, and S. Turek, “Accelerating Double Precision FEM Simulations with GPUs,”
Proc. 18th Symp. Simulation Technique (ASIM '05), Sept. 2005.- [7] J.H. Wilkinson,
The Algebraic Eigenvalue Problem. Clarendon, 1965.- [8] Cray, Inc.,
Cray XD1 FPGA Development, 2005.- [10] J. Demmel, Y. Hida, W. Kahan, X.S. Li, S. Mukherjee, and E.J. Riedy, “Error Bounds from Extra Precise Iterative Refinement,” TechnicalReport UCB/CSD-04-1344, LAPACK Working Note165, Aug. 2004.
- [11] G.W. Stewart,
Introduction to Matrix Computations. Academic Press, 1973.- [12] Cray Inc., http:/www.cray.com, 2008.
- [13] J. Sun, G. Peterson, and O.O. Storaasli, “Sparse Matrix-Vector Multiplication Design on FPGAs,”
Proc. 15th IEEE Symp. Field-Programmable Custom Computing Machines (FCCM '07), Apr. 2007.- [14] J.W. Demmel,
Applied Numerical Linear Algebra. SIAM Press, 1997.- [15] R. Barrett,
Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, second ed. SIAM, 1994.- [16] Xilinx Inc., http:/www.xilinx.com, 2008.
- [17] G. Govindu, R. Scrofano, and V.K. Prasanna, “A Library of Parameterizable Floating-Point Cores for FPGAs and Their Application to Scientific Computing,”
Proc. Int'l Conf. Eng. of Reconfigurable Systems and Algorithms (ERSA '05), June 2005.- [18] X. Wang, M. Leeser, and H. Yu, “A Parameterized Floating-Point Library Applied to Multispectral Image Clustering,”
Proc. Seventh Ann. Military and Aerospace Programmable Logic Devices (MAPLD '04) Int'l Conf., Sept. 2004.- [19] J. Sun, G. Peterson, and O.O. Storaasli, “Mapping Sparse Matrix-Vector Multiplication on FPGAs,”
Proc. Reconfigurable Systems Summer Inst. (RSSI), 2007.- [20] L. Zhuo and V.K. Prasanna, “Sparse Matrix-Vector Multiplication on FPGAs,”
Proc. ACM/SIGDA 13th Int'l Symp. Field-Programmable Gate Arrays (FPGA '05), pp. 63-74, Feb. 2005.- [21] K. Underwood, S. Hemmert, and C. Ulmer, “Architectures and APIs: Assessing Requirements for Delivering FPGA Performance to Applications,”
Proc. ACM/IEEE Conf. Supercomputing (SC '06), Nov. 2006.- [22] L. Zhuo and V.K. Prasanna, “High Performance Linear Algebra Operations on Reconfigurable Systems,”
Proc. ACM/IEEE Conf. Supercomputing (SC), 2005.- [23] H. Bowdler, R. Martin, G. Peters, and J. Wilkinson, “Handbook Series Linear Algebra: Solution of Real and Complex Systems of Linear Equations,”
Numerische Math., vol. 8, pp. 217-234, 1966.- [25] N.J. Higham,
Accuracy and Stability of Numerical Algorithms, seconded. SIAM Press, 2002.- [26] J. Sun, “Obtaining High Performance via Lower-Precision FPGA Floating Point Units,”
Proc. ACM/IEEE Conf. Supercomputing (SC'07), Nov. 2007.- [27] J. Sun, “High Performance Reconfigurable Computing for Linear Algebra: Design and Performance Analysis,” PhD dissertation, Univ. of Tennessee, 2008.
- [28] G.H. Golub and C.F. Loan,
Matrix Computations, third ed. JohnsHopkins, 1996.- [30] Y. Gu, T. VanCourt, and M. Herbordt, “Improved Interpolation and System Integration for FPGA-Based Molecular Dynamics Simulations,”
Proc. 16th Int'l Conf. Field Programmable Logic and Applications (FPL), 2006.- [31] Y. Bi, G.D. Peterson, L. Warren, and R. Harrison, “Hardware Acceleration of Parallel Lagged-Fibonacci Pseudo Random Number Generation,”
Proc. Int'l Conf. Eng. of Reconfigurable Systems and Algorithms, June 2006. |