The Community for Technology Leaders
RSS Icon
Issue No.12 - December (2008 vol.57)
pp: 1614-1623
Junqing Sun , University of Tennessee, Knoxville
Gregory D. Peterson , University of Tennessee, Knoxville
Olaf O. Storaasli , Oak Ridge National Laboratory, Knoxville
Compared to higher-precision data formats, lower-precision data formats result in higher performance for computational intensive applications on FPGAs because of their lower resource cost, reduced memory bandwidth requirements, and higher circuit frequency. On the other hand, scientific computations usually demand highly accurate solutions. This paper seeks to utilize lower-precision data formats whenever possible for higher performance without losing the accuracy of higher-precision data formats by using mixed-precision algorithms and architectures. First, we analyze the floating-point performance of different data formats on FPGAs. Second, we introduce mixed-precision iterative refinement algorithms for linear solvers and give error analysis. Finally, we propose an innovative architecture for a mixed-precision direct solver for reconfigurable computing. Our results show that our mixed-precision algorithm and architecture significantly improve the performance of linear solvers on FPGAs.
Cost/performance, VLSI, Computer arithmetic, Multiple precision arithmetic, mixed precision arithmetic in reconfigurable computing
Junqing Sun, Gregory D. Peterson, Olaf O. Storaasli, "High-Performance Mixed-Precision Linear Solver for FPGAs", IEEE Transactions on Computers, vol.57, no. 12, pp. 1614-1623, December 2008, doi:10.1109/TC.2008.89
[1] J.W. Demmel, Applied Numerical Linear Algebra. SIAM Press, 1997.
[2] K.D. Underwood, “FPGAs versus CPUs: Trends in Peak Floating-Point Performance,” Proc. ACM/SIGDA 12th Int'l Symp. Field Programmable Gate Arrays (FPGA '04), Feb. 2004.
[3] A. Buttari, J. Dongarra, J. Langou, J. Langou, P. Luszczek, and J. Kurzak, “Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems,” Int'l J. High Performance Computer Applications, vol. 21, pp. 457-466, 2007.
[4] R. Strzodka and D. Göddeke, “Mixed Precision Methods for Convergent Iterative Schemes,” Proc. Workshop Edge Computing Using New Commodity Architectures (EDGE '06), May 2006.
[5] R. Strzodka and D. Göddeke, “Pipelined Mixed Precision Algorithm on FPGAs for Fast and Accurate PDE Solvers from Low Precision Components,” Proc. 14th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM '06), May 2006.
[6] D. Göddeke, R. Strzodka, and S. Turek, “Accelerating Double Precision FEM Simulations with GPUs,” Proc. 18th Symp. Simulation Technique (ASIM '05), Sept. 2005.
[7] J.H. Wilkinson, The Algebraic Eigenvalue Problem. Clarendon, 1965.
[8] Cray, Inc., Cray XD1 FPGA Development, 2005.
[9] C.B. Moler, “Iterative Refinement in Floating Point,” J. ACM, vol. 2, pp. 316-321, 1967.
[10] J. Demmel, Y. Hida, W. Kahan, X.S. Li, S. Mukherjee, and E.J. Riedy, “Error Bounds from Extra Precise Iterative Refinement,” TechnicalReport UCB/CSD-04-1344, LAPACK Working Note165, Aug. 2004.
[11] G.W. Stewart, Introduction to Matrix Computations. Academic Press, 1973.
[12] Cray Inc., http:/, 2008.
[13] J. Sun, G. Peterson, and O.O. Storaasli, “Sparse Matrix-Vector Multiplication Design on FPGAs,” Proc. 15th IEEE Symp. Field-Programmable Custom Computing Machines (FCCM '07), Apr. 2007.
[14] J.W. Demmel, Applied Numerical Linear Algebra. SIAM Press, 1997.
[15] R. Barrett, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, second ed. SIAM, 1994.
[16] Xilinx Inc., http:/, 2008.
[17] G. Govindu, R. Scrofano, and V.K. Prasanna, “A Library of Parameterizable Floating-Point Cores for FPGAs and Their Application to Scientific Computing,” Proc. Int'l Conf. Eng. of Reconfigurable Systems and Algorithms (ERSA '05), June 2005.
[18] X. Wang, M. Leeser, and H. Yu, “A Parameterized Floating-Point Library Applied to Multispectral Image Clustering,” Proc. Seventh Ann. Military and Aerospace Programmable Logic Devices (MAPLD '04) Int'l Conf., Sept. 2004.
[19] J. Sun, G. Peterson, and O.O. Storaasli, “Mapping Sparse Matrix-Vector Multiplication on FPGAs,” Proc. Reconfigurable Systems Summer Inst. (RSSI), 2007.
[20] L. Zhuo and V.K. Prasanna, “Sparse Matrix-Vector Multiplication on FPGAs,” Proc. ACM/SIGDA 13th Int'l Symp. Field-Programmable Gate Arrays (FPGA '05), pp. 63-74, Feb. 2005.
[21] K. Underwood, S. Hemmert, and C. Ulmer, “Architectures and APIs: Assessing Requirements for Delivering FPGA Performance to Applications,” Proc. ACM/IEEE Conf. Supercomputing (SC '06), Nov. 2006.
[22] L. Zhuo and V.K. Prasanna, “High Performance Linear Algebra Operations on Reconfigurable Systems,” Proc. ACM/IEEE Conf. Supercomputing (SC), 2005.
[23] H. Bowdler, R. Martin, G. Peters, and J. Wilkinson, “Handbook Series Linear Algebra: Solution of Real and Complex Systems of Linear Equations,” Numerische Math., vol. 8, pp. 217-234, 1966.
[24] J. Demmel, M. Heath, and H. van der Vorst, “Parallel Numerical Linear Algebra,” Acta Numerica, pp.111-198, 1993.
[25] N.J. Higham, Accuracy and Stability of Numerical Algorithms, seconded. SIAM Press, 2002.
[26] J. Sun, “Obtaining High Performance via Lower-Precision FPGA Floating Point Units,” Proc. ACM/IEEE Conf. Supercomputing (SC'07), Nov. 2007.
[27] J. Sun, “High Performance Reconfigurable Computing for Linear Algebra: Design and Performance Analysis,” PhD dissertation, Univ. of Tennessee, 2008.
[28] G.H. Golub and C.F. Loan, Matrix Computations, third ed. JohnsHopkins, 1996.
[29] R. Whaley, A. Petitet, and J. Dongarra, “Automated Empirical Optimization of Software and the ATLAS Project,” Parallel Computing, vol. 27, pp. 3-35, 2001.
[30] Y. Gu, T. VanCourt, and M. Herbordt, “Improved Interpolation and System Integration for FPGA-Based Molecular Dynamics Simulations,” Proc. 16th Int'l Conf. Field Programmable Logic and Applications (FPL), 2006.
[31] Y. Bi, G.D. Peterson, L. Warren, and R. Harrison, “Hardware Acceleration of Parallel Lagged-Fibonacci Pseudo Random Number Generation,” Proc. Int'l Conf. Eng. of Reconfigurable Systems and Algorithms, June 2006.
428 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool