Los Angeles, CA
March 31, 2009 to April 2, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CSIE.2009.814
Applying blocking technology to the computation of big dense matrices can make a better use of computer’s memory hierarchies and increase computing efficiency. The blocked algorithm for LU factorization is studied in this paper. Efficient algorithms are designed for the computation of the different matrix operations involved in the blocked LU factorization algorithm. Optimization techniques including matrix transposing and loop unrolling are used in the implementation of the matrix computations. Experimental results show that the block LU factorization algorithm runs much faster than the standard LU factorization. A speedup of more than 50% is achieved.
Jianping Chen, Zhenguo Shi, Weifu Liu, "Implementation of Block Algorithm for LU Factorization", CSIE, 2009, 2009 WRI World Congress on Computer Science and Information Engineering, CSIE, 2009 WRI World Congress on Computer Science and Information Engineering, CSIE 2009, pp. 569-573, doi:10.1109/CSIE.2009.814