Issue No. 03 - March (2013 vol. 24)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2012.161
Chi-Yuan Yeh , National Sun Yat-Sen University, Kaohsiung
Yu-Ting Peng , National Sun Yat-Sen University, Kaohsiung
Shie-Jue Lee , National Sun Yat-Sen University, Kaohsiung
Singular value decomposition (SVD) is a popular decomposition method for solving least squares estimation (LSE) problems. However, for large data sets, applying SVD directly on the coefficient matrix is very time consuming and memory demanding in obtaining least squares solutions. In this paper, we propose an iterative divide-and-merge-based estimator for solving large-scale LSE problems. Iteratively, the LSE problem to be solved is processed and transformed to equivalent but smaller LSE problems. In each iteration, the input matrices are subdivided into a set of small submatrices. The submatrices are decomposed by SVD, respectively, and the results are merged, and the resulting matrices become the input of the next iteration. The process is iterated until the resulting matrices are small enough which can then be solved directly and efficiently by SVD. The number of iterations required is determined dynamically according to the size of the input data set. As a result, the requirements in time and space for finding least squares solutions are greatly improved. Furthermore, the decomposition and merging of the submatrices in each iteration can be independently done in parallel. The idea can be easily implemented in MapReduce and experimental results show that the proposed approach can solve large-scale LSE problems effectively.
Matrix decomposition, Least squares approximation, Complexity theory, Iterative methods, Approximation algorithms, Equations, Educational institutions, MapReduce, Linear system, matrix decomposition, error minimization, least squares solution, large-scale data set, batch SVD
Y. Peng, C. Yeh and S. Lee, "An Iterative Divide-and-Merge-Based Approach for Solving Large-Scale Least Squares Problems," in IEEE Transactions on Parallel & Distributed Systems, vol. 24, no. , pp. 428-438, 2013.