2012 41st International Conference on Parallel Processing (2012)
Pittsburgh, PA, USA USA
Sept. 10, 2012 to Sept. 13, 2012
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICPP.2012.43
We discuss efficient shared memory parallelization of sparse matrix computations whose main traits resemble to those of the sparse matrix-vector multiply operation. Such computations are difficult to parallelize because of the relatively small computational granularity characterized by small number of operations per each data access. Our main application is a sparse matrix scaling algorithm which is more memory bound than the sparse matrix vector multiplication operation. We take the application and parallelize it using the standard OpenMP programming principles. Apart from the common race condition avoiding constructs, we do not reorganize the algorithm. Rather, we identify associated performance metrics and describe models to optimize them. By using these models, we implement parallel matrix scaling algorithms for two well-known sparse matrix storage formats. Experimental results show that simple parallelization attempts which leave data/work partitioning to the runtime scheduler can suffer from the overhead of avoiding race conditions especially when the number of threads increases. The proposed algorithms perform better than these algorithms by optimizing the identified performance metrics and reducing the overhead.
Sparse matrices, Partitioning algorithms, Instruction sets, Arrays, Measurement, Vectors, Computational modeling, matrix scaling, Shared-memory parallelization, sparse matrices, hypergraphs
U. V. Catalyurek, K. Kaya and B. Ucar, "On Shared-Memory Parallelization of a Sparse Matrix Scaling Algorithm," 2012 41st International Conference on Parallel Processing(ICPP), Pittsburgh, PA, USA USA, 2012, pp. 68-77.