The Community for Technology Leaders
Green Image
Issue No. 10 - Oct. (2013 vol. 24)
ISSN: 1045-9219
pp: 1930-1940
Vasileios Karakasis , National Technical University of Athens, Zografou
Theodoros Gkountouvas , National Technical University of Athens, Zografou
Kornilios Kourtis , ETH Zürich, Zürich
Georgios Goumas , National Technical University of Athens, Zografou
Nectarios Koziris , National Technical University of Athens, Zografou
Sparse matrix-vector multiplication ($({\rm SpM}\times{\rm V})$) has been characterized as one of the most significant computational scientific kernels. The key algorithmic characteristic of the $({\rm SpM}\times{\rm V})$ kernel, that inhibits it from achieving high performance, is its very low flop:byte ratio. In this paper, we present a compressed storage format, called Compressed Sparse eXtended (CSX), that is able to detect and encode simultaneously multiple commonly encountered substructures inside a sparse matrix. Relying on aggressive compression techniques of the sparse matrix's indexing structure, CSX is able to considerably reduce the memory footprint of a sparse matrix, alleviating the pressure to the memory subsystem. In a diverse set of sparse matrices, CSX was able to provide a more than 40 percent average performance improvement over the standard CSR format in SMP architectures and surpassed 20 percent improvement in NUMA systems, significantly outperforming other CSR alternatives. Additionally, it was able to adapt successfully to the nonzero element structure of the considered matrices, exhibiting very stable performance. Finally, in the context of a "real-lifeâ multiphysics simulation software, CSX accelerated the $({\rm SpM}\times{\rm V})$ component nearly 40 percent and the total solver time approximately 15 percent.
Sparse matrices, Kernel, Encoding, Indexes, Optimization, Vectors, Computer architecture, data compression, Sparse Matrix-Vector Multiplication, multicore optimizations

G. Goumas, V. Karakasis, T. Gkountouvas, N. Koziris and K. Kourtis, "An Extended Compression Format for the Optimization of Sparse Matrix-Vector Multiplication," in IEEE Transactions on Parallel & Distributed Systems, vol. 24, no. , pp. 1930-1940, 2013.
89 ms
(Ver 3.3 (11022016))