2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) (2016)

Uppsala, Sweden

April 17, 2016 to April 19, 2016

ISBN: 978-1-5090-1952-6

pp: 46-56

Heiner Giefers , IBM Research - Zurich, Saumerstrasse 4, CH-8803 Ruschlikon, Switzerland

Peter Staar , IBM Research - Zurich, Saumerstrasse 4, CH-8803 Ruschlikon, Switzerland

Costas Bekas , IBM Research - Zurich, Saumerstrasse 4, CH-8803 Ruschlikon, Switzerland

Christoph Hagleitner , IBM Research - Zurich, Saumerstrasse 4, CH-8803 Ruschlikon, Switzerland

ABSTRACT

Hardware accelerators have evolved as the most prominent vehicle to meet the demanding performance and energy-efficiency constraints of modern computer systems. The prevalent type of hardware accelerators in the high-performance computing domain are PCIe attached co-processors to which the CPU can offload compute intensive tasks. In this paper, we analyze the performance, power, and energy-efficiency of such accelerators for sparse matrix multiplication kernels. Improving the efficiency for sparse matrix operations is of eminent importance since they work at the core of graph analytics algorithms which are in turn key to many big data knowledge discovery workloads. Our study involves GPU, Xeon Phi, and FPGA co-processors to embrace the vast majority of hardware accelerators applied in modern HPC systems. In order to compare the devices on the same level of implementation quality we apply vendor optimized libraries for which published results exist. From our experiments we deduce that none of the compared devices generally dominates in terms of energy-efficiency and that the optimal solutions depends on the actual sparse matrix data, data transfer requirements and on the applied efficiency metric. We also show that a combined use of multiple accelerators can further improve the system's performance and efficiency by up to 11% and 18%, respectively.

INDEX TERMS

Sparse matrices, Field programmable gate arrays, Energy efficiency, Power demand, Power measurement, Graphics processing units

CITATION

H. Giefers, P. Staar, C. Bekas and C. Hagleitner, "Analyzing the energy-efficiency of sparse matrix multiplication on heterogeneous systems: A comparative study of GPU, Xeon Phi and FPGA,"

*2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)*, Uppsala, Sweden, 2016, pp. 46-56.

doi:10.1109/ISPASS.2016.7482073

CITATIONS

SEARCH