Analyzing communication models for distributed thread-collaborative processors in terms of energy and time
2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) (2015)
Philadelphia, PA, USA
March 29, 2015 to March 31, 2015
Benjamin Klenk , University of Heidelberg Institute of Computer Engineering Heidelberg, Germany
Lena Oden , Fraunhofer Institute for Industrial Mathematics Competence Center High Performance Computing Kaiserslautern, Germany
Holger Froning , University of Heidelberg Institute of Computer Engineering Heidelberg, Germany
Accelerated computing has become pervasive for increasing the computational power and energy efficiency in terms of GFLOPs/Watt. For application areas with highest demands, for instance high performance computing, data warehousing and high performance analytics, accelerators like GPUs or Intel's MICs are distributed throughout the cluster. Since current analyses and predictions show that data movement will be the main contributor to energy consumption, we are entering an era of communication-centric heterogeneous systems that are operating with hard power constraints. In this work, we analyze data movement optimizations for distributed heterogeneous systems based on CPUs and GPUs. Thread-collaborative processors like GPUs differ significantly in their execution model from generalpurpose processors like CPUs, but available communication models are still designed and optimized for CPUs. Similar to heterogeneity in processing, heterogeneity in communication can have a huge impact on energy and time. To analyze this impact, we use multiple workloads with distinct properties regarding computational intensity and communication characteristics. We show for which workloads tailored communication models are essential, not only reducing execution time but also saving energy. Exposing the impact in terms of energy and time for communication-centric heterogeneous systems is crucial for future optimizations, and this work is a first step in this direction.
Graphics processing units, Computational modeling, Instruction sets, Data transfer, Benchmark testing, Bandwidth
B. Klenk, L. Oden and H. Froning, "Analyzing communication models for distributed thread-collaborative processors in terms of energy and time," 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Philadelphia, PA, USA, 2015, pp. 318-327.