2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) (2016)
May 16, 2016 to May 19, 2016
The time spent in communication operations is a major factor in determining the scalability of parallel applications. Tuning the parameters of a communication library can be used to adapt its characteristics to a particular platform, minimizing the communication time of an application. The goal of this paper is to improve theoretical and practical understanding of how performance improvements of point-to-point operations propagate to collective communication operations. We derive formulas to determine the expected improvement of a collective operation based on the improvement observed for a point-to-point communication using Hockney's model and the LogGP model. Our results indicate that many collective algorithms will inherently see a lower performance improvements compared to the improvement observed for point-to-point operations. Our evaluation shows for most test cases a good match between the predictions made by our models and the observed data, but also identifies multiple reasons for potential disparity between theory and practice.
Tuning, Libraries, Benchmark testing, Binary trees, Mathematical model, Algorithm design and analysis, Context modeling
S. Jha and E. Gabriel, "Impact and Limitations of Point-to-Point Performance on Collective Algorithms," 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)(CCGRID), Cartagena, Colombia, 2016, pp. 261-266.