High Performance Computing and Communication & IEEE International Conference on Embedded Software and Systems, IEEE International Conference on (2012)
Liverpool, United Kingdom United Kingdom
June 25, 2012 to June 27, 2012
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/HPCC.2012.24
UPC is designed to improve user productivity when programming distributed-memory machines. Yet the shared-memory abstraction also makes performance analysis hard as it introduces extra overhead with local accesses and implicit communication with remote ones. As far as we know, there are no mature software utilities for systematic analysis and tuning of shared-memory access performance in UPC programs. We develop a mechanism to track shared memory accesses and correlate them to the UPC source lines, functions, and data structures. We then apply tool-assisted analysis to a set of UPC programs. For the NAS UPC benchmark we achieve dramatic performance improvement over the unoptimized implementation as well as up to two times speedups over the fully hand-tuned implementation. We expect our approach effective in tuning a wide range of UPC programs.
Runtime, Optimization, Arrays, Privatization, Electronics packaging, Tuning, performance toolsP, GAS, UPC, performance toolsGAS, UPC
Guojing Cong, Huifang Wen, Hiroki Murata, Yasushi Negishi, "Tool-assisted Optimization of Shared-memory Accesses in UPC Applications", High Performance Computing and Communication & IEEE International Conference on Embedded Software and Systems, IEEE International Conference on, vol. 00, no. , pp. 104-111, 2012, doi:10.1109/HPCC.2012.24