16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007) (2007)
Sept. 15, 2007 to Sept. 19, 2007
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PACT.2007.53
Costin Iancu , Lawrence Berkeley National Laboratory, USA
Wei Chen , University of California at Berkeley, USA
Katherine Yelick , University of California at Berkeley, USA
As high end computing systems continue to scale in CPU computational power and overall node count, optimization techniques that can reduce communication overhead have proven important. Communication optimizations have been explored in the context of parallelizing compilers and data parallel languages. Most of these studies have traditionally been performed using MPI as the communication library and at a time when networks had a relatively high latency and low bandwidth. As a result, most techniques concentrate on eliminating redundant messages and reducing message count through aggregation. Research on recent networks has shown that significant performance improvements can be achieved using fine grained communication decomposition and overlap.
K. Yelick, C. Iancu and W. Chen, "Performance Portable Optimizations for Loops Containing Communication Operations," 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007)(PACT), Brasov, Romania, 2007, pp. 411.