Two important aspects have to be addressed when Automatically parallelizing loop nests for massively parallel Distributedmemory computers, namely maximizing parallelism and minimizing communication overhead due to non local data accesses. This paper studies the problem of finding a computation mapping and data distributions that minimize the number of remote data accesses for a given degree of parallelism. This problem is called the constant-degree parallelism alignment problem and is shown to be NP-hard. The algorithm presented uses a linear algebra framework and assumes affine data access functions. It proceeds by enumerating all interesting bases of the set of vectors representing the alignments between computation and data accesses that should be satisfied. It is shown in a comparison with related work how the approach presented allows to express previous results as special cases. The algorithmis applied to benchmark programs and shown superior to more basic mappings.
Citation:
Claude G. Diderich, Marc Gengler, "The Alignment Problem in a Linear Algebra Framework," hicss, vol. 1, pp.586, 30th Hawaii International Conference on System Sciences (HICSS) Volume 1: Software Technology and Architecture, 1997