Issue No. 05 - May (1998 vol. 9)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.679213
<p><b>Abstract</b>—With the objective of minimizing the total execution time of a parallel program on a distributed memory parallel computer, this paper discusses how to find an optimal supernode size and optimal supernode relative side lengths of a supernode transformation (also known as tiling). We identify three parameters of supernode transformation: supernode size, relative side lengths, and cutting hyperplane directions. For algorithms with perfectly nested loops and uniform dependencies, for sufficiently large supernodes and number of processors, and for the case where multiple supernodes are mapped to a single processor, we give an order <it>n</it> polynomial whose real positive roots include the optimal supernode size. For two special cases, 1) two-dimensional algorithm problems and 2) <it>n</it>-dimensional algorithm problems, where the communication cost is dominated by the startup penalty and, therefore, can be approximated by a constant, we give a closed form expression for the optimal supernode size, which is independent of the supernode relative side lengths and cutting hyperplanes. For the case where the algorithm iteration index space and the supernodes are hyperrectangular, we give closed form expressions for the optimal supernode relative side lengths. Our experiment shows a good match of the closed form expressions with experimental data.</p>
Supernode partitioning, tiling, parallelizing compilers, distributed memory multicomputer, minimizing running time.
E. Hodzic and W. Shang, "On Supernode Transformation with Minimized Total Running Time," in IEEE Transactions on Parallel & Distributed Systems, vol. 9, no. , pp. 417-428, 1998.