Issue No.09 - September (2003 vol.14)
Sanjay Rajopadhye , IEEE Computer Society
<p><b>Abstract</b>—For 2D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time on a parallel machine. We consider uniform dependency computations tiled so that (at least) one of the tile boundaries is parallel to the domain boundaries. We determine the optimal tile size as a <it>closed form solution</it>. In addition, we determine the <it>optimal number of processors</it> and also the <it>optimal slope</it> of the oblique tile boundary. Our results are based on the <scp>bsp</scp> model, which assures the portability of the results. Our predictions are justified on a sequence global alignment problem specialized to <it>similar</it> sequences using Fickett's k-band algorithm, for which our optimal semi-oblique tiling yields an improvement of a <it>factor</it> of 2.5 over orthogonal tiling. Our optimal solution requires a block-cyclic distribution of tiles to processors. The best one can obtain with only block distribution (as many authors require) is three <it>times</it> slower. Furthermore, our best running time is within 10 percent of the "predicted theoretical peak" performance of the machine!</p>
2D uniform recurrences, biological sequence alignment, BSP model, communication-compuation granularity, distributed memory machines, locality, loop blocking, MPI, perfect loop nests, SPMD.
Rumen Andonov, Sanjay Rajopadhye, Nicola Yanev, "Optimal Semi-Oblique Tiling", IEEE Transactions on Parallel & Distributed Systems, vol.14, no. 9, pp. 944-960, September 2003, doi:10.1109/TPDS.2003.1233716