Issue No.03 - March (2003 vol.14)
Larry Carter , IEEE
Karin Högstedt , IEEE
<p><b>Abstract</b>—Many computationally-intensive programs, such as those for differential equations, spatial interpolation, and dynamic programming, spend a large portion of their execution time in multiply-nested loops that have a regular stencil of data dependences. Tiling is a well-known compiler optimization that improves performance on such loops, particularly for computers with a multileveled hierarchy of parallelism and memory. Most previous work on tiling is limited in at least one of the following ways: they only handle nested loops of depth two, orthogonal tiling, or rectangular tiles. In our work, we tile loop nests of arbitrary depth using polyhedral tiles. We derive a prediction formula for the execution time of such tiled loops, which can be used by a compiler to automatically determine the tiling parameters that minimizes the execution time. We also explain the notion of <it>rise</it>, a measure of the relationship between the shape of the tiles and the shape of the iteration space generated by the loop nest. The rise is a powerful tool in predicting the execution time of a tiled loop. It allows us to reason about how the tiling affects the <it>length of the longest path of dependent tiles</it>, which is a measure of the execution time of a tiling. We use a model of the tiled iteration space that allows us to determine the length of the longest path of dependent tiles using linear programming. Using the rise, we derive a simple formula for the length of the longest path of dependent tiles in <it>rectilinear</it> iteration spaces, a subclass of the convex iteration spaces, and show how to choose the optimal tile shape.</p>
Tiling, blocking, compiler optimization, parallel compilers.
Larry Carter, Karin Högstedt, "On the Parallel Execution Time of Tiled Loops", IEEE Transactions on Parallel & Distributed Systems, vol.14, no. 3, pp. 307-321, March 2003, doi:10.1109/TPDS.2003.1189587