Issue No. 05 - May (2002 vol. 13)
<p>In this paper, an efficient algorithm to implement <it>loop partitioning</it> is introduced and evaluated. We start from results of Agarwal et al. whose aim is to minimize the number of accessed data throughout the computation of a tile; this number is called the cumulative footprint of the tile. We improve these results along several directions. First, we derive a new formulation of the cumulative footprint, allowing for an analytical solution of the optimization problem stated in. Second, we deal with arbitrary parallelepiped-shaped tiles, as opposed to rectangular tiles in. We design an efficient heuristic to determine the optimal tile shape in this general setting and we show its usefulness using both examples from and a large collection of randomly generated data.</p>
Compilation technique, hierarchical memory systems, loop partitioning, tiling, cache, data locality, footprint, out-of-core algorithms.
Fabrice Rastello, Yves Robert, "Automatic Partitioning of Parallel Loops with Parallelepiped-Shaped Tiles", IEEE Transactions on Parallel & Distributed Systems, vol. 13, no. , pp. 460-470, May 2002, doi:10.1109/TPDS.2002.1003856