Issue No.11 - November (1996 vol.7)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.544356
<p><b>Abstract</b>—Most scientific and Digital Signal Processing (DSP) applications are recursive or iterative. Transformation techniques are usually applied to get optimal execution rates in parallel and/or pipeline systems. The retiming technique is a common and valuable transformation tool in one-dimensional problems, when loops are represented by data flow graphs (DFGs). In this paper, uniform nested loops are modeled as multidimensional data flow graphs (MDFGs). Full parallelism of the loop body, i.e., all nodes in the MDFG executed in parallel, substantially decreases the overall computation time. It is well known that, for one-dimensional DFGs, retiming can not always achieve full parallelism. Other existing optimization techniques for nested loops also can not always achieve full parallelism. This paper shows an important and counter-intuitive result, which proves that we can always obtain full-parallelism for MDFGs with more than one dimension. This result is obtained by transforming the MDFG into a new structure. The restructuring process is based on a multidimensional retiming technique. The theory and two algorithms to obtain full parallelism are presented in this paper. Examples of optimization of nested loops and digital signal processing designs are shown to demonstrate the effectiveness of the algorithms.</p>
Retiming, multidimensional data-flow graphs, instruction level parallelism, loop transformation, nested loops, VLIW, superscalar.
Nelson Luiz Passos, Edwin Hsing-Mean Sha, "Achieving Full Parallelism Using Multidimensional Retiming", IEEE Transactions on Parallel & Distributed Systems, vol.7, no. 11, pp. 1150-1163, November 1996, doi:10.1109/71.544356