Issue No.05 - May (2001 vol.12)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.926166
<p><b>Abstract</b>—This paper presents a time stamp algorithm for runtime parallelization of general DOACROSS loops that have indirect access patterns. The algorithm follows the I<scp>NSPECTOR</scp>/E<scp>XECUTOR</scp> scheme and exploits parallelism at a fine-grained memory reference level. It features a parallel inspector and improves upon previous algorithms of the same generality by exploiting parallelism among consecutive reads of the same memory element. Two variants of the algorithm are considered: One allows partially concurrent reads (PCR) and the other allows fully concurrent reads (FCR). Analyses of their time complexities derive a necessary condition with respect to the iteration workload for runtime parallelization. Experimental results for a Gaussian elimination loop, as well as an extensive set of synthetic loops on a 12-way SMP server, show that the time stamp algorithms outperform iteration-level parallelization techniques in most test cases and gain speedups over sequential execution for loops that have heavy iteration workloads. The PCR algorithm performs best because it makes a better trade-off between maximizing the parallelism and minimizing the analysis overhead. For loops with light or unknown iteration loads, an alternative speculative runtime parallelization technique is preferred.</p>
Compiler, parallelizing compiler, runtime support, inspector-executor, doacross loop, dynamic dependence.
Cheng-Zhong Xu, Vipin Chaudhary, "Time Stamp Algorithms for Runtime Parallelization of DOACROSS Loops with Dynamic Dependences", IEEE Transactions on Parallel & Distributed Systems, vol.12, no. 5, pp. 433-450, May 2001, doi:10.1109/71.926166