The Community for Technology Leaders
Green Image
<p><b>Abstract</b>—It is extremely difficult to parallelize <it>DOACROSS</it> loops with nonuniform loop-carried dependences. In this paper, we present a static scheduling scheme with an accompanying synchronization strategy that can execute such <it>DOACROSS</it> loops effectively and efficiently. Our approach uses one of the parallelization techniques called <it>Dependence Uniformization</it>, which finds a small set of uniform dependence vectors to cover all possible nonuniform dependences in a <it>DOACROSS</it> loop. It differs from the previous schemes in that we demonstrate a better way to select the uniform dependence vectors. When used with the <it>Static Strip Scheduling</it> scheme, the proposed uniform dependence vector set allows us to enforce dependences with more locality, which reduces the requirement of explicit synchronization considerably while retaining most of the parallelism. This paper describes the uniform dependence vectors selection strategy and the static strip scheduling scheme. The performance analysis and examples are also presented.</p>
Compiler transformation, data dependence, loop parallelization, parallelism, scheduling, synchronization.
Ding-Kai Chen, Pen-Chung Yew, "On Effective Execution of Nonuniform DOACROSS Loops", IEEE Transactions on Parallel & Distributed Systems, vol. 7, no. , pp. 463-476, May 1996, doi:10.1109/71.503771
103 ms
(Ver 3.3 (11022016))