Issue No. 06 - June (2002 vol. 13)
<p>This paper presents a new compiler approach to minimizing the number of barriers executed in parallelized programs. A simple procedure is developed to reduce the complexity of barrier placement by eliminating certain data dependences, without affecting optimality. An algorithm is presented which, provably, places the minimal number of barriers in perfect loop nests and in certain imperfect loop nest structures. This scheme is generalized to accept entire, well-structured control-flow programs containing arbitrary nesting of IF constructs, loops, and subroutines. It has been implemented in a prototype parallelizing compiler and applied to several well-known benchmarks where it has been shown to place significantly fewer synchronization points than existing techniques. Experiments indicate that on average the number of barriers executed is reduced by 70 percent and there is a three fold improvement in execution time when evaluated on a 32-processor SGI Origin 2000. </p>
Compiler optimization, synchronization reduction, efficient parallelization, barrier minimization, graph algorithms.
M. O'Boyle and E. Stöhr, "Compile Time Barrier Synchronization Minimization," in IEEE Transactions on Parallel & Distributed Systems, vol. 13, no. , pp. 529-543, 2002.