FASTEST: A Practical Low-Complexity Algorithm for Compile-Time Assignment of Parallel Programs to Multiprocessors
Issue No. 02 - February (1999 vol. 10)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.752781
<p><b>Abstract</b>—In the area of parallelizing compilers, considerable research has been carried out on data dependency analysis, parallelism extraction, as well as program and data partitioning. However, designing a practical, low complexity scheduling algorithm without sacrificing performance remains a challenging problem. A variety of heuristics have been proposed to generate efficient solutions but they take prohibitively long execution times for moderate size or large problems. In this paper, we propose an algorithm called FASTEST (<it>Fast Assignment and Scheduling of Tasks using an Efficient Search Technique</it>) that has <it>O</it>(<it>e</it>) time complexity, where <it>e</it> is the number of edges in the task graph. The algorithm first generates an initial solution in a short time and then refines it by using a simple but robust random neighborhood search. We have also parallelized the search to further lower the time complexity. We are using the algorithm in a prototype automatic parallelization and scheduling tool which compiles sequential code and generates parallel code optimized with judicious scheduling. The proposed algorithm is evaluated with several application programs and outperforms a number of previous algorithms by generating parallelized code with shorter execution times, while taking dramatically shorter scheduling times. The FASTEST algorithm generates optimal solutions for a majority of the test cases and close-to-optimal solutions for the rest.</p>
Automatic parallelization, compile-time scheduling, task graphs, multiprocessors, parallel processing, parallel programming tool, parallel algorithm, random neighborhood search.
Y. Kwok and I. Ahmad, "FASTEST: A Practical Low-Complexity Algorithm for Compile-Time Assignment of Parallel Programs to Multiprocessors," in IEEE Transactions on Parallel & Distributed Systems, vol. 10, no. , pp. 147-159, 1999.