This Article 
 Bibliographic References 
 Add to: 
Adaptively Scheduling Parallel Loops in Distributed Shared-Memory Systems
January 1997 (vol. 8 no. 1)
pp. 70-81

Abstract—Using runtime information of load distributions and processor affinity, we propose an adaptive scheduling algorithm and its variations from different control mechanisms. The proposed algorithm applies different degrees of aggressiveness to adjust loop scheduling granularities, aiming at improving the execution performance of parallel loops by making scheduling decisions that match the real workload distributions at runtime. We experimentally compared the performance of our algorithm and its variations with several existing scheduling algorithms on two parallel machines: the KSR-1 and the Convex Exemplar. The kernel application programs we used for performance evaluation were carefully selected for different classes of parallel loops. Our results show that using runtime information to adaptively adjust scheduling granularity is an effective way to handle loops with a wide range of load distributions when no prior knowledge of the execution can be used. The overhead caused by collecting runtime information is insignificant in comparison with the performance improvement. Our experiments show that the adaptive algorithm and its five variations outperformed the existing scheduling algorithms.

[1] CONVEX Exemplar Architecture. CONVEX Computer Corp., second edition, document no. 710-004730-001, Nov. 1994.
[2] S.F. Hummel, E. Schonberg, and L.E. Flynn, “Factoring: A Method for Scheduling Parallel Loops,” Comm. ACM, vol. 35, no. 8, pp. 90-101, Aug. 1992.
[3] KSR-1 Technology Background. Kendall Square Research, 1992
[4] S. Lucco, "A Dynamic Scheduling Method for Irregular Parallel Programs," Proc. ACM SIGPLAN '92 Conf. Programming Language Design and Implementation, pp. 200-211, 1992.
[5] J. Liu, V.A. Saletore, and T.G. Lewis, "Safe Self-Scheduling: A Parallel Loop Scheduling Scheme for Shared-Memory Multiprocessors," Int'l J. Parallel Programming, vol. 22, no. 6, pp. 589-616, 1994.
[6] E.P. Markatos and T.J. LeBlanc, “Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors,” IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 4, pp. 379-400, Apr. 1994.
[7] L.M. Ni and C.E. Wu, "Design Tradeoffs for Process Scheduling in Shared Memory Multiprocessor System," IEEE Trans. Software Eng., vol. 15, no. 3, pp. 327-334, Mar. 1989.
[8] C.D. Polychronopoulos and D.J. Kuck, “Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers,” IEEE Trans. Computers, vol. 36, no. 12, pp. 1425-1439, Dec. 1987.
[9] S. Subramaniam and D.L. Eager, "Affinity Scheduling of Unbalanced Workloads," Proc. Supercomputing '94, pp. 214-226, 1994.
[10] P. Tang and P.C. Yew, "Processor Self-Scheduling for Multiple Nested Parallel Loops," Proc. 1986 Int'l Conf. Parallel Processing, pp. 528-535, 1986.
[11] T.H. Tzen and L.M. Ni, "Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers," IEEE Trans. Parallel and Distributed Systems, vol. 4, pp. 87-98, Jan. 1993.

Index Terms:
Adaptive scheduling algorithms, dynamic information, load balancing, parallel loops, processor affinity, shared-memory systems.
Yong Yan, Canming Jin, Xiaodong Zhang, "Adaptively Scheduling Parallel Loops in Distributed Shared-Memory Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 8, no. 1, pp. 70-81, Jan. 1997, doi:10.1109/71.569656
Usage of this product signifies your acceptance of the Terms of Use.