This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Low-Cost Task Scheduling for Distributed-Memory Machines
June 2002 (vol. 13 no. 6)
pp. 648-658

In compile-time task scheduling for distributed-memory systems, list scheduling is generally accepted as an attractive approach since it pairs low cost with good results. List scheduling algorithms schedule tasks in order of their priority. This priority can be computed either 1) statically, before the scheduling, or 2) dynamically, during the scheduling. In this paper, we show that list scheduling with statically computed priorities can be performed at a significantly lower cost than existing approaches, without sacrificing performance. Our approach is general, i.e., it can be applied to any list scheduling algorithm with static priorities. The low-complexity is achieved by using low-complexity methods for the most time consuming parts in list scheduling algorithms, i.e., processor selection and task selection, preserving the criteria used in the original algorithms. We exemplify our method by applying it to the MCP algorithm. Using an extension of this method, we can also reduce the time complexity of a particular class of list scheduling with dynamic priorities (including algorithms such as DLS, ETF, or ERT). Our results confirm that the modified versions of the list scheduling algorithms obtain a performance comparable to their original versions, yet at a significantly lower cost. We also show that the modified versions of the list scheduling algorithms consistently outperform multistep algorithms, such as DSC-LLB, which also have higher complexity and clearly outperform algorithms in the same class of complexity, such as CPM.

[1] T.L. Adam, K.M. Chandy, and J.R. Dickson, “A Comparison of List Schedules for Parallel Processing Systems,” Comm. ACM, vol. 17, no. 12, pp. 685-690, Dec. 1974.
[2] I. Ahmad and Y.-K. Kwok, “A New Approach to Scheduling Parallel Programs Using Task Duplication,” Proc. Int'l Conf. Parallel Processing (ICPP), pp. 47-51, Aug. 1994.
[3] Y.-C. Chung and S. Ranka,"Applications and performance analysis of a compile-time optimization approach for list scheduling algorithms on distributed memory multiprocessors," Proc. Supercomputing '92, pp. 512-521, 1992.
[4] M. Cosnard, E. Jeannot, and L. Rougeot, “Low Memory Cost Dynamic Scheduling of Large Coarse Grain Task Graphs,” Proc. Int'l Parallel Processing Symp./Symp. Parallel and Distributed Processing (IPPS/SPDP), Mar. 1998.
[5] G.L. Djordjevic and M.B. Tosic, “A Heuristic for Scheduling Task Graphs with Communication Delays onto Multiprocessors,” Parallel Computing, vol. 22, pp. 1197-1214, Nov. 1996.
[6] H. El-Rewini, H.H. Ali, and T.G. Lewis, “Task Scheduling in Multiprocessing Systems,” Computer, pp. 27-37, Dec. 1995.
[7] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness.New York: W.H. Freeman, 1979.
[8] A. Gerasoulis and T. Yang,"On the granularity and clustering of directed acyclic task graphs," IEEE Transactions on Parallel and Distributed Systems, vol. 4, no. 6, pp. 686-701, June 1993.
[9] G. Golub and C.F. van Loan, Matrix Computations. Baltimore, Md.: Johns Hopkins Univ. Press, 1996.
[10] R.L. Graham, “Bounds on Multiprocessing Timing Anomalies,” SIAM J. Applied Math., vol. 17, no. 2, pp. 416-429, Mar. 1969.
[11] J.J. Hwang,Y.C. Chow,F.D. Anger, and C.Y. Lee,"Scheduling precedence graphs in systems with interprocessor communication times," SIAM J. Computing, vol. 18, no. 2, pp. 244-257, Apr. 1989.
[12] A.A. Khan, C.L. McCreary, and M.S. Jones, “A Comparison of Multiprocessor Scheduling Heuristics,” Proc. Int'l Conf. Parallel Processing (ICPP), pp. 243-250, Aug. 1994.
[13] S.J. Kim and J.C. Browne, “A General Approach to Mapping of Parallel Computation Upon Multiprocessor Architectures,” Proc. Int'l Conf. Parallel Processing (ICPP), vol. 3, pp. 1-8, Aug. 1988.
[14] B. Kruatrachue and T. Lewis,"Grain size determination for parallel processing," IEEE Software, pp. 23-32, Jan. 1988.
[15] Y.-K. Kwok and I. Ahmad, “Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 5, pp. 506-521, May 1996.
[16] Y.-K. Kwok and I. Ahmad, “Benchmarking and Comparison of the Task Graph Scheduling Algorithms,” J. Parallel and Distributed Computing, vol. 59, pp. 381-422, 1999.
[17] Y.-K. Kwok, I. Ahmad, and J. Gu, “FAST: A Low-Complexity Algorithm for Efficient Scheduling of DAGs on Parallel Processors,” Proc. Int'l Conf. Parallel Processing (ICPP), vol. II, pp. 150-157, Aug. 1996.
[18] C.-Y. Lee, J.-J. Hwang, Y.-C. Chow, and F.D. Anger, “Multiprocessor Scheduling with Interprocessor Communication Delays,” Operations Research Letters, vol. 7, pp. 141-147, June 1988.
[19] M.A. Palis, J.-C. Liou, and D.S.L. Wei, “Task Clustering and Scheduling for Distributed Memory Parallel Architectures,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 1, pp. 46-55, Jan. 1996.
[20] G.-L. Park, B. Shirazi, and J. Marquis, “DFRN: A New Approach for Duplication Based Scheduling for Distributed Memory Multiprocessor System,” Proc. Int'l Parallel Processing Symp. (IPPS), pp. 157-166, Apr. 1997.
[21] G.-L. Park, B. Shirazi, J. Marquis, and H. Choo, “Decisive Path Scheduling: A New List Scheduling Method,” Proc. Int'l Conf. Parallel Processing (ICPP), pp. 472-480, Aug. 1997.
[22] A. Radulescu, “Compile-Time Scheduling for Distributed-Memory Systems,” PhD thesis, Delft Univ. of Tech nology, June 2001.
[23] A. Radulescu and A.J.C. van Gemund, “FLB: Fast Load Balancing for Distributed-Memory Machines,” Proc. Int'l Conf. Parallel Processing (ICPP), pp. 534-541, Sept. 1999.
[24] A. Radulescu and A.J.C. van Gemund, “On the Complexity of List Scheduling Algorithms for Distributed-Memory Systems,” Proc. ACM Int'l Conf. Supercomputing (ICS), pp. 68-75, June 1999.
[25] A. Radulescu and A.J.C. van Gemund, “LLB: A Fast and Effective Scheduling Algorithm for Distributed-Memory Systems,” Proc. Int'l Parallel Processing Symp./Symp. Parallel and Distributed Processing (IPPS/SPDP), pp. 525-530, Apr. 1999.
[26] V. Sarkar,Partitioning and Scheduling Parallel Programs for Execution on Multiprocessors.Cambridge, Mass.: MIT Press, 1989.
[27] B. Shirazi, H.-B. Chen, and J. Marquis, “Comparative Study of Task Duplication Static Scheduling versus Clustering and NonClustering Techniques,” Concurrency: Practice and Experience, vol. 7, pp. 371-389, Aug. 1995.
[28] B. Shirazi, M. Wang, and G. Pathak, “Analysis and Evaluation of Heuristic Methods for Static Task Scheduling,” J. Parallel and Distributed Computing, vol. 10, no. 3, pp. 222-232, Nov. 1990.
[29] G.C. Sih and E.A. Lee, “A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures,” IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 2, pp. 175-186, Feb. 1993.
[30] M.Y. Wu and D.D. Gajski,"Hypertool: A programming aid for message-passing systems," IEEE Transactions on Parallel and Distributed Systems, vol. 1, no. 3, pp. 330-343, July 1990.
[31] T. Yang and C. Fu, “Heuristic Algorithms for Scheduling Iterative Task Computations on Distributed Memory Machines,” IEEE Trans. Parallel and Distributed Systems, vol. 8, no. 6, pp. 608-622, June 1997.
[32] T. Yang and A. Gerasoulis, “PYRROS: Static Scheduling and Code Generation for Message Passing Multiprocessors,” Proc. Sixth ACM Int'l Conf. Supercomputing, pp. 428-437, 1992.
[33] T. Yang and A. Gerasoulis, “DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors,” IEEE Trans. Parallel and Distributed Systems, vol. 5, pp. 951-967, 1994.

Index Terms:
Compile-time task scheduling, list scheduling, dataflow graphs, distributed-memory multiprocessors.
Citation:
Andrei Radulescu, Arjan J.C. van Gemund, "Low-Cost Task Scheduling for Distributed-Memory Machines," IEEE Transactions on Parallel and Distributed Systems, vol. 13, no. 6, pp. 648-658, June 2002, doi:10.1109/TPDS.2002.1011417
Usage of this product signifies your acceptance of the Terms of Use.