This Article 
 Bibliographic References 
 Add to: 
DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors
September 1994 (vol. 5 no. 9)
pp. 951-967

We present a low-complexity heuristic, named the dominant sequence clusteringalgorithm (DSC), for scheduling parallel tasks on an unbounded number of completelyconnected processors. The performance of DSC is on average, comparable to, or evenbetter than, other higher-complexity algorithms. We assume no task duplication andnonzero communication overhead between processors. Finding the optimum solution forarbitrary directed acyclic task graphs (DAG's) is NP-complete. DSC finds optimalschedules for special classes of DAG's, such as fork, join, coarse-grain trees, and somefine-grain trees. It guarantees a performance within a factor of 2 of the optimum forgeneral coarse-grain DAG's. We compare DSC with three higher-complexity generalscheduling algorithms: the ETF by J.J. Hwang, Y.C. Chow, F.D. Anger, and C.Y. Lee(1989); V. Sarkar's (1989) clustering algorithm; and the MD by M.Y. Wu and D. Gajski(1990). We also give a sample of important practical applications where DSC has beenfound useful.

[1] M. A. Al-Mouhamed, "Lower bound on the number of processors and time for scheduling precedence graphs with communication costs,"IEEE Trans. Software Eng., vol. 16, pp. 1390-1401, 1990.
[2] F. D. Anger, J. Hwang, and Y. Chow, "Scheduling with sufficient loosely coupled processors,"J. Parallel and Distributed Comput., vol. 9, pp. 87-92, 1990.
[3] P. Chretienne, "Task scheduling over distributed memory machines," inProc. Int. Workshop Parallel Distrib. Algorithms, 1989.
[4] P. Chretienne, "A polynomial algorithm to optimially schedule tasks over an ideal distributed system under tree-like presedence constraints,"European J. Oper. Res., vol. 2, pp. 225-230, 1989.
[5] P. Chretienne, "Complexity of tree scheduling with interprocessor communication delays," Tech. Rep. M.A.S.I. 90.5, Universite Pierre et Marie Curie, 1990.
[6] J. Y. Colin and P. Chretienne, "C. P. M. scheduling with small communication delays and task duplication,"Oper. Res., vol. 39, no. 4, pp. 680-684, 1991.
[7] M. Cosnard, M. Marrakchi, Y. Robert, and D. Trystram, "Parallel Gaussian elimination on an MIMD computer,"Parallel Computing, vol. 6, pp. 275-296, 1988.
[8] A. Gerasoulis and T. Yang, "On the granularity and clustering of directed acyclic task graphs,"IEEE Trans. Parallel Distrib. Syst., vol. 4, pp. 686-701, June 1993.
[9] A. Gerasoulis and T. Yang, "A comparison of clustering heuristics for scheduling DAG's on multiprocessors,"J. Parallel Distrib. Computing, vol. 16, pp. 276-291, Dec. 1992.
[10] M. Girkar and C. Polychronopoulos, "Partitioning programs for parallel execution," inProc. 1988 ACM Int. Conf. Supercomputing, 1988.
[11] J.-J. Hwang, Y.-C. Chow, F. D. Anger, and C. Y. Lee, "Scheduling precedence graphs in systems with interprocessor communication times,"SIAM Computing, pp. 244-257, Apr. 1988.
[12] S. J. Kim and J. C. Browne, "A general approach to mapping of parallel computation upon multiprocessor architectures," inInt. Conf. Parallel Processing, vol. 3, pp. 1-8, 1988.
[13] B. Kruatrachue and T. Lewis, "Grain size determination for parallel processing,"IEEE Software, Jan. 1988, pp. 23-32.
[14] C. McCreary and H. Gill, "Automatic determination of grain size for efficient parallel processing,"Commun. ACM, vol. 32, pp. 1073-1078, Sept., 1989.
[15] C. Papadimitriou and M. Yannakakis, "Toward an architecture-independent analysis of parallel algorithms,"SIAM J. Comput., vol. 19, no. 2, pp. 322-328, Apr. 1990.
[16] R. Pozo, "Performance modeling of sparse matrix methods for distributed memory architectures," inLecture Notes in Computer Science 634, Parallel Processing: CONPAR 92--VAPPV. New York: Springer-Verlag, 1992, pp. 677-688.
[17] V. Sarkar,Partitioning and Scheduling Parallel Programs for Multiprocessing, MIT Press, 1989.
[18] R. Wolski and J. Feo, "Program parititoning for NUMA multiprocessor computer systems,"J. Parallel Distrib. Computing(special issue on performance of supercomputers), vol. 19, pp. 203-218, 1993.
[19] M. Y. Wu and D. Gajski, "Hypertool: A programming aid for messagepassing systems,"IEEE Trans. Parallel Distrib. Syst., vol. 1, pp. 330-343, 1990.
[20] M. Y. Wu, personal commun., Feb. 1993.
[21] J. Yang, L. Bic, and A. Nicolau, "A mapping strategy for MIMD computers,"Proc. 1991 Int. Conf. Parallel Processing, vol. I, pp. 102-109.
[22] T. Yang and A. Gerasoulis, "A fast static scheduling algorithm for DAG's on an unbounded number of processors," inProc. IEEE Supercomputing '91, IEEE, Albuquerque, NM, Nov. 1991, pp. 633-642.
[23] T. Yang and A. Gersoulis, "Pyrros: Static Task Scheduling and Code Generation for Message-Passing Multiprocessors,"Proc. 6th ACM Int'l Conf. Supercomputing, ACM Press, New York, 1992, pp. 428-443.

Index Terms:
Index Termsscheduling; directed graphs; computational complexity; parallel algorithms; trees(mathematics); parallel programming; DSC; parallel task scheduling; low-complexityheuristic; dominant sequence clustering algorithm; completely connected processor;unbounded number; performance; nonzero communication overhead; arbitrary directedacyclic task graphs; DAGs; NP-complete; optimal schedules; special classes; fork; join;coarse-grain trees; fine-grain trees; general scheduling algorithms; ETF; MD
T. Yang, A. Gerasoulis, "DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors," IEEE Transactions on Parallel and Distributed Systems, vol. 5, no. 9, pp. 951-967, Sept. 1994, doi:10.1109/71.308533
Usage of this product signifies your acceptance of the Terms of Use.