This Article 
 Bibliographic References 
 Add to: 
An Improved Duplication Strategy for Scheduling Precedence Constrained Graphs in Multiprocessor Systems
June 2003 (vol. 14 no. 6)
pp. 533-544

Abstract—Scheduling precedence constrained task graphs, with or without duplication, is one of the most challenging NP-complete problems in parallel and distributed computing systems. Duplication heuristics are more effective, in general, for fine grain tasks graphs and for networks with high communication latencies. However, most of the available duplication algorithms are designed under the assumption of unbounded availability of fully connected processors, and lie in high complexity range. Low complexity optimal duplication algorithms work under restricted cost and/or shape parameters for the task graphs. Further, the required number of processors grows in proportion to the task-graph size significantly. An improved duplication strategy is proposed that works for arbitrary task graphs, with a limited number of interconnection-constrained processors. Unlike most other algorithms that replicate all possible parents/ancestors of a given task, the proposed algorithm tends to avoid redundant duplications and duplicates the nodes selectively, only if it helps in improving the performance. This results in lower duplications and also lower time and space complexity. Simulation results are presented for clique and an interconnection-constrained network topology with random and regular benchmark task graph suites, representing a variety of parallel numerical applications. Performance, in terms of normalized schedule length and efficiency, is compared with some of the well-known and recently proposed algorithms. The suggested algorithm turns out to be most efficient, as it generates better or comparable schedules with remarkably less processor consumption.

[1] T.L. Adam, K.M. Chandy, and J.R. Dickson, “A Comparison of List Schedules for Parallel Processing Systems,” Comm. ACM, vol. 17, no. 12, pp. 685-690, Dec. 1974.
[2] I. Ahmad and Y. Kwok, On Exploiting Task Duplication in Parallel Program Scheduling IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 9, pp. 872-892, Sept. 1998.
[3] S. Bansal, P. Kumar, and K. Singh, A Cost-Effective Scheduling Algorithm for Message Passing Multiprocessor Systems Proc. 15th Int'l Conf. Parallel and Distributed Computing Systems, pp. 47-52, Sept. 2002.
[4] S. Bansal, P. Kumar, and K. Singh, Duplication Based Scheduling Algorithm for Interconnection Constrained Distributed Memory Machines Proc. Ninth Int'l Conf. High Performance Computing, Lecture Notes in Computer Science, S. Sahni, V.K. Prasanna, and U. Shukla, eds., vol. 2552, pp. 52-62, Dec. 2002.
[5] T.L. Casavant and J.G. Kuhl,“A taxonomy of scheduling in general-purpose distributed computing systems,” IEEE Trans. on Software Engineering, vol. 14, no. 2. Feb. 1988.
[6] Y.-C. Chung and S. Ranka,"Applications and performance analysis of a compile-time optimization approach for list scheduling algorithms on distributed memory multiprocessors," Proc. Supercomputing '92, pp. 512-521, 1992.
[7] E.G. Coffman, Computer and Job-Shop Scheduling Theory. New York: Wiley, 1976.
[8] J.Y. Colin and P. Chrétienne, C.P.M. Scheduling with Small Communication Delays and Task Duplication Operations Research, pp. 680-684, 1991.
[9] S. Darbha and D.P. Agrawal, Optimal Scheduling Algorithm for Distributed-Memory Machines IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 1, pp. 87-95, Jan. 1998.
[10] S. Darbha and D.P. Agrawal, A Task Duplication Based Scalable Scheduling Algorithm for Distributed Memory Systems J. Parallel and Distributed Computing, vol. 46, no. 1, pp. 15-27, Oct. 1997.
[11] M.D. Dikaiakos, A. Rogers, and K. Steiglitz, A Comparative Study of Heuristics for Mapping Parallel Algorithms to Message Passing Multiprocessors technical report, Princeton Univ., 1994.
[12] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness.New York: W.H. Freeman, 1979.
[13] A. Gerasoulis and T. Yang, A Comparison of Clustering Heuristics for Scheduling Directed Acyclic Graphs onto Multiprocessors J. Parallel and Distributed Computing, vol. 16, no. 4, pp. 276-291, Dec. 1992.
[14] J.J. Hwang,Y.C. Chow,F.D. Anger, and C.Y. Lee,"Scheduling precedence graphs in systems with interprocessor communication times," SIAM J. Computing, vol. 18, no. 2, pp. 244-257, Apr. 1989.
[15] S.J. Kim and J.C. Browne, A General Approach to Mapping of Parallel Computation upon Multiprocessor Architectures Proc. Int'l Conf. Parallel Processing, vol. 2, pp. 1-8, Aug. 1988.
[16] Y.-K. Kwok and I. Ahmad, “Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 5, pp. 506-521, May 1996.
[17] Y.-K. Kwok and I. Ahmad, “Benchmarking and Comparison of the Task Graph Scheduling Algorithms,” J. Parallel and Distributed Computing, vol. 59, pp. 381-422, 1999.
[18] Y.K. Kwok, High Performance Algorithms for Compile Time Scheduling of Parallel Processors PhD thesis, Hong Kong Univ. of Science and Technology, Hong Kong, May 1997.
[19] H. Kasahara and S. Narita, Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing IEEE Trans. Computers, vol. 33, no. 11, pp. 1023-1029, Nov. 1984.
[20] W.H. Kohler, A Preliminary Evaluation of the Critical Path Method for Scheduling Tasks on Multiprocessor Systems IEEE Trans. Computers, pp. 1235-1238, Dec. 1975.
[21] B. Kruatrachue and T. Lewis,"Grain size determination for parallel processing," IEEE Software, pp. 23-32, Jan. 1988.
[22] A. Munier and C. Hanen, Using Duplication for Scheduling Unitary Tasks onmProcessors with Unit Communication Delays Theoretical Computing Science, 1997.
[23] M.A. Palis, J.-C. Liou, and D.S.L. Wei, “Task Clustering and Scheduling for Distributed Memory Parallel Architectures,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 1, pp. 46-55, Jan. 1996.
[24] C.H. Papadimitriou and M. Yannakakis,"Towards an architecture-independent analysis of parallel algorithms," SIAM J. Computing, vol. 19, no. 2, pp. 322-328, Apr. 1990.
[25] C.I. Park and T.Y. Choe, An Optimal Scheduling Algorithm Based on Task Duplication IEEE Trans. Computers, vol. 51, no. 4, pp. 444-448, Apr. 2002.
[26] G.-L. Park, B. Shirazi, and J. Marquis, “DFRN: A New Approach for Duplication Based Scheduling for Distributed Memory Multiprocessor System,” Proc. Int'l Parallel Processing Symp. (IPPS), pp. 157-166, Apr. 1997.
[27] A. Radulescu and A.J.C. van Gemund, Low Cost Task Scheduling for Distributed-Memory Machines IEEE Trans. Parallel and Distributed Systems, vol. 13, no. 6, pp. 648-658, June 2002.
[28] H.E. Rewini and T.G. Lewis,"Scheduling parallel program tasks onto arbitrary target machines," J. Parallel and Distributed Computing, vol. 9, pp. 138-153, 1990.
[29] H. El-Rewini, T.G. Lewis, and H.H. Ali, Task Scheduling in Parallel and Distributed Systems. Prentice Hall, 1994.
[30] M.W. Schaffter, Scheduling Jobs with Communication Delays: Complexity Results and Approximation Algorithms PhD thesis, Technical Univ. of Berlin, Germany, 1996.
[31] S. Selvakumar and C.S.R. Murthy, "Scheduling Precedence Constrained Task Graphs with Non-Negligible Intertask Communication onto Multiprocessors," IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 3, pp. 328-336, Mar. 1994.
[32] B. Shirazi, M. Wang, and G. Pathak, “Analysis and Evaluation of Heuristic Methods for Static Task Scheduling,” J. Parallel and Distributed Computing, vol. 10, no. 3, pp. 222-232, Nov. 1990.
[33] G.C. Sih and E.A. Lee, “A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures,” IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 2, pp. 175-186, Feb. 1993.
[34] M.Y. Wu and D.D. Gajski,"Hypertool: A programming aid for message-passing systems," IEEE Transactions on Parallel and Distributed Systems, vol. 1, no. 3, pp. 330-343, July 1990.
[35] M.Y. Wu, W. Shu, and J. Gu, Efficient Local Search for DAG Scheduling IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 6, pp. 617-627, June 2001.
[36] T. Yang and A. Gerasoulis, “DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors,” IEEE Trans. Parallel and Distributed Systems, vol. 5, pp. 951-967, 1994.
[37] T. Yang,"Scheduling and code generation for parallel architectures," PhD thesis, Rutgers Univ., May 1993. Tech. Report DCS-TR-299.

Index Terms:
Algorithm, distributed computing, interconnection network, multiprocessor scheduling.
Savina Bansal, Padam Kumar, Kuldip Singh, "An Improved Duplication Strategy for Scheduling Precedence Constrained Graphs in Multiprocessor Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 6, pp. 533-544, June 2003, doi:10.1109/TPDS.2003.1206502
Usage of this product signifies your acceptance of the Terms of Use.