The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - June (2009 vol.20)
pp: 857-871
Füsun Özgüner , The Ohio State University, Columbus
Doruk Bozdağ , The Ohio State University, Columbus
ABSTRACT
Many DAG scheduling algorithms generate schedules that require prohibitively large number of processors. To address this problem, we propose a generic algorithm, SC, to minimize the processor requirement of any given valid schedule. SC preserves the schedule length of the original schedule and reduces processor count by merging processor schedules and removing redundant duplicate tasks. To the best of our knowledge, this is the first algorithm to address this highly unexplored aspect of DAG scheduling. On average, SC reduced the processor requirement 91, 82, and 72 percent for schedules generated by PLW, TCSD, and CPFD algorithms, respectively. SC algorithm has a low complexity (O(\vert {\cal N}\vert^3)) compared to most duplication-based algorithms. Moreover, it decouples processor economization from schedule length minimization problem. To take advantage of these features of SC, we also propose a scheduling algorithm SDS, having the same time complexity as SC. Our experiments demonstrate that schedules generated by SDS are only 3 percent longer than CPFD (O(\vert {\cal N}\vert^4)), one of the best algorithms in that respect. SDS and SC together form a two-stage scheduling algorithm that produces schedules with high quality and low processor requirement, and has lower complexity than the comparable algorithms that produce similar high-quality results.
INDEX TERMS
Scheduling and task partitioning, task duplication, algorithms, multiprocessor systems.
CITATION
Füsun Özgüner, Doruk Bozdağ, "Compaction of Schedules and a Two-Stage Approach for Duplication-Based DAG Scheduling", IEEE Transactions on Parallel & Distributed Systems, vol.20, no. 6, pp. 857-871, June 2009, doi:10.1109/TPDS.2008.260
REFERENCES
[1] A. Gerasoulis and T. Yang, “On the Granularity and Clustering of Directed Acyclic Task Graphs,” IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 6, pp.686-701, June 1993.
[2] B. Kruatrachue and T. Lewis, “Grain Size Determination for Parallel Processing,” IEEE Software, vol. 5, no. 1, pp.23-32, Jan. 1988.
[3] M. Cosnard and M. Loi, “Automatic Task Graph Generation Techniques,” Parallel Processing Letters, vol. 5, no. 4, pp.527-538, Dec. 1995.
[4] M.-Y. Wu and D. Gajski, “Hypertool: A Programming Aid for Message-Passing Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 1, no. 3, pp.330-343, July 1990.
[5] M. Iverson, F. Özgüner, and L. Potter, “Statistical Prediction of Task Execution Times Through Analytical Benchmarking for Scheduling in a Heterogeneous Environment,” IEEE Trans. Computers, vol. 48, no. 12, pp.1374-1379, Dec. 1999.
[6] M. Garey and D. Johnson,Computers and Intractability, A Guide to the Theory of NP Completeness. W.H. Freeman and Co., 1979.
[7] S. Darbha and D. Agrawal, “Optimal Scheduling Algorithm for Distributed Memory Machines,” IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 1, pp.87-95, Jan. 1998.
[8] C. Park and T. Choe, “An Optimal Scheduling Algorithm Based on Task Duplication,” IEEE Trans. Computers, vol. 51, no. 4, pp.444-448, Apr. 2002.
[9] C. Papadimitriou and M. Yannakakis, “Towards an Architecture Independent Analysis of Parallel Algorithms,” SIAM J. Computing, vol. 19, pp.322-328, Apr. 1990.
[10] I. Ahmad and Y.-K. Kwok, “On Exploiting Task Duplication in Parallel Program Scheduling,” IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 9, pp.872-892, Sept. 1998.
[11] M. Palis, J. Liou, and D. Wei, “Task Clustering and Scheduling for Distributed Memory Parallel Architectures,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 1, pp.46-54, Jan. 1996.
[12] S. Baskiyar, “Scheduling Task In-Trees on Distributed Memory Systems,” Proc.15th Int'l Parallel and Distributed Processing Symp., p.6, Apr. 2001.
[13] T. Wajdi and A. Imitaz, “Optimal Algorithm for Tree Scheduling with Unit Time Communication Delays,” Proc. IEE Computers and Digital Techniques, vol. 148, no. 2, pp.79-88, Mar. 2001.
[14] J.-J. Hwang, Y.-C. Chow, F.D. Anger, and C.-Y. Lee, “Scheduling Precedence Graphs in Systems with Interprocessor Communication Times,” SIAM J. Computing, vol. 18, no. 2, pp.244-257, 1989.
[15] Y.-K. Kwok and I. Ahmad, “Dynamic Critical Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 5, pp.506-521, May 1996.
[16] Y. Chung and S. Ranka, “Application and Performance Analysis of a Compile-Time Optimization Approach for List Scheduling Algorithms on Distributed-Memory Multiprocessors,” Proc. Supercomputing, pp.512-521, Nov. 1992.
[17] T. Tsuchiya, T. Osada, and T. Kikuno, “Genetics-Based Multiprocessor Scheduling Using Task Duplication,” Microprocessors and Microsystems, vol. 22, no. 3, pp.197-207, Aug. 1998.
[18] C.-H. Yang, P. Lee, and Y.-C. Chung, “Improving Static Task Scheduling in Heterogeneous and Homogeneous Computing Systems,” Int'l Conf. Parallel Processing, pp.45-45, Sept. 2007.
[19] L. Zhou and S. Shi-Xin, “A Genetic Scheduling Algorithm Based on Knowledge for Multiprocessor System,” Proc. Int'l Conf. Comm. Circuits and Systems, pp.900-904, July 2007.
[20] G.-L. Park, B. Shirazi, and J. Marquis, “DFRN: A New Approach for Duplication Based Scheduling for Distributed Memory Multiprocessor Systems,” Proc. 11th Int'l Parallel Processing Symp., pp.157-166, Apr. 1997.
[21] D. Bozdag, F. Ozguner, E. Ekici, and U. Catalyurek, “A Task Duplication Based Scheduling Algorithm Using Partial Schedules,” Proc. Int'l Conf. Parallel Processing, pp.630-637, June 2005.
[22] G. Li, D. Chen, D. Wang, and D. Zhang, “Task Clustering and Scheduling to Multiprocessors with Duplication,” Proc. Int'l Parallel and Distributed Processing Symp., p. 8, 22-26, Apr. 2003.
[23] C. Boeres and V. Robello, “Cluster-Based Static Scheduling: Theory and Practice,” Proc. 14th Symp. Computer Architecture and High Performance Computing, pp.133-140, Oct. 2002.
[24] T. Yang and A. Gerasoulis, “DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors,” IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 9, pp.951-967, Sept. 1994.
[25] S. Kim and J. Browne, “A General Approach to Mapping of Parallel Computation Upon Multiprocessor Architectures,” Proc. Int'l Conf. Parallel Processing, vol. 3, pp.1-8, 1988.
[26] V. Sarkar, Partitioning and Scheduling Parallel Programs for Multiprocessors. MIT Press, 1989.
[27] J. Colin and P. Chretienne, “C.P.M. Scheduling with Small Computation Delays and Task Duplication,” Operations Research, pp.680-684, 1991.
[28] B. Shirazi, H. Chen, and J. Marquis, “Comparative Study of Task Duplication Static Scheduling Versus Clustering and Non-Clustering Techniques,” Concurrency: Practice and Experience, vol. 7, no. 5, pp.371-390, Aug. 1995.
[29] Y.-K. Kwok and I. Ahmad, “Benchmarking and Comparison of the Task Graph Scheduling Algorithms,” J. Parallel and Distributed Computing, vol. 59, no. 3, pp.381-422, Dec. 1999.
[30] Y. Ruan, G. Liu, Q. Li, and T. Jiang, “An Efficient Scheduling Algorithm for Dependent Tasks,” Proc. Fourth Int'l Conf. Computer and Information Technology, pp.456-461, Sept. 2004.
[31] M.-Y. Wu, W. Shu, and J. Gu, “Efficient Local Search for Dag Scheduling.” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 6, pp.617-627, June 2001.
[32] A. Radulescu and A. van Gemund, “Low-Cost Task Scheduling for Distributed-Memory Machines,” IEEE Trans. Parallel and Distributed Systems, vol. 13, no. 6, pp.648-658, June 2002.
[33] S. Bansal, P. Kumar, and K. Singh, “An Improved Duplication Strategy for Scheduling Precedence Constrained Graphs in Multiprocessor Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 14, no. 6, pp.533-544, June 2003.
[34] S. Pasham and W.-M. Lin, “Efficient Task Scheduling with Duplication for Bounded Number of Processors,” Proc. 11th Int'l Conf. Parallel and Distributed Systems, vol. 1, pp.543-549, July 2005.
[35] H. El-Rewini and T. Lewis, “Scheduling Parallel Programs onto Arbitrary Target Machines,” J. Parallel and Distributed Computing, vol. 9, no. 2, pp.138-153, June 1990.
[36] G. Sih and E. Lee, “A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures,” IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 2, pp.75-87, Feb. 1993.
[37] G. Li, Y. Zhang, Y. Lin, and Y. Huang, “Scalable Duplication Strategy with Bounded Availability of Processors,” Proc. 10th Int'l Conf. Parallel and Distributed Systems, pp.267-274, July 2004.
[38] G.H. Golub and C.F.V. Loan, Matrix Computations, third ed. Johns Hopkins Univ. Press, 1996.
[39] A.A. Auer et al., “Automatic Code Generation for Many-Body Electronic Structure Methods: The Tensor Contraction Engine,” Molecular Physics, vol. 104, pp.211-228, 2006.
25 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool