This Article 
 Bibliographic References 
 Add to: 
Toward a Theory for Scheduling Dags in Internet-Based Computing
June 2006 (vol. 55 no. 6)
pp. 757-768
Web Extra: View supplemental material
Conceptual and algorithmic tools are developed as a foundation for a theory of scheduling complex computation-dags for Internet-based computing. The goal of the schedules produced is to render tasks eligible for allocation to remote clients (hence, for execution) at the maximum possible rate. This allows one to utilize remote clients well, as well as to lessen the likelihood of the "gridlock” that ensues when a computation stalls for lack of eligible tasks. Earlier work has introduced a formalism for studying this optimization problem and has identified optimal schedules for several significant families of structurally uniform dags. The current paper extends this work via a methodology for devising optimal schedules for a much broader class of complex dags, which are obtained via composition from a prespecified collection of simple building-block dags. The paper provides a suite of algorithms that decompose a given dag {\cal G} to expose its building blocks and an execution-priority relation \triangleright on building blocks. When the building blocks are appropriately interrelated under \triangleright, the algorithms specify an optimal schedule for {\cal G}.

[1] J. Annis, Y. Zhao, J. Voeckler, M. Wilde, S. Kent, and I. Foster, “Applying Chimera Virtual Data Concepts to Cluster Finding in the Sloan Sky Survey,” Proc. 15th Conf. High Performance Networking and Computing, p. 56, 2002.
[2] R. Buyya, D. Abramson, and J. Giddy, “A Case for Economy Grid Architecture for Service Oriented Grid Computing,” Proc. 10th Heterogeneous Computing Workshop, 2001.
[3] W. Cirne and K. Marzullo, “The Computational Co-Op: Gathering Clusters into a Metacomputer,” Proc. 13th Int'l Parallel Processing Symp., pp. 160-166, 1999.
[4] T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein, Introduction to Algorithms, second ed. Cambridge, Mass.: MIT Press, 2001.
[5] The Grid: Blueprint for a New Computing Infrastructure, second ed., I. Foster and C. Kesselman, eds. San Francisco: Morgan Kaufmann, 2004.
[6] I. Foster, C. Kesselman, and S. Tuecke, “The Anatomy of the Grid: Enabling Scalable Virtual Organizations,” Int'l J. High Performance Computing Applications, vol. 15, pp. 200-222, 2001.
[7] L. Gao and G. Malewicz, “Internet Computing of Tasks with Dependencies Using Unreliable Workers,” Thoery of Computing Systems, to appear.
[8] A. Gerasoulis and T. Yang, “A Comparison of Clustering Heuristics for Scheduling Dags on Multiprocessors,” J. Parallel and Distributed Computing, vol. 16, pp. 276-291, 1992.
[9] L. He, Z. Han, H. Jin, L. Pan, and S. Li, “DAG-Based Parallel Real Time Task Scheduling Algorithm on a Cluster,” Proc. Int'l Conf. Parallel and Distruted Processing Techniques and Applications, pp. 437-443, 2000.
[10] H.T. Hsu, “An Algorithm for Finding a Minimal Equivalent Graph of a Digraph,” J. ACM, vol. 22, pp. 11-16, 1975.
[11] D. Kondo, H. Casanova, E. Wing, and F. Berman, “Models and Scheduling Guidelines for Global Computing Applications,” Proc. Int'l Parallel and Distruted Processing Symp., p. 79, 2002.
[12] E. Korpela, D. Werthimer, D. Anderson, J. Cobb, and M. Lebofsky, “SETI@home: Massively Distributed Computing for SETI,” Computing in Science and Eng., P.F. Dubois, ed., Los Alamitos, Calif.: IEEE CS Press, 2000.
[13] G. Malewicz, “Parallel Scheduling of Complex Dags under Uncertainty,” Proc. 17th ACM Symp. Parallelism in Algorithms and Architectures, 2005.
[14] G. Malewicz and A.L. Rosenberg, “On Batch-Scheduling Dags for Internet-Based Computing,” Proc. 11th European Conf. Parallel Processing, 2005.
[15] A.L. Rosenberg, “On Scheduling Mesh-Structured Computations for Internet-Based Computing,” IEEE Trans. Computers, vol. 53, pp. 1176-1186, 2004.
[16] A.L. Rosenberg and I.H. Sudborough, “Bandwidth and Pebbling,” Computing, vol. 31, pp. 115-139, 1983.
[17] A.L. Rosenberg and M. Yurkewych, “Guidelines for Scheduling Some Common Computation-Dags for Internet-Based Computing,” IEEE Trans. Computers, vol. 54, pp. 428-438, 2005.
[18] X.-H. Sun and M. Wu, “Grid Harvest Service: A System for Long-Term, Application-Level Task Scheduling,” Proc. IEEE Int'l Parallel and Distributed Processing Symp., p. 25, 2003.
[19] D. Thain, T. Tannenbaum, and M. Livny, “Distributed Computing in Practice: The Condor Experience,” Concurrency and Computation: Practice and Experience, 2005.

Index Terms:
Internet-based computing, grid computing, global computing, Web computing, scheduling dags, dag decomposition, theory.
Grzegorz Malewicz, Arnold L. Rosenberg, Matthew Yurkewych, "Toward a Theory for Scheduling Dags in Internet-Based Computing," IEEE Transactions on Computers, vol. 55, no. 6, pp. 757-768, June 2006, doi:10.1109/TC.2006.91
Usage of this product signifies your acceptance of the Terms of Use.