• Publication
  • 2002
  • Issue No. 3 - March
  • Abstract - Matching and Scheduling Algorithms for Minimizing Execution Time and Failure Probability of Applications in Heterogeneous Computing
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Matching and Scheduling Algorithms for Minimizing Execution Time and Failure Probability of Applications in Heterogeneous Computing
March 2002 (vol. 13 no. 3)
pp. 308-323

In a heterogeneous distributed computing system, machine and network failures are inevitable and can have an adverse effect on applications executing on the system. To reduce the effect of failures on an application executing on a failure-prone system, matching and scheduling algorithms which minimize not only the execution time but also the probability of failure of the application must be devised. However, because of the conflicting requirements, it is not possible to minimize both of the objectives at the same time. Thus, the goal of this paper is to develop matching and scheduling algorithms which account for both the execution time and the reliability of the application. This goal is achieved by modifying an existing matching and scheduling algorithm. The reliability of resources is taken into account using an incremental cost function proposed in this paper and the new algorithm is referred to as the reliable dynamic level scheduling algorithm. The incremental cost function can be defined based on one of the three cost functions developed here. These cost functions are unique in the sense that they are not restricted to tree-based networks and a specific matching and scheduling algorithm. The simulation results confirm that the proposed incremental cost function can be incorporated into matching and scheduling algorithms to produce schedules where the effect of failures of machines and network resources on the execution of the application is reduced and the execution time of the application is minimized as well.

[1] R.F. Freund and H.J. Siegel, "Heterogeneous Processing," Computer, vol. 26, no. 6, pp. 13-17, June 1993.
[2] A.A. Khokhar, V.K. Prasanna, M.E. Shaaban,, and C.L. Wang, “Heterogeneous Computing: Challenges and Opportunities,” Computer, pp. 18-27, June 1993.
[3] G.C. Sih and E.A. Lee, “A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures,” IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 2, pp. 175-186, Feb. 1993.
[4] L. Wang, H.J. Siegel, V.P. Roychowdhury, and A.A. Maciejewski “, Task Matching and Scheduling in Heterogeneous Computing Environments Using a Genetic-Algorithm-Based Approach,” J. Parallel and Distributed Computing, vol. 47, no. 1, pp. 1-15, Nov. 1997.
[5] M. Iverson and F. Özgüner, “Dynamic, Competitive Scheduling of Multiple Dags in a Distributed Heterogeneous Environment,” Proc. Seventh Heterogeneous Computing Workshop, 1998.
[6] B.R. Carter, D.W. Watson, R.F. Freund, E. Keith, F. Mirabile, and H.J. Siegel, “Generational Scheduling for Dynamic Task Management in Heterogeneous Computing Systems,” Information Science, vol. 106, no. 3-4, pp. 219-236, 1998.
[7] M. Maheswaran and H.J. Siegel, “A Dynamic Matching and Scheduling Algorithm for Heterogeneous Computing Systems,” Proc. Heterogeneous Computing Workshop, pp. 57-69, 1998.
[8] S.M. Shatz, J.P. Wang, and M. Goto, “Task Allocation for Maximizing Reliability of Distributed Computer Systems,” IEEE Trans. Computers, vol. 41, no. 9, pp. 1,156-1,168, Sept. 1992.
[9] S. Kartik and C.S.R. Murthy, “Task Allocation Algorithms for Maximizing Reliability of Distributed Computing Systems,” IEEE Trans. Computers, vol. 46, pp. 719-724, June 1997.
[10] S.M. Shatz and J.P. Wang, “Models&Algorithms for Reliability-Oriented Task-Allocation in Redundant Distributed-Computer Systems,” IEEE Trans. Reliability, vol. 38, pp. 16-26, Apr. 1989.
[11] S. Kartik and C.S.R. Murthy, “Improved Task Allocation Algorithms to Maximize Reliability of Redundant Distributed Computing Systems,” IEEE Trans. Reliability, vol. 44, pp. 575-586, Dec. 1995.
[12] A. Dogan and F. Özgüner, “Reliable Scheduling of Precedence-Constrained Tasks Using a Genetic Algorithm,” Proc. 2000 Int'l Conf. Parallel and Distributed Pocessing Techniques and Application, pp. 549-555, June 2000.
[13] A. Dogan and F. Özgüner, “Optimal and Suboptimal Reliable Scheduling of Precedence-Constrained Tasks in Heterogeneous Computing,” Proc. 2000 Int'l Conf. Parallel Processing Workshop Network Based Computing, pp. 429-436, Aug. 2000.
[14] M.A. Iverson, “Dynamic Mapping and Scheduling Algorithms for a Multi-User Heterogeneous Computing Environment,” PhD thesis, Ohio State Univ., Columbus, 1999.
[15] M.O. Ball, “Computational Complexity of Network Reliability Analysis: An Overview,” IEEE Trans. Reliability, vol. 35, pp. 230-239, Aug. 1986.
[16] S. Rai and K.K. Aggarwal, “An Efficient Method for Reliability Evaluation of a General Network,” IEEE Trans. Reliability, vol. 27, pp. 206-211, Aug. 1978.
[17] C.S. Raghavendra and S.V. Makam, “Reliability Modeling and Analysis of Computer Networks,” IEEE Trans. Reliability, vol. 35, pp. 156-160, June 1986.
[18] P.A. Jensen and M. Bellmore, “An Algorithm to Determine the Reliability of a Complex System,” IEEE Trans. Reliability, vol. 18, pp. 169-174, Nov. 1969.
[19] Y.G. Chen and M.C. Yuang, “A Cut-Based Method for Terminal-Pair Reliability,” IEEE Trans. Reliability, vol. 45, pp. 413-416, Sept. 1996.
[20] J.S. Plank and W.R. Elwasif, “Experimental Assessment of Workstation Failures and Their Impact on Checkpointing Systems,” Int'l Symp. Fault-Tolerant Computing, pp. 48-57, June 1998.
[21] T.H. Cormen,C.E. Leiserson, and R.L. Rivest,Introduction to Algorithms.Cambridge, Mass.: MIT Press/McGraw-Hill, 1990.
[22] A. Dogan and F. Özgüner, “Trading Off Execution Time for Reliability in Scheduling Precedence-Constrained Tasks in Heterogeneous Computing,” Proc. Int'l Parallel and Distributed Processing Symposium, Apr. 2001.
[23] E.E. Lewis, Introduction to Reliability Engineering. John Wiley&Sons, 1987.
[24] Interagency Working Group on Information Technology Research and Development, “Information Technology: The 21st Century Revolution,” FY2001 Blue Book, Sept. 2000.
[25] M.A. Iverson, F. Özgüner, and G.J. Follen, “Parallelizing Existing Applications in a Distributed Heterogeneous Environment,” Proc. 1995 Workshop Heterogeneous Processing, pp. 93-100, 1995.
[26] I. Foster and C. Kesselman, “Globus: A Metacomputing Infrastructure Toolkit,” Int'l J. Supercomputer Applications, vol. 11, no. 2, pp. 115-128, 1997.
[27] R. Wolski, N.T. Spring, and J. Hayes, “The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing,” J. Future Generation Computing Systems, 1999.
[28] M.A. Iverson, F. Özgüner, and L. Potter, “Statistical Prediction of Task Execution Times through Analytic Benchmarking for Scheduling in a Heterogeneous Environment,” IEEE Trans. Computers, vol. 48, Dec. 1999.

Index Terms:
matching and scheduling, precedence-constrained tasks, heterogeneous computing, reliability, articulation points and bridges, DLS algorithm
Citation:
A. Dogan, F. Ösgünger, "Matching and Scheduling Algorithms for Minimizing Execution Time and Failure Probability of Applications in Heterogeneous Computing," IEEE Transactions on Parallel and Distributed Systems, vol. 13, no. 3, pp. 308-323, March 2002, doi:10.1109/71.993209
Usage of this product signifies your acceptance of the Terms of Use.