This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Managing Faults for Distributed Workflows over Grids
March/April 2010 (vol. 14 no. 2)
pp. 84-88
Onyeka Ezenwoye, South Dakota State University
M. Brian Blake, University of Notre Dame
Gargi Dasgupta, IBM Research
S. Masoud Sadjadi, Florida International University
Selim Kalayci, Florida International University
Liana L. Fong, IBM T.J. Watson Research Center
Grid applications composed of multiple, distributed jobs are common areas for applying Web-scale workflows. Workflows over grid infrastructures are inherently complicated due to the need to both functionally assure the entire process and coordinate the underlying tasks. Often, these applications are long-running, and fault tolerance becomes a significant concern. Transparency is a vital aspect to understanding fault tolerance in these environments.

1. W. Tan, L. Fong, and N. Bobroff, "BPEL4Job: A Fault-Handling Design for Job Flow Management," Proc. 5th Int'l Conf. Service-Oriented Computing, Springer, 2007, pp. 27–42.
2. O. Ezenwoye and S.M. Sadjadi, "TRAP/BPEL: A Framework for Dynamic Adaptation of Composite Services," Proc. Int'l Conf. Web Information Systems and Technologies, Springer, 2007.
3. G. Dasgupta et al., "Design of a Fault-Tolerant Job-Flow Manager for Grid Environments Using Standard Technologies, Job-Flow Patterns, and a Transparent Proxy," Proc. 20th Int'l Conf. Software Engineering and Knowledge Engineering, Knowledge Systems Inst., 2008, pp. 814–819.
4. N. Russell, W.M.P. van der Aalst, and A.H.M. ter Hofstede, "Workflow Exception Patterns," Proc. 18th Int'l Conf. Advanced Information Systems Engineering, E. Dubois, and K. Pohl eds., LNCS 4001, Springer, 2006, pp. 288–302.
5. D. Jordan, Web Services Business Process Execution Language Version 2.0., Oasis, 2007; http://docs.oasis-open.org/wsbpel/2.0/CS01 wsbpel-v2.0-CS01.pdf.
6. A. Anjomshoaa et al., Job Submission Description Language (JSDL) Specification v1.0, JSDL Working Group, 2005; www.gridforum.org/documentsGFD.56.pdf.
7. I. Foster, and C. Kesselman, "Globus: A Metacomputing Infrastructure Toolkit," Int'l J. Supercomputer Applications and High Performance Computing, vol. 11, no. 2, 1997, pp. 115–128.
8. H. Rajic et al., Distributed Resource Management Application API Specification 1.0., tech. report, DRMAA (Distributed Resource Management Application API) working group, Global Grid Forum, 2003.

Index Terms:
grid applications, scientific workflow, service-oriented computing, Web-Scale Workflow
Citation:
Onyeka Ezenwoye, M. Brian Blake, Gargi Dasgupta, S. Masoud Sadjadi, Selim Kalayci, Liana L. Fong, "Managing Faults for Distributed Workflows over Grids," IEEE Internet Computing, vol. 14, no. 2, pp. 84-88, March-April 2010, doi:10.1109/MIC.2010.47
Usage of this product signifies your acceptance of the Terms of Use.