2010 13th IEEE International Conference on Computational Science and Engineering (2010)
Hong Kong, China
Dec. 11, 2010 to Dec. 13, 2010
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CSE.2010.59
Scientific workflows generally involve the distribution of tasks to distributed resources, which may exist in different administrative domains. The use of distributed resources in this way may lead to faults, and detecting them, identifying them and subsequently correcting them remains an important research challenge. We introduce a fault taxonomy for scientific workflows that may help in conducting a systematic analysis of faults, so that the potential faults that may arise at execution time can be corrected (recovered from). The presented taxonomy is motivated by previous work , but has a particular focus on workflow environments (compared to previous work which focused on Grid-based resource management) and demonstrated through its use in Weka4WS.
Scientific Workflows, Fault Tolerance
M. Lackovic, J. A. Bañares, O. F. Rana, R. Tolosana-Calasanz and D. Talia, "A Taxonomy for the Analysis of Scientific Workflow Faults," 2010 13th IEEE International Conference on Computational Science and Engineering(CSE), Hong Kong, China, 2010, pp. 398-403.