The Community for Technology Leaders
Cluster Computing and the Grid, IEEE International Symposium on (2009)
Shanghai, China
May 18, 2009 to May 21, 2009
ISBN: 978-0-7695-3622-4
pp: 580-585
Workflow management system is widely accepted and used in the wide area network environment, especially in the e-Science application scenarios, to coordinate the operation of different functional components and to provide more powerful functions. The error-prone nature of the wide area network environment makes the fault-tolerance requirements of workflow management become more and more urgent. In this paper, we propose Cesar-FD, a stateful fault detection mechanism, which builds up states related to the runtime and external environments of workflow management system by aggregating multiple messages and provides more accurate notifications asynchronously. We demonstrate the use of this mechanism in the Drug Discovery Grid environment by two use cases. We also show that it can be used to detect faulty situations more accurately.
Stateful fault detection, Complex Event Processing, Compenstation mechanism, Drug Discovery Grid

