loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2008 Seventh IEEE International Symposium on Network Computing and Applications
Adaptive Checkpoint Replication for Supporting the Fault Tolerance of Applications in the Grid
July 10-July 12
ISBN: 978-0-7695-3192-2
A major challenge in a dynamic Grid with thousands of machines connected toeach other is fault tolerance. The more resources and components involved, themore complicated and error-prone becomes the system. Migol is an adaptive Grid middleware,which addresses the fault tolerance of Grid applications and services by providing the capability to recover applications from checkpoint files automatically. A critical aspect for an automatic recovery is the availability of checkpoint files: If a resource becomes unavailable, it is very likely that the associated storage is also unreachable, e. g. due to a network partition. A strategy to increase the availability of checkpoints isreplication.In this paper, we present the Checkpoint Replication Service. A key feature of this service is the ability to automatically replicate and monitor checkpoints in the Grid.
Index Terms:
Grid Computing, Checkpointing, Replication
Citation:
Andre Luckow, Bettina Schnor, "Adaptive Checkpoint Replication for Supporting the Fault Tolerance of Applications in the Grid," nca, pp.299-306, 2008 Seventh IEEE International Symposium on Network Computing and Applications, 2008
Usage of this product signifies your acceptance of the Terms of Use.