The Community for Technology Leaders
Green Image
<p><b>Abstract</b>—In this paper, we consider the problem of constructing consistent global checkpoints that contain a given set of checkpoints. We address three important issues related to this problem. First, we define the maximum and minimum consistent global checkpoints containing a set <it>S</it>, and give algorithms to construct them. These algorithms are based on reachability analysis on a rollback-dependency graph. Second, we introduce a concept called "rollback-dependency trackability" that enables this analysis to be performed efficiently for a certain class of checkpoint and communication models. We define the least stringent of these models ("FDAS"), and put it in context with other models defined in the literature. Significant in this is a way to use FDAS to provide efficient rollback recovery for applications that do not satisfy perfect piecewise determinism. Finally, we describe several applications of the theorems and algorithms derived in this paper to demonstrate the capability of our approach to unify, generalize, and extend many previous works.</p>
Algorithms, distributed systems, consistent global states, distributed debugging, deadlock recovery, fault tolerance, checkpointing, rollback recovery, message logging, vector timestamps.

Y. Wang, "Consistent Global Checkpoints that Contain a Given Set of Local Checkpoints," in IEEE Transactions on Computers, vol. 46, no. , pp. 456-468, 1997.
93 ms
(Ver 3.3 (11022016))