Theoretical Analysis for Communication-Induced Checkpointing Protocols with Rollback-Dependency Trackability
Issue No. 10 - October (1998 vol. 9)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.730526
<p><b>Abstract</b>—<it>Rollback-Dependency Trackability</it> (RDT) is a property that states that all rollback dependencies between local checkpoints are on-line trackable by using a transitive dependency vector. In this paper, we address three fundamental issues in the design of communication-induced checkpointing protocols that ensure RDT. First, we prove that the following intuition commonly assumed in the literature is in fact false: If a protocol forces a checkpoint only at a stronger condition, then it must take, at most, as many forced checkpoints as a protocol based on a weaker condition. This result implies that the common approach of sharpening the checkpoint-inducing condition by piggybacking more control information on each message may not always yield a more efficient protocol. Next, we prove that there is no optimal on-line RDT protocol that takes fewer forced checkpoints than any other RDT protocol for all possible communication patterns. Finally, since comparing checkpoint-inducing conditions is not sufficient for comparing protocol performance, we present some formal techniques for comparing the performance of several existing RDT protocols.</p>
Rollback-dependency trackability, on-line algorithms, checkpointing, rollback recovery, communication-induced protocols.
Jichiang Tsai, Sy-Yen Kuo, Yi-Min Wang, "Theoretical Analysis for Communication-Induced Checkpointing Protocols with Rollback-Dependency Trackability", IEEE Transactions on Parallel & Distributed Systems, vol. 9, no. , pp. 963-971, October 1998, doi:10.1109/71.730526