The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - February (1999 vol.10)
pp: 181-192
ABSTRACT
<p><b>Abstract</b>—This paper presents an index-based checkpointing algorithm for distributed systems with the aim of reducing the total number of checkpoints while ensuring that each checkpoint belongs to at least one consistent global checkpoint (or recovery line). The algorithm is based on an equivalence relation defined between pairs of successive checkpoints of a process which allows us, in some cases, to advance the recovery line of the computation without forcing checkpoints in other processes. The algorithm is well-suited for <it>autonomous</it> and <it>heterogeneous</it> environments, where each process does not know any private information about other processes and private information of the same type of distinct processes is not related (e.g., clock granularity, local checkpointing strategy, etc.). We also present a simulation study which compares the checkpointing-recovery overhead of this algorithm to the ones of previous solutions.</p>
INDEX TERMS
Checkpointing, causal dependency, protocols, timestamp management, global snapshot, fault tolerance, rollback-recovery, distributed systems, performance evaluation.
CITATION
Roberto Baldoni, Francesco Quaglia, Paolo Fornara, "An Index-Based Checkpointing Algorithm for Autonomous Distributed Systems", IEEE Transactions on Parallel & Distributed Systems, vol.10, no. 2, pp. 181-192, February 1999, doi:10.1109/71.752783
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool