The Community for Technology Leaders
Green Image
ABSTRACT
<p>An approach to checkpointing and rollback recovery in a distributed computing system using a common time base is proposed. A common time base is established in the system using a hardware clock synchronization algorithm. This common time base is coupled with the idea of pseudo-recovery points to develop a checkpointing algorithm that has the following advantages: reduced wait for commitment for establishing recovery lines, fewer messages to be exchanged, and less memory requirement. These advantages are assessed quantitatively by developing a probabilistic model.</p>
INDEX TERMS
message exchange; distributed system; checkpointing; rollback recovery; common time base; hardware clock synchronization algorithm; pseudo-recovery points; recovery lines; memory requirement; probabilistic model; distributed processing; fault tolerant computing; system recovery
CITATION
P. Ramanathan, K.G. Shin, "Use of Common Time Base for Checkpointing and Rollback Recovery in a Distributed System", IEEE Transactions on Software Engineering, vol. 19, no. , pp. 571-583, June 1993, doi:10.1109/32.232022
92 ms
(Ver )