Issue No. 02 - March (1992 vol. 3)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.127264
A rollback recovery scheme for distributed systems is proposed. The state-save synchronization among processes is implemented by bounding clock drifts such that no state-save synchronization messages are required. Since the clocks are only loosely synchronized, the synchronization overhead can be negligible in many applications. An interprocess communication protocol which encodes state-save progress information within message frames is introduced to checkpoint consistent system states. A rollback recovery algorithm that will force a minimum number of nodes to roll back after failures is developed.
Index Termsdistributed systems; loosely synchronized clocks; clock drifts; state-save synchronizationmessages; interprocess communication protocol; encodes; state-save progressinformation; message frames; consistent system states; rollback recovery algorithm;distributed processing; programming theory; protocols
Z. Tong, W. Tsai and R. Kain, "Rollback Recovery in Distributed Systems Using Loosely Synchronized Clocks," in IEEE Transactions on Parallel & Distributed Systems, vol. 3, no. , pp. 246-251, 1992.