Fourth International Workshop on Assurance in Distributed Systems and Networks (ADSN) (ICDCSW'05)
State Checksum and Its Role in System Stabilization
Columbus, Ohio, USA
June 06-June 10
ISBN: 0-7695-2328-5
Although a self-stabilizing system that suffers from a transient fault is guaranteed to converge to a legitimate state after a finite number of steps, the convergence can be slow if the harmful effects of the fault are allowed to propagate into many processes in the system. Moreover, some safety properties of the system may be violated during the convergence. To address these problems, we propose in this paper the concept of a state checksum -- a redundancy that can be added to the state of a self-stabilizing system so that some classes of faults become visible to the system, and the system can limit the propagation of their harmful effects, and maintain its safety properties during the convergence. To make these concepts concrete, we discuss the case study of a token ring and show how to use fault-detecting and fault-correcting checksums to detect visible faults, limit the propagation of their harmful effects, and ensure that the safety properties of the ring are maintained during the convergence from these faults.
Citation:
Chin-Tser Huang, Mohamed G. Gouda, "State Checksum and Its Role in System Stabilization," icdcsw, vol. 1, pp.29-34, Fourth International Workshop on Assurance in Distributed Systems and Networks (ADSN) (ICDCSW'05), 2005