2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications (2011)
Mar. 22, 2011 to Mar. 25, 2011
Self-stabilizing systems, intended to run for a long time, commonly have to cope with transient faults during their mission. We model the behavior of a distributed self-stabilizing system under such a fault model as a Markov chain. Adding fault detection to a self-correcting non-masking fault tolerant system is required to progress from non-masking systems towards their masking fault tolerant functional equivalents. We introduce a novel measure, called limiting window availability (LWA) and apply it on self-stabilizing systems in order to quantify the probability of (masked) stabilization against the time that is needed for stabilization. We show how to calculate LWA based on Markov chains: first, by a straightforward Markov chain modeling and second, by using a suitable abstraction resulting in a space-reduced Markov chain. The proposed abstraction can in particular be applied to spot fault tolerance bottlenecks in the system design.
Distributed Algorithms, Fault Tolerance, Masking, Non-masking, Self-Stabilization
O. Theel and N. Müllner, "The Degree of Masking Fault Tolerance vs. Temporal Redundancy," 2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications(WAINA), Biopolis, Singapore, 2011, pp. 21-28.