14th International Conference on Distributed Computing Systems (1994)
Pozman, Poland
June 21, 1994 to June 24, 1994
ISBN: 0-8186-5840-1
pp: 536-543
J.-F. Paris , Dept. of Comput. Sci., Houston Univ., TX, USA
We propose a highly available replication control protocol tailored to environments where network partitions are always the result of a gateway failure. Our protocol divides nodes holding replicas into local nodes that can communicate directly with each other and non-local nodes that communicate with other nodes through one or more gateways. While local nodes are assumed to remain up to date as long as they don't crash, non-local nodes are required to maintain a volatile witness on the same network segment as the local nodes and must poll this witness before answering any user request. To speed up recovery from a total failure, each site maintains a list of replicas that were available the last time the data were updated or a replica recovered from a crash. Markov models are used to compare the performance of our protocol with that of the dynamic-linear voting protocol (DLV), which is the best replication control protocol tolerating communication failures. We also observe that volatile witness placement has a strong impact on data availability and gateway nodes are the best location for them.<>
distributed databases, software reliability, system recovery, fault tolerant computing, protocols, storage management, network operating systems, data integrity

