This Article 
 Bibliographic References 
 Add to: 
VELOS: A New Approach for Efficiently Achieving High Availability in Partitioned Distributed Systems
April 1996 (vol. 8 no. 2)
pp. 305-321

Abstract—This work presents a new protocol, VELOS, for tolerating partitionings in distributed systems with replicated data. Our primary goals were influenced by efficiency and availability constraints. The proposed protocol achieves optimal availability, according to a well known metric, while ensuring one-copy serializability. In addition, however, VELOS is designed to reduce the cost involved in achieving high availability. We have developed mechanisms through which transactions, in the absence of failures, can access replicated data objects and observe shorter delays than related protocols, and impose smaller loads on the network and the servers. Furthermore, VELOS offers high availability without relying on system transactions that must execute to restore availability when failures and recoveries occur. Such system transactions typically access all (replicas of all) data objects and thus introduce significant delays to user transactions and consume large quantities of resources such as network bandwidth and CPU cycles. Thus, we offer our protocol as a proof that high availability can be achieved inexpensively.

[1] A.D. Agrawal and A.J. Bernstein, "A Non-Blocking Quorum Consensus Protocol for Replicated Data," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 2, pp. 171-179, Apr. 1991.
[2] P. Bernstein, V. Hadzilacos, and N. Goodman, Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987.
[3] P. Bernstein and N. Goodman, "The Failure and Recovery Problem for Replicated Databases," Proc. Second ACM Symp. Principles of Distributed Computing, pp. 114-122, Aug. 1983.
[4] K. Birman, "Replication and Fault-Tolerance in the ISIS System," Proc. 10th ACM Symp. Operating Systems Principles, pp. 79-86, Dec. 1985.
[5] K. Brahmadathan and K.V.S. Ramarao, "Read-Only Transactions in Partitioned Replicated Databases," Proc. Fifth Int'l Conf. Data Engineering, pp. 522-529, Feb. 1989.
[6] B.A. Coan, B. Oki, and E.K. Kolodner, "Limitations on Database Availability When Networks Partition," Proc. Fifth ACM Symp. Principles of Distributing Computing, pp. 187-194, Aug. 1986.
[7] S.B. Davidson, H. Garcia-Molina, and D. Skeen, "Consistency in Partitioned Networks," ACM Computing Surveys, vol. 17, no. 3, pp. 341-370, Sept. 1985.
[8] D.L. Eager and K.C. Sevick, "Achieving Robustness in Distributed Database Systems," ACM Trans. Database Systems, vol. 8, no. 3, pp. 354-381, Sept. 1983.
[9] A. El Abbadi and S. Toue.g., "Availability in Partitioned Replicated Databases (extended abstract)," Proc. Fifth ACM Symp. Principles of Database Systems, pp. 240-251, Mar. 1986.
[10] A. El Abbadi and S. Toueg, "Maintaining Availability in Partitioned Replicated Databases," ACM Trans. Databases Systems, vol. 14, no. 2, pp. 264-290, June 1989.
[11] A. El Abbadi, D. Skeen, and F. Cristian, "An Efficient, Fault-Tolerant Protocol for Replicated Data Management," Proc. Fourth ACM Symp. Principles of Database Systems, pp. 215-229, 1985.
[12] D.K. Gifford, “Weighted Voting for Replicated Data,” Proc. Seventh ACM SIGOPS Symp. Operating Systems Principles, pp. 150-159, Dec. 1979.
[13] M. Herlihy, "Dynamic Quorum Adjustment for Partitioned Data," ACM Trans. Database Systems, vol. 12, no. 2, pp. 170-194, June 1987.
[14] S. Jajodia and D. Mutchler, "Dynamic Voting," Proc. 1987 ACM Int'l Conf. Management of Data (SIGMOD), pp. 227-238, 1987.
[15] D.B. Johnson and L. Raab, "A Tight Upper Bound on the Benefits of Replication and Consistency Control Protocols," Proc. ACM Symp. Principles of Database Systems, pp. 75-81, 1991.
[16] W.H. Kohler, "A Survey of Techniques for Synchronization and Recovery in Decentralized Computing Systems," ACM Computing Surveys, vol. 13, no. 2, pp. 149-185, June 1981.
[17] B. Lampson, "Atomic Transactions," Lecture notes in Computer Science—Distributed Systems: Architecture and Implementation, vol. 105, pp. 246-265. Springer-Verlag, 1981.
[18] D.D.E. Long, "The Management of Replication in a Distributed System," Univ. of California, San Diego, Dept. of Computer Science, PhD thesis (available as technical report from the Univ. of California, Santa Cruz, UCSC-CRL-88-07), 1988.
[19] B.M. Oki and B. Liskov, "Viewstamped Replication: A New Primary Copy Method to Support Highly Available Distributed Systems," Proc. Seventh ACM Symp. Principles Distributed Computing, pp. 8-17, Aug. 1988.
[20] C. Papadimitriou, "The Serializability of Concurrent Updates," J. ACM, vol. 26, no. 4, pp. 631-653, 1979.
[21] J.-F. Paris, "Voting with Witnesses: A Consistency Scheme for Replicated Files," Proc. Sixth Int'l Conf. Distributed Computing Systems, pp. 606-612, May 1986.
[22] P. Triantafillou, "High Availability is Not Enough," Proc. Second IEEE Workshop Management of Replicated Data, pp. 40-43, Nov. 1992.
[23] P. Triantafillou and D.J. Taylor, "Multi-Class Replicated Data Management: Exploiting Replication to Improve Efficiency," IEEE Trans. Parallel and Distributed Systems, pp. 121-139, Feb. 1994.
[24] P. Triantafillou and D.J. Taylor, "A New Paradigm for High Availability and Efficiency in Replicated Distributed Databases," Proc. Second IEEE Symp. Parallel and Distributed Processing, pp. 136-143, Dec.9-13 1990.

Index Terms:
Availability, concurrency control, distributed computing, partitionings, recovery, replication, transactions.
Peter Triantafillou, David J. Taylor, "VELOS: A New Approach for Efficiently Achieving High Availability in Partitioned Distributed Systems," IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 2, pp. 305-321, April 1996, doi:10.1109/69.494168
Usage of this product signifies your acceptance of the Terms of Use.