This Article 
 Bibliographic References 
 Add to: 
Epidemic Algorithms for Replicated Databases
September/October 2003 (vol. 15 no. 5)
pp. 1218-1238

Abstract—We present a family of epidemic algorithms for maintaining replicated database systems. The algorithms are based on the causal delivery of log records where each record corresponds to one transaction instead of one operation. The first algorithm in this family is a pessimistic protocol that ensures serializability and guarantees strict executions. Since we expect the epidemic algorithms to be used in environments with low probability of conflicts among transactions, we develop a variant of the pessimistic algorithm which is optimistic in that transactions commit as soon as they terminate locally and inconsistencies are detected asynchronously as the effects of committed transactions propagate through the system. The last member of the family of epidemic algorithms is pessimistic and uses voting with quorums to resolve conflicts and improve transaction response time. A simulation study evaluates the performance of the protocols.

[1] N. Suri,M. Hugue, and C. Walter,"Synchronization Issues in Real-Time Systems," Proc. IEEE: Special Issue on Real-Time Computing, vol. 82, no. 1, pp. 41-54, Jan. 1994.
[2] A.D. Agrawal and A.J. Bernstein, "A Non-Blocking Quorum Consensus Protocol for Replicated Data," IEEE Trans. Parallel and Distributed Systems, vol. 2, no. 2, pp. 171-179, Apr. 1991.
[3] D. Agrawal and A. El Abbadi, Storage Efficient Replicated Databases IEEE Trans. Knowledge and Data Eng., vol. 2, no. 3, pp. 342-352, Sept. 1990.
[4] D. Agrawal, A. El Abbadi, and R.C. Steinke, “Epidemic Algorithms in Replicated Databases,” Proc. 16th Symp. Database Systems (PODS), pp. 161-172, 1997.
[5] R. Agrawal, M. Carey, and M. Livny, Concurrency Control Performance Modeling: Alternatives and Implications Performance of Concurrency Control Mechanisms in Centralized Database Systems, V. Kumar, ed., Prentice Hall, 1996.
[6] R. Agrawal, M. Carey, and M. Livny, “Models for Studying Concurrency Control Performance: Alternatives and Implications,” Proc. ACM SIGMOD Int'l Conf. Management of Data, 1985.
[7] T. Anderson, Y. Breitbart, H. Korth, and A. Wool, “Replication, Consistency, and Practicality: Are These Mutually Exclusive,” Proc. ACM SIGMOD Int'l Conf. Management of Data, June 1998.
[8] P.A. Bernstein and E. Newcomer, Principles of Transaction Processing, Morgan Kaufmann, San Mateo, Calif., 1997.
[9] Y. Breitbart, R. Komondoor, R. Rastogi, S. Seshadri, and A. Silberschatz, “Update Propagation Protocols For Replicated Databases,” Proc. ACM SIGMOD Int'l Conf. Management of Data, SIGMOD Record, vol. 28, no. 2, June 1999.
[10] Y. Breitbart and H.F. Korth, “Replication and Consistency: Being Lazy Helps Sometimes,” Proc. 16th Symp. Database Systems (PODS), pp. 173-184, 1999.
[11] S. Ceri and S. Owicki, On the Use of Optimistic Methods for Concurrency Control in Distributed Databases Proc. Sixth Berkeley Workshop Distributed Data Management and Computer Networks, pp. 117-129, Feb. 1982.
[12] A. Demers, D. Greene, C. Hauser, W. Irish, J. Larson, S. Shenker, H. Sturgis, D. Swinehart, and D. Terry, Epidemic Algorithms for Replicated Database Maintenance Proc. Sixth ACM Symp. Principles of Distributed Computing, 1987.
[13] D.K. Gifford, “Weighted Voting for Replicated Data,” Proc. Seventh ACM SIGOPS Symp. Operating Systems Principles, pp. 150-159, Dec. 1979.
[14] J. Gray, P. Helland, P. O'Neil, and D. Shasha, “The Dangers of Replication and a Solution,” Proc. 1996 ACM SIGMOD Conf. Management of Data, SIGMOD Record, pp. 173-182, June 1996.
[15] R.G. Guy, J.S. Heidemann, W. Mak, T.W. PageJr., G.J. Popek, and D. Rothmeier, Implementation of the Ficus File System Proc. Summer USENIX Conf., pp. 63-71, June 1990.
[16] A.A. Heddaya, M. Hsu, and W.E. Weihl, Two Phase Gossip: Managing Distributed Event Histories Information Sciences: An Int'l J., special issue on databases, vol. 49, nos. 1, 2, 3, pp. 35-57, Oct./Nov./Dec. 1989.
[17] J. Holliday, D. Agrawal, and A. El Abbadi, Database Replication Using Epidemic Communications Proc. European Conf. Parallel and Distributed Systems (EUROPAR 2000), pp. 427-434, Aug. 2000.
[18] J. Holliday, D. Agrawal, and A. El Abbadi, Database Replication Using Epidemic Update Technical Report TRCS 00-01, Dept. of Computer Science, Univ. of California at Santa Barbara, Jan. 2000. Also available at, 2002.
[19] J. Holliday, R. Steinke, D. Agrawal, and A. El Abbadi, Epidemic Quorums for Managing Replicated Data Technical Report TRCS 99-32, Dept. of Computer Science, Univ. of California at Santa Barbara, 1999. Also available at 2002.
[20] J. Holliday, R. Steinke, D. Agrawal, and A.E. Abbadi, Epidemic Quorums for Managing Replicated Data Proc. 19th IEEE Int'l Performance, Computing, and Comm. Conf., 2000.
[21] H.V. Jagadish, A.O. Mendelzon, and I.S. Mumick, “Managing Conflicts Between Rules,” Proc. 15th ACM SIGACT/SIGMOD Symp. Principles of Database Systems, pp. 192-201, 1996.
[22] L. Kawell, S. Beckhardt, T. Halvorsen, R. Ozie, and L. Greif, Replicated Document Management in a Group Communication System Proc. Conf. Computer Supported Cooperative Work, 1988.
[23] P.J. Keleher, Decentralized Replicated-Object Protocols Proc. 18th ACM Symp. Principles of Distributed Computing, May 1999.
[24] H.T. Kung and J.T. Robinson, "On Optimistic Methods for Concurrency Control," ACM Trans. Database Systems, vol. 6, no. 2, pp. 213-226, June 1981.
[25] R. Ladin, B. Liskov, L. Shrira, and S. Ghemawat, "Providing High Availability Using Lazy Replication," ACM Trans. Computer Systems, vol. 10, no. 4, pp. 360-391, Nov. 1992.
[26] L. Lamport, "Time, clocks and the ordering of events in a distributed system," Comm. ACM, vol. 21, no. 7, pp. 558-565, July 1978.
[27] B. Liskov and R. Ladin, Highly Available Distributed Services and Fault-Tolerant Distributed Garbage Collection Proc. Fifth ACM Symp. Principles of Distributed Computing, pp. 29-39, 1986.
[28] F. Mattern, Time and Global States of Distributed Systems Proc. 1988 Int'l Workshop Parallel and Distributed Algorithms, 1989.
[29] Oracle, Oracle7 Server Distributed Systems: Replicated Data, Oracle part number A21903, Oracle, Redwood Shores, CA, Mar. 1994. Also available at / trcs/index.shtml, WG73-doc/server/sd273/tochtml, 2002.
[30] K. Patersen, M.J. Spreitzer, D.B. Terry, M.M. Theimer, and A.J. Demers, “Flexible Update Propagation for Weakly Consistent Replication,” Proc. 16th ACM Symp. Operating Systems Principles, Oct. 1997.
[31] M. Rabinovich, N.H. Gehani, and A. Kononov, Scalable Update Propagation in Epidemic Replicated Databases Proc. Int'l Conf. Extending Data Base Technology, pp. 207-222, 1996.
[32] M. Satyanarayanan, J. Kistler, P. Kumar, M. Okasaki, E. Siegel, and D. Steere, "Coda: A Highly Available File System for a Distributed Workstation Environment," IEEE Trans. Computers, vol. 39, no. 4, Apr. 1990.
[33] F.B. Schneider, Synchronization in Distributed Programs ACM Trans. Programming Languages and Systems, vol. 4, no. 2, pp. 125-148, Apr. 1982.
[34] W.E. Weihl, Distributed Version Management of Read-Only Actions IEEE Trans. Software Eng., vol. 13, no. 1, pp. 55-64, Jan. 1987.
[35] G.T. Wuu and A.J. Bernstein, Efficient Solutions to the Replicated Log and Dictionary Problems Proc. Third ACM Symp. Principles of Distributed Computing, pp. 233-242, Aug. 1984.

Index Terms:
Database replication, distributed databases, epidemic communication.
JoAnne Holliday, Robert Steinke, Divyakant Agrawal, Amr El Abbadi, "Epidemic Algorithms for Replicated Databases," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 5, pp. 1218-1238, Sept.-Oct. 2003, doi:10.1109/TKDE.2003.1232274
Usage of this product signifies your acceptance of the Terms of Use.