This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Performance Modeling of Distributed and Replicated Databases
July/August 2000 (vol. 12 no. 4)
pp. 645-672

Abstract—This paper surveys performance models for distributed and replicated database systems. Over the last 20 years, a variety of such performance models have been developed and they differ in 1) which aspects of a real system are or are not captured in the model (e.g., replication, communication, nonuniform data access, etc.) and 2) how these aspects are modeled. We classify the different alternatives and modeling assumptions and discuss their interdependencies and expressiveness for the representation of distributed databases. This leads to a set of building blocks for analytical performance models. To illustrate the work that is surveyed, we select a combination of these proven modeling concepts and give an example of how to compose a balanced analytical model of a replicated database. We use this example to show how to derive meaningful performance values and to discuss the applicability and expressiveness of performance models for distributed and replicated databases. Finally, we compare the analytical results to measurements in a distributed database system.

[1] R. Abbott and H. Garcia-Molina, “Reliable Distributed Database Management,” Proc. IEEE, vol. 75, no. 5, pp. 601-620, May 1987.
[2] S. Acharya and S.B. Zdonik, “An Efficient Scheme for Dynamic Data Replication,” Technical Report CS-93-43, Dept. of Computer Science, Brown Univ., Sept. 1993.
[3] R. Agrawal, M.J. Carey, and M. Livny, "Concurrency Control Performance Modeling: Alternatives and Implications," ACM Trans. Database Systems, vol. 12, no. 4, pp. 609-654, Dec. 1987.
[4] G. Alonso, “Partial Database Replication and Group Communication Primitives,” Proc. Second European Research Seminar Advances in Distributed Systems (ERSADS '97), Mar. 1997.
[5] R. Alonso, D. Barbara, and H. Garcia-Molina, "Data Caching Issues in an Information Retrieval System," ACM Trans. Database Systems, vol. 15, no. 3, pp. 359-384, Sept. 1990.
[6] T. Anderson, Y. Breitbart, H. Korth, and A. Wool, “Replication, Consistency, and Practicality: Are These Mutually Exclusive,” Proc. ACM SIGMOD Int'l Conf. Management of Data, June 1998.
[7] F. Bacelli and E.G. Coffmann, “A Database Replication Analysis Using an M/M/m Queue with Service Interruptions,” Performance Evaluation Review, vol. 11, no. 4, pp. 102-107, 1983.
[8] S. Banerjee, V.O.K. Li, and C. Wang, “Performance Analysis of the Send-on-Demand: A Distributed Database Concurrency Control Protocol for High-Speed Networks,” Computer Comm., vol. 17, no. 3, pp. 189-204, Mar. 1994.
[9] D. Barbara and H. Garcia-Molina, “How Expensive Is Data Replication? An Example,” Proc. Second Int'l Conf. Distributed Computing Systems, pp. 263-268, Feb. 1982.
[10] T. Beuter and P. Dadam, “Principles of Replication Control in Distributed Database Systems,” Informatik Forschung und Technik, vol. 11, no. 4, pp. 203-212, 1996, in German.
[11] A.B. Bondi and V.Y. Jin, "A Performance Model of a Design for a Minimally Replicated Distributed Database for Database-Driven Telecommunications Services," Distributed and Parallel Databases, vol. 4, pp. 295-397, 1996.
[12] E. Born, “Analytical Performance Modelling of Lock Management in Distributed Systems,” Distributed Systems Eng., vol. 3, no. 1, pp. 68-76, Mar. 1996.
[13] C.J. Bouras and P.G. Spirakis, “Performance Modeling of Distributed Timestamp Ordering: Perfect and Imperfect Clocks,” Performance Evaluation, vol. 26, no. 2, pp. 105-130, Apr. 1996.
[14] A. Burger, V. Kumar, and M.L. Hines, “Performance of Multiversion and Distributed Two-Phase Locking Concurrency Control Mechanisms in Distributed Database,” Information Sciences, vol. 96, nos. 1-2, pp. 129-157, Jan. 1997.
[15] J. Cai, “Simulation and Evaluation of Distributed Database Systems,” Springer Informatik Fachberichte 154, pp. 313-326, Oct. 1987.
[16] M.J. Carey and M. Livny, "Distributed Concurrency Control Performance: A Study of Algorithms, Distribution and Replication," Proc. 14th Very Large Data Bases Conf.,Los Angeles, 1988.
[17] M.J. Carey and M. Livny, “Conflict Detection Tradeoffs for Replicated Data,” Performance of Concurrency Control Mechanisms in Centralized Database Systems, V. Kumar, ed., Prentice Hall, 1996.
[18] W. Cellary, E. Gelenbe, and T. Morzy, Concurrency Control in Distributed Database Systems. Holland: Elsevier Science, 1988.
[19] S. Ceri, M.A.H. Houtsma, A.M. Keller, and P. Samarati, “A Classification of Update Methods for Replicated Databases,” Technical Report STAN-CS-91-1392, Stanford Univ., Oct. 1991.
[20] S.-W. Chen and C. Pu, “A Structural Classification of Integrated Replica Control Mechanisms,” Technical Report CUCS-006-92, Columbia Univ., New York, 1992.
[21] S.Y. Cheung, M.H. Ammar, and M. Ahamad, "The Grid Protocol: A High Performance Scheme for Maintaining Replicated Data," IEEE Trans. Knowledge and Data Eng., vol. 4, no. 6, pp. 582-592, 1992.
[22] B. Ciciani, D.M. Dias, and P.S. Yu, “Analysis of Replication in Distributed Database Systems,” IEEE Trans. Knowledge and Data Eng., vol. 2, no. 2, pp. 247-261, June 1990.
[23] B. Ciciani, D.M. Dias, and P.S. Yu, “Analysis of Concurrency-Coherency Control Protocols for Distributed Transaction Processing Systems with Regional Locality,” IEEE Trans. Software Eng., vol. 18, no. 10, pp. 899-914, Oct. 1992.
[24] B.A. Coan, B. Oki, and E.K. Kolodner, "Limitations on Database Availability When Networks Partition," Proc. Fifth ACM Symp. Principles of Distributing Computing, pp. 187-194, Aug. 1986.
[25] E.G. Coffmann, E. Gelenbe, and B. Plateau, “Optimization of the Number of Copies in a Distributed System,” IEEE Trans. Software Eng., vol. 7, no. 1, pp. 78-84, Jan. 1981.
[26] S.B. Davidson, H. Garcia-Molina, and D. Skeen, "Consistency in Partitioned Networks," ACM Computing Surveys, vol. 17, no. 3, pp. 341-370, Sept. 1985.
[27] A. DeSimone and S. Nanda, “Wireless Data: Systems, Standards, Services,” Wireless Networks, vol. 1, no. 3, pp. 241-253, 1995.
[28] D. DeWitt and J. Gray, “Parallel Database Systems: The Future of High-Performance Database Systems,” Comm. ACM, Vol. 35, No. 6, June 1992, pp. 85-98.
[29] D.M. Dias, P.S. Yu, and B.T. Bennett, “On Centralized versus Geographically Distributed Database Systems,” Proc. Seventh Int'l Conf. Distributed Computing Systems, pp. 64-71, 1987.
[30] M. Ebling, L. Mummert, and D. Steere, “Overcoming the Network Bottleneck in Mobile Computing,” Proc. IEEE Workshop Mobile Computing Systems and Applications, 1994.
[31] R. Gallersdörfer, M. Jarke, and M. Nicola, “The ADR Replication Manager,” Int'l J. Cooperative Information Systems (IJCS), vol. 8, no. 1, pp. 15-45, Mar. 1999.
[32] R. Gallersdörfer and M. Nicola, “Improving Performance in Replicated Databases through Relaxed Coherency,” Proc. 21st Conf. Very Large Databases, pp. 445-456, Sept. 1995.
[33] H. Garcia-Molina, “Performance of the Update Algorithms for Replicated Data in a Distributed Database,” PhD dissertation, revised, Computer Science Dept., Stanford Univ., 1982.
[34] H. Garcia-Molina and G. Wiederhold, Read-Only Transactions in a Distributed Database System ACM Trans. Database Systems, vol. 7, no. 2, pp. 209-234, June 1982.
[35] J. Gray, ed., Benchmark Handbook for Database and Transaction Processing Systems, second ed., Morgan Kaufmann, San Mateo, Calif., 1993.
[36] J. Gray, P. Helland, P. O'Neil, and D. Shasha, “The Dangers of Replication and a Solution,” Proc. 1996 ACM SIGMOD Conf. Management of Data, SIGMOD Record, pp. 173-182, June 1996.
[37] J. Gray and A. Reuter, Transaction Processing: Concepts and Techniques, Morgan Kauffman, 1993.
[38] D. Gross and C.M. Harris,Fundamentals of Queueing Theory, 2nd edition. New York: John Wiley&Sons, 1985.
[39] B.R. Haverkort, “Approximate Analysis of Networks of PH/PH/1/K Queues: Theory&Tool Support,” Quantitative Evaluation of Computing and Comm. Systems, pp. 239-253, 1995.
[40] A.A. Helal, A.A. Heddaya, and B.B. Bhargava, Replication Techniques in Distributed Systems. Kluwer Academic, 1996.
[41] S.L. Hung and K.Y. Lam, “Performance Study of 2 Phase Locking in Distributed Database System with Mixed Transaction Classes,” Proc. 24th Ann. Computer Simulation Conf., pp. 289-293, 1992.
[42] S.Y. Hwang, K.S. Lee, and Y.H. Chin, “Data Replication in a Distributed System: A Performance Study,” Proc. Seventh Int'l Conf. Database and Expert Systems Applications, pp. 708-717, 1996.
[43] T. Imielinski and B.R. Badrinath, “Mobile Wireless Computing: Challenges in Data Management,” Technical Report DCS-TR, Dept. of Computer Science, Rutgers Univ., 1994.
[44] R. Jain, The Art of Computer Systems Performance Analysis—Techniques for Experiment Design, Measurement, Simulation and Modeling. John Wiley&Sons, 1991.
[45] B.C. Jenq, W.H. Kohler, and D. Towsley, “A Queueing Network Model for a Distributed Database Test-bed System,” IEEE Trans. Software Eng., vol. 14, no. 7, pp. 908-921, July 1988.
[46] B.C. Jenq, B.C. Twichell, and T.W. Keller, "Locking Performance in a Shared Nothing Parallel Database Machine," IEEE Trans. Knowledge and Data Eng., vol. 1, no. 4, pp. 530-543, Dec. 1989.
[47] B. Kähler and O. Risnes, “Extending Logging for Database Snapshot Refresh,” Proc. 13th Int'l Conf. Very Large Databases, pp. 389-398, 1987.
[48] B. Kemme and G. Alonso, “A Suite of Database Replication Protocols Based on Group Communication Primitives,” Proc. 18th Int'l Conf. Distributed Computing Systems, 1998.
[49] C.S. Keum, E.K. Hong, W.Y. Kim, and K.Y. Whang, “Performance Evaluation of Replica Control Algorithms in a Locally Distributed Database System,” Proc. Fourth Int'l Conf. Database Systems for Advanced Database Applications, pp. 388-396, Apr. 1995.
[50] L. Kleinrock, Queueing Systems, Volume I: Theory. John Wiley&Sons, 1975.
[51] Y. Kuang and R. Mukkamala, “Performance Analysis of Static Locking in Replicated Distributed Database Systems,” IEEE Proc. SOUTHEASTCON '91, vol. 2, pp. 698-701, 1991.
[52] A. Kumar and A. Segev, “Cost and Availability Tradeoffs in Replicated Data Concurrency Control,” ACM Trans. Database Systems, vol. 18, no. 1, pp. 102-131, Mar. 1993.
[53] K.K. Leung, “An Update Algorithm for Replicated Signaling Databases in Wireless and Advanced Intelligent Networks,” IEEE Trans. Computers, vol. 46, no. 3, pp. 362-367, Mar. 1997.
[54] D. Liang and S.K. Tripathi, “Performance Analysis of Long-Lived Transaction Processing Systems with Rollbacks and Aborts,” IEEE Trans. Knowledge and Data Eng., vol. 8, no. 5, pp. 802-815, Oct. 1996.
[55] M.C. Little and D.L. McCue, “The Replica Management System: A Scheme for Flexible and Dynamic Replication,” Proc. Second Workshop Configurable Distributed Systems, Mar. 1994.
[56] W. Mariasoosai and M. Singhal, “A Concurrency Control Algorithm for Replicated Database Systems,” Proc. Iint'l Symp. Memory Management, pp. 143-147, Oct. 1990.
[57] J. McDermott and R. Mukkamala, “Performance Analysis of Transaction Management Algorithms for the SINTRA Replicated Architecture Database Systems,” IFIP Trans. (Computer Science and Technology), vol. A-47, pp. 215-234, 1994.
[58] Y. Miyanishi, K. Nakamura, F. Sato, T. Watanabe, and T. Mizuno, “An Analysis of Data Updating Performance in Distributed Systems and a Proposal of a Data Updating Algorithm,” Proc. 10th Int'l Conf. Networking, pp. 53-59, 1996.
[59] R. Mukkamala, “Design of Partially Replicated Distributed Database Systems,” Technical Report TR-87-04, Dept. of Computer Science, Univ. of Iowa, 1987.
[60] R. Mukkamala, “Measuring the Effects of Data Distribution Models on Performance Evaluation of Distributed Database Systems,” IEEE Trans. Knowledge and Data Eng., vol. 1, no. 4, pp. 494-507, Dec. 1989.
[61] R. Mukkamala, “Measuring the Effects of Distributed Database Models on Transaction Availability Measures,” Performance Evaluation, vol. 14, pp. 1-20, 1992.
[62] R. Mukkamala and S.C. Bruell, “Efficient Schemes to Evaluate Transaction Performance in Distributed Database Systems,” The Computer J., vol. 33, no. 1, pp. 79-89, Feb. 1990.
[63] R.D. Nelson and B.R. Iyer, “Analysis of a Replicated Database,” Performance Evaluation, vol. 5, pp. 133-148, 1985.
[64] M. Nicola, “Performance Evaluation of Distributed, Replicated, and Wireless Information Systems,” doctoral thesis AIB-99-10, Dept. Informatik V, Technical Univ. of Aachen, Oct. 1999.
[65] J.D. Noe and A. Andreassian, “Effectiveness of Replication in Distributed Computer Networks,” Proc. Seventh Int'l Conf. Distributed Computing Systems, pp. 508-513, 1987.
[66] C.U. Orji, “A Methodology for Benchmarking Distributed Database Management Systems,” Proc. Seventh Int'l Conf. Data Eng., pp. 612-619, 1991.
[67] E. Pacitti and E. Simon, “Update Propagation Strategies to Improve Freshness of Data in Lazy Master Schemes,” Technical Report No. 3233, INRIA Rocquencourt, France, Aug. 1997.
[68] S.O. Park, C.R. Carlson, and T.M. Chen, “Performance Analysis of Distributed Database System in Inter-Network Environments: A Queueing Analytic Approach,” Proc. Int'l Conf. Modeling and Simulation, pp. 202-205, 1994.
[69] A. Payne, “Designing the Databases of the Intelligent Network,” Proc. Eighth Int'l Conf. Software Eng. for Telecomm. Systems and Services, pp. 37-41, 1992.
[70] A. Raghuram, T.W. Morgan, B. Rajaraman, and Y. Ronen, “Approximation for the Mean Value Performance of Locking Algorithms for Distributed Database Systems,” Annals of Operations Research, vol. 36, nos. 1-4, pp. 299-346, May 1992.
[71] J.F. Ren, Y. Takahashi, and T. Hasegawa, “Analysis of Impact of Network Delay on Multiversion Timestamp Algorithms in DDBS,” Performance Evaluation, pp. 21-50, July 1996.
[72] D. Saha, S. Rangarajan, and S.K. Tripathi, “An Analysis of the Average Message Overhead in Replica Control Protocols,” IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 10, pp. 1,026-1,034, Oct. 1996.
[73] A. Shah and D. Ghosal, “A Stochastic Analysis of the Performance of Distributed Databases with Site and Link Failures,” Technical Report 90-1072, Dept. of Computer Science, Cornell Univ., New York, 1990.
[74] A. Shah and K. Marzullo, “Trade-Offs between Replication and Availability in Distributed Databases,” Technical Report 89-1065, Dept. of Computer Science, Cornell Univ., New York, 1989.
[75] A.P. Sheth, A. Singhal, and M.T. Liu, “An Analysis of the Effect of Network Parameters on the Performance of Distributed Database Systems,” IEEE Trans. Software Eng., vol. 11, no. 10, pp. 1,174-1,184, Oct. 1985.
[76] S.C. Shyu, V.O.K. Li, and C.P. Weng, “Performance Analysis of Static Locking in Distributed Database Systems,” IEEE Trans. Computers, vol. 39, no. 6, pp. 741-751, June 1990.
[77] A. Silberschatz and P.B. Galvin, Operating Systems Concepts, 5th ed., Addison-Wesley, Reading, Mass., 1998.
[78] R. Simha and A. Majumdar, “An Urn Model with Application to Database Performance Evaluation,” Computers&Operations Research, vol. 24, no. 4, pp. 289-300, Apr. 1997.
[79] M. Singhal, “Concurrency Control Algorithms and Their Performance for Replicated Database Systems,” PhD dissertation, Dept. of Computer Science, Univ. of Maryland, 1986.
[80] M. Singhal, “Update Transport: A New Technique for Update Synchronization in Replicated Database Systems,” IEEE Trans. Software Eng., vol. 16, pp. 1,325-1,336, 1990.
[81] S.H. Son and N. Haghighi, “Performance Evaluation of Multiversion Database Systems,” Proc. Sixth Int'l Conf. Data Eng., pp. 129-136, Feb. 1990.
[82] S.H. Son and S. Kouloumbis, “Performance Evaluation of Replication Control Algorithms for Distributed Database Systems,” Technical Report CS-TR-91-11, Univ. of Virginia, 1991.
[83] S.H. Son and F. Zhang, “Real-Time Replication Control for Distributed Database Systems: Algorithms and Their Performance,” Proc. Fourth Int'l Conf. Database Systems for Advanced Database Applications, pp. 214-221, Apr. 1995.
[84] M. Stonebraker, “Concurrency Control and Consistency of Multiple Copies of Data in Distributed Ingres,” IEEE Trans. Software Eng., vol. 5, no. 3, pp. 188-194, 1979.
[85] A.T. Tai and J.F. Meyer, “Performability Management in Distributed Satabase Systems: An Adaptive Concurency Control Protocol,” Proc. Fourth Int'l Workshop Modeling, Analysis, and Simulation of Computer and Telecomm. Systems, pp. 212-216, 1996.
[86] Y.C. Tay,R. Suri,, and N. Goodman,“Locking performance in centralized databases,” ACM Trans. Database Systems, vol. 10, no. 4, pp. 415-462, 1985.
[87] C. Thanos, E. Bertino, and C. Carlesi, “The Effect of Two-Phase Locking on the Performance of a Distributed Database System,” Performance Evaluation, vol. 8, no. 2, pp. 129-157, 1988.
[88] A. Thomasian, "On the Number of Remote Sites Accessed in Distributed Transaction Processing," IEEE Trans. Parallel and Distributed Processing, vol. 4, no. 1, pp. 99-103, Jan. 1993.
[89] A. Thomasian, Database Concurrency Control: Methods, Performance, and Analysis, Kluwer Academic, 1996.
[90] A. Thomasian, “Concurrency Control: Methods, Performance, and Analysis,” ACM Computing Surveys, vol. 30, no. 1, pp. 70-119, 1998.
[91] P. Triantafillou and D.J. Taylor, "The Location-Based Paradigm for Replication: Achieving Efficiency and Availability in Distributed Systems," IEEE Trans. Software Eng., vol. 21, no. 1, pp. 1-8, Jan. 1995.
[92] P. Triantafillou and D.J. Taylor, "VELOS: A New Approach for Efficiently Achieving High Availability in Partitioned Distributed Systems," IEEE Trans. Knowledge and Data Engineering, pp. 305-21, Apr. 1996.
[93] P. Triantafillou, “Employing Replication to Achieve High Availability and Efficiency in Distributed Systems,” Research Report CS-91-28, Univ. of Waterloo, 1991.
[94] P. Triantafillou, "Independent Recovery in Large-Scale Distributed Systems," IEEE Trans. Software Eng., vol. 22, no. 11, Nov. 1996.
[95] Ö. Ulusoy, “Processing Real-Time Transactions in a Replicated Database System,” J. Distributed and Parallel Databases, vol. 2, no. 4, pp. 405-436, Oct. 1994.
[96] Ö. Ulusoy and G.G. Belford, “A Simulation Model for Distributed Real-Time Databases,” Proc. 25th Ann. Simulation Symp. (IEEE), pp. 232-240, Apr. 1992.
[97] O. Wolfson, S. Jajodia, and Y. Huang, “An Adaptive Data Replication Algorithm,” ACM Trans. Database Systems, vol. 22, no. 4, pp. 255-314, June 1997.
[98] C. Wu, “Replica Control Protocols that guarantee High Availability and Low Access Cost,” PhD dissertation, Univ. of Illi nois, 1993.
[99] P.S. Yu, D.M. Dias, and S.S. Lavenberg, "On the Analytical Modeling of Database Concurrency Control," J. ACM, vol. 40, no. 4, pp. 831-872, Sept. 1993.
[100] M. Hsu and B. Zhang, "Modeling Performance Impact of Hot Spots," chapter 7 in Performance of Concurrency Control Mechanisms in Centralized Databases Systems, V. Kumar, ed., Englewood Cliffs, N.J.: Prentice Hall, pp. 148-164, 1995.
[101] S. Zhou, M.H. Williams, and H. Taylor, “Practical Throughput Estimation for Parallel Databases,” Software Eng. J., vol. 11, no. 4, pp. 255-263, July 1996.

Index Terms:
Performance models, distributed databases, replication, interdatabase communication, modeling assumptions, queueing theory, measurements, benchmarks.
Citation:
Matthias Nicola, Matthias Jarke, "Performance Modeling of Distributed and Replicated Databases," IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 4, pp. 645-672, July-Aug. 2000, doi:10.1109/69.868912
Usage of this product signifies your acceptance of the Terms of Use.