This Article 
 Bibliographic References 
 Add to: 
Measuring the Effects of Data Distribution Models on Performance Evaluation of Distributed Database Systems
December 1989 (vol. 1 no. 4)
pp. 494-507

The effect of simplistic assumptions about the data distribution and replication in a system on performance measures and the computational complexity and accuracy of evaluations of performance measures is investigated. The size of the participating node set of a transaction is chosen as the desired performance measure. A data distribution and replication model is represented by four key parameters. Probabilistic analysis is used to evaluate six of these models. It is concluded that even though some of the data distribution and replication models appear to be simplistic, the results obtained from them are very close to those from complex models. In addition, the gains due to drastically reduced execution times strongly suggest the use of simple models (at least) in the early stages of the design process.

[1] B. Bhargava and L. Lilien, "A review of concurrency and reliability issues in distributed database systems, " inConcurrency Control and Reliability Issues in Distributed Systems, B. K. Bhargava, Ed. New York: Van Nostrand Reinhold, 1987, pp. 1-84.
[2] S. Ceri, G. Martella, and G. Pelagatti, "Optimal file allocation for a distributed database on a network of minicomputers," inProc. Int. Conf. Data Bases, Univ. Aberdeen, July 1980, pp. 216-237.
[3] E. G. Coffman, E. Gelenbe, and B. Plateau, "Optimization of number of copies in a distributed database,"IEEE Trans. Software Eng., vol. SE-7, no. 1, pp. 78-84, 1981.
[4] S. B. Davidson, "Analyzing partition failure protocols," Dep. Comput. Sci.. Univ. Pennsylvania, Tech. Rep. MS-CIS-86-05, Jan. 1986.
[5] H. Garcia-Molina, "Performance evaluation of the update algorithms for replicated data in a distributed database," Ph.D. dissertation, Dep. Comput. Sci., Stanford Univ., Stanford, CA, June 1979.
[6] B. Gavish and H. Pirkul, "Computer and database location in distributed computer systems,"IEEE Trans. Comput., vol. C-35, no. 7, pp. 583-590, July 1986.
[7] D. Gifford, "Weighted voting for replicated data," inProc. 7th ACM Symp. Oper. Syst. Principles, Dec. 1979, pp. 150-162.
[8] N. G. Hall and D. S. Hochbaum, "A fast approximation algorithm for the multicovering problem,"Discrete Appl. Math., vol. 15, pp. 35-40, 1986.
[9] L. F. Mackert and G. M. Lohman, "R*optimizer validation and performance evaluation for distributed queries," inProc. 12th Int. Conf. Very Large Data Bases, Kyoto, Japan, 1986, pp. 149-159.
[10] R. Mukkamala, "Design of partially replicated distributed database systems," Dep. Comput. Sci., Univ. Iowa, Tech. Rep. TR 87-04, July 1987.
[11] R. Mukkamala, S. C. Bruell, and R. K. Shultz. "A heuristic algorithm for determining a near-optimal set of nodes to access in a partially replicated distributed database system," inProc. 4th Int. Conf. Data Eng., Feb. 1988, pp. 330-337.
[12] R. Mukkamala, S. C. Bruell, and R. K. Shultz, "Design of partially replicated distributed database systems: An integrated approach," inProc. ACM SIGMETRICS Conf. Measurement and Modeling of Comput. Syt., May 1988, pp. 187-196.
[13] L. E. Stanfel, "Applications of clustering to information system design,"Inform. Processing Management, vol. 19, no. 1, pp. 37-50, 1983.
[14] R. Thomas, "A majority consensus approach to concurrency control,"ACM Trans. Database Syst., vol. 4, pp. 180-209, June 1979.

Index Terms:
data distribution models; performance evaluation; distributed database systems; replication; computational complexity; participating node set; execution times; computational complexity; concurrency control; database theory; distributed databases; performance evaluation
R. Mukkamala, "Measuring the Effects of Data Distribution Models on Performance Evaluation of Distributed Database Systems," IEEE Transactions on Knowledge and Data Engineering, vol. 1, no. 4, pp. 494-507, Dec. 1989, doi:10.1109/69.43424
Usage of this product signifies your acceptance of the Terms of Use.