This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Block Access Estimation for Clustered Data
August 1993 (vol. 5 no. 4)
pp. 712-718

A method is proposed for dealing with nonuniform data distributions in database organizations in order to estimate the expected number of blocks containing the tuples requested by a query. When tuples with equal attribute value are not uniformly distributed over the blocks of secondary memory that store the relation, a clustering effect is observed. This can be detected by means of a single parameter, the clustering factor, which can be stored in the system catalog. The method can be applied to uniform data distributions as well, since it is shown that a uniform distribution can be viewed as a particular instance of a class of clustered distributions. In this case the proposed method allows considerable reduction of the number of computational steps needed to compute the estimated result.

[1] S. Bergamaschi and M. R. Scalas, "Choice of the optimal number of blocks for data access by an index,"Inform. Syst., vol. 11, no. 3, pp. 199-209, 1986.
[2] A. Cardenas, "Analysis and performance of inverted data-base structures,"Commun. ACM, vol. 18, no. 5, pp. 253-263, 1975.
[3] S. Christodoulaki, "Estimating block selectivities,"Inform Syst., vol. 9, no. 1, pp. 69-79, 1984.
[4] S. Christodoulakis, "Implications of certain assumptions in database performance evaluation,"ACM Trans. Database Syst.vol. 9, no. 2, pp. 163-186, June 1984.
[5] P. Ciaccia, D. Maio, and P. Tiberio. "A unifying approach to evaluating block accesses in database organizations,"Inform. Processing Lett., vol. 28, no. 5, pp, 253-257, Aug 1988.
[6] P. Ciaccia and M. R. Scalas, "Optimization strategies for relational disjunctive queries,"IEEE Trans. Software Eng., vol. 15. pp. 1217-1235, Oct 1989.
[7] D. Maio, M. R. Scalas, and P. Tiberio, "On estimating access costs in relational databases,"Inform. Processing Lett., vol. 19, no. 3, pp. 157-161, 1984.
[8] M. Schkolnick and P. Tiberio, "Estimating the cost of updates in a relational database,"ACM Trans. Database Syst., vol. 10, pp. 163- 179, June 1985.
[9] P. Selinger,et al., "Access path selection in a relational data base system," inProc. 1979 ACM-SIGMOD Int. Conf. Management of Data, Boston, MA, June 1979.
[10] K. F. Siler, "A stochastic evaluation model for database organizations in data retrieval systems,"Commun. ACM, vol. 19, no. 2, pp. 84-95, Feb. 1976.
[11] B. T. Vander Zanden, H. M. Taylor, and D. Bitton, "A general framework for computing block accesses,"Inform. Syst., vol. 12, no. 2, pp. 177-190, 1987.
[12] K. Whang, G. Wiederhold, and D. Sagalowicz, "Estimating block accesses in database organizations: A closed noniterative formula,"Commun. ACM, vol. 26, no. 11, pp. 940-944, Nov. 1983.
[13] S. B. Yao, "Approximating block accesses in database organizations,"Commun. ACM, vol. 20, pp. 260-261, Apr. 1977.

Index Terms:
relational databases; clustered data; nonuniform data distributions; database organizations; tuples; query; clustering effect; clustering factor; system catalog; relational databases
Citation:
P. Ciaccia, "Block Access Estimation for Clustered Data," IEEE Transactions on Knowledge and Data Engineering, vol. 5, no. 4, pp. 712-718, Aug. 1993, doi:10.1109/69.234782
Usage of this product signifies your acceptance of the Terms of Use.