This Article 
 Bibliographic References 
 Add to: 
Partial Indexing for Nonuniform Data Distributions in Relational DBMS's
June 1994 (vol. 6 no. 3)
pp. 420-429

It is well known that the effectiveness of relational database systems is greatly dependent on the efficiency of the data access strategies. For this reason, much work has been devoted to the development of new access techniques, supported by adequate access structures such as the B/sup +/trees. The effectiveness of the B/sup +/tree also depends on the data distribution characteristics; in particular, poor performance results when the data show strong key value distribution unbalancing. The aim of this paper is to present the partial index: a new access structure that is useful in such cases of unbalancing, as an alternative to the B/sup +/tree unclustered indexes. The access structures are built in the physical design phase, and at execution (or compilation) time, the optimizer chooses the most efficient access path. Thus, integration of the partial indexing technique in the design and in the optimization process are also described.

[1] A. Albano, V. De Antonellis, and A. Di Leva, Eds.,Computer Aided Database Design: The DATAID Project. Amsterdam, The Netherlands: North-Holland, 1985.
[2] M. M. Astrahanet al., "System R: Relational approach to database management,"Trans. Database Syst., vol. 1, no. 1, pp. 97-137, 1976.
[3] M. M. Astrahan, M. Schkolnick, and K. Y. Whang, "Approximating the number of unique values of an attribute without sorting,"Inform. Syst., vol. 12, no. 1, pp. 11-15, 1987.
[4] M. W. Blasgen and K. P. Eswaran, "Storage and access in relational databases,"IBM Syst. J., vol. 16, no. 4, 1977.
[5] A. Cardenas, "Analysis and performance of inverted data-base structures,"Commun. ACM, vol. 18, no. 5, pp. 253-263, 1975.
[6] S. Ceri, Ed.,Methodology and Tools for Database Design. Amsterdam, The Netherlands: North-Holland, 1983.
[7] D. D. Chamberlinet al., "A history and evaluation of system R,"Commun. ACM, vol. 24, no. 10, pp. 632-646, Oct. 1981.
[8] S. Christodulakis, "Estimating record selectivities,"Inform. Syst., vol. 8, no. 2, pp. 105-115, 1983.
[9] S. Christodulakis, "Estimating block selectivites.Inform. Syst., vol. 9, no. 1, 1984.
[10] P. Ciaccia, "Block access estimation for clustered data," Tech. Rep. 70, CIOC-CNR, Viale Risorgimento 2, Apr. 1990.
[11] P. Ciaccia and M. R. Scalas, "Optimization strategies for relational disjunctive queries,"IEEE Trans. Software Eng., vol. 15, pp. 1217-123, Oct. 1989.
[12] C. Comer, "The ubiquitousB-trees,"ACM Comput. Surveys, vol. 11, no. 2, pp. 121-136, June 1979.
[13] S. Ghosh, "SIAM: Statistics information access method,"Inform. Syst., 1988. Also published as IBM RJ4865, 1985.
[14] G. Held and M. Stonebraker, "B-trees re-examined,"Commun. ACM, vol. 21, no. 2, pp. 139-143, Feb. 1978.
[15] Y. S. Hsiao and A. L. Tharp, "Adaptive hashing,"Inform. Syst., vol. 13, no. 1, pp. 111-127, 1988.
[16] D. Maio, C. Sartori, and M. R. Scalas, "A modular, user-oriented decision support for physical database design,"Decision Support Syst., vol. 3, pp. 155-163, 1987.
[17] D. Maio, M. R. Scalas, and P. Tiberio, "On estimating access costs in relational databases,"Inform. Processing Lett., vol. 19, no. 3, pp. 157-161, 1984.
[18] C. Mohan, D. Haderle, Y. Wang, and J. Cheng, "Single table access using multiple indexes: Optimization, execution and concurrency control techniques," in F. Bancilhon, C. Thanos, and D. Tsichritzis, Eds.,Advances in Database Technology-EDBT '90.Berlin: Springer-Verlag, 1990, pp. 29-43.
[19] F. Fotouhi and S. Pramanik, "Optimizing the cost of relational queries using partial-relation schemes,"Inform. Syst., vol. 13, no. 1, 1988.
[20] C. Sartori and M. R. Scalas, "An access method adaptable to irregular data distributions for relational DBMS's," in P. Gleaser, Ed.,Data Sci. Technol., Amsterdam: North-Holland, 1985.
[21] M. R. Scalas and P. Tiberio, "Il metodo nested block nella esecuzione dei joins,"Rivista di Informatica, vol. 8, no. 3, 1983.
[22] M. R. Scalas and P. Tiberio," Valutazione dei costi ed ottimizzazione delle interrogazioni nei sistemi di gestione di basi di dati relazionali," in P. Tiberio, Ed.,Le Basi di Dati.Milan, Italy: Masson, 1985.
[23] P. Selinger,et al., "Access path selection in a relational data base system," inProc. 1979 ACM-SIGMOD Int. Conf. Management of Data, Boston, MA, June 1979.
[24] T. Sellis, "Intelligent caching and indexing techniques for relational database systems,"Inform. Syst., vol. 13, no. 2, June 1988.
[25] M. Stonebraker, "The case for partial indexes,"SIGMOD Records, vol. 18, pp. 4-11, Dec. 1989.
[26] M. Stonebraker,et al., "The design and implementation of INGRES,"ACM Trans. Database Syst., vol. 1, no. 3, Sept. 1976.
[27] B. T. Vander Zanden, H. M. Taylor, and D. Bitton, "A general framework for computing block accesses,"Inform. Syst., vol. 12, no. 2, pp. 177-190, 1987.
[28] K. Y. Wang, G. Wiederhold, and D. Sagalowitz, "Separability: An approach to physical database design,"IEEE Trans. Comput., vol. 33, no. 3, pp. 209-222, Mar. 1984.
[29] K. Whang, G. Wiederhold, and D. Sagalowicz, "Estimating block accesses in database organizations: A closed noniterative formula,"Commun. ACM, vol. 26, no. 11, pp. 940-944, Nov. 1983.
[30] S. B. Yao, "Approximating block accesses in database organizations,"Commun. ACM, vol. 20, pp. 260-261, Apr. 1977.
[31] S. B. Yao, "Optimization of query evaluation algorithms,"ACM Trans. Database Syst., vol. 4, no. 2, pp. 133-155, June 1979.

Index Terms:
indexing; relational databases; tree data structures; optimisation; query processing; partial indexing; nonuniform data distributions; relational DBMS; relational database; data access strategies; access structures; B+ tree; data distribution characteristics; performance results; key value distribution unbalancing; access structure; unclustered indexes; physical design phase; execution time; compilation time; optimizer; access path; partial indexing technique; optimization process; query processing
C. Sartori, M.R. Scalas, "Partial Indexing for Nonuniform Data Distributions in Relational DBMS's," IEEE Transactions on Knowledge and Data Engineering, vol. 6, no. 3, pp. 420-429, June 1994, doi:10.1109/69.334860
Usage of this product signifies your acceptance of the Terms of Use.