Proceedings 18th International Conference on Data Engineering (2002)
San Jose, California
Feb. 26, 2002 to Mar. 1, 2002
Francesco Buccafurri , University of Reggio Calabria
Domenico Rosaci , University of Reggio Calabria
Luigi Pontieri , DEIS-UNICAL & ISI-CNR
Domenico Saccà , DEIS-UNICAL & ISI-CNR
Histograms are used to summarize the contents of relations for the estimation of query result sizes into a number of buckets. Several techniques (e.g., MaxDiff and V-Optimal) have been proposed in the past for determining bucket boundaries which provide better estimations. This paper proposes to use a 32-bit information (4-level tree index) for each bucket for storing approximated cumulative frequencies at 7 internal intervals of a bucket. Both theoretical analysis and experimental results show that the 4-level tree index provides the best frequency estimation inside a bucket. The index is later added to two well-known techniques for constructing histograms, MaxDiff and V-Optimal, thus obtaining high improvements in the frequency estimation over inter-bucket ranges w.r.t. the original methods.
histograms, range query estimation, OLAP queries
D. Saccà, F. Buccafurri, L. Pontieri and D. Rosaci, "Improving Range Query Estimation on Histograms," Proceedings 18th International Conference on Data Engineering(ICDE), San Jose, California, 2002, pp. 0628.