The Community for Technology Leaders
Proceedings 18th International Conference on Data Engineering (2002)
San Jose, California
Feb. 26, 2002 to Mar. 1, 2002
ISBN: 0-7695-1531-2
pp: 0628
Francesco Buccafurri , University of Reggio Calabria
Domenico Rosaci , University of Reggio Calabria
Luigi Pontieri , DEIS-UNICAL & ISI-CNR
Domenico Saccà , DEIS-UNICAL & ISI-CNR
ABSTRACT
Histograms are used to summarize the contents of relations for the estimation of query result sizes into a number of buckets. Several techniques (e.g., MaxDiff and V-Optimal) have been proposed in the past for determining bucket boundaries which provide better estimations. This paper proposes to use a 32-bit information (4-level tree index) for each bucket for storing approximated cumulative frequencies at 7 internal intervals of a bucket. Both theoretical analysis and experimental results show that the 4-level tree index provides the best frequency estimation inside a bucket. The index is later added to two well-known techniques for constructing histograms, MaxDiff and V-Optimal, thus obtaining high improvements in the frequency estimation over inter-bucket ranges w.r.t. the original methods.
INDEX TERMS
histograms, range query estimation, OLAP queries
CITATION

D. Saccà, F. Buccafurri, L. Pontieri and D. Rosaci, "Improving Range Query Estimation on Histograms," Proceedings 18th International Conference on Data Engineering(ICDE), San Jose, California, 2002, pp. 0628.
doi:10.1109/ICDE.2002.994780
81 ms
(Ver 3.3 (11022016))