The Community for Technology Leaders
2013 IEEE 29th International Conference on Data Engineering (ICDE) (2002)
San Jose, California
Feb. 26, 2002 to Mar. 1, 2002
ISBN: 0-7695-1531-2
pp: 0628
Domenico Saccà , DEIS-UNICAL & ISI-CNR
Francesco Buccafurri , University of Reggio Calabria
Luigi Pontieri , DEIS-UNICAL & ISI-CNR
Domenico Rosaci , University of Reggio Calabria
ABSTRACT
Histograms are used to summarize the contents of relations for the estimation of query result sizes into a number of buckets. Several techniques (e.g., MaxDiff and V-Optimal) have been proposed in the past for determining bucket boundaries which provide better estimations. This paper proposes to use a 32-bit information (4-level tree index) for each bucket for storing approximated cumulative frequencies at 7 internal intervals of a bucket. Both theoretical analysis and experimental results show that the 4-level tree index provides the best frequency estimation inside a bucket. The index is later added to two well-known techniques for constructing histograms, MaxDiff and V-Optimal, thus obtaining high improvements in the frequency estimation over inter-bucket ranges w.r.t. the original methods.
INDEX TERMS
histograms, range query estimation, OLAP queries
CITATION
Domenico Saccà, Francesco Buccafurri, Luigi Pontieri, Domenico Rosaci, "Improving Range Query Estimation on Histograms", 2013 IEEE 29th International Conference on Data Engineering (ICDE), vol. 00, no. , pp. 0628, 2002, doi:10.1109/ICDE.2002.994780
77 ms
(Ver 3.3 (11022016))