The Community for Technology Leaders
2013 IEEE 29th International Conference on Data Engineering (ICDE) (2010)
Long Beach, CA, USA
Mar. 1, 2010 to Mar. 6, 2010
ISBN: 978-1-4244-5445-7
pp: 441-444
Ravishankar Ramamurthy , Microsoft Corp, USA
Stratos Idreos , CWI Amsterdam, Netherlands
Raghav Kaushik , Microsoft Corp, USA
Vivek Narasayya , Microsoft Corp, USA
Data compression techniques such as null suppression and dictionary compression are commonly used in today's database systems. In order to effectively leverage compression, it is necessary to have the ability to efficiently and accurately estimate the size of an index if it were to be compressed. Such an analysis is critical if automated physical design tools are to be extended to handle compression. Several database systems today provide estimators for this problem based on random sampling. While this approach is efficient, there is no previous work that analyses its accuracy. In this paper, we analyse the problem of estimating the compressed size of an index from the point of view of worst-case guarantees. We show that the simple estimator implemented by several database systems has several “good” cases even though the estimator itself is agnostic to the internals of the specific compression algorithm.
Ravishankar Ramamurthy, Stratos Idreos, Raghav Kaushik, Vivek Narasayya, "Estimating the compression fraction of an index using sampling", 2013 IEEE 29th International Conference on Data Engineering (ICDE), vol. 00, no. , pp. 441-444, 2010, doi:10.1109/ICDE.2010.5447871
280 ms
(Ver 3.3 (11022016))