15th International Conference on Scientific and Statistical Database Management (2003)
Cambridge, MA, USA
July 9, 2003 to July 11, 2003
F. Buccafurri , Dept. of DIMET, Reggio Calabria Univ., Italy
In many application contexts, like statistical databases, scientific databases, query optimizers, OLAP, and so on, data are often summarized into synopses of aggregate values. Summarization has the great advantage of saving space, but querying aggregate data rather than the original ones introduces estimation errors which cannot be in general avoided, as summarization is a lossy compression. A central problem in designing summarization techniques is to retain a certain degree of accuracy in reconstructing query answers. In this paper we restrict our attention to two-dimensional data, which are relevant for a number of applications, and propose a hierarchical summarization technique, which is combined with the use of indices, i.e. compact structures providing an approximate description of portions of the original data. Experimental results show that the technique gives approximation errors much smaller than other "general purpose" techniques, such as wavelets and various types of multi-dimensional histogram.
Data mining, Frequency estimation, Estimation error, Approximation error, Intrusion detection, Transaction databases, Internet
F. Buccafurri, F. Furfaro, D. Sacca and C. Sirangelo, "A quad-tree based multiresolution approach for two-dimensional summary data," 15th International Conference on Scientific and Statistical Database Management(SSDM), Cambridge, MA, USA, 2013, pp. 127,128,129,130,131,132,133,134,135,136,137.