Issue No. 02 - February (2008 vol. 20)
We propose a novel predictive quantization (PQ) based approach for online summarization of multiple time varying data streams. A synopsis over a sliding window of most recent entries is computed in one pass and dynamically updated in constant time. The correlation between consecutive data elements is effectively taken into account without the need for preprocessing. We extend PQ to multiple streams and propose structures for real-time summarization and querying of a massive number of streams. Queries on any subsequence of a sliding window over multiple streams are processed in real-time. We examine each component of the proposed approach, prediction and quantization, separately and investigate the space-accuracy trade off for synopsis generation. Complementing the theoretical optimality of PQ based approaches, we show that the proposed technique, even for very short prediction windows, significantly outperforms the current techniques for a wide variety of query types on both synthetic and real data sets.
multiple streams, Prediction, quantization, summarization, online update
F. Altiparmak, H. Ferhatosmanoglu and E. Tuncel, "Incremental Maintenance of Online Summaries Over Multiple Streams," in IEEE Transactions on Knowledge & Data Engineering, vol. 20, no. , pp. 216-229, 2007.