Dec. 18, 2006 to Dec. 22, 2006
Mohamed G. Elfeky , Google Inc.
Walid G. Aref , Purdue University
Ahmed K. Elmagarmid , Purdue University
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.153
Sensor devices are becoming ubiquitous, especially in measurement and monitoring applications. Because of the real-time, append-only and semi-infinite natures of the generated sensor data streams, an online incremental approach is a necessity for mining stream data types. In this paper, we propose STAGGER: a one-pass, online and incremental algorithm for mining periodic patterns in data streams. STAGGER does not require that the user pre-specify the periodicity rate of the data. Instead, STAGGER discovers the potential periodicity rates. STAGGER maintains multiple expanding sliding windows staggered over the stream, where computations are shared among the multiple overlapping windows. Small-length sliding windows are imperative for early and real-time output, yet are limited to discover short periodicity rates. As streamed data arrives continuously, the sliding windows expand in length in order to cover the whole stream. Larger-length sliding windows are able to discover longer periodicity rates. STAGGER incrementally maintains a tree-like data structure for the frequent periodic patterns of each discovered potential periodicity rate. In contrast to the Fourier/Wavelet-based approaches used for discovering periodicity rates, STAGGER not only discovers a wider, more accurate set of periodicities, but also discovers the periodic patterns themselves. In fact, experimental results with real and synthetic data sets show that STAGGER outperforms Fourier/Wavelet-based approaches by an order of magnitude in terms of the accuracy of the discovered periodicity rates. Moreover, realdata experiments demonstrate the practicality of the discovered periodic patterns.
Mohamed G. Elfeky, Walid G. Aref, Ahmed K. Elmagarmid, "STAGGER: Periodicity Mining of Data Streams Using Expanding Sliding Windows", ICDM, 2006, Sixth International Conference on Data Mining (ICDM'06), Sixth International Conference on Data Mining (ICDM'06) 2006, pp. 188-199, doi:10.1109/ICDM.2006.153