The Community for Technology Leaders
RSS Icon
Issue No.02 - February (2010 vol.22)
pp: 193-206
Ken-Hao Liu , National Taiwan University, Taipei
Wei-Guang Teng , National Cheng Kung University, Taipei
Ming-Syan Chen , National Taiwan University, Taipei
Due to the dynamic nature of data streams, a sliding window is used to generate synopses that approximate the most recent data within the retrospective horizon to answer queries or discover patterns. In this paper, we propose a dynamic scheme for wavelet synopses management in sensor networks. We define a data structure Sliding Dual Tree, abbreviated as SDT, to generate dynamic synopses that adapts to the insertions and deletions in the most recent sliding window. By exploiting the properties of Haar wavelet transform, we develop several operations to incrementally maintain SDT over consecutive time windows in a time- and space-efficient manner. These operations directly operate on the transformed time-frequency domain without the need of storing/reconstructing the original data. As shown in our thorough analysis, our SDT-based approach greatly reduces the required resources for synopses generation and maximizes the storage utilization of wavelet synopses in terms of the window length and quality measures. We also show that the approximation error of the dynamic wavelet synopses, i.e., L^{2}-norm error, can be incrementally updated. We also derive the bound of the overestimation of the approximation error due to the incremental thresholding scheme. Furthermore, the synopses can be used to answer various kinds of numerical queries such as point and distance queries. In addition, we show that our SDT can adapt to resource allocation to further enhance the overall storage utilization over time. As demonstrated by our experimental results, our proposed framework can outperform current techniques in both real and synthetic data.
Data stream, wavelet transform, dynamic synopses.
Ken-Hao Liu, Wei-Guang Teng, Ming-Syan Chen, "Dynamic Wavelet Synopses Management over Sliding Windows in Sensor Networks", IEEE Transactions on Knowledge & Data Engineering, vol.22, no. 2, pp. 193-206, February 2010, doi:10.1109/TKDE.2009.51
[1] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, “Models and Issues in Data Stream Systems,” Proc. ACM Symp. Principles of Database Systems (PODS '02), pp. 1-16, 2002.
[2] A. Arasu and G.S. Manku, “Approximate Counts and Quantiles over Sliding Windows,” Proc. ACM Symp. Principles of Database Systems (PODS '04), pp. 286-296, June 2004.
[3] B. Babcock, M. Datar, R. Motwani, and L. O'Callaghan, “Maintaining Variance and K-Medians over Data Stream Windows,” Proc. ACM Symp. Principles of Database Systems (PODS '03), pp.234-243, June 2003.
[4] M. Datar, A. Gionis, P. Indyk, and R. Motwani, “Maintaining Stream Statistics over Sliding Windows: (Extended Abstract),” Proc. ACM-SIAM Symp. Discrete Algorithms, pp. 635-644, Jan. 2002.
[5] C.S. Burrus and R.A. Gopinath, Introduction to Wavelets and Wavelets Transforms. Prentice-Hall, 1997.
[6] E. Keogh, K. Chakrabarti, S. Mehrotra, and M. Pazzani, “Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases” Proc. ACM SIGMOD '01, pp. 151-162, May 2001.
[7] E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra, “Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases,” Knowledge and Information Systems, vol. 3, no. 3, pp. 263-286, Aug. 2001.
[8] I. Lazaridis and S. Mehrotra, “Capturing Sensor-Generated Time Series with Quality Guarantees” Proc. IEEE Int'l Conf. Data Eng. (ICDE '03), pp. 429-440, Mar. 2003.
[9] I. Popivanov and R.J. Miller, “Similarity Search over Time-Series Data Using Wavelets” Proc. IEEE Int'l Conf. Data Eng. (ICDE '02), pp. 212-221, Feb. 2002.
[10] W.-G. Teng, M.-S. Chen, and P.S. Yu, “A Regression-Based Temporal Pattern Mining Scheme for Data Streams” Proc. Int'l Conf. Very Large Data Bases (VLDB '03), pp. 93-104, Sept. 2003.
[11] B.-K. Yi and C. Faloutsos, “Fast Time Sequence Indexing for Arbitrary Lp Norms,” Proc. Int'l Conf. Very Large Data Bases (VLDB '00), pp. 385-394, Sept. 2000.
[12] K.-H. Liu, W.-G. Teng, and M.-S. Chen, “Incremental Maintenance of Wavelet Synopses for Data Streams,” Proc. Int'l Conf. Data Mining (ICDM '05) Workshop Temporal Data Mining: Algorithms, Theory and Applications, Nov. 2005.
[13] Y. Matias, J.S. Vitter, and M. Wang, “Dynamic Maintenance of Wavelet-Based Histograms,” Proc. Int'l Conf. Very Large Data Bases (VLDB '00), pp. 101-110, Sept. 2000.
[14] A. Deshpande, C. Guestrin, S. Madden, J.M. Hellerstein, and W. Hong, “Model-Based Approximate Querying in Sensor Networks,” Very Large Data Base J., vol. 14, no. 4, pp. 417-443, 2005.
[15] D. Chu, A. Deshpande, J.M. Hellerstein, and W. Hong, “Approximate Data Collection in Sensor Networks Using Probabilistic Models,” Proc. IEEE Int'l Conf. Data Eng. (ICDE '06), p. 48, 2006.
[16] A.C. Gilbert, Y. Kotidis, S. Muthukrishnan, and M. Strauss, “Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries,” Proc. Int'l Conf. Very Large Data Bases (VLDB '01), pp. 79-88, Sept. 2001.
[17] M. Jahangiri, D. Sacharidis, and C. Shahabi, “Shift-Split: I/O Efficient Maintenance of Wavelet-Transformed Multidimensional Data,” Proc. ACM SIGMOD '05, pp. 275-286, 2005.
[18] A. Bulut and A.K. Singh, “SWAT: Hierarchical Stream Summarization in Large Networks” Proc. IEEE Int'l Conf. Data Eng. (ICDE '03), pp. 303-314, Mar. 2003.
[19] Y. Matias, J.C. Vitter, and M. Wang, “Wavelet-Based Histograms for Selectivity Estimation” Proc. ACM SIGMOD '98, pp. 448-459, June 1998.
[20] T. Li, Q. Li, S. Zhu, and M. Ogihara, “A Survey on Wavelet Applications in Data Mining,” SIGKDD Explorations, vol. 4, no. 2, pp. 49-68, Dec. 2002.
[21] E.J. Stollnitz, T.D. Derose, and D.H. Salesin, Wavelets for Computer Graphics: Theory and Applications. Morgan Kaufmann, 1996.
[22] “Average Daily Temperature Archive of the University of Dayton,” http://www.engr.udayton.eduweather/, 2009.
[23] “Pacific Northwest Weather Data Traces,” http://cs-people.bu. edu/jchingicde06/, 2009.
17 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool