The Community for Technology Leaders
RSS Icon
Issue No.04 - April (2009 vol.21)
pp: 568-581
Xiang Lian , Hong Kong University of Science and Technology, Hong Kong
Lei Chen , Hong Kong University of Science and Technology, Hong Kong
Jeffrey Xu Yu , The Chinese University of Hong Kong, Hong Kong
Jinsong Han , Hong Kong University of Science and Technology, Hong Kong
Jian Ma , Nokia Research Center, Beijing
Similarity-based time series retrieval has been a subject of long term study due to its wide usage in many applications, such as financial data analysis and weather data forecasting. Its original task was to find those time series similar to a pattern time series data, where both the pattern and data time series are static. Recently, with an increasing demand on stream data management, similarity-based stream time series retrieval has raised new research issues due to its unique requirements during the stream processing, such as one-pass search and fast response. In this paper, we address the problem of matching both static and dynamic patterns over stream time series data. We will develop a novel multi-scale representation, called multi-scale segment mean (MSM), for stream time series data, which can be incrementally computed and thus perfectly adapted to the stream characteristics. Most importantly, we propose a novel multi-step filtering mechanism, SS, over the multi-scale representation. Analysis indicates that the mechanism can greatly prune the search space and thus offer fast response. Furthermore, batching processing optimization, the dynamic case where patterns are also from stream time series, and pattern matching over future stream time series are also discussed. Extensive experiments show the proposed scheme can efficiently filter out false candidates and detect patterns.
Information Storage and Retrieval, Temporal databases
Xiang Lian, Lei Chen, Jeffrey Xu Yu, Jinsong Han, Jian Ma, "Multiscale Representations for Fast Pattern Matching in Stream Time Series", IEEE Transactions on Knowledge & Data Engineering, vol.21, no. 4, pp. 568-581, April 2009, doi:10.1109/TKDE.2008.184
[1] R. Agrawal, C. Faloutsos, and A.N. Swami, “Efficient Similarity Search in Sequence Databases,” Proc. Fourth Int'l Conf. Foundations of Data Organization and Algorithms (FODO), 1993.
[2] R. Agrawal, K.I. Lin, H.S. Sawhney, and K. Shim, “Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Database,” Proc. 21st Int'l Conf. Very Large Data Bases (VLDB), 1995.
[3] A.N. Akansu and R.A. Haddad, Multiresolution Signal Decomposition. Academic Press, 1992.
[4] D.J. Berndt and J. Clifford, “Finding Patterns in Time Series: A Dynamic Programming Approach,” Advances in Knowledge Discovery and Data Mining, 1996.
[5] S. Berchtold, C. Bohm, and H.P. Kriegel, “The Pyramid-Technique: Towards Breaking the Curse of Dimensionality,” Proc. ACM SIGMOD, 1998.
[6] J.S. Boreczky and L.A. Rowe, “Comparison of Video Shot Boundary Detection Techniques,” Proc. Eighth Int'l Symp. Storage and Retrieval for Image and Video Databases, 1996.
[7] A. Bulut and A.K. Singh, “A Unified Framework for Monitoring Data Streams in Real Time,” Proc. 21st Int'l Conf. Data Eng. (ICDE), 2005.
[8] Y. Cai and R. Ng, “Indexing Spatio-Temporal Trajectories with Chebyshev Polynomials,” Proc. ACM SIGMOD, 2004.
[9] K.P. Chan and A.W.-C. Fu, “Efficient Time Series Matching by Wavelets,” Proc. 15th Int'l Conf. Data Eng. (ICDE), 1999.
[10] L. Chen and R. Ng, “On the Marriage of Edit Distance and ${\rm L}_{p}$ Norms,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB), 2004.
[11] Q. Chen, L. Chen, X. Lian, Y. Liu, and J.X. Yu, “Indexable PLA for Efficient Similarity Search,” Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), 2007.
[12] L. Chen, M.T. Ozsu, and V. Oria, “Robust and Fast Similarity Search for Moving Object Trajectories,” Proc. ACM SIGMOD, 2005.
[13] C. Cranor, T. Johnson, and O. Spatscheck, “Gigascope: A Stream Database for Network Applications,” Proc. ACM SIGMOD, 2003.
[14] M. Dubinko, R. Kumar, J. Magnani, J. Novak, P. Raghavan, and A. Tomkins, “Visualizing Tags over Time,” Proc. Int'l Conf. World Wide Web (WWW), 2006.
[15] C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, “Fast Subsequence Matching in Time-Series Databases,” Proc. ACM SIGMOD, 1994.
[16] L. Gao and X.S. Wang, “Continually Evaluating Similarity-Based Pattern Queries on a Streaming Time Series,” Proc. ACM SIGMOD, 2002.
[17] A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial Searching,” Proc. ACM SIGMOD, 1984.
[18] A. Kemper and B. Stegmaier, “Evaluating Bestmatchjoins on Streaming Data,” Technical Report MIP-0204, Universitt Passau, 2002.
[19] E. Keogh, “Exact Indexing of Dynamic Time Warping,” Proc. 28th Int'l Conf. Very Large Data Bases (VLDB), 2002.
[20] F. Korn, H. Jagadish, and C. Faloutsos, “Efficiently Supporting AdHoc Queries in Large Datasets of Time Sequences,” Proc. ACM SIGMOD, 1997.
[21] M. Li and Y. Liu, “Underground Coal Mine Monitoring with Wireless Sensor Networks,” ACM Trans. Sensor Networks, 2009.
[22] X. Lian, L. Chen, J.X. Yu, G. Wang, and G. Yu, “Similarity Match over High Speed Time-Series Streams,” Proc. 23rd Int'l Conf. Data Eng. (ICDE), 2007.
[23] J. Lin, E. Keogh, S. Lonardi, and B. Chiu, “A Symbolic Representation of Time Series, with Implications for Streaming Algorithms,” Proc. ACM/SIGMOD Int'l Workshop Research Issues on Data Mining and Knowledge Discovery (DMKD), 2003.
[24] Y. Liu, L. Xiao, X. Liu, L.M. Ni, and X. Zhang, “Location Awareness in Unstructured Peer-to-Peer Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 16, no. 2, pp. 163-174, Feb. 2005.
[25] Z. Liu, X. Yu, X. Lin, H. Lu, and W. Wang, “Locating Motifs in Time-Series Data,” Proc. Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD), 2005.
[26] A. Natsev, R. Rastogi, and K. Shim, “Walrus: A Similarity Retrieval Algorithm for Image Databases,” Proc. ACM SIGMOD, 1999.
[27] S. Papadimitriou, J. Sun, and C. Faloutsos, “Streaming Pattern Discovery in Multiple Time-Series,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB), 2005.
[28] E.J. Stollnitz, T.D. Derose, and D.H. Salesin, Wavelets for Computer Graphics: Theory and Applications. Morgan Kaufmann, 1996.
[29] M. Vlachos, G. Kollios, and D. Gunopulos, “Discovering Similar Multidimensional Trajectories,” Proc. 18th Int'l Conf. Data Eng. (ICDE), 2002.
[30] R. Weber, H.-J. Schek, and S. Blott, “A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces,” Proc. 24th Int'l Conf. Very Large Data Bases (VLDB), 1998.
[31] H. Wu, B. Salzberg, and D. Zhang, “Online Event Driven Subsequence Matching over Financial Data Streams,” Proc. ACM SIGMOD, 2004.
[32] W. Xue, Q. Luo, L. Chen, and Y. Liu, “Contour Map Matching for Event Detection in Sensor Networks,” Proc. ACM SIGMOD, 2006.
[33] B.-K. Yi and C. Faloutsos, “Fast Time Sequence Indexing for Arbitrary ${\rm L}_{p}$ -Norms,” Proc. 26th Int'l Conf. Very Large Data Bases (VLDB), 2000.
[34] B.-K. Yi, H. Jagadish, and C. Faloutsos, “Efficient Retrieval of Similar Time Sequences under Time Warping,” Proc. 14th Int'l Conf. Data Eng. (ICDE), 1998.
[35] B.-K. Yi, N. Sidiropoulos, T. Johnson, H.V. Jagadish, C. Faloutsos, and A. Biliris, “Online Data Mining for Co-Evolving Time Sequences,” Proc. 16th Int'l Conf. Data Eng. (ICDE), 2000.
[36] Y. Zhu and D. Shasha, “Efficient Elastic Burst Detection in Data Streams,” Proc. Ninth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD), 2003.
[37] Y. Zhu and D. Shasha, “Warping Indexes with Envelope Transforms for Query by Humming,” Proc. ACM SIGMOD, 2003.
[38] V. Megalooikonomo, Q. Wang, G. Li, and C. Faloutsos, “A Multiresolution Symbolic Representation of Time Series,” Proc. 21st Int'l Conf. Data Eng. (ICDE), 2005.
55 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool