Subscribe

Issue No.02 - February (2012 vol.24)

pp: 265-278

Yueguo Chen , Renmin University of China, Beijing

Ke Chen , Zhejiang University, Hangzhou

Mario A. Nascimento , University of Alberta, Edmonton

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2010.223

ABSTRACT

Existing distance measures of time series such as the euclidean distance, DTW, and EDR are inadequate in handling certain degrees of amplitude shifting and scaling variances of data items. We propose a novel distance measure of time series, Spatial Assembling Distance (SpADe), that is able to handle noisy, shifting, and scaling in both temporal and amplitude dimensions. We further apply the SpADe to the application of streaming pattern detection, which is very useful in trend-related analysis, sensor networks, and video surveillance. Our experimental results on real time series data sets show that SpADe is an effective distance measure of time series. Moreover, high accuracy and efficiency are achieved by SpADe for continuous pattern detection in streaming time series.

INDEX TERMS

Distance measure, time series, shifting and scaling, pattern detection.

CITATION

Yueguo Chen, Ke Chen, Mario A. Nascimento, "Effective and Efficient Shape-Based Pattern Detection over Streaming Time Series",

*IEEE Transactions on Knowledge & Data Engineering*, vol.24, no. 2, pp. 265-278, February 2012, doi:10.1109/TKDE.2010.223REFERENCES

- [1] D.J. Berndt and J. Clifford, "Using Dynamic Time Warping to Find Patterns in Time Series,"
Proc. AAAI94 Workshop Knowledge Discovery in Databases (KDD), pp. 359-370, 1994.- [2] L. Chen and R.T. Ng, "On the Marriage of Lp-Norms and Edit Distance,"
Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 792-803, 2004.- [3] L. Chen, M.T. Özsu, and V. Oria, "Robust and Fast Similarity Search for Moving Object Trajectories,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 491-502, 2005.- [4] M. Vlachos, D. Gunopulos, and G. Kollios, "Discovering Similar Multidimensional Trajectories,"
Proc. Int'l Conf. Data Eng. (ICDE), pp. 673-684, 2002.- [5] R. Agrawal, K.-I. Lin, H.S. Sawhney, and K. Shim, "Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases,"
Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 490-501, 1995.- [6] E.J. Keogh, "Exact Indexing of Dynamic Time Warping,"
Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 406-417, 2002.- [7] A.W.-C. Fu, E.J. Keogh, L.Y.H. Lau, and C.A. Ratanamahatana, "Scaling and Time Warping in Time Series Querying,"
Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 649-660, 2005.- [8] M.D. Morse and J.M. Patel, "An Efficient and Accurate Method for Evaluating Time Series Similarity,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 569-580, 2007.- [9]
UCR Time Series Data Mining Archive, http://www.cs.ucr.edu/eamonntime_series_data /, 2011.- [10] Y. Chen, M.A. Nascimento, B.C. Ooi, and A.K.H. Tung, "Spade: On Shape-Based Pattern Detection in Streaming Time Series,"
Proc. IEEE Int'l Conf. Data Eng. (ICDE), pp. 786-795, 2007.- [11] R. Agrawal, C. Faloutsos, and A.N. Swami, "Efficient Similarity Search in Sequence Databases,"
Proc. Int'l Conf. Foundations of Data Organization and Algorithms (FODO), pp. 69-84, 1993.- [12] F. Korn, H.V. Jagadish, and C. Faloutsos, "Efficiently Supporting Ad Hoc Queries in Large Data Sets of Time Sequences,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 289-300, 1997.- [13] I. Popivanov and R.J. Miller, "Similarity Search over Time-Series Data Using Wavelets,"
Proc. IEEE Int'l Conf. Data Eng. (ICDE), pp. 212-221, 2002.- [14] B.-K. Yi and C. Faloutsos, "Fast Time Sequence Indexing for Arbitrary Lp Norms,"
Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 385-394, 2000.- [15] Y. Cai and R.T. Ng, "Indexing Spatio-Temporal Trajectories with Chebyshev Polynomials,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 599-610, 2004.- [16] Y. Zhu and D. Shasha, "Warping Indexes with Envelope Transforms for Query by Humming,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 181-192, 2003.- [17] K.K.W. Chu and M.H. Wong, "Fast Time-Series Searching with Scaling and Shifting,"
Proc. ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems (PODS), pp. 237-248, 1999.- [18] W.-K. Loh, S.-W. Kim, and K.-Y. Whang, "A Subsequence Matching Algorithm that Supports Normalization Transform in Time-Series Databases,"
Data Mining and Knowledge Discovery, vol. 9, no. 1, pp. 5-28, 2004.- [19] A.W.-C. Fu, E.J. Keogh, L.Y.H. Lau, C.A. Ratanamahatana, and R.C.-W. Wong, "Scaling and Time Warping in Time Series Querying,"
VLDB J., vol. 17, no. 4, pp. 899-921, 2008.- [20] S. Gandhi, S. Nath, S. Suri, and J. Liu, "Gamps: Compressing Multi Sensor Data by Grouping and Amplitude Scaling,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 771-784, 2009.- [21] C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, "Fast Subsequence Matching in Time-Series Databases,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 419-429, 1994.- [22] Y.-S. Moon, K.-Y. Whang, and W.-K. Loh, "Duality-Based Subsequence Matching in Time-Series Databases,"
Proc. Int'l Conf. Data Eng. (ICDE), pp. 263-272, 2001.- [23] Y.-S. Moon, K.-Y. Whang, and W.-S. Han, "General Match: A Subsequence Matching Method in Time-Series Databases Based on Generalized Windows,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 382-393, 2002.- [24] S. Park, W.W. Chu, J. Yoon, and C. Hsu, "Efficient Searches for Similar Subsequences of Different Lengths in Sequence Databases,"
Proc. Int'l Conf. Data Eng. (ICDE), pp. 23-32, 2000.- [25] H. Wu, B. Salzberg, and D. Zhang, "Online Event-Driven Subsequence Matching over Financial Data Streams,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 23-34, 2004.- [26] Y. Sakurai, S. Papadimitriou, and C. Faloutsos, "Braid: Stream Mining through Group Lag Correlations,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 599-610, 2005.- [27] S. Papadimitriou, J. Sun, and C. Faloutsos, "Streaming Pattern Discovery in Multiple Time-Series,"
Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 697-708, 2005.- [28] L. Gao and X.S. Wang, "Continually Evaluating Similarity-Based Pattern Queries on a Streaming Time Series,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 370-381, 2002.- [29] Y. Sakurai, C. Faloutsos, and M. Yamamuro, "Stream Monitoring under the Time Warping Distance,"
Proc. IEEE Int'l Conf. Data Eng. (ICDE), pp. 1046-1055, 2007.- [30] E.W. Dijkstra, "A Note on Two Problems in Connexion with Graphs,"
Numerische Mathematik, vol. 1, no. 1, pp. 269-271, 1959.- [31] C.K. Chui,
An Introduction to Wavelets. Academic Press, 1992.- [32] A. Guttman, "R-Trees: A Dynamic Index Structure for Spatial Searching,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 47-57, 1984.- [33] R. Weber, H.-J. Schek, and S. Blott, "A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces,"
Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 194-205, 1998.- [34] C. Böhm, S. Berchtold, and D.A. Keim, "Searching in High-Dimensional Spaces: Index Structures for Improving the Performance of Multimedia Databases,"
ACM Computing Surveys, vol. 33, no. 3, pp. 322-373, 2001.- [35] E.J. Keogh and S. Kasetty, "On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration,"
Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD), pp. 102-111, 2002.- [36]
CMU Graphics Lab Motion Capture Database, http:/mocap.cs. cmu.edu/, 2011.- [37] M. Müller, T. Röder, and M. Clausen, "Efficient Content-Based Retrieval of Motion Capture Data,"
ACM Trans. Graphics, vol. 24, no. 3, pp. 677-685, 2005. |