The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - February (2009 vol.31)
pp: 306-318
Pierre-François Marteau , Université de Bretagne Sud, Vannes
ABSTRACT
In a way similar to the string-to-string correction problem, we address discrete time series similarity in light of a time-series-to-time-series-correction problem for which the similarity between two time series is measured as the minimum cost sequence of edit operations needed to transform one time series into another. To define the edit operations, we use the paradigm of a graphical editing process and end up with a dynamic programming algorithm that we call Time Warp Edit Distance (TWED). TWED is slightly different in form from Dynamic Time Warping (DTW), Longest Common Subsequence (LCSS), or Edit Distance with Real Penalty (ERP) algorithms. In particular, it highlights a parameter that controls a kind of stiffness of the elastic measure along the time axis. We show that the similarity provided by TWED is a potentially useful metric in time series retrieval applications since it could benefit from the triangular inequality property to speed up the retrieval process while tuning the parameters of the elastic measure. In that context, a lower bound is derived to link the matching of time series into downsampled representation spaces to the matching into the original space. The empiric quality of the TWED distance is evaluated on a simple classification task. Compared to Edit Distance, DTW, LCSS, and ERP, TWED has proved to be quite effective on the considered experimental task.
INDEX TERMS
Pattern recognition, time series, algorithms, similarity measures.
CITATION
Pierre-François Marteau, "Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 2, pp. 306-318, February 2009, doi:10.1109/TPAMI.2008.76
REFERENCES
[1] R. Agrawal and R. Srikant, “Mining Sequential Patterns,” Proc. 11th IEEE Int'l Conf. Data Eng., pp. 3-14, 1995.
[2] R. Bellman, Dynamic Programming. Princeton Univ. Press, 1957.
[3] E. Chàvez, G. Navarro, R.A. Baeza-Yates, and J.L. Marroquin, “Searching in Metric Spaces,” ACM Computing Surveys, vol. 33, no. 3, pp. 273-321, 2001.
[4] L. Chen and R. Ng, “On the Marriage of LP-Norm and Edit Distance,” Proc. Int'l Conf. Very Large Data Bases, pp. 792-801, 2004.
[5] L. Chen, M.T. Ozsu, and V. Oria, “Robust and Fast Similarity Search for Moving Object Trajectories,” Proc. Int'l Conf. Management of Data (SIGMOD '05), pp. 491-502, 2005.
[6] G. Das, D. Gunopulos, and H. Mannila, “Finding Similar Time Series,” Proc. Conf. Principles of Knowledge Discovery and Data Mining, pp. 454-456, 1997.
[7] D.H. Douglas and T.K. Peucker, “Algorithm for the Reduction of the Number of Points Required to Represent a Line or Its Cari-cature,” The Canadian Cartographer, vol. 10, no. 2, pp. 112-122, 1973.
[8] R. Durbin, S. Eddy, A. Krogh, and G. Mitchinson, Biological Sequence Analysis—Probabilistic Models of Proteins and Nucleic Acids. Cambridge Univ. Press, 1998.
[9] E. Fink and B. Pratt, “Indexing of Compressed Time Series,” Data Mining in Time Series Databases, pp. 43-65, World Scientific, 2004.
[10] O. Gotoh, “An Improved Algorithm for Matching Biological Sequences,” J. Molecular Biology, vol. 162, pp. 705-708, 1982.
[11] D.S. Hirschberg, “A Linear Space Algorithm for Computing Maximal Common Subsequences,” Comm. ACM, vol. 18, no. 6, pp.341-343, 1975.
[12] H. Imai and M. Iri, “Polygonal Approximations of a Curve— Formulations and Algorithms,” Computational Morphology, pp. 71-86, 1988.
[13] E.J. Keogh, K. Chakrabarti, M.J. Pazzani, and S. Mehrotra, “Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases,” J. Knowledge and Information Systems, vol. 3, pp. 263-286, 2001.
[14] E.J. Keogh and M.J. Pazzani, “A Simple Dimensionality Reduction Technique for Fast Similarity Search in Large Time Series Databases,” Proc. Fourth Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD '00), pp. 122-133, 2000.
[15] E.J. Keogh, L. Wei, X. Xi, S.-H. Lee, and M. Vlachos, “Lb_keogh Supports Exact Indexing of Shapes under Rotation Invariance with Arbitrary Representations and Distance Measures,” Proc. ACM Int'l Conf. Very Large Databases, pp. 882-893, 2006.
[16] E.J. Keogh, X. Xi, L. Wei, and C.A. Ratanamahatana, The UCR Time Series Classification-Clustering Datasets, http://wwwcs.ucr.edu/ eamonntime_series_data /, 2006.
[17] A. Kolesnikov and P. Fränti, “Reduced-Search Dynamic Programming for Approximation of Polygonal Curves,” Pattern Recognition Letters, vol. 24, pp. 2243-2254, 2003.
[18] V.I. Levenshtein, “Binary Codes Capable of Correcting Deletions, Insertions, and Reversals,” Doklady Akademii Nauk SSSR, vol. 163, no. 4, pp. 707-710, 1966 (in Russian), english translation in Soviet Physics Doklady, vol. 10, no. 8, pp. 845-848, 1965.
[19] J. Lin, E.J. Keogh, and A.F. van Herie, “Approximations to Magic: Finding Unusual Medical Time Series,” Proc. 18th IEEE Int'l Symp. Computer-Based Medical Systems, pp. 329-334, 2005.
[20] V. Mäkinen, “Using Edit Distance in Point-Pattern Matching,” Proc. Eighth Symp. String Processing and Information Retrieval (SPIRE '01), pp. 153-161, 2001.
[21] V. Mäkinen, “Parameterized Approximate String Matching and Local-Similarity-Based Point-Pattern Matching,” PhD dissertation, Report A-2003, Dept. of Computer Science, Univ. of Helsinki, 2003.
[22] P.F. Marteau, “Time Warp Edit Distance,” Technical Report valoria/UBS-2008-3v, VALORIA, Univ. de Bretagne Sud, http://hal.archives-ouvertes.fr/hal-00258669 fr/, 2008.
[23] P.F. Marteau and S. Gibet, “Adaptive Sampling of Motion Trajectories for Discrete Task-Based Analysis and Synthesis of Gesture,” Proc. Int'l Gesture Workshop, pp. 224-235, 2005.
[24] P.F. Marteau and G. Ménier, “Speeding up Simplification of Polygonal Curves Using Nested Approximations,” J. Pattern Analysis and Application, pp. 1-8, Springer, June 2008, http://dx.doi.org.10.1007s10044-008-0133-y .
[25] P. Moore and D. Molloy, “A Survey of Computer-Based Deformable Models,” Proc. Int'l Machine Vision and Image Processing Conf. (IMVIP '07), pp. 55-64, 2007.
[26] S.B. Needleman and C.D. Wunsch, “A General Method Applicable to the Search for Similarities in the Amino Acid Sequences of Two Proteins,” J. Molecular Biology, vol. 48, pp. 443-453, 1970.
[27] M.K. Ng and Z. Huang, Temporal Data Mining with a Case Study as Astronomical Data Analysis, LNCS, pp. 2-18, 1997.
[28] S. Park, J.I. Won, J.H. Yoonc, and S.W. Kimb, “A Multi-Dimensional Indexing Approach for Timestamped Event Sequence Matching,” Information Sciences, vol. 177, pp. 4859-4876, 2007.
[29] J.C. Perez and E. Vidal, “Optimum Polygonal Approximation of Digitized Curves,” Pattern Recognition Letters, vol. 15, pp. 743-750, 1994.
[30] C.A. Ratanamahatana and E.J. Keogh, “Making Time-Series Classification More Accurate Using Learned Constraints,” Proc. Fourth SIAM Int'l Conf. Data Mining (SDM '04), pp. 11-22, 2004.
[31] H. Sakoe and S. Chiba, “A Dynamic Programming Approach to Continuous Speech Recognition,” Proc. Seventh Int'l Congress of Acoustic, pp. 65-68, 1971.
[32] A. Singh, D. Terzopoulos, and D.B. Goldgof, Deformable Models in Medical Image Analysis. IEEE CS Press, 1998.
[33] V.M. Velichko and N.G. Zagoruyko, “Automatic Recognition of 200 Words,” Int'l J. Man-Machine Studies, vol. 2, pp. 223-234, 1970.
[34] M. Vlachos, M. Hadjieleftheriou, D. Gunopulos, and E.J. Keogh, “Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures,” Proc. ACM Special Interest Group Knowledge Discovery and Data Mining (SIGKDD '03), pp. 216-225, 2003.
[35] R.A. Wagner and M.J. Fischer, “The String-to-String Correction Problem,” J. ACM, vol. 21, pp. 168-173, 1973.
[36] H. Wang, C. Perng, W. Fan, S. Park, and P. Yu, “Indexing Weighted Sequences in Large Databases,” Proc. IEEE Int'l Conf. Data Eng., pp. 63-74, 2003.
[37] H. Wu, B. Salzberg, and D. Zhang, “Online Event-Driven Subsequence Matching over Financial Data Streams,” Proc. Int'l Conf. Management of Data (SIGMOD '04), pp. 23-34, 2004.
412 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool