The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - Nov. (2013 vol.35)
pp: 2796-2802
M. G. Baydogan , Security & Defense Syst. Initiative, Tempe, AZ, USA
G. Runger , Sch. of Comput., Inf. & Decision Syst. Eng., Arizona State Univ., Tempe, AZ, USA
E. Tuv , Logic Technol. Dev., Intel, Chandler, AZ, USA
ABSTRACT
Time series classification is an important task with many challenging applications. A nearest neighbor (NN) classifier with dynamic time warping (DTW) distance is a strong solution in this context. On the other hand, feature-based approaches have been proposed as both classifiers and to provide insight into the series, but these approaches have problems handling translations and dilations in local patterns. Considering these shortcomings, we present a framework to classify time series based on a bag-of-features representation (TSBF). Multiple subsequences selected from random locations and of random lengths are partitioned into shorter intervals to capture the local information. Consequently, features computed from these subsequences measure properties at different locations and dilations when viewed from the original series. This provides a feature-based approach that can handle warping (although differently from DTW). Moreover, a supervised learner (that handles mixed data types, different units, etc.) integrates location information into a compact codebook through class probability estimates. Additionally, relevant global features can easily supplement the codebook. TSBF is compared to NN classifiers and other alternatives (bag-of-words strategies, sparse spatial sample kernels, shapelets). Our experimental results show that TSBF provides better results than competitive methods on benchmark datasets from the UCR time series database.
INDEX TERMS
Time series analysis, Feature extraction, Radio frequency, Error analysis, Support vector machines, Training, Histograms,codebook, Supervised learning, feature extraction
CITATION
M. G. Baydogan, G. Runger, E. Tuv, "A Bag-of-Features Framework to Classify Time Series", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 11, pp. 2796-2802, Nov. 2013, doi:10.1109/TPAMI.2013.72
REFERENCES
[1] Y.-S. Jeong, M.K. Jeong, and O.A. Omitaomu, "Weighted Dynamic Time Warping for Time Series Classification," Pattern Recognition, vol. 44, no. 9, pp. 2231-2240, 2011.
[2] E. Keogh and S. Kasetty, "On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration," Data Mining and Knowledge Discovery, vol. 7, no. 4, pp. 349-371, 2003.
[3] C. Ratanamahatana and E. Keogh, "Making Time-Series Classification More Accurate Using Learned Constraints," Proc. SIAM Int'l Conf. Data Mining, pp. 11-22, 2004,
[4] K. Ueno, X. Xi, E. Keogh, and D. Lee, "Anytime Classification Using the Nearest Neighbor Algorithm with Applications to Stream Mining," Proc. IEEE Int'l Conf. Data Mining, pp. 623-632, 2007.
[5] Z. Xing, J. Pei, and P.S. Yu, "Early Prediction on Time Series: A Nearest Neighbor Approach," Proc. Int'l Joint Conf. Artificial Intelligence, pp. 1297-1302, 2009.
[6] H. Sakoe, "Dynamic Programming Algorithm Optimization for Spoken Word Recognition," IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 26, no. 1, pp. 43-49, Feb. 1978.
[7] C. Ratanamahatana and E. Keogh, "Three Myths about Dynamic Time Warping Data Mining," Proc. SIAM Int'l Conf. Data Mining, vol. 21, pp. 506-510, 2005.
[8] J. Lin and Y. Li, "Finding Structural Similarity in Time Series Data Using Bag-of-Patterns Representation," Proc. Int'l Conf. Scientific and Statistical Database Management, pp. 461-477, 2009.
[9] Y. Yamada, H. Yokoi, and K. Takabayashi, "Decision-Tree Induction from Time-Series Data Based on Standard-Example Split Test," Proc. Int'l Conf. Machine Learning, pp. 840-847, 2003.
[10] L. Ye and E. Keogh, "Time Series Shapelets: A New Primitive for Data Mining," Proc. 15th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 947-956, 2009.
[11] L. Ye and E. Keogh, "Time Series Shapelets: A Novel Technique That Allows Accurate, Interpretable and Fast Classification," Data Mining and Knowledge Discovery, vol. 22, pp. 149-182, 2011.
[12] A. Mueen, E.J. Keogh, and N. Young, "Logical-Shapelets: An Expressive Primitive for Time Series Classification," Proc. 17th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 1154-1162, 2011.
[13] P. Kuksa and V. Pavlovic, "Spatial Representation for Efficient Sequence Classification," Proc. 20th Int'l Conf. Pattern Recognition, pp. 3320-3323, Aug. 2010.
[14] M. Hearst, S. Dumais, E. Osman, J. Platt, and B. Scholkopf, "Support Vector Machines," IEEE Intelligent Systems and Their Applications, vol. 13, no. 4, pp. 18-28, 1998.
[15] P. Geurts, "Pattern Extraction for Time Series Classification," Proc. Fifth European Conf. Principles of Data Mining and Knowledge Discovery, vol. 2168, pp. 115-127, 2001.
[16] D. Eads, K. Glocer, S. Perkins, and J. Theiler, "Grammar-Guided Feature Extraction for Time Series Classification," Proc. Conf. Neural Information Processing Systems, 2005.
[17] A. Nanopoulos, R. Alcock, and Y. Manolopoulos, "Feature-Based Classification of Time-Series Data," Int'l J. Computer Research, vol. 10, pp. 49-61, 2001.
[18] J. Rodríguez, C. Alonso, and J. Maestro, "Support Vector Machines of Interval-Based Features for Time Series Classification," Knowledge-Based Systems, vol. 18, nos. 4/5, pp. 171-178, 2005.
[19] J. Rodríguez, C. Alonso, and H. Boström, "Boosting Interval Based Literals," Intelligent Data Analysis, vol. 5, no. 3, pp. 245-262, 2001.
[20] J.J. Rodríguez and C.J. Alonso, "Interval and Dynamic Time Warping-Based Decision Trees," Proc. ACM Symp. Applied Computing, pp. 548-552, 2004.
[21] R. Rahmani and S.A. Goldman, "MISSL: Multiple-Instance Semi-Supervised Learning," Proc. Int'l Conf. Machine Learning, pp. 705-712, 2006.
[22] C. Zhang, X. Chen, M. Chen, S.-C. Chen, and M.-L. Shyu, "A Multiple Instance Learning Approach for Content Based Image Retrieval Using One-Class Support Vector Machine," Proc. IEEE Int'l Conf. Multimedia and Expo, pp. 1142-1145, 2005.
[23] Q. Zhang, S.A. Goldman, W. Yu, and J.E. Fritts, "Content-Based Image Retrieval Using Multiple-Instance Learning," Proc. Int'l Conf. Machine Learning, pp. 682-689, 2002.
[24] O. Maron and A.L. Ratan, "Multiple-Instance Learning for Natural Scene Classification," Proc. Int'l Conf. Machine Learning), pp. 341-349, 1998.
[25] B. Babenko, M.-H. Yang, and S. Belongie, "Robust Object Tracking with Online Multiple Instance Learning," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 8, pp. 1619-1632, Aug. 2011.
[26] P. Dollár, B. Babenko, S. Belongie, P. Perona, and Z. Tu, "Multiple Component Learning for Object Detection," Proc. European Conf. Computer Vision, pp. 211-224, 2008.
[27] R. Fergus, P. Perona, and A. Zisserman, "Object Class Recognition by Unsupervised Scale-Invariant Learning," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 264-271, 2003.
[28] A. Mohan, C. Papageorgiou, and T. Poggio, "Example-Based Object Detection in Images by Components," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 4, pp. 349-361, Apr. 2001.
[29] V.C. Raykar, B. Krishnapuram, J. Bi, M. Dundar, and R.B. Rao, "Bayesian Multiple Instance Learning: Automatic Feature Selection and Inductive Transfer," Proc. Int'l Conf. Machine Learning, pp. 808-815, 2008.
[30] E. Nowak, F. Jurie, and B. Triggs, "Sampling Strategies for Bag-of-Features Image Classification," Proc. European Conf. Computer Vision, pp. 490-503, 2006.
[31] D. Lewis, "Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval," Proc. European Conf. Machine Learning, pp. 4-15, 1998.
[32] Y. Chen, J. Bi, and J.Z. Wang, "Miles: Multiple-Instance Learning via Embedded Instance Selection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, pp. 1931-1947, Dec. 2006.
[33] T.G. Dietterich, R.H. Lathrop, and T. Lozano-Perez, "Solving the Multiple Instance Problem with Axis-Parallel Rectangles," Artificial Intelligence, vol. 89, nos. 1/2, pp. 31-71, 1997.
[34] F. Briggs, R. Raich, and X.Z. Fern, "Audio Classification of Bird Species: A Statistical Manifold Approach," Proc. IEEE CS Int'l Conf. Data Mining, pp. 51-60, 2009.
[35] Z. Fu, G. Lu, K.M. Ting, and D. Zhang, "Music Classification via the Bag-of-Features Approach," Pattern Recognition Letters, vol. 32, no. 14, pp. 1768-1777, 2011.
[36] C. Harris and M. Stephens, "A Combined Corner and Edge Detector," Proc. Fourth Alvey Vision Conf., pp. 147-151, 1988.
[37] J.-J. Aucouturier, B. Defreville, and F. Pachet, "The Bag-of-Frames Approach to Audio Pattern Recognition: A Sufficient Model for Urban Soundscapes but Not for Polyphonic Music," The J. Acoustical Soc. of Am., vol. 122, no. 2, pp. 881-891, 2007.
[38] R.F. Lyon, M. Rehn, S. Bengio, T.C. Walters, and G. Chechik, "Sound Retrieval and Ranking Using Sparse Auditory Representations," Neural Computation, vol. 22, pp. 2390-2416, 2010.
[39] D. Turnbull, L. Barrington, D. Torres, and G. Lanckriet, "Semantic Annotation and Retrieval of Music and Sound Effects," IEEE Trans. Audio, Speech, and Language Processing, vol. 16, no. 2, pp. 467-476, Feb. 2008.
[40] J. Lin, E. Keogh, L. Wei, and S. Lonardi, "Experiencing SAX: A Novel Symbolic Representation of Time Series," Data Mining and Knowledge Discovery, vol. 15, pp. 107-144, 2007.
[41] M. Casey, C. Rhodes, and M. Slaney, "Analysis of Minimum Distances in High-Dimensional Musical Spaces," IEEE Trans. Audio, Speech, and Language Processing, vol. 16, no. 5, pp. 1015-1028, July 2008.
[42] E. Keogh, Q. Zhu, B. Hu, H. Y., X. Xi, L. Wei, and C.A. Ratanamahatana, "The UCR Time Series Classification/Clustering Homepage," http://www.cs.ucr.edu/eamonntime_series_data /, 2011.
[43] T.-c. Fu, "A Review on Time Series Data Mining," Eng. Applications of Artificial Intelligence, vol. 24, pp. 164-181, 2011.
[44] T. Lindeberg, "Scale-Space Theory: A Basic Tool for Analyzing Structures at Different Scales," J. Applied Statistics, vol. 21, nos. 1/2, pp. 225-270, 1994.
[45] F. Moosmann, E. Nowak, and F. Jurie, "Randomized Clustering Forests for Image Classification," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 9, pp. 1632-1646, Sept. 2008.
[46] L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
[47] R. Caruana, N. Karampatziakis, and A. Yessenalina, "An Empirical Evaluation of Supervised Learning in High Dimensions," Proc. Int'l Conf. Machine Learning, pp. 96-103, 2008.
[48] M.G. Baydogan, "A Bag-of-Features Framework to Classify Time Series Homepage," www.mustafabaydogan.coma-bag-of-features-framework-to-classify-time-series-tsbf.html , 2012.
[49] H. Ding, G. Trajcevski, P. Scheuermann, X. Wang, and E. Keogh, "Querying and Mining of Time Series Data: Experimental Comparison of Representations and Distance Measures," Proc. VLDB Endowment, vol. 1, pp. 1542-1552, Aug. 2008.
[50] M.G. Baydogan, "Modeling Time Series Data for Supervised Learning," PhD thesis, Arizona State Univ., Dec. 2012.
9 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool