Subscribe

Issue No.12 - December (2008 vol.20)

pp: 1616-1626

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2008.29

ABSTRACT

To efficiently and effectively mine massive amounts of data in the time series, approximate representation of the data is one of the most commonly used strategies. Piecewise Linear Approximation is such an approach, which represents a time series by dividing it into segments and approximating each segment with a straight line. In this paper, we first propose a new segmentation criterion that improves computing efficiency. Based on this criterion, two novel online piecewise linear segmentation methods are developed, the feasible space window method and the stepwise feasible space window method. The former usually produces much fewer segments and is faster and more reliable in the running time than other methods. The latter can reduce the representation error with fewer segments. It achieves the best overall performance on the segmentation results compared with other methods. Extensive experiments on a variety of real-world time series have been conducted to demonstrate the advantages of our methods.

INDEX TERMS

Temporal databases, Data mining, Mining methods and algorithms, 0Information Storage

CITATION

Xiaoyan Liu, Zhenjiang Lin, Huaiqing Wang, "Novel Online Methods for Time Series Segmentation",

*IEEE Transactions on Knowledge & Data Engineering*, vol.20, no. 12, pp. 1616-1626, December 2008, doi:10.1109/TKDE.2008.29REFERENCES

- [1] R. Agrawal, C. Faloutsos, and A. Swami, “Efficient Similarity Search in Sequence Databases,”
Proc. Fourth Conf. Foundations of Data Organization and Algorithms (FODO '93), pp. 69-84, 1993.- [7] X. Ge and P. Smyth, “Segmental Semi-Markov Models for Endpoint Detection in Plasma Etching,”
IEEE Trans. Semiconductor Eng., 2001.- [8] A. Gionis and H. Mannila, “Segmentation Algorithms for Time Series and Sequence Data,”
Tutorial in SIAM Int'l Conf. Data Mining, 2005.- [9] S. Guha, N. Koudas, and K. Shim, “Data-Streams and Histograms,”
Proc. 33rd Ann. ACM Symp. Theory of Computing (STOC '01), pp. 471-475, 2001.- [10] http://www.nyse.comtaq/, 2006.
- [11] E. Keogh and T. Folias,
The UCR Time Series Data Mining Archive, Computer Science and Eng. Dept., Univ. of California, http://www.cs.ucr.edu/~eamonn/TSDMAindex.html , 2002.- [12] K.V.R. Kanth, D. Agrawal, and A.K. Singh, “Dimensionality Reduction for Similarity Searching in Dynamic Databases,”
Proc. ACM SIGMOD '98, pp. 166-176, 1998.- [13] E. Keogh and P. Smyth, “A Probabilistic Approach to Fast Pattern Matching in Time Series Databases,”
Proc. ACM SIGKDD '97, pp. 20-24, 1997.- [15] E. Keogh et al., “Segmenting Time Series: A Survey and Novel Approach,”
Data Mining in Time Series Databases, second ed. World Scientific, 2003.- [16] E. Keogh and M. Pazzani, “An Enhanced Representation of Time Series Which Allows Fast and Accurate Classification, Clustering and Relevance Feedback,”
Proc. ACM SIGKDD '98, pp. 239-241, 1998.- [17] E. Keogh and M. Pazzani, “Scaling Up Dynamic Time Warping to Massive Dataset,”
Proc. Third European Conf. Principles of Data Mining and Knowledge Discovery (PKDD '99), pp. 1-11, 1999.- [21] T. Palpanas et al., “Online Amnesic Approximation of Streaming Time Series,”
Proc. 20th Int'l Conf. Data Eng. (ICDE '04), pp.338-349, 2004.- [23] D. Rafiei and A.O. Mendelzon, “Efficient Retrieval of Similar Time Sequences Using DFT,”
Proc. Fifth Int'l Conf. Foundations of Data Organization (FODO '98), pp. 249-257, 1998.- [24] H. Shatkay and S.B. Zdonik, “Approximate Queries and Representations for Large Data Sequences,” Technical Report CS-95-03, Dept. of Computer Science, Brown Univ., 1995.
- [25] H.J.L.M. Vullings, M.H.G. Verhaegen, and H.B. Verbruggen, “ECG Segmentation Using Time-Warping,”
Advances in Intelligent Data Analysis, pp. 275-285, 1997. |