Issue No. 10 - October (2005 vol. 17)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2005.161
Xiaoyang Sean Wang , IEEE Computer Society
Like Gao , IEEE
In many applications, local or remote sensors send in streams of data, and the system needs to monitor the streams to discover relevant events/patterns and deliver instant reaction correspondingly. An important scenario is that the incoming stream is a continually appended time series, and the patterns are time series in a database. At each time when a new value arrives (called a time position), the system needs to find, from the database, the nearest or near neighbors of the incoming time series up to the time position. This paper attacks the problem by using Fast Fourier Transform (FFT) to efficiently find the cross correlations of time series, which yields, in a batch mode, the nearest and near neighbors of the incoming time series at many time positions. To take advantage of this batch processing in achieving fast response time, this paper uses prediction methods to predict future values. When the prediction length is long, FFT is used to compute the cross correlations of the predicted series (with the values that have already arrived) and the database patterns, and to obtain predicted distances between the incoming time series at many future time positions and the database patterns. If the prediction length is short, the direct computation method is used to obtain these predicted distances to avoid the overhead of using FFT. When the actual data value arrives, the prediction error together with the predicted distances is used to filter out patterns that are not possible to be the nearest or near neighbors, which provides fast responses. Experiments show that with reasonable prediction errors, the performance gain is significant. Especially, when the long term predictions are available, the proposed method can handle incoming data at a very fast streaming rate.
Index Terms- Similarity search, data mining, data stream processing, time series analysis, nearest neighbor search, continuous query.
Xiaoyang Sean Wang, Like Gao, "Continuous Similarity-Based Queries on Streaming Time Series", IEEE Transactions on Knowledge & Data Engineering, vol. 17, no. , pp. 1320-1332, October 2005, doi:10.1109/TKDE.2005.161