Issue No. 01 - January (2008 vol. 20)
With the advance of hardware and communication technologies, stream time series is gaining ever-increasing attention due to its importance in many applications, such as financial data processing, network monitoring, web click-stream analysis, sensor data mining and anomaly detection. For all these applications, an efficient and effective similarity search over stream data is essential. Even though many approaches have been proposed for searching through archived data, because of the unique characteristics of the stream, for example, data are frequently updated and real-time response is required, traditional methods may not work in these stream scenarios. Especially, for the cases where the arrival of data is often delayed for various reasons, for example, the communication congestion or batch processing and so on, queries on such incomplete time series or even future time series may result in inaccuracy using the traditional approaches. Therefore, in this paper we propose three approaches, polynomial, DFT and probabilistic, to predict the unknown values that have not arrived at the system and answer the queries based on the predicated data. We also present efficient indexes, that is, a multidimensional hash index and B+-tree, to facilitate the prediction and similarity search on future time series, respectively. Extensive experiments demonstrate the efficiency and effectiveness of our methods in terms of I/O, prediction and query accuracy
Information Search and Retrieval, Search process, Multimedia databases, Query processing
X. Lian and L. Chen, "Efficient Similarity Search over Future Stream Time Series," in IEEE Transactions on Knowledge & Data Engineering, vol. 20, no. , pp. 40-54, 2007.