
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Davood Rafiei, Alberto O. Mendelzon, "Querying Time Series Data Based on Similarity," IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 5, pp. 675693, September/October, 2000.  
BibTex  x  
@article{ 10.1109/69.877502, author = {Davood Rafiei and Alberto O. Mendelzon}, title = {Querying Time Series Data Based on Similarity}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {12}, number = {5}, issn = {10414347}, year = {2000}, pages = {675693}, doi = {http://doi.ieeecomputersociety.org/10.1109/69.877502}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Knowledge and Data Engineering TI  Querying Time Series Data Based on Similarity IS  5 SN  10414347 SP675 EP693 EPD  675693 A1  Davood Rafiei, A1  Alberto O. Mendelzon, PY  2000 KW  Similarity queries KW  time series retrieval KW  indexing time series KW  Fourier transform. VL  12 JA  IEEE Transactions on Knowledge and Data Engineering ER   
Abstract—We study similarity queries for time series data where similarity is defined, in a fairly general way, in terms of a distance function and a set of affine transformations on the Fourier series representation of a sequence. We identify a safe set of transformations supporting a wide variety of comparisons and show that this set is rich enough to formulate operations such as moving average and time scaling. We also show that queries expressed using safe transformations can efficiently be computed without prior knowledge of the transformations. We present a query processing algorithm that uses the underlying multidimensional index built over the data set to efficiently answer similarity queries. Our experiments show that the performance of this algorithm is competitive to that of processing ordinary (exact match) queries using the index, and much faster than sequential scanning. We propose a generalization of this algorithm for simultaneously handling multiple transformations at a time, and give experimental results on the performance of the generalized algorithm.
[1] R. Agrawal, C. Faloutsos, and A. Swami, “Efficient Similarity Search in Sequence Databases,” Proc. Fourth Int'l Conf. Foundations of Data Organization and Algorithms, pp. 6984, Oct. 1993.
[2] R. Agrawal, K. Lin, H.S. Sawhney, and K. Shim, “Fast Similarity Search in the Presence of Noise, Scaling and Translation in TimeSeries Databases,” Proc. Very Large Data Bases, pp. 490501, Sept. 1995.
[3] R. Agrawal, G. Psaila, E.L. Wimmers, and M. Zait, “Querying Shapes of Histories,” Proc. Very Large Data Bases (VLDB) Conf., pp. 502514, 1995.
[4] N. Beckmann, H.P. Kriegel, R. Schneider, and B. Seeger, “The R*Tree: An Efficient and Robust Access Method for Points and Rectangles,” Proc. ACM SIGMOD Conf. Management of Data, 1990.
[5] K.K.W. Chu and M.H. Wong, “Fast Time Series Searching with Scaling and Shifting,” Proc. ACM Symp. Principles of Database Systems (PODS '99), pp. 237–248, 1999.
[6] R.D. Edwards and J. Magee, Technical Analysis of Stock Trends. Springfield, Mass., 1969.
[7] C. Faloutsos, H.V. Jagadish, A.O. Mendelzon, and T. Milo, “A Signature Technique for SimilarityBased Queries,” Proc. Compression and Complexity of Sequences (SEQUENCES '97), June 1997.
[8] C. Faloutsos, M. Ranganathan, and I. Manolopoulos, “Fast Subsequence Matching in Time Series Databases,” Proc. ACM SIGMOD, pp. 419429, May 1994.
[9] D.Q. Goldin and P.C. Kanellakis, “On Similarity Queries for Time Series Data: Constraint Specification and Implementation,” Proc. Int'l Conf. Principles and Practice of Constraint Programming, pp. 137153, 1995.
[10] S. Guha, R. Rastogi, and K. Shim, CURE: An Efficient Clustering Algorithm for Large Databases Proc. ACM SIGMOD, pp. 7384, June 1998.
[11] A. Guttman, “RTrees: A Dynamic Index Structure for Spatial Searching,” Proc. ACM SIGMOD Conf. Management of Data, 1984.
[12] H. Jagadish, A. Medelzon, and T. Milo, “SimilarityBased Queries,” Proc. ACM Principles of Database Systems (PODS), pp. 3645, May 1995.
[13] D. Lomet and B. Salzberg, "The hBTree: A Multiattribute Indexing Method with Good Guaranteed Performance," ACM Trans. Database Systems. vol. 15, no. 4, pp. 625658, Dec. 1990.
[14] C.S. Li, P.S. Yu, and V. Castelli, “Hierarchyscan: A Hierarchical Similarity Search Algorithm for Databases of Long Sequences,” Proc. Int'l Conf. Data Eng., 1996.
[15] NRCCNRC, Feature Selection Bibliography. http://ai.iit.nrc.ca/bibliographiesfeatureselection.html .
[16] J. Nievergelt, H. Hinterberger, and K.C. Sevcik, "The Grid File: An Adaptable, Symmetric Multikey File Structure," ACM Trans. Database Systems, vol. 9, no. 1, pp. 3871, Mar. 1984.
[17] A.V. Oppenheim and R.W. Schafer, DiscreteTime Signal Processing.Englewood Cliffs, N.J.: Prentice Hall, 1989.
[18] D. Rafiei, “FourierTransform Based Techniques in Efficient Retrieval of Similar Time Sequences,” PhD thesis, Univ. of Toronto, 1998.
[19] N. Roussopoulos, S. Kelley, and F. Vincent, “Nearest Neighbor Queries,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 7179, 1995.
[20] D. Rafiei and A. Mendelzon, “SimilarityBased Queries for Time Series Data,” Proc. ACM SIGMOD Conf. Management of Data, pp. 1325, 1997.
[21] D. Rafiei and A. Mendelzon, “Efficient Retrieval of Similar Time Sequences Using DFT,” Proc. Fifth Int'l Conf. Foundations of Data Organizations and Algorithms (FODO '98), pp. 249–257, Nov. 1998.
[22] W.G. Roth, “MIMSY: A System for Analyzing Time Series Data in the Stock Market Domain,” master's thesis, Univ. of Wisconsin, Madison, 1993.
[23] R. Ramakrishnan,D. Srivastava,, and S. Sudarshan,“Coral_Control, relations and logic,” Proc. 18th Int’l Conf. Very Large Data Bases, pp. 547559.,Vancouver, Can., Aug. 1992.
[24] D. Sankoff and J.B. Kruskal, Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. AddisonWesley, 1983.
[25] K.C. Sevcik and N. Koudas, “Filter Trees for Managing Spatial Data Over a Range of Size Granularities,” Proc. 23rd Int'l Conf. Very Large Data Bases (VLDB '96), pp. 16–27, Sept. 1996.
[26] T. Seidl and H.P. Kriegel, “Optimal MultiStep kNearest Neighbor Search,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 154165, 1998.
[27] P. Seshadri, M. Livny, and R. Ramakrishnan, “Sequence Query Processing,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 430441, May 1994.
[28] H. Shatkay and S.B. Zdonik, “Approximate Queries and Representations for Large Data Sequence,” Proc. Int'l Conf. Data Eng. (ICDE), pp. 536545, 1996.
[29] B.K. Yi, H.V. Jagadish, and C. Faloutsos, “Efficient Retrieval of Similar Time Sequences under Time Warping,” Proc. Int'l Conf. Data Eng., 1998.