This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Unifying Framework for Detecting Outliers and Change Points from Time Series
April 2006 (vol. 18 no. 4)
pp. 482-492
We are concerned with the issue of detecting outliers and change points from time series. In the area of data mining, there have been increased interest in these issues since outlier detection is related to fraud detection, rare event discovery, etc., while change-point detection is related to event/trend change detection, activity monitoring, etc. Although, in most previous work, outlier detection and change point detection have not been related explicitly, this paper presents a unifying framework for dealing with both of them. In this framework, a probabilistic model of time series is incrementally learned using an online discounting learning algorithm, which can track a drifting data source adaptively by forgetting out-of-date statistics gradually. A score for any given data is calculated in terms of its deviation from the learned model, with a higher score indicating a high possibility of being an outlier. By taking an average of the scores over a window of a fixed length and sliding the window, we may obtain a new time series consisting of moving-averaged scores. Change point detection is then reduced to the issue of detecting outliers in that time series. We compare the performance of our framework with those of conventional methods to demonstrate its validity through simulation and experimental applications to incidents detection in network security.

[1] H. Akaike and G. Kitagawa, Practices in Time Series Analysis I,II. Asakura Shoten, 1994 and 1995, (in Japanese).
[2] V. Barnett and T. Lewis, Outliers in Statistical Data. John Wiley & Sons, 1994.
[3] P. Burge and J. Shaw-Taylor, “Detecting Cellular Fraud Using Adaptive Prototypes,” Proc. AI Approaches to Fraud Detection and Risk Management, pp. 9-13, 1997.
[4] T. Cover and J.A. Thomas, Elements of Information Theory. Wiley-International, 1991.
[5] T. Fawcett and F. Provost, “Activity Monitoring: Noticing Interesting Changes in Behavior,” Proc. ACM-SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 53-62, 1999.
[6] V. Guralnik and J. Srivastava, “Event Detection from Time Series Data,” Proc. ACM-SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 33-42, 1999.
[7] S.B. Guthery, “Partition Regression,” J. Am. Statistical Assoc., vol. 69, no. 348, pp. 945-947, Dec. 1974.
[8] D.M. Hawkins, “Point Estimation of Parameters of Piecewise Regression Models,” J Royal Statistical Soc. Series C, vol. 25, no. 1, pp. 51-57, 1976.
[9] M. Huskova, “Nonparametric Procedures for Detecting a Change in Simple Linear Regression Models,” Applied Change Point Problems in Statistics, 1993.
[10] G. Kitagawa and W. Gersch, “Smoothness Priors Analysis of Time Series,” Lecture Notes in Statistics, vol. 116, Springer-Verlag, 1996.
[11] E.M. Knorr and R.T. Ng, “Algorithms for Mining Distance-Based Outliers in Large Data Sets,” Proc. 24th Very Large Data Bases Conf., pp. 392-403, 1998.
[12] U. Murad and G. Pinkas, “Unsupervised Profiling for Identifying Superimposed Fraud,” Proc. Third European Conf. Principles and Practice of Knowledge Discovery in Databases, pp. 251-261, 1999.
[13] R.M. Neal and G.E. Hinton, “A View of the EM Algorithm that Justifies Incremental, Sparse, and Other Variants,” ftp://ftp.cs.toronto.edu/pub/radford/www publications.html, 1993.
[14] T. Ozaki and G. Kitagawa, A Method for Time Series Analysis. Asakura Shoten, 1995, (in Japanese).
[15] J. Rissanen, “Fisher Information and Stochastic Complexity,” IEEE Trans. Information Theory, vol. 42, no. 1, pp. 40-47, 1996.
[16] K. Yamanishi, J. Takeuchi, G. Williams, and P. Milne, “Online Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms,” Data Mining and Knowledge Discovery J., vol. 8, no. 3, pp. 275-300, May 2004.
[17] K. Yamanishi and J. Takeuchi, “Discovering Outlier Filtering Rules from Unlabeled Data,” Proc. Fourth Workshop Knowledge Discovery and Data Mining, pp. 389-394, 2001.
[18] K. Yamanishi and J. Takeuchi, “A Unifying Approach to Detecting Outliers and Change-Points from Nonstationary Data,” Proc of the Eighth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2002.
[19] B.-K. Yi, N.D. Sidiropoulos, T. Johnson, H.V. Jagadish, C. Faloutsos, and A. Biliris, “Online Data Mining for Co-Evolving Time Sequences” Proc. 16th Int'l Conf. Data Eng., 2000.
[20] http://www.trendmicro.com/vinfo/virusencyclo default5.asp? VName=WORM_MSBLAST.A, 2006.

Index Terms:
Time series, change point, data mining, network security, AR model.
Citation:
Jun-ichi Takeuchi, Kenji Yamanishi, "A Unifying Framework for Detecting Outliers and Change Points from Time Series," IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 4, pp. 482-492, April 2006, doi:10.1109/TKDE.2006.54
Usage of this product signifies your acceptance of the Terms of Use.