
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Varun Chandola, Arindam Banerjee, Vipin Kumar, "Anomaly Detection for Discrete Sequences: A Survey," IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 5, pp. 823839, May, 2012.  
BibTex  x  
@article{ 10.1109/TKDE.2010.235, author = {Varun Chandola and Arindam Banerjee and Vipin Kumar}, title = {Anomaly Detection for Discrete Sequences: A Survey}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {24}, number = {5}, issn = {10414347}, year = {2012}, pages = {823839}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2010.235}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Knowledge and Data Engineering TI  Anomaly Detection for Discrete Sequences: A Survey IS  5 SN  10414347 SP823 EP839 EPD  823839 A1  Varun Chandola, A1  Arindam Banerjee, A1  Vipin Kumar, PY  2012 KW  Discrete sequences KW  anomaly detection. VL  24 JA  IEEE Transactions on Knowledge and Data Engineering ER   
[1] V. Chandola, A. Banerjee, and V. Kumar, "Anomaly Detection  A Survey," ACM Computing Surveys, vol. 41, no. 3, pp. 158, July 2009.
[2] V. Hodge and J. Austin, "A Survey of Outlier Detection Methodologies," Artificial Intelligence Rev., vol. 22, no. 2, pp. 85126, 2004.
[3] A. Lazarevic, L. Ertoz, V. Kumar, A. Ozgur, and J. Srivastava, "A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection," Proc. SIAM Int'l Conf. Data Mining, May 2003.
[4] S. Forrest, C. Warrender, and B. Pearlmutter, "Detecting Intrusions Using System Calls: Alternate Data Models," Proc. IEEE Symp. Security and Privacy (ISRSP), pp. 133145, 1999.
[5] S.A. Hofmeyr, S. Forrest, and A. Somayaji, "Intrusion Detection Using Sequences of System Calls," J. Computer Security, vol. 6, no. 3, pp. 151180, citeseer.ist.psu.eduhofmeyr98intrusion.html , 1998.
[6] C.C. Michael and A. Ghosh, "Two StateBased Approaches to ProgramBased Anomaly Detection," Proc. 16th Ann. Computer Security Applications Conf., p. 21, 2000.
[7] W. Lee, S. Stolfo, and P. Chan, "Learning Patterns from Unix Process Execution Traces for Intrusion Detection," Proc. AAAI 97 Workshop AI Methods in Fraud and Risk Management, 1997.
[8] W. Lee and S. Stolfo, "Data Mining Approaches for Intrusion Detection," Proc. Seventh USENIX Security Symp., 1998.
[9] F.A. Gonzalez and D. Dasgupta, "Anomaly Detection Using RealValued Negative Selection," Genetic Programming and Evolvable Machines, vol. 4, no. 4, pp. 383403, 2003.
[10] B. Gao, H.Y. Ma, and Y.H. Yang, "Hmms (Hidden Markov Models) Based on Anomaly Intrusion Detection Method," Proc. Int'l Conf. Machine Learning and Cybernetics, pp. 381385, 2002.
[11] S. Budalakoti, A. Srivastava, R. Akella, and E. Turkov, "Anomaly Detection in Large Sets of HighDimensional Symbol Sequences," Technical Report NASA TM2006214553, NASA Ames Research Center, 2006.
[12] S. Budalakoti, A. Srivastava, and M. Otey, "Anomaly Detection and Diagnosis Algorithms for Discrete Symbol Sequences with Applications to Airline Safety," Proc. IEEE Int'l Conf. Systems, Man, and Cybernetics, vol. 37, no. 6, 2007.
[13] P. Sun, S. Chawla, and B. Arunasalam, "Mining for Outliers in Sequential Databases," Proc. SIAM Int'l Conf. Data Mining, 2006.
[14] V. Chandola, V. Mithal, and V. Kumar, "A Comparative Evaluation of Anomaly Detection Techniques for Sequence Data," Proc. Int'l Conf. Data Mining, 2008.
[15] D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge Univ. Press, 1997.
[16] S. Forrest, S.A. Hofmeyr, A. Somayaji, and T.A. Longstaff, "A Sense of Self for Unix Processes," Proc. IEEE Symp. Security and Privacy (ISRSP '96), pp. 120128, citeseer.ist.psu.eduforrest96sense.html, 1996.
[17] E. Eskin, W. Lee, and S. Stolfo, "Modeling System Call for Intrusion Detection Using Dynamic Window Sizes," Proc. DARPA Information Survivability Conf. and Exposition (DISCEX), citeseer. ist.psu.eduportnoy01intrusion.html , 2001.
[18] T. Lane and C.E. Brodley, "Temporal Sequence Learning and Data Reduction for Anomaly Detection," ACM Trans. Information Systems and Security, vol. 2, no. 3, pp. 295331, 1999.
[19] G. Liu, T.K. McDaniel, S. Falkow, and S. Karlin, "Sequence Anomalies in the cag7 Gene of the Helicobacter Pylori Pathogenicity Island," Proc. Nat'l Academy of Sciences USA, vol. 96, no. 12, pp. 70117016, 1999.
[20] A.N. Srivastava, "Discovering System Health Anomalies Using Data Mining Techniques," Proc. Joint Army Navy NASA Airforce Conf. Propulsion, 2005.
[21] S. Chakrabarti, S. Sarawagi, and B. Dom, "Mining Surprising Patterns Using Temporal Description Length," Proc. 24th Int'l Conf. Very Large Data Bases, pp. 606617, 1998.
[22] D. Pavlov and D. Pennock, "A Maximum Entropy Approach to Collaborative Filtering in Dynamic, Sparse, HighDimensional Domains," Proc. Advances in Neural Information Processing Systems, 2002.
[23] D. Pavlov, "Sequence Modeling with Mixtures of Conditional Maximum Entropy Distributions," Proc. Third IEEE Int'l Conf. Data Mining, pp. 251258, 2003.
[24] S. Ramaswamy, R. Rastogi, and K. Shim, "Efficient Algorithms for Mining Outliers from Large Data Sets," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 427438, 2000.
[25] E. Eskin, A. Arnold, M. Prerau, L. Portnoy, and S. Stolfo, "A Geometric Framework for Unsupervised Anomaly Detection," Applications of Data Mining in Computer Security, pp. 78100, Kluwer Academics, 2002.
[26] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. PrenticeHall, Inc., 1988.
[27] J. Yang and W. Wang, "CLUSEQ: Efficient and Effective Sequence Clustering," Proc. Int'l Conf. Data Eng., pp. 101112, 2003.
[28] D. Ron, Y. Singer, and N. Tishby, "The Power of Amnesia: Learning Probabilistic Automata with Variable Memory Length," Machine Learning, vol. 25, nos. 2/3, pp. 117149, 1996.
[29] I. Cadez, D. Heckerman, C. Meek, P. Smyth, and S. White, "Visualization of Navigation Patterns on a Web Site Using ModelBased Clustering," Proc. Sixth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 280284, 2000.
[30] P. Smyth, "Clustering Sequences with Hidden Markov Models," Proc. Advances in Neural Information Processing Systems, vol. 9, 1997.
[31] R.R. Sokal and C.D. Michener, "A Statistical Method for Evaluating Systematic Relationships," Univ. of Kansas Scientific Bull., vol. 38, pp. 14091438, 1958.
[32] J.W. Hunt and T.G. Szymanski, "A Fast Algorithm for Computing Longest Common Subsequences," Comm. ACM, vol. 20, no. 5, pp. 350353, 1977.
[33] N. Kumar, V.N. Lolla, E.J. Keogh, S. Lonardi, and C.A. Ratanamahatana, "TimeSeries Bitmaps: A Practical Visualization Tool for Working with Large Time Series Databases," Proc. SIAM Int'l Conf. Data Mining (SDM), 2005.
[34] T. Lane and C.E. Brodley, "Sequence Matching and Learning in Anomaly Detection for Computer Security," Proc. AI Approaches to Fraud Detection and Risk Management, Fawcett, Haimowitz, Provost, and Stolfo, eds., pp. 4349, 1997.
[35] M.M. Breunig, H.P. Kriegel, R.T. Ng, and J. Sander, "Lof: Identifying DensityBased Local Outliers," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 93104, 2000.
[36] D. Endler, "Intrusion Detection: Applying Machine Learning to Solaris Audit Data," Proc. 14th Ann. Computer Security Applications Conf., pp. 268279, 1998.
[37] H. Debar, M. Dacier, M. Nassehi, and A. Wespi, "Fixed vs. VariableLength Patterns for Detecting Suspicious Process Behavior," Proc. Fifth European Symp. Research in Computer Security, pp. 115, 1998.
[38] A.K. Ghosh, A. Schwartzbard, and M. Schatz, "Using Program Behavior Profiles for Intrusion Detection," Proc. SANS Third Conf. and Workshop Intrusion Detection and Response, citeseer.ist.psu. edughosh99learning.html , Feb. 1999.
[39] A. Ghosh, A. Schwartzbard, and M. Schatz, "Learning Program Behavior Profiles for Intrusion Detection," Proc. First USENIX Workshop Intrusion Detection and Network Monitoring, pp. 5162, Apr. 1999.
[40] J.B.D. Cabrera, L. Lewis, and R.K. Mehra, "Detection and Classification of Intrusions and Faults Using Sequences of System Calls," SIGMOD Record, vol. 30, no. 4, pp. 2534, 2001.
[41] A.P. Kosoresow and S.A. Hofmeyr, "Intrusion Detection via System Call Traces," IEEE Software, vol. 14, no. 5, pp. 3542, Sept./Oct. 1997.
[42] D. Dasgupta and F. Nino, "A Comparison of Negative and Positive Selection Algorithms in Novel Pattern Detection," Proc. IEEE Int'l Conf. Systems, Man, and Cybernetics, vol. 1, pp. 125130, 2000.
[43] T. Lane and C.E. Brodley, "An Application of Machine Learning to Anomaly Detection," Proc. 20th Nat'l Information Systems Security Conf., pp. 366380, 1997.
[44] T. Lane, "Machine Learning Techniques for the Computer Security Domain of Anomaly Detection," PhD dissertation, Purdue Univ., 2000.
[45] D. Dasgupta and N. Majumdar, "Anomaly Detection in Multidimensional Data Using Negative Selection Algorithm," Proc. IEEE Conf. Evolutionary Computation, pp. 10391044, May 2002.
[46] S. Forrest, P. D'haeseleer, and P. Helman, "An Immunological Approach to Change Detection: Algorithms, Analysis and Implications," Proc. IEEE Symp. Security and Privacy, pp. 110119, 1996.
[47] S. Forrest, A.S. Perelson, L. Allen, and R. Cherukuri, "SelfNonself Discrimination in a Computer," Proc. IEEE Symp. Security and Privacy, pp. 202212, 1994.
[48] S. Forrest and D. Dasgupta, "Novelty Detection in Time Series Data Using Ideas from Immunology," Proc. Fifth Int'l Conf. Intelligence Systems, 1996.
[49] S. Forrest, F. Esponda, and P. Helman, "A Formal Framework for Positive and Negative Detection Schemes," IEEE Trans. Systems, Man and Cybernetics, Part B, vol. 34, no. 1, pp. 357373, Feb. 2004.
[50] A.K. Ghosh, J. Wanken, and F. Charron, "Detecting Anomalous and Unknown Intrusions against Programs," Proc. 14th Ann. Computer Security Applications Conf., pp. 259267, 1998.
[51] M. Wang, C. Zhang, and J. Yu, "Native Api Based Windows Anomaly Intrusion Detection Method Using SVM," Proc. IEEE Int'l Conf. Sensor Networks, Ubiquitous, and Trustworthy Computing, vol. 1, pp. 514519, 2006.
[52] S. Tian, S. Mu, and C. Yin, "SequenceSimilarity Kernels for Svms to Detect Anomalies in System Calls," Neurocomputing, vol. 70, nos. 46, pp. 859866, 2007.
[53] X. Li, J. Han, S. Kim, and H. Gonzalez, "Roam: Rule and MotifBased Anomaly Detection in Massive Moving Object Data Sets," Proc. Seventh SIAM Int'l Conf. Data Mining, 2007.
[54] N. Ye, "A Markov Chain Model of Temporal Behavior for Anomaly Detection," Proc. Fifth Ann. IEEE Information Assurance Workshop, 2004.
[55] C. Marceau, "Characterizing the Behavior of a Program Using MultipleLength NGrams," Proc. Workshop New Security Paradigms, pp. 101110, 2000.
[56] E. Eskin, W.N. Grundy, and Y. Singer, "Protein Family Classification Using Sparse Markov Transducers," Proc. Int'l Conf. Intelligent Systems for Molecular Biology (ISMB '08), pp. 134145, 2000.
[57] W.W. Cohen, "Fast Effective Rule Induction," Proc. 12th Int'l Conf. Machine Learning, A. Prieditis and S. Russell, eds., pp. 115123, July 1995.
[58] L.R. Rabiner and B.H. Juang, "An Introduction to Hidden Markov Models," IEEE ASSP Magazine, vol. 3, no. 1, pp. 416, Jan. 1986.
[59] L.E. Baum, T. Petrie, G. Soules, and N. Weiss, "A Maximization Technique Occuring in the Statistical Analysis of Probabilistic Functions of Markov Chains," Annals of Math. Statistics, vol. 41, no. 1, pp. 164171, 1970.
[60] T. Lane, "Hidden Markov Models for Human/Computer Interface Modeling," Proc. IJCAI99 Workshop Learning about Users, pp. 3544, 1999.
[61] K. Yamanishi and Y. Maruyama, "Dynamic Syslog Mining for Network Failure Monitoring," KDD '05: Proc. 11th ACM SIGKDD Int'l Conf. Knowledge Discovery in Data Mining, pp. 499508, 2005.
[62] J. Forney, G.D., "The Viterbi Algorithm," Proc. IEEE, vol. 61, no. 3, pp. 268278, Mar. 1973.
[63] G. Florez, Z. Liu, S. Bridges, A. Skjellum, and R. Vaughn, "Lightweight Monitoring of Mpi Programs in Real Time," Concurrency and Computation: Practice and Experience, vol. 17, no. 13, pp. 15471578, 2005.
[64] Y. Qiao, X.W. Xin, Y. Bin, and S. Ge, "Anomaly Intrusion Detection Method Based on Hmm," Electronics Letters, vol. 38, no. 13, pp. 663664, 2002.
[65] X. Zhang, P. Fan, and Z. Zhu, "A New Anomaly Detection Method Based on Hierarchical Hmm," Proc. Fourth Int'l Conf. Parallel and Distributed Computing, Applications and Technologies, pp. 249252, 2003.
[66] E. Keogh, J. Lin, S.H. Lee, and H.V. Herle, "Finding the Most Unusual Time Series Subsequence: Algorithms and Applications," Knowledge and Information Systems, vol. 11, no. 1, pp. 127, 2006.
[67] E. Keogh, J. Lin, and A. Fu, "Hot SAX: Efficiently Finding the Most Unusual Time Series Subsequence," Proc. Fifth IEEE Int'l Conf. Data Mining, pp. 226233, 2005.
[68] J. Lin, E. Keogh, A. Fu, and H.V. Herle, "Approximations to Magic: Finding Unusual Medical Time Series," Proc. 18th IEEE Symp. ComputerBased Medical Systems, pp. 329334, 2005.
[69] J. Lin, E. Keogh, L. Wei, and S. Lonardi, "Experiencing SAX: A Novel Symbolic Representation of Time Series," Data Mining and Knowledge Discovery, vol. 15, no. 2, pp. 107144, 2007.
[70] E. Keogh, S. Lonardi, and C.A. Ratanamahatana, "Towards ParameterFree Data Mining," Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 206215, 2004.
[71] L. Wei, N. Kumar, V. Lolla, E.J. Keogh, S. Lonardi, and C. Ratanamahatana, "AssumptionFree Anomaly Detection in Time Series," Proc. 17th Int'l Conf. Scientific and Statistical Database Management, pp. 237240, 2005.
[72] L. Wei, E. Keogh, and X. Xi, "Saxually Explicit Images: Finding Unusual Shapes," Proc. Sixth Int'l Conf. Data Mining, pp. 711720, 2006.
[73] Y. Bu, T.W. Leung, A. Fu, E. Keogh, J. Pei, and S. Meshkin, "Wat: Finding Topk Discords in Time Series Database," Proc. Seventh SIAM Int'l Conf. Data Mining, 2007.
[74] A.W.C. Fu, O.T.W. Leung, E.J. Keogh, and J. Lin, "Finding Time Series Discords Based on Haar Transform," Proc. Second Int'l Conf. Advanced Data Mining and Applications, pp. 3141, 2006.
[75] A. Ghoting, S. Parthasarathy, and M.E. Otey, "Fast Mining of DistanceBased Outliers in HighDimensional Datasets," Proc. SIAM Data Mining Conf., 2006.
[76] R. Gwadera, A. Gionis, and H. Mannila, "Optimal Segmentation Using Tree Models," ICDM '06: Proc. Sixth Int'l Conf. Data Mining, pp. 244253, 2006.
[77] E. Keogh, S. Lonardi, and B.Y.C. Chiu, "Finding Surprising Patterns in a Time Series Database in Linear Time and Space," Proc. Eighth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 550556, 2002.
[78] J.J. Schlesselman, CaseControl Studies: Design, Conduct, Analysis (Monographs in Epidemiology and Biostatistics). Oxford Univ. Press, 1982.
[79] R. Gwadera, M. Atallah, and W. Szpankowski, "Reliable Detection of Episodes in Event Sequences," Knowledge and Information Systems, vol. 7, no. 4, pp. 415437, 2005.
[80] R. Gwadera, M. Atallah, and W. Szpankowskii, "Detection of Significant Sets of Episodes in Event Sequences," Proc. Fourth IEEE Int'l Conf. Data Mining, pp. 310, 2004.
[81] R. Gwadera, M.J. Atallah, and W. Szpankowski, "Markov Models for Identification of Significant Episodes," Proc. Fifth SIAM Int'l Conf. Data Mining, 2005.
[82] R.A. Maxion and K.M.C. Tan, "Benchmarking AnomalyBased Detection Systems," Proc. Int'l Conf. Dependable Systems and Networks, pp. 623630, 2000.
[83] A. Pawling, P. Yan, J. Candia, T. Schoenharl, and G. Madey, "Anomaly Detection in Streaming Sensor Data," Intelligent Techniques for Warehousing and Mining Sensor Network Data, IGI Global, 2008.
[84] D. Pokrajac, A. Lazarevic, and L.J. Latecki, "Incremental Local Outlier Detection for Data Streams," Proc. IEEE Symp. Computational Intelligence and Data Mining, 2007.
[85] G. FlorezLarrahondo, S.M. Bridges, and R. Vaughn, "Efficient Modeling of Discrete Events for Anomaly Detection Using Hidden Markov Models," Information Security, vol. 3650, pp. 506514, 2005.
[86] G.K. Palshikar, "DistanceBased Outliers in Sequences," Proc. Second Int'l Conf. Distributed Computing and Internet Technology, pp. 547552, 2005.
[87] D. Yankov, E.J. Keogh, and U. Rebbapragada, "Disk Aware Discord Discovery: Finding Unusual Time Series in Terabyte Sized Datasets," Proc. Int'l Conf. Data Mining, pp. 381390, 2007.
[88] P. Protopapas, J.M. Giammarco, L. Faccioli, M.F. Struble, R. Dave, and C. Alcock, "Finding Outlier Light Curves in Catalogues of Periodic Variable Stars," Monthly Notices of the Royal Astronomical Soc., vol. 369, no. 2, pp. 677696, 2006.
[89] U. Rebbapragada, P. Protopapas, C.E. Brodley, and C. Alcock, "Finding Anomalous Periodic Time Series," Machine Learning, vol. 74, pp. 281313, 2009.
[90] J. Ma and S. Perkins, "Online Novelty Detection on Temporal Sequences," Proc. Ninth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 613618, 2003.
[91] Z. Liu, J.X. Yu, L. Chen, and D. Wu, "Detection of Shape Anomalies: A Probabilistic Approach Using Hidden Markov Models," Proc. IEEE 24th Int'l Conf. Data Eng., pp. 13251327, Apr. 2008.
[92] R. Gwadera and F. Crestani, "Discovering Significant Patterns in MultiStream Sequences," Proc. Eighth IEEE Int'l Conf. Data Mining, pp. 827832, 2008.
[93] H. Cheng, P.N. Tan, C. Potter, and S. Klooster, "Detection and Characterization of Anomalies in Multivariate Time Series," Proc. Ninth SIAM Int'l Conf. Data Mining, 2009.
[94] R. Fujimaki, T. Nakata, H. Tsukahara, and A. Sato, "Mining Abnormal Patterns from Heterogeneous TimeSeries with Irrelevant Features for Fault Event Detection," Proc. SIAM Int'l Conf. Data Mining, pp. 472482, 2008.
[95] R. Gwadera and F. Crestani, "Ranking Sequential Patterns with Respect to Significance," Proc. 14th PacificAsia Conf. Knowledge Discovery and Data Mining, (PAKDD '[10), pp. 286299, 2010.