CSDL Home IEEE Transactions on Pattern Analysis & Machine Intelligence 2009 vol.31 Issue No.03 - March

Subscribe

Issue No.03 - March (2009 vol.31)

pp: 563-569

Jorge Silva , Duke University, Durham

ABSTRACT

This paper addresses the problem of detecting anomalous multivariate co-occurrences using a limited number of unlabeled training observations. A novel method based on using a hypergraph representation of the data is proposed to deal with this very high-dimensional problem. Hypergraphs constitute an important extension of graphs which allow edges to connect more than two vertices simultaneously. A variational Expectation-Maximization algorithm for detecting anomalies directly on the hypergraph domain without any feature selection or dimensionality reduction is presented. The resulting estimate can be used to calculate a measure of anomalousness based on the False Discovery Rate. The algorithm has $O(np)$ computational complexity, where $n$ is the number of training observations and $p$ is the number of potential participants in each co-occurrence event. This efficiency makes the method ideally suited for very high-dimensional settings, and requires no tuning, bandwidth or regularization parameters. The proposed approach is validated on both high-dimensional synthetic data and the Enron email database, where $p > 75,000$, and it is shown that it can outperform other state-of-the-art methods.

INDEX TERMS

Anomaly detection, Co-occurrence data, Unsupervised learning, Variational methods, False Discovery Rate

CITATION

Jorge Silva, "Hypergraph-Based Anomaly Detection of High-Dimensional Co-Occurrences",

*IEEE Transactions on Pattern Analysis & Machine Intelligence*, vol.31, no. 3, pp. 563-569, March 2009, doi:10.1109/TPAMI.2008.232REFERENCES

- [1] A. Ozgur, B. Cetin, and H. Bingol, “Co-Occurrence Network of Reuters News,” http://arxiv.org/abs0712.2491, Dec. 2007.
- [2] A. Globerson, G. Chechik, F. Pereira, and N. Tishby, “Euclidean Embedding of Co-Occurrence Data,”
J. Machine Learning Research, vol. 8, pp. 2265-2295, 2007.- [3] N. Jhanwar, S. Chaudhuri, G. Seetharaman, and B. Zavidovique, “Content Based Image Retrieval Using Motif Cooccurrence Matrix,”
Proc. Fourth Indian Conf. Computer Vision, Graphics and Image Processing, vol. 22, no. 14, pp. 1211-1220, 2004.- [4] E. Garcia, “Targeting Documents and Terms: Using Co-Occurrence Data, Answer Sets and Probability Theory,” http://www.miislita.com/semanticsc-index-3.html , May 2008.
- [5] M. Li, B. Dias, W. El-Deredy, and P.J.G. Lisboa, “A Probabilistic Model for Item-Based Recommender Systems,”
Proc. ACM Int'l Conf. Recommender Systems), 2007.- [6] H. Li and N. Abe, “Word Clustering and Disambiguation Based on Co-Occurrence Data,”
Proc. 19th Int'l Conf. Computational Linguistics, 2002.- [7] M. Rabbat, M. Figueiredo, and R. Nowak, “Network Inference from Co-Occurrences,”
IEEE Trans. Information Theory, vol. 54, no. 9, pp. 4053-4068, 2008.- [9] M.E.J. Newman, “The Structure and Function of Complex Networks,”
SIAM Rev., vol. 45, pp. 167-256, 2003.- [10] T. Hofmann and J. Puzicha, “Statistical Models for Co-Occurrence Data,” Technical Report AIM-1625, Massachusetts Inst. of Technology, citeseer. ist.psu.edu/articlehofmann98statistical.html , 1998.
- [11] C. Berge,
Hypergraphs: Combinatorics of Finite Sets. North Holland, 1989.- [12] W. Lee and S. Stolfo, “Data Mining Approaches for Intrusion Detection,”
Proc. Seventh Usenix Security Symp., 1998.- [14] A. Lazarevic, L. Ertoz, V. Kumar, A. Ozgur, and J. Srivastava, “A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection,”
Proc. Third SIAM Int'l Conf. Data Mining, May 2003.- [15] T. Ahmed, B. Oreshkin, and M. Coates, “Machine Learning Approaches to Network Anomaly Detection,”
Proc. Second Workshop Tackling Computer Systems Problems with Machine Learning, Apr. 2007.- [17] E. Eskin, A. Arnold, M. Prerau, L. Portnoy, and S. Stolfo, “A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data,”
Applications of Data Mining in Computer Security, D.Barbara and S. Jajodia, eds., chapter 4, Kluwer Academic, 2002.- [19] D.W. Scott,
Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons, 1992.- [20] C. Scott and E. Kolaczyk, “Nonparametric Assessment of Contamination in Multivariate Data Using Minimum Volume Sets and FDR,” technical report, Univ. of Michigan, 2007.
- [21] A.O. Hero, “Geometric Entropy Minimization (GEM) for Anomaly Detection and Localization,”
Advances in Neural Information Processing Systems, 2007.- [23] R. El-Yaniv and M. Nisenson, “Optimal Single-Class Classification Strategies,”
Advances in Neural Information Processing Systems, 2007.- [24] A. McCallum and K. Nigam, “A Comparison of Event Models for Naïve Bayes Text Classification,”
Proc. AAAI Workshop Learning for Text Categorization, Technical Report WS-98-05, 1998.- [25] K. Humphreys and D.M. Titterington, “Improving the Mean-Field Approximation in Belief Networks Using Bahadur's Reparameterisation of the Multivariate Binary Representation,”
Neural Processing Letters, vol. 12, pp. 183-197, 2000.- [26] M.J. Wainwright and M.I. Jordan, “Graphical Models, Exponential Families, and Variational Inference,” technical report, Dept. of Statistics, Univ. of California, Berkeley, 2003.
- [27] G. Beylkin, J. Garcke, and M.J. Mohlenkamp, “Multivariate Regression and Machine Learning with Sums of Separable Functions,” submitted, 2007.
- [28] G. McLachlan and D. Peel,
Finite Mixture Models. John Wiley & Sons, 2000.- [29] G. McLachlan and T. Krishnan,
The EM Algorithm and Extensions. Wiley-Interscience, 1996.- [33] D.M. Titterington, A.F.M. Smith, and U.E. Makov,
Statistical Analysis of Finite Mixture Distributions. John Wiley & Sons, 1985.- [35] J. Silva and R. Willett, “Hypergraph-Based Anomaly Detection in Very Large Networks,” Technical Report ECE-2008-01, Duke Univ., 2008.
- [36] B. Klimt and Y. Yang, “The Enron Corpus: A New Dataset for E-Mail Classification Research,”
Proc. 15th European Conf. Machine Learning, 2004.- [37] R. Abelson, “Enron's Many Strands: Ex-Chief's Holdings; Putting 'Lost Everything' in Perspective,”
New York Times, Jan. 2002.- [38] C.-C. Chang and C.-J. Lin, “LIBSVM: A Library for Support Vector Machines,” http://www.csie.ntu.edu.tw/cjlinlibsvm, 2001.
- [39] A. Ng and M.I. Jordan, “On Discriminative versus Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes,”
Advances in Neural Information Processing Systems, 2002. |