loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2007 Seventh IEEE International Conference on Data Mining
Clustering Needles in a Haystack: An Information Theoretic Analysis of Minority and Outlier Detection
Omaha, Nebraska, USA
October 28-October 31
ISBN: 0-7695-3018-4
Identifying atypical objects is one of the traditional topics in machine learning. Recently, novel approaches, e.g., Minority Detection and One-class clustering, have explored further to identify clusters of atypical objects which strongly contrast from the rest of the data in terms of their distribution or density. This paper analyzes such tasks from an information theoretic perspective. Based on Information Bottleneck formalization, these tasks interpret to increasing the averaged atypicalness of the clusters while reducing the complexity of the clustering. This formalization yields a unifying view of the new approaches as well as the classic outlier detection. We also present a scalable minimization algorithm which exploits the localized form of the cost function over individual clusters. The proposed algorithm is evaluated using simulated datasets and a text classification benchmark, in comparison with an existing method.
Citation:
Shin Ando, "Clustering Needles in a Haystack: An Information Theoretic Analysis of Minority and Outlier Detection," icdm, pp.13-22, 2007 Seventh IEEE International Conference on Data Mining, 2007
Usage of this product signifies your acceptance of the Terms of Use.