loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Third IEEE International Conference on Data Mining (ICDM'03)
Tractable Group Detection on Large Link Data Sets
Melbourne, Florida
November 19-November 22
ISBN: 0-7695-1978-4
Jeremy Kubica, Carnegie Mellon University, Pittsburgh, PA
Andrew Moore, Carnegie Mellon University, Pittsburgh, PA
Jeff Schneider, Carnegie Mellon University, Pittsburgh, PA
Discovering underlying structure from co-occurrence data is an important task in a variety of fields, including: insurance, intelligence, criminal investigation, epidemiology, human resources, and marketing. Previously Kubica et. al. presented the group detection algorithm (GDA) - an algorithm for finding underlying groupings of entities from co-occurrence data. This algorithm is based on a probabilistic generative model and produces coherent groups that are consistent with prior knowledge. Unfortunately, the optimization used in GDA is slow, potentially making it infeasible for many large data sets. To this end, we present k-groups - an algorithm that uses an approach similar to that of k-means to significantly acclerate the discovery of groups while retaining GDA's probabilistic model. We compare the performance of GDA and k-groups on a variety of data, showing that k-groups' sacrifice in solution quality is significantly offset by its increase in speed.
Citation:
Jeremy Kubica, Andrew Moore, Jeff Schneider, "Tractable Group Detection on Large Link Data Sets," icdm, pp.573, Third IEEE International Conference on Data Mining (ICDM'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.