The Community for Technology Leaders
Sixth International Conference on Data Mining (ICDM'06) (2006)
Hong Kong
Dec. 18, 2006 to Dec. 22, 2006
ISSN: 1550-4786
ISBN: 0-7695-2701-9
pp: 63-74
Ziv Bar-Yossef , Technion and Google Inc., Israel
Ido Guy , Technion and IBM Research Lab, Israel
Ronny Lempel , IBM Research Lab, Israel
Yoelle S. Maarek , Google Inc., Israel
Vladimir Soroka , IBM Research Lab, Israel
ABSTRACT
We initiate the study of a new clustering framework, called cluster ranking. Rather than simply partitioning a network into clusters, a cluster ranking algorithm also orders the clusters by their strength. To this end, we introduce a novel strength measure for clusters--the integrated cohesion--which is applicable to arbitrary weighted networks. <p>We then present C-Rank: a new cluster ranking algorithm. Given a network with arbitrary pairwise similarity weights, C-Rank creates a list of overlapping clusters and ranks them by their integrated cohesion. We provide extensive theoretical and empirical analysis of C-Rank and show that it is likely to have high precision and recall.</p> <p>Our experiments focus on mining mailbox networks. A mailbox network is an egocentric social network, consisting of contacts with whom an individual exchanges email. Ties among contacts are represented by the frequency of their co-occurrence on message headers. C-Rank is well suited to mine such networks, since they are abundant with overlapping communities of highly variable strengths. We demonstrate the effectiveness of C-Rank on the Enron data set, consisting of 130 mailbox networks.</p>
INDEX TERMS
null
CITATION

V. Soroka, Z. Bar-Yossef, R. Lempel, I. Guy and Y. S. Maarek, "Cluster Ranking with an Application to Mining Mailbox Networks," Sixth International Conference on Data Mining (ICDM'06)(ICDM), Hong Kong, 2006, pp. 63-74.
doi:10.1109/ICDM.2006.35
85 ms
(Ver 3.3 (11022016))