The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - February (2012 vol.24)
pp: 326-337
Bo Yang , Jilin University, Changchun
Jiming Liu , Hong Kong Baptist University, Hong Kong
Jianfeng Feng , Warwick University, Coventry
ABSTRACT
Network communities refer to groups of vertices within which their connecting links are dense but between which they are sparse. A network community mining problem (or NCMP for short) is concerned with the problem of finding all such communities from a given network. A wide variety of applications can be formulated as NCMPs, ranging from social and/or biological network analysis to web mining and searching. So far, many algorithms addressing NCMPs have been developed and most of them fall into the categories of either optimization based or heuristic methods. Distinct from the existing studies, the work presented in this paper explores the notion of network communities and their properties based on the dynamics of a stochastic model naturally introduced. In the paper, a relationship between the hierarchical community structure of a network and the local mixing properties of such a stochastic model has been established with the large-deviation theory. Topological information regarding to the community structures hidden in networks can be inferred from their spectral signatures. Based on the above-mentioned relationship, this work proposes a general framework for characterizing, analyzing, and mining network communities. Utilizing the two basic properties of metastability, i.e., being locally uniform and temporarily fixed, an efficient implementation of the framework, called the LM algorithm, has been developed that can scalably mine communities hidden in large-scale networks. The effectiveness and efficiency of the LM algorithm have been theoretically analyzed as well as experimentally validated.
INDEX TERMS
Social network, community structure, Markov chain, local mixing, large-deviation theory.
CITATION
Bo Yang, Jiming Liu, Jianfeng Feng, "On the Spectral Characterization and Scalable Mining of Network Communities", IEEE Transactions on Knowledge & Data Engineering, vol.24, no. 2, pp. 326-337, February 2012, doi:10.1109/TKDE.2010.233
REFERENCES
[1] M. Girvan and M.E.J. Newman, "Community Structure in Social and Biological Networks," Proc. Nat'l Academy of Sciences USA, vol. 9, no. 12, pp. 7821-7826, 2002.
[2] G. Palla, I. Derenyi, I. Farkas, and T. Vicsek, "Uncovering the Overlapping Community Structures of Complex Networks in Nature and Society," Nature, vol. 435, no. 7043, pp. 814-818, 2005.
[3] G. Palla, A.L. Barabasi, and T. Vicsek, "Quantifying Social Group Evolution," Nature, vol. 446, no. 7136, pp. 664-667, 2007.
[4] M.E.J. Newman, "Coauthorship Networks and Patterns of Scientific Collaboration," Proc. Nat'l Academy of Sciences USA, vol. 101, no. s1, pp. 5200-5205, 2004.
[5] M. Fiedler, "Algebraic Connectivity of Graphs," Czechoslovakian Math. J., vol. 23, pp. 298-305, 1973.
[6] J. Shi and J. Malik, "Normalized Cuts and Image Segmentation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-904, Aug. 2000.
[7] M.E.J. Newman, "Modularity and Community Structure in Networks," Proc. Nat'l Academy of Sciences USA, vol. 103, no. 23, pp. 8577-8582, 2006.
[8] S. White and P. Smyth, "A Spectral Clustering Approach to Finding Communities in Graphs," Proc. Fifth SIAM Int'l Conf. Data Mining, 2005.
[9] M. Shiga, I. Takigawa, and H. Mamitsuka, "A Spectral Clustering Approach to Optimally Combining Numerical Vectors with a Modular Network," Proc. 13th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 647-656, Aug. 2007.
[10] D.M. Wilkinson and B.A. Huberman, "A Method for Finding Communities of Related Genes," Proc. Nat'l Academy of Sciences USA, vol. 101, no. suppl 1, pp. 5241-5248, 2004.
[11] R. Guimera and L.A.N. Amaral, "Functional Cartography of Complex Metabolic Networks," Nature, vol. 433, no. 2, pp. 895-900, 2005.
[12] E. Ravasz, A.L. Somera, and D.A. Mongru, "Hierarchical Organization of Modularity in Metabolic Networks," Science, vol. 297, no. 5586, pp. 1551-1555, 2002.
[13] V. Farutin, K. Robison, E. Lightcap, V. Dancik, A. Ruttenberg, S. Letovsky, and J. Pradines, "Edge-Count Probabilities for the Identification of Local Protein Communities and Their Organization," Proteins: Structure, Function, and Bioinformatics, vol. 62, no. 3, pp. 800-818, 2006.
[14] B. Snel, P. Bork, and M.A. Huynen, "The Identification of Functional Modules from the Genomic Association of Genes," Proc. Nat'l Academy of Sciences USA, vol. 99, no. 9, pp. 5890-5895, 2002.
[15] Z. Wang and J. Zhang, "In Search of the Biological Significance of Modular Structures in Protein Networks," PLOS Computational Biology, vol. 3, no. 6, p. e107, 2007.
[16] G.W. Flake, S. Lawrence, C.L. Giles, and F.M. Coetzee, "Self-Organization and Identification of Web Communities," Computer, vol. 35, no. 3, pp. 66-70, Mar. 2002.
[17] J.M. Kleinberg, "Authoritative Sources in a Hyperlinked Environment," J. ACM, vol. 46, no. 5, pp. 604-632, 1999.
[18] J.P. Eckmann and E. Moses, "Curvature of Co-Links Uncovers Hidden Thematic Layers in the World Wide Web," Proc. Nat'l Academy of Sciences USA, vol. 99, no. 9, pp. 5825-5829, 2002.
[19] H. Zhuge, "Communities and Emerging Semantics in Semantic Link Network: Discovery and Learning," IEEE Trans. Knowledge and Data Eng., vol. 21, no. 6, pp. 785-799, June 2009.
[20] B.W. Kernighan and S. Lin, "An Efficient Heuristic Procedure for Partitioning Graphs," Bell System Technical J., vol. 49, pp. 291-307, 1970.
[21] M.E.J. Newman, "Fast Algorithm for Detecting Community Structure in Networks," Physical Rev. E, vol. 69, no. 6, p. 066133, 2004.
[22] J. Duch and A. Arenas, "Community Detection in Complex Networks Using Extreme Optimization," Physical Rev. E, vol. 72, p. 027104, 2005.
[23] J.M. Pujol, J. Bjar, and J. Delgado, "Clustering Algorithm for Determining Community Structure in Large Networks," Physical Rev. E, vol. 74, p. 016107, 2006.
[24] J. Reichardt and S. Bornholdt, "Detecting Fuzzy Community Structures in Complex Networks with a Potts Model," Physical Rev. Letters, vol. 93, no. 21, p. 218701, 2004.
[25] A. Clauset, C. Moore, and M.E.J. Newman, "Hierarchical Structure and the Prediction of Missing Links in Networks," Nature, vol. 453, no. 5, pp. 98-101, 2008.
[26] F. Wu and B.A. Huberman, "Finding Communities in Linear Time: A Physics Approach," European Physical J. B, vol. 38, no. 2, pp. 331-338, 2004.
[27] F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, and D. Parisi, "Defining and Identifying Communities in Networks," Proc. Nat'l Academy of Sciences USA, vol. 101, no. 9, pp. 2658-2663, 2004.
[28] B. Yang, W.K. Cheung, and J. Liu, "Community Mining from Signed Social Networks," IEEE Trans. Knowledge and Data Eng., vol. 19, no. 10, pp. 1333-1348, Oct. 2007.
[29] S. Albeverio, J. Feng, and M. Qian, "Role of Noise in Neural Networks," Physical Rev. E, vol. 52, pp. 6593-6606, 1995.
[30] J. Scott, Social Network Analysis: A Handbook, second ed. Sage Publications, 2000.
[31] M.I. Fredlin and A.D. Wenzell, Random Perturbations of Dynamical Systems. Springer-Verlag, 1984.
[32] A. Bovier, M. Eckhoff, V. Gayrard, and M. Klein, "Metastability and Low Lying Spectra in Reversible Markov Chains," Comm. Math. Physics, vol. 228, pp. 219-255, 2002.
[33] W.W. Zachary, "An Information Flow Model for Conflict and Fission in Small Groups," J. Anthropological Research, vol. 33, pp. 452-473, 1977.
[34] D. Lusseau, "The Emergent Properties of a Dolphin Social Network," Proc. Royal Soc. B: Biological Sciences, vol. 270, no. Suppl 2, pp. S186-S188, 2003.
[35] S. Fortunato and M. Barthelemy, "Resolution Limit in Community Detection," Proc. Nat'l Academy of Sciences USA, vol. 104, no. 1, pp. 36-41, 2007.
[36] Y.C. Wei and C.K. Cheng, "Ration Cut Partitioning for Hierarchical Designs," IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 10, no. 7, pp. 911-921, July 1991.
[37] G.H. Golub and C.F.V. Loan, Matrix Computations. Johns Hopkins Univ. Press, 1989.
14 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool