This Article 
 Bibliographic References 
 Add to: 
A Fast Hierarchical Clustering Algorithm for Functional Modules Discovery in Protein Interaction Networks
May/June 2011 (vol. 8 no. 3)
pp. 607-620
Jianxin Wang, Central South University, ChangSha
Min Li, Central South University, Changsha
Jianer Chen, Texas A & M University, College Station
Yi Pan, Georgia State University, Atlanta
As advances in the technologies of predicting protein interactions, huge data sets portrayed as networks have been available. Identification of functional modules from such networks is crucial for understanding principles of cellular organization and functions. However, protein interaction data produced by high-throughput experiments are generally associated with high false positives, which makes it difficult to identify functional modules accurately. In this paper, we propose a fast hierarchical clustering algorithm HC-PIN based on the local metric of edge clustering value which can be used both in the unweighted network and in the weighted network. The proposed algorithm HC-PIN is applied to the yeast protein interaction network, and the identified modules are validated by all the three types of Gene Ontology (GO) Terms: Biological Process, Molecular Function, and Cellular Component. The experimental results show that HC-PIN is not only robust to false positives, but also can discover the functional modules with low density. The identified modules are statistically significant in terms of three types of GO annotations. Moreover, HC-PIN can uncover the hierarchical organization of functional modules with the variation of its parameter's value, which is approximatively corresponding to the hierarchical structure of GO annotations. Compared to other previous competing algorithms, our algorithm HC-PIN is faster and more accurate.

[1] M. Li, J.X. Wang, and J.E. Chen, "A Fast Agglomerate Algorithm for Mining Functional Modules in Protein Interaction Networks," Proc. First Int'l Conf. BioMedical Eng. and Informatics (BMEI), pp. 3-7, May 2008.
[2] M. Li, J.X. Wang, J.E. Chen, and Y. Pan, "Hierarchical Organization of Functional Modules in Weighted Protein Interaction Networks Using Clustering Coefficient," Proc. Int'l Symp. Bioinformatics Research and Applications (ISBRA), pp. 75-86, May 2009.
[3] C. Brun, C. Herrmann, and A. Guénoche, "Clustering Proteins from Interaction Networks for the Prediction of Cellular Functions," BMC Bioinformatcis, vol. 7, article no. 488, 2004.
[4] A.L. Barabasi and Z.N. Oltvai, "Network Biology: Understanding the Cell's Functional Organization," Nature Rev. Genetics, vol. 5, pp. 101-114, 2004.
[5] L.H. Hartwell et al., "From Molecular to Modular Cell Biology," Nature, vol. 402, pp. C47-C52, 1999.
[6] F. Luo et al., "Modular Organization of Protein Interaction Networks," Bioinformatics, vol. 23, no. 2, pp. 207-214, 2007.
[7] N. Przulj and D.A. Wigle, "Functional Topology in a Network of Protein Interactions," Bioinformatics, vol. 20, no. 3, pp. 340-348, 2004.
[8] A.W. Rives and T. Galitski, "Modular Organization of Cellular Networks," Proc. Nat'l Academy of Sciences USA, vol. 100, pp. 1128-1133, 2003.
[9] V. Spirin and L.A. Mirny, "Protein Complexes and Functional Modules in Molecular Networks," Proc. Nat'l Academy of Sciences USA, vol. 100, no. 21, pp. 12123-12128, 2003.
[10] S. Yook, Z. Oltvai, and A.L. Barabsi, "Functional and Topologies Characterization of Protein Interaction Networks," Protenomics, vol. 4, pp. 928-942, 2004.
[11] P. Pei and A. Zhang, "A 'Seed-Refine' Algorithm for Detecting Protein Complexes from Protein Interaction Data," IEEE Trans. Nanobioscience, vol. 6, no. 1, pp. 43-50, Mar. 2007.
[12] D. Ucar, S. Parthasarathy, S. Asur, and S. Chao, "Effective Preprocessing Strategies for Funcitonal Clustering of a Protein-Protein Interactions Network," Proc. Int'l Symp. Bioinformatics and Bioeng. (BIBE), pp. 129-136, Oct. 2005.
[13] D. Ucar, S. Asur, U. Catalyurek, and S. Parthasarathy, "Improving Functional Modularity in Protein-Protein Interactions Graphs Using Hub-Induced Subgraphs," Proc. 10th European Conf. Principles and Practice of Knowledge Discoverage in Database (PKDD), pp. 371-382, Sept. 2006.
[14] X.L. Li, S. Tan, C. Foo, and S. Ng, "Interaction Graph Mining for Protein Complexes Using Local Clique Merging," Genome Informatics, vol. 16, pp. 260-269, 2006.
[15] H. Xiong, X. He, C. Ding, Y. Zhang, V. Kumar, and S.R. Holbrook, "Identification of Functional Modules in Protein Complexes via Hyperclique Pattern Discovery," Proc. Pacific Symp. Biocomputing, pp. 221-232, 2005.
[16] G. Palla et al., "Uncovering the Overlapping Community Structure of Complex Networks in Nature and Society," Nature, vol. 435, no. 7043, pp. 814-818, 2005.
[17] B. Adamcsek et al., "CFinder: Locating Cliques and Overlapping Modules in Biological Networks," Bioinformatics, vol. 22, no. 8, pp. 1021-1023, 2006.
[18] M. Altaf-UI-Amin et al., "Development and Implementation of an Algorithm for Detection of Protein Complexes in Large Interaction Networks," BMC Bioinformatics, vol. 7, article no. 207, 2006.
[19] G.D. Bader and C.W. Hogue, "An Automated Method for Finding Molecular Complexes in Large Protein Interaction Networks," BMC Bioinformatics, vol. 4, article no. 2, 2003.
[20] A.D. King, N. Przulj, and I. Jurisica, "Protein Complex Prediction via Cost-Based Clustering," Bioinformatics, vol. 20, no. 17, pp. 3013-3020, 2004.
[21] M. Girvan and M.E. Newman, "Community Structure in Social and Biological Networks," Proc. Nat'l Academy of Sciences USA, vol. 99, pp. 7821-7826, 2002.
[22] E. Ravasz et al., "Hierarchical Organization of Modularity in Metaboli Networks," Science, vol. 297, pp. 1551-1555, 2002.
[23] C. Wang, C. Ding, Q. Yang, and S.R. Holbrook, "Consistent Dissection of the Protein Interaction Network by Combining Global and Local Metrics," Genome Biology, vol. 8, no. 12,article  no. R271, 2007.
[24] E. Hartuv and R. Shamir, "A Clustering Algorithm Based Graph Connectivity," Information Processing Letters, vol. 76, nos. 4-6, pp. 175-181, 2000.
[25] M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, 1979.
[26] F. Radicchi, C. Castellano, and F. Cecconi, "Defining and Identifying Communities in Networks," Proc. Nat'l Academy of Sciences USA, vol. 101, no. 9, pp. 2658-2663, 2004.
[27] C. von Mering et al., "Comparative Assessment of Large-Scale Data Sets of Protein-Protein Interactions," Nature, vol. 417, no. 6887, pp. 399-403, 2002.
[28] S. Brohee and J. van Helden, "Evaluation of Clustering Algorithms for Protein-Protein Interaction Networks," BMC Bioinformatics, vol. 7, article no. 488, 2006.
[29] Y.R. Cho, W. Hwang, and M. Ramanmathan et al., "Semantic Integration to Identify Overlapping Functional Modules in Protein Interaction Networks," BMC Bioinformatics, vol. 8, article no. 265, 2007.
[30] C. Friedel, and R. Zimmer, "Inferring Topology from Clustering Coefficients in Protein-Protein Interaction Networks," BMC Bioinformatics, vol. 7, article no. 519, 2006.
[31] H. Jeong et al., "The Large-Scale Organization of Metabolic Networks," Nature, vol. 407, pp. 651-654, 2000.
[32] H. Jeong et al., "Lethality and Centrality in Protein Networks," Nature, vol. 411, pp. 41-42, 2001.
[33] I. Xenarios et al., "DIP: The Database of Interaction Proteins: A Research Tool for Studying Cellular Networks of Protien Interactions," Nucleic Acids Research, vol. 30, pp. 303-305, 2002.
[34] R. Sharan et al., "Conserved Patterns of Protein Interaction in Multiple Species," Proc. Nat'l Academy of Sciences USA, vol. 102, no. 6, pp. 1974-1979, 2005.
[35] T. Shlomi, D. Segal, E. Ruppin, and R. Sharan, "Qpath: A Method for Querying Pathways in a Protein-Protein Interaction Network," BMC Bioinformatics, vol. 7, article no. 199, 2006.
[36] A.C. Gavin et al., "Proteome Survey Reveals Modularity of the Yeast Cell Machinery," Nature, vol. 440, no. 7084, pp. 631-636, 2006.
[37] N.J. Krogan et al., "Global Landscape of Protein Complexes in the Yeast Saccharomyces Cerevisiae," Nature, vol. 440, no. 7084, pp. 637-643, 2006.
[38] Lu et al., "The Interactome as a Tree—An Attempt to Visualize the Protein-Protein Interaction Network in Yeast," Nucleic Acids Research, vol. 32, no. 16, pp. 4804-4811, 2004.
[39] W. Hwang, Y.R. Cho, A. Zhang, and M. Ramanathan, "A Novel Functional Module Detection Algorithm for Protein-Protein Interaction Networks," Algorithms for Molecular Biology, vol. 12, pp. 1-24, 2006.
[40] S. van Dongen, "Graph Clustering by Flow Simulation," PhD thesis, Univ. of Utrecht, May 2000.
[41] A.J. Enright, S. Van Dongen, and C.A. Ouzounis, "An Efficient Algorithm for Large-Scale Detection of Protein Families," Nucleic Acids Research, vol. 30, no. 7, pp. 1575-1584, 2002.
[42] J. Vlasblom and S.J. Wodak, "Markov Clustering versus Affinity Propagation for the Partitioning of Protein Interaction Graphs," BMC Bioinformatics, vol. 10, article no. 99, 2009.
[43] Liu et al., "Complex Discovery from Weighted PPI Networks," Bioinformatics, vol. 25, no. 15, pp. 1891-1897, Aug. 2009.
[44] Leung et al., "Predicting Protein Complexes from PPI Data: A Core-Attachment Approach," J. Computational Biology, vol. 2, no. 16, pp. 133-144, 2009.
[45], 2006.
[46] /, 2006.
[47] M. Li, J. Chen, J. Wang, B. Hu, and G. Chen, "Modifying the DPClus Algorithm for Identifying Protein Complexes Based on New Topological Structures," BMC Bioinformatics, vol. 9, article  no. 398, 2008.
[48] M.N. Seaman et al., "A Membrane Coat Complex Essential for Endosome-to-Golgi Retrograde Transport in Yeast," The J. Cell Biology, vol. 142, no. 3, pp. 665-681, 1998.
[49] J. Zhao et al., "Cleavage Factor II of Saccharomyces Cerevisiae Contains Homologues to Subunits of the Mammalian Cleavage/Polyadenylation Specificity Factor and Exhibits Sequence-Specific, ATP-Dependent Interaction with Precursor RNA," The J. Biological Chemistry, vol. 272, no. 16, pp. 10831-10838, 1997.
[50] S. Gross and C. Moore, "Five Subunits are Required for Reconstitution of the Cleavage and Polyadenylation Activities of Saccharomyces Cerevisiae Cleavage Factor I," Proc. Nat'l Academy of Sciences USA, vol. 98, no. 11, pp. 6080-6085, 2001.
[51] M. Sacher et al., "Identification and Characterization of Five New Subunits of TRAPP," European J. Cell Biology, vol. 79, no. 2, pp. 71-80, liminHC-PIN, 2000.

Index Terms:
Protein interaction network, functional module, hierarchical clustering algorithm, Gene Ontology.
Jianxin Wang, Min Li, Jianer Chen, Yi Pan, "A Fast Hierarchical Clustering Algorithm for Functional Modules Discovery in Protein Interaction Networks," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. 3, pp. 607-620, May-June 2011, doi:10.1109/TCBB.2010.75
Usage of this product signifies your acceptance of the Terms of Use.