This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Coclustering Approach for Mining Large Protein-Protein Interaction Networks
May-June 2012 (vol. 9 no. 3)
pp. 717-730
C. Pizzuti, Inst. for High Performance Comput. & Networking (ICAR), Nat. Res. Council of Italy (CNR), Rende, Italy
S. E. Rombo, Dept. of Electron., Comput. Sci. & Syst. (DEIS), Univ. of Calabria, Rende, Italy
Several approaches have been presented in the literature to cluster Protein-Protein Interaction (PPI) networks. They can be grouped in two main categories: those allowing a protein to participate in different clusters and those generating only nonoverlapping clusters. In both cases, a challenging task is to find a suitable compromise between the biological relevance of the results and a comprehensive coverage of the analyzed networks. Indeed, methods returning high accurate results are often able to cover only small parts of the input PPI network, especially when low-characterized networks are considered. We present a coclustering-based technique able to generate both overlapping and nonoverlapping clusters. The density of the clusters to search for can also be set by the user. We tested our method on the two networks of yeast and human, and compared it to other five well-known techniques on the same interaction data sets. The results showed that, for all the examples considered, our approach always reaches a good compromise between accuracy and network coverage. Furthermore, the behavior of our algorithm is not influenced by the structure of the input network, different from all the techniques considered in the comparison, which returned very good results on the yeast network, while on the human network their outcomes are rather poor.

[1] Int'l Molecular Interaction Exchange (imex) Consortium of Molecular Interaction Databases: http:/imex.sf.net, 2012.
[2] Website title: ftp://ftpmips.gsf.de/yeast/cataloguescomplexcat , 2012.
[3] Website title: http://mips.helmholtz-muenchen.de/genre/ projcorum, 2012.
[4] B. Adamcsek, G. Palla, I.J. Farkas, I. Dernyi, and T. Vicsek, "Cfinder: Locating Cliques and Overlapping Modules in Biological Networks," Bioinformatics, vol. 22, no. 8, pp. 1021-1023, 2006.
[5] B. Aittokallio and B. Schwikowski, "Graph-Based Methods for Analyzing Networks in Cell Biology," Briefing in Bioinformatics, vol. 7, no. 3, pp. 243-255, 2006.
[6] M. Altaf-Ul-Amin, Y. Shinbo, K. Mihara, K. Kurokawa, and S. Kanaya, "Development and Implementation of an Algorithm for Detection of Protein Complexes in Large Interaction Networks," BMC Bioinformatics, vol. 7, article 207, 2006.
[7] V. Arnau, S. Mars, and I. Marìn, "Iterative Cluster Analysis of Protein Interaction Data," Bioinformatics, vol. 21, no. 3, pp. 364-378, 2005.
[8] S. Asburner et al., "Gene Ontology: Tool for the Unification of Biology. The Gene Ontology Consortium," Nature Genetics, vol. 25, pp. 25-29, 2000.
[9] G. Bader and H. Hogue, "An Automated Method for Finding Molecular Complexes in Large Protein-Protein Interaction Networks," BMC Bioinformatics, vol. 4, article 2, 2003.
[10] A. Barabási and Z.N. Oltvai, "Network Biology: Understanding the Cell's Functional Organization," Nature Reviews Genetics, vol. 5, pp. 101-113, 2004.
[11] M. Blatt, S. Wiseman, and E. Domany, "Superparamagnetic Clustering of Data," Physical Review Letters, vol. 76, no. 18, pp. 3251-3254, 1996.
[12] S. Brohèe and J. van Helden, "Evaluation of Clustering Algorithms for Protein-Protein Interaction Networks," BMC Bioinformatics, vol. 7, article 488, 2006.
[13] C. Brun, C. Herrmann, and A. Guenoche, "Clustering Proteins from Interaction Networks for the Prediction of Cellular Functions," BMC Bioinformatics, vol. 5, article 95, 2004.
[14] A. Ceol et al., "Mint, the Molecular Interaction Database: 2009 Update," Nucleic Acids Research, vol. 38, pp. D532-D539, 2010.
[15] Y.-R. Cho, W. Hwang, M. Ramanathan, and A. Zhang, "Semantic Integration to Identify Overlapping Functional Modules in Protein Interaction Networks," BMC Bioinformatics, vol. 8, article 265, 2007.
[16] Y.-R. Cho, W. Hwang, and A. Zhang, "Identification of Overlapping Functional Modules in Protein Interaction Networks: Information Flow-Based Approach," Proc. Sixth Int'l Conf. Data Mining-Workshops (ICDMW '06), 2006.
[17] I. Derenyi, G. Palla, and T. Vicsek, "Clique Percolation in Random Networks," Physical Review Letters, vol. 94, no. 16, pp. 160-202, 2005.
[18] B.L. Drees et al., "A Protein Interaction Map for Cell Polarity Development," J. Cellular Biology, vol. 154, pp. 549-571, 2001.
[19] A.J. Enright, S.V. Dongen, and C.A. Ouzounis, "An Efficient Algorithm for Large-Scale Detection of Protein Families," Nucleic Acids Research, vol. 30, no. 7, pp. 1575-1584, 2002.
[20] V. Farutin, K. Robinson, E. Lightcap, V. Dancik, A. Ruttenberg, S. Letovsky, and J. Pradines, "Edge-Count Probabilities for the Identification of Local Protein Communities and Their Organization," Proteins: Structure, Function, and Bioinformatics, vol. 62, pp. 800-818, 2006.
[21] A.C. Gavin et al., "Proteome Survey Reveals Modularity of the Yeast Cell Machinery," Nature, vol. 440, pp. 631-636, 2006.
[22] E. Georgii, S. Dietmann, T. Uno, P. Pagel, and K. Tsuda, "Enumeration of Condition-Dependent Dense Modules in Protein Interaction Networks," Bioinformatics, vol. 25, no. 7, pp. 933-940, 2009.
[23] L.H. Hartwell, J.J. Hopfield, S. Leibler, and A.W. Murray, "Clustering Algorithm Based Graph Connectivity," Nature, vol. 402, pp. C47-C52, 1999.
[24] W. Hwang, Y.-R. Cho, A. Zhang, and M. Ramanathan, "A Novel Functional Module Detection Algorithm for Protein-Protein Interaction Networks," Algorithms for Molecular Biology, vol. 1, no. 24, 2006.
[25] A.D. King, N. Przulj, and I. Jurisica, "Protein Complex Prediction via Cost-Based Clustering," Bioinformatics, vol. 20, no. 17, pp. 3013-3020, 2004.
[26] M. Li, J. Chen, J. Wang, B. Hu, and G. Chen, "Modifying the DPClus Algorithm for Identifying Protein Complexes Based on New Topological Structures," BMC Bioinformatics, vol. 9, 2008.
[27] C. Lin, Y. Cho, W. Hwang, P. Pei, and A. Zhang, "Clustering Methods in Protein-Protein Interaction Network," Knowledge Discovery in Bioinformatics: Techniques, Methods and Application, John Wiley and Sons, Inc., 2006.
[28] Z. Lubovac, J. Gamalielsson, and B. Olsson, "Combining Functional and Topological Properties to Identify Core Modules in Protein Interaction Networks," Proteins: Structure, Function, and Bioinformatics, vol. 64, pp. 948-959, 2006.
[29] S.C. Madeira and A.L. Oliveira, "Biclustering Algorithms for Biological Data Analysis: A Survey," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 1, no. 1, pp. 24-45, Jan.-Mar. 2004.
[30] H.W. Mewes, D. Frishman, U. Güldener, G. Mannhaupt, K. Mayer, M. Mokrejs, B. Morgenstern, M. Münsterkötter, S. Rudd, and B. Weil, "MIPS: A Database for Genomes and Protein Sequences," Nucleic Acids Research, vol. 30, no. 1, pp. 31-34, 2002.
[31] G. Palla, I. Derenyi, I. Farkas, and T. Vicsek, "Uncovering the Overlapping Community Structure of Complex Networks in Nature and Society," Nature, vol. 435, pp. 814-818, 2005.
[32] P. Pei and A. Zhang, "A Two-Step Approach for Clustering Proteins Based on Protein Interaction Profiles," Proc. IEEE Int'l Symp. Bioinformatics and Bioengeneering (BIBE '05), pp. 201-209, 2005.
[33] J.B. Pereira, A.J. Enright, and C.A. Ouzounis, "Detection of Functional Modules from Protein Interaction Networks," Proteins: Structure, Functions, and Bioinformatics, vol. 20, pp. 49-57, 2004.
[34] C. Pizzuti and S.E. Rombo, "PINCoC: A Co-Clustering Based Approach to Analyze Protein-Protein Interaction Networks," Proc. Eighth Int'l Conf. Intelligent Data Eng. and Automated Learning (IDEAL '07), pp. 821-830, 2007.
[35] C. Pizzuti and S.E. Rombo, "Multi-Functional Protein Clustering in PPI Networks," Proc. Second Int'l Conf. Bioinformatics Research and Development (BIRD '08), pp. 318-330, 2008.
[36] C. Pizzuti and S.E. Rombo, "Discovering Protein Complexes in Protein Interaction Networks," Biological Data Mining in Protein Interaction Networks, X.-L. Li and S.-K. Ng, eds., IGI Global-Medical Inf. Science Ref., 2009.
[37] N. Przulj, "Functional Topology in a Network of Protein Interactions," Knowledge Discovery in Proteomics, I. Jurisica and D. Wigle, eds. CRC Press, 2005.
[38] A.W. Rives and T. Galitski, "Modular Organization of Cellular Networks," Proc. Nat'l Academy of Science USA, vol. 100, no. 3, pp. 1128-1133, 2003.
[39] M.P. Samantha and S. Liang, "Predicting Protein Functions from Redundancies in Large-Scale Protein Interaction Networks," Proc. Nat'l Academy of Science USA, vol. 100, no. 22, pp. 12579-12583, 2003.
[40] R. Sharan, I. Ulitsky, and R. Shamir, "Network-Based Prediction of Protein Function," Molecular Systems Biology, vol. 3, no. 88, 2007.
[41] V. Spirin and L.A. Mirny, "Protein Complexes and Functional Modules in Molecular Networks," Proc. Nat'l Academy of Science USA, vol. 100, pp. 12123-12128, 2003.
[42] S. Tornw and H.W. Mewes, "Functional Modules by Relating Protein Interaction Networks and Gene Expression," Nucleic Acids Research, vol. 31, no. 21, pp. 6283-6289, 2003.
[43] D. Ucar, S. Asur, Ü.V. Çatalyürek, and S. Parthasarathy, "Improving Functional Modularity in Protein-Protein Interactions Graphs Using Hub-Induced Subgraphs," Proc. 10th European Conf. Principles and Practice of Knowledge Discovery in Databases (PKDD), pp. 371-382, 2006.
[44] D. von Mering et al., "Comparative Assessment of a Large-Scale Data Sets of Protein-Protein Interactions," Nature, vol. 31, pp. 399-403, 2002.
[45] S. Zhang, H.-W. Liu, X.-M. Ning, and X.-S. Zhang, "A Graph Theoretic Method for Mining Functional Modules in Large Sparse Protein Interaction Networks," Proc. IEEE ICDM Workshop Data Mining in Bioinformatics (ICDMW '06), pp. 130-135, 2006.
[46] E. Zotenko, K.S. Guimaraes, R. Jothi, and T.M. Przytycka, "Decomposition of Overlapping Protein Complexes: A Graph Theoretical Method for Analyzing Static and Dynamic Protein Associations," Algorithms for Molecular Biology, vol. 1, no. 7, 2006.

Index Terms:
proteins,biology computing,cellular biophysics,data mining,microorganisms,molecular biophysics,human network,coclustering approach,protein-protein interaction networks,PPI network,nonoverlapping clusters,interaction data sets,yeast network,Proteins,Databases,Bioinformatics,Clustering algorithms,Humans,Computational biology,hub proteins.,Coclustering,biological networks,protein-protein interaction networks,protein complexes
Citation:
C. Pizzuti, S. E. Rombo, "A Coclustering Approach for Mining Large Protein-Protein Interaction Networks," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 3, pp. 717-730, May-June 2012, doi:10.1109/TCBB.2011.158
Usage of this product signifies your acceptance of the Terms of Use.