This Article 
 Bibliographic References 
 Add to: 
Identification of Essential Proteins Based on Edge Clustering Coefficient
July-Aug. 2012 (vol. 9 no. 4)
pp. 1070-1080
Min Li, Sch. of Inf. Sci. & Eng., Central South Univ., Changsha, China
Jianxin Wang, Sch. of Inf. Sci. & Eng., Central South Univ., Changsha, China
Huan Wang, Sch. of Inf. Sci. & Eng., Central South Univ., Changsha, China
Yi Pan, Dept. of Comput. Sci., Georgia State Univ., Atlanta, GA, USA
Identification of essential proteins is key to understanding the minimal requirements for cellular life and important for drug design. The rapid increase of available protein-protein interaction (PPI) data has made it possible to detect protein essentiality on network level. A series of centrality measures have been proposed to discover essential proteins based on network topology. However, most of them tended to focus only on the location of single protein, but ignored the relevance between interactions and protein essentiality. In this paper, a new centrality measure for identifying essential proteins based on edge clustering coefficient, named as NC, is proposed. Different from previous centrality measures, NC considers both the centrality of a node and the relationship between it and its neighbors. For each interaction in the network, we calculate its edge clustering coefficient. A node's essentiality is determined by the sum of the edge clustering coefficients of interactions connecting it and its neighbors. The new centrality measure NC takes into account the modular nature of protein essentiality. NC is applied to three different types of yeast protein-protein interaction networks, which are obtained from the DIP database, the MIPS database and the BioGRID database, respectively. The experimental results on the three different networks show that the number of essential proteins discovered by NC universally exceeds that discovered by the six other centrality measures: DC, BC, CC, SC, EC, and IC. Moreover, the essential proteins discovered by NC show significant cluster effect.

[1] M.L. Acencio and N. Lemke, "Towards the Prediction of Essential Genes by Integration of Network Topology, Cellular Localization and Biological Process Information," BMC Bioinformatics, vol. 10, article 290, 2009.
[2] E.A. Winzeler et al., "Functional Characterization of the S. cerevisiae Genome by Gene Deletion and Parallel Analysis," Science, vol. 285, no. 5429, pp. 901-906, 1999.
[3] D. Park, J. Park, S.G. Park, T. Park, and S.S. Choi, "Analysis of Human Disease Genes in the Context of Gene Essentiality," Genomics, vol. 92, no. 6, pp. 414-418, 2008.
[4] W. Hu et al., "Essential Gene Identification and Drug Target Prioritization in Aspergillus fumigatus," PLoS Pathogens, vol. 3, no. 3, p. e24, 2007.
[5] G. Giaever et al., "Functional Profiling of the Saccharomyces cerevisiae Genome," Nature, vol. 418, no. 6896, pp. 387-391, 2002.
[6] L.M. Cullen and G.M. Arndt, "Genome-Wide Screening for Gene Function Using RNAi in Mammalian Cells," Immunology and Cell Biology, vol. 83, no. 3, pp. 217-223, 2005.
[7] T. Roemer et al., "Large-Scale Essential Gene Identification in Candida albicans, and Applications to Antifungal Drug Discovery," Molecular Microbiology, vol. 50, no. 1, pp. 167-181, 2003.
[8] P. Uetz et al., "A Comprehensive Analysis of Protein-protein Interactions in Saccharomyces cerevisiae," Nature, vol. 403, no. 6770, pp. 623-627, 2000.
[9] T. Ito, T. Chiba, R. Ozawa, M. Yoshida, M. Hattori, and Y. Sakaki, "A Comprehensive Two-Hybrid Analysis to Explore the Yeast Protein Interactome," Proc. Nat'l Academy of Sciences USA, vol. 98, no. 8, pp. 4569-4574, 2001.
[10] Y. Ho et al., "Systematic Identification of Protein Complexes in Saccharomyces cerevisiae by Mass Spectrometry," Nature, vol. 415, no. 6868, pp. 180-183, 2002.
[11] C. von Mering, R. Krause, B. Snel, M. Cornell, S.G. Oliver, S. Fields, and P. Bork, "Comparative Assessment of Large-Scale Data Sets of Protein-Protein Interactions," Nature, vol. 417, no. 6887, pp. 399-403, 2002.
[12] H. Jeong, S.P. Mason, A.-L. Barabási, and Z.N. Oltvai, "Lethality and Centrality in Protein Networks," Nature, vol. 411, no. 6833, pp. 41-42, 2001.
[13] S. Maslov and K. Sneppen, "Specificity and Stability in Topology of Protein Networks," Science, vol. 296, no. 5569 pp. 910-913, 2002.
[14] A.-L. Barabási and Z.N. Oltvai, "Network Biology: Understanding the Cell's Functional Organization," Nature Rev. Genetics, vol. 5, no. 2 pp. 101-113, 2004.
[15] C.C. Lin, H.F. Juan, J.T. Hsiang, Y.C. Hwang, H. Mori, and H.C. Huang, "Essential Core of Protein-Protein Interaction Network in Escherichia coli," J. Proteome Research, vol. 8, no. 4, pp. 1925-1931, 2009.
[16] M.W. Hahn and A.D. Kern, "Comparative Genomics of Centrality and Essentiality in Three Eukaryotic Protein-Interaction Networks," Molecular Biology and Evolution, vol. 22, no. 4, pp. 803-806, 2005.
[17] H. Liang and W.H. Li, "Gene Essentiality, Gene Duplicability and Protein Connectivity in Human and Mouse," Trends in Genetics, vol. 23, no. 8, pp. 375-378, 2007.
[18] R.R. Vallabhajosyula, D. Chakravarti, S. Lutfeali, A. Ray, and A. Raval, "Identifying Hubs in Protein Interaction Networks," PLoS ONE, vol. 4, no. 4, p. e5344, 2009.
[19] K. Pang, H. Sheng, and X. Ma, "Understanding Gene Essentiality by Finely Characterizing Hubs in the Yeast Protein Interaction Network," Biochemical and Biophysical Research Comm., vol. 401, no. 1, pp. 112-116, 2010.
[20] K. Ning, H.K. Ng, S. Srihari, H.W. Leong, and A.I. Nesvizhskii, "Examination of the Relationship between Essential Genes in PPI Network and Hub Proteins in Reverse Nearest Neighbor Topology," BMC Bioinformatics, vol. 11, article 505, 2010.
[21] L.C. Freeman, "A Set of Measures of Centrality Based on Betweenness," Sociometry, vol. 40, no. 1, pp. 35-41, 1977.
[22] M.P. Joy, A. Brock, D.E. Ingber, and S. Huang, "High-betweenness Proteins in the Yeast Protein Interaction Network," J. Biomedicine and Biotechnology, vol. 2005, no. 2, pp. 96-103, 2005.
[23] S. Wuchty and P.F. Stadler, "Centers of Complex Networks," J. Theoretical Biology, vol. 223, no. 1, pp. 45-53, 2003.
[24] E. Estrada and J.A. Rodríguez-Velázquez, "Subgraph Centrality in Complex Networks," Physical Rev. E, vol. 71, no. 5 p. 056103, 2005.
[25] P. Bonacich, "Power and Centrality: A Family of Measures," Am. J. Sociology, vol. 92, no. 5, pp. 1170-1182, 1987.
[26] K. Stevenson and M. Zelen, "Rethinking Centrality: Methods and Examples," Social Networks, vol. 11, no. 1, pp. 1-37, 1989.
[27] E. Estrada, "Virtual Identification of Essential Proteins within the Protein Interaction Network of Yeast," Proteomics, vol. 6, no. 1, pp. 35-40, 2006.
[28] O. Mason and M. Verwoerd, "Graph Theory and Networks in Biology," IET Systems Biology, vol. 1, no. 2, pp. 89-119, 2007.
[29] M. Li, J. Wang, H. Wang, and Y. Pan, "Essential Proteins Discovery from Weighted Protein Interaction Networks," Proc. Sixth Int'l Symp. Bioinformatics Research and Applied (ISBRA), pp. 89-100, May 2010.
[30] K. Park and D. Kim, "Localized Network Centrality and Essentiality in the Yeast-Protein Interaction Network," Proteomics, vol. 9, no. 22 pp. 5143-5154, 2009.
[31] G. del Rio, D. Koschützki, and G. Coello, "How to Identify Essential Genes from Molecular Networks?," BMC Systems Biology, vol. 3, article 102, 2009.
[32] L.C. Freeman, "Centrality in Social Networks: Conceptual Clarification," Social Networks, vol. 1, no. 3, pp. 215-239, 1979.
[33] D. Gómez, E. González-Arangüena, C. Manuel, G. Owen, M. del Pozo, and J. Tejada, "Centrality and Power in Social Networks: A Game Theoretic Approach," Math. Social Sciences, vol. 46, no. 1, pp. 27-54, 2003.
[34] X. He and J. Zhang, "Why Do Hubs Tend to be Essential in Protein Networks?," PLoS Genetics, vol. 2, no. 6, p. e88, 2006.
[35] E. Zotenko, J. Mestre, D.P. O'Leary, and T.M. Przytycka, "Why Do Hubs in the Yeast Protein Interaction Network Tend to be Essential: Reexamining the Connection between the Network Topology and Essentiality," PLoS Computational Biology, vol. 4, no. 8, p. e1000140, 2008.
[36] J.B. Pereira-Leal, B. Audit, J.M. Peregrin-Alvarez, and C.A. Ouzounis, "An Exponential Core in the Heart of the Yeast Protein Interaction Network," Molecular Biology and Evolution, vol. 22, no. 3, pp. 421-425, 2005.
[37] F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, and D. Parisi, "Defining and Identifying Communities in Networks," Proc. Nat'l Academy of Sciences USA, vol. 101, no. 9, pp. 2658-2663, 2004.
[38] J.X. Wang, M. Li, J.E. Chen, and Y. Pan, "A Fast Hierarchical Clustering Algorithm for Functional Modules Discovery in Protein Interaction Networks," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 8, no. 3, pp. 607-620, May/June 2011.
[39] M. Girvan and M.E.J. Newman, "Community Structure in Social and Biological Networks," Proc. Nat'l Academy of Sciences USA, vol. 99, no. 12, pp. 7821-7826, 2002.
[40] A.W. Rives and T. Galitski, "Modular Organization of Cellular Networks," Proc. Nat'l Academy of Sciences USA, vol. 100, no. 3, pp. 1128-1133, 2003.
[41] A.-C. Gavin et al., "Functional Organization of the Yeast Proteome by Systematic Analysis of Protein Complexes," Nature, vol. 415, no. 6868, pp. 141-147, 2002.
[42] G.T. Hart, I. Lee, and E.M. Marcotte, "A High-Accuracy Consensus Map of Yeast Protein Complexes Reveals Modular Nature of Gene Essentiality," BMC Bioinformatics, vol. 8, article 236, 2007.
[43] I. Xenarios, D.W. Rice, L. Salwinski, M.K. Baron, E.M. Marcotte, and D. Eisenberg, "DIP: The Database of Interacting Proteins," Nucleic Acids Research, vol. 28, no. 1, pp. 289-291, 2000.
[44] M.S. Livstone, J. Nixon, K. Van Auken, X. Wang, X. Shi, T. Reguly, J.M. Rust, A. Winter, K. Dolinski, and M. Tyers, "The BioGRID Interaction Database: 2011 Update," Nucleic Acids Research, vol. 39, pp. D698-D704, Nov. 2010.
[45] H.W. Mewes, D. Frishman, K.F.X. Mayer, M. Münsterkötter1, O. Noubibou, P. Pagel, T. Rattei, M. Oesterheld, A. Ruepp, and V. Stümpflen, "MIPS: Analysis and Annotation of Proteins from Whole Genomes in 2005," Nucleic Acids Research, vol. 34, pp. D169-D172, 2006.
[46] J.M. Cherry, C. Adler, C. Ball, S.A. Chervitz, S.S. Dwight, E.T. Hester, Y. Jia, G. Juvik, T. Roe, M. Schroeder, S. Weng, and D. Botstein, "SGD: Saccharomyces Genome Database," Nucleic Acids Research, vol. 26, no. 1, pp. 73-79, 1998.
[47] R. Zhang and Y. Lin, "DEG 5.0, A Database of Essential genes in both Prokaryotes and Eukaryotes," Nucleic Acids Research, vol. 37, pp. D455-D458, 2009.
[48] project , 2011.
[49] A.G. Holman et al., "Computational Prediction of Essential Genes in an Unculturable Endosymbiotic Bacterium, Wolbachia of Brugia Malayi," BMC Microbiology, vol. 9, article 243, 2009.
[50] P. Shannon, A. Markiel, O. Ozier, N.S. Baliga, J.T. Wang, D. Ramage, N. Amin, B. Schwikowski, and T. Ideker, "Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks," Genome Research, vol. 13, no. 11, pp. 2498-2504, 2003.

Index Terms:
proteins,bioinformatics,cellular biophysics,grid computing,molecular biophysics,network topology,pattern clustering,BioGRID database,edge clustering coefficient,cellular life,drug design,protein-protein interaction,protein essentiality,network topology,centrality measure,yeast,DIP database,MIPS database,Proteins,Electronics packaging,Databases,Accuracy,Tin,Sensitivity,Bioinformatics,edge clustering coefficient.,Essential proteins,protein interaction network,topology,centrality measures
Min Li, Jianxin Wang, Huan Wang, Yi Pan, "Identification of Essential Proteins Based on Edge Clustering Coefficient," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 4, pp. 1070-1080, July-Aug. 2012, doi:10.1109/TCBB.2011.147
Usage of this product signifies your acceptance of the Terms of Use.