Protein Complexes Discovery Based on Protein-Protein Interaction Data via a Regularized Sparse Generative Network Model
Issue No. 03 - May-June (2012 vol. 9)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.20
Dao-Qing Dai , Center for Comput. Vision & Dept. of Math., Sun Yat-Sen Univ., Guangzhou, China
Xiao-Fei Zhang , Center for Comput. Vision & Dept. of Math., Sun Yat-Sen Univ., Guangzhou, China
Xiao-Xin Li , Center for Comput. Vision & Dept. of Math., Sun Yat-Sen Univ., Guangzhou, China
Detecting protein complexes from protein interaction networks is one major task in the postgenome era. Previous developed computational algorithms identifying complexes mainly focus on graph partition or dense region finding. Most of these traditional algorithms cannot discover overlapping complexes which really exist in the protein-protein interaction (PPI) networks. Even if some density-based methods have been developed to identify overlapping complexes, they are not able to discover complexes that include peripheral proteins. In this study, motivated by recent successful application of generative network model to describe the generation process of PPI networks and to detect communities from social networks, we develop a regularized sparse generative network model (RSGNM), by adding another process that generates propensities using exponential distribution and incorporating Laplacian regularizer into an existing generative network model, for protein complexes identification. By assuming that the propensities are generated using exponential distribution, the estimators of propensities will be sparse, which not only has good biological interpretation but also helps to control the overlapping rate among detected complexes. And the Laplacian regularizer will lead to the estimators of propensities more smooth on interaction networks. Experimental results on three yeast PPI networks show that RSGNM outperforms six previous competing algorithms in terms of the quality of detected complexes. In addition, RSGNM is able to detect overlapping complexes and complexes including peripheral proteins simultaneously. These results give new insights about the importance of generative network models in protein complexes identification.
proteins, biochemistry, biology computing, exponential distribution, genomics, molecular biophysics, physiological models, competing algorithms, protein complexes discovery, protein-protein interaction data, regularized sparse generative network model, detecting protein complexes, postgenome era, computational algorithms, traditional algorithms, protein-protein interaction networks, density-based methods, peripheral proteins, generative network model, generation processing, Laplacian regularizer, protein complexes identification, exponential distribution, Proteins, Communities, Biological system modeling, RNA, Polymers, Exponential distribution, peripheral protein., Protein complex, protein-protein interaction network, generative network model, regularization method, overlapping complex
Dao-Qing Dai, Xiao-Fei Zhang and Xiao-Xin Li, "Protein Complexes Discovery Based on Protein-Protein Interaction Data via a Regularized Sparse Generative Network Model," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. , pp. 857-870, 2012.