This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Designing Template-Free Predictor for Targeting Protein-Ligand Binding Sites with Classifier Ensemble and Spatial Clustering
July-Aug. 2013 (vol. 10 no. 4)
pp. 994-1008
Dong-Jun Yu, Sch. of Comput. Sci. & Eng., Nanjing Univ. of Sci. & Technol., Nanjing, China
Jun Hu, Sch. of Comput. Sci. & Eng., Nanjing Univ. of Sci. & Technol., Nanjing, China
Jing Yang, Key Lab. of Syst. Control & Inf. Process., Shanghai Jiao Tong Univ., Shanghai, China
Hong-Bin Shen, Key Lab. of Syst. Control & Inf. Process., Shanghai Jiao Tong Univ., Shanghai, China
Jinhui Tang, Sch. of Comput. Sci. & Eng., Nanjing Univ. of Sci. & Technol., Nanjing, China
Jing-Yu Yang, Sch. of Comput. Sci. & Eng., Nanjing Univ. of Sci. & Technol., Nanjing, China
Accurately identifying the protein-ligand binding sites or pockets is of significant importance for both protein function analysis and drug design. Although much progress has been made, challenges remain, especially when the 3D structures of target proteins are not available or no homology templates can be found in the library, where the template-based methods are hard to be applied. In this paper, we report a new ligand-specific template-free predictor called TargetS for targeting protein-ligand binding sites from primary sequences. TargetS first predicts the binding residues along the sequence with ligand-specific strategy and then further identifies the binding sites from the predicted binding residues through a recursive spatial clustering algorithm. Protein evolutionary information, predicted protein secondary structure, and ligand-specific binding propensities of residues are combined to construct discriminative features; an improved AdaBoost classifier ensemble scheme based on random undersampling is proposed to deal with the serious imbalance problem between positive (binding) and negative (nonbinding) samples. Experimental results demonstrate that TargetS achieves high performances and outperforms many existing predictors. TargetS web server and data sets are freely available at: http://www.csbio.sjtu.edu.cn/bioinf/TargetS/ for academic use.
Index Terms:
sequences,bioinformatics,bonds (chemical),learning (artificial intelligence),molecular biophysics,molecular configurations,proteins,sampling methods,binding sample-nonbinding sample imbalance problem,template-free predictor design,targeting protein-ligand binding site,accurate protein-ligand binding site identification,accurate pocket identification,protein function analysis,drug design,target protein 3D structure,homology template,template-based method application,ligand-specific template-free predictor,TargetS predictor,primary sequence,sequence binding residue prediction,ligand-specific strategy,recursive spatial clustering algorithm,protein evolutionary information,protein secondary structure prediction,residue ligand-specific binding propensity,discriminative feature construction,improved AdaBoost classifier ensemble scheme,random undersampling,positive sample-negative sample imbalance problem,Training,Feature extraction,Protein sequence,Metals,Bioinformatics,spatial clustering,Protein-ligand binding sites,ligand-specific prediction model,template-free,classifier ensemble
Citation:
Dong-Jun Yu, Jun Hu, Jing Yang, Hong-Bin Shen, Jinhui Tang, Jing-Yu Yang, "Designing Template-Free Predictor for Targeting Protein-Ligand Binding Sites with Classifier Ensemble and Spatial Clustering," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 10, no. 4, pp. 994-1008, July-Aug. 2013, doi:10.1109/TCBB.2013.104
Usage of this product signifies your acceptance of the Terms of Use.