Issue No. 02 - April-June (2005 vol. 2)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2005.24
<p><b>Abstract</b>—One of the key components in protein structure prediction by protein threading technique is to choose the best overall template for a given target sequence after all the optimal sequence-template alignments are generated. The chosen template should have the best alignment with the target sequence since the three-dimensional structure of the target sequence is built on the sequence-template alignment. The traditional method for template selection is called Z-score, which uses a statistical test to rank all the sequence-template alignments and then chooses the first-ranked template for the sequence. However, the calculation of Z-score is time-consuming and not suitable for genome-scale structure prediction. Z-scores are also hard to interpret when the threading scoring function is the weighted sum of several energy items of different physical meanings. This paper presents a Support Vector Machine (SVM) regression approach to directly predict the alignment accuracy of a sequence-template alignment, which is used to rank all the templates for a specific target sequence. Experimental results on a large-scale benchmark demonstrate that SVM regression performs much better than the composition-corrected Z-score method. SVM regression also runs much faster than the Z-score method.</p>
Protein structure prediction, protein threading, protein fold recognition, SVM regression.
J. Xu, "Fold Recognition by Predicted Alignment Accuracy," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 2, no. , pp. 157-165, 2005.