The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - March/April (2011 vol.8)
pp: 395-409
Yucel Altunbasak , Georgia Institute of Technology, Atlanta
Hakan Erdogan , Sabanci University, Istanbul
ABSTRACT
Prediction of the 3D structure greatly benefits from the information related to secondary structure, solvent accessibility, and nonlocal contacts that stabilize a protein's structure. We address the problem of \beta-sheet prediction defined as the prediction of \beta--strand pairings, interaction types (parallel or antiparallel), and \beta-residue interactions (or contact maps). We introduce a Bayesian approach for proteins with six or less \beta-strands in which we model the conformational features in a probabilistic framework by combining the amino acid pairing potentials with a priori knowledge of \beta-strand organizations. To select the optimum \beta-sheet architecture, we significantly reduce the search space by heuristics that enforce the amino acid pairs with strong interaction potentials. In addition, we find the optimum pairwise alignment between \beta-strands using dynamic programming in which we allow any number of gaps in an alignment to model \beta-bulges more effectively. For proteins with more than six \beta-strands, we first compute \beta-strand pairings using the BetaPro method. Then, we compute gapped alignments of the paired \beta-strands and choose the interaction types and \beta--residue pairings with maximum alignment scores. We performed a 10-fold cross-validation experiment on the BetaSheet916 set and obtained significant improvements in the prediction accuracy.
INDEX TERMS
Protein \beta-sheets, open \beta-sheets, \beta-sheet prediction, contact map prediction, Bayesian modeling.
CITATION
Yucel Altunbasak, Hakan Erdogan, "Bayesian Models and Algorithms for Protein β-Sheet Prediction", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 2, pp. 395-409, March/April 2011, doi:10.1109/TCBB.2008.140
REFERENCES
[1] E. Koh , T. Kim , and H.S. Cho , "Mean Curvature as a Major Determinant of Beta-Sheet Propensity," Bioinformatics, vol. 22, pp. 297-302, 2006.
[2] C. Zhang and S. Kim , "The Anatomy of Protein Beta-Sheet Topology," J. Molecular Biology, vol. 299, pp. 1075-1089, 2000.
[3] S.M. Zaremba and L.M. Gregoret , "Context-Dependence of Amino Acid Residue Pairing in Antiparallel $\beta$ -Sheets," J. Molecular Biology, vol. 291, pp. 463-479, 1999.
[4] I. Ruczinski , C. Kooperberg , R. Bonneau , and D. Baker , "Distributions of Beta Sheets in Proteins with Application to Structure Prediction," Proteins: Structure, Function and Genetics, vol. 48, pp. 85-97, 2002.
[5] J.S. Merkel and L. Regan , "Modulating Protein Folding Rates In Vivo and In Vitro by Side Chain Interactions between the Parallel Beta Strands of Green Fluorescent Protein," J. Biological Chemistry, vol. 275, pp. 29200-29206, 2000.
[6] Y. Mandel-Gutfreund , S.M. Zaremba , and L.M. Gregoret , "Contributions of Residue Pairing to Beta-Sheet Formation: Conservation and Covariation of Amino Acid Residue Pairs on Antiparallel Beta-Strands," J. Molecular Biology, vol. 305, pp. 1145-1159, 2001.
[7] T. Kortemme , M. Ramirez-Alvarado , and L. Serrano , "Design of a 20-Amino Acid, Three-Stranded $\beta$ -Sheet Protein," Science, vol. 281, pp. 253-256, 1998.
[8] B. Kuhlman , G. Dantas , G. Ireton , G. Varani , B. Stoddard , and D. Baker , "Design of a Novel Globular Protein Fold with Atomic-Level Accuracy," Science, vol. 302, pp. 1364-1368, 2003.
[9] S. Lifson and C. Sander , "Specific Recognition in the Tertiary Structure of Beta-Sheets of Proteins," J. Molecular Biology, vol. 139, pp. 627-639, 1980.
[10] D.L. Minor and S. Kim , "Context Is a Major Determinant of Beta-Sheet Propensity," Nature, vol. 371, pp. 264-267, 1994.
[11] M.A. Wouters and P.M.G. Curmi , "An Analysis of Side Chain Interactions and Pair Correlations within Antiparallel Beta-Sheets: The Differences between Backbone Hydrogen Bonded and Nonhydrogen Bonded Residue Pairs," Proteins: Structure, Function, and Genetics, vol. 22, pp. 119-131, 1995.
[12] H. Zhu and W. Braun , "Sequence Specificity, Statistical Potentials, and Three-Dimensional Structure Prediction with Selfcorrecting," Protein Science, vol. 8, pp. 326-342, 1999.
[13] D.N. Woolfson , P.A. Evans , E.G. Hutchinson , and J.M. Thornton , "On the Conformation of Proteins: The Handedness of the Connection between Parallel $\beta$ -Strands," J. Molecular Biology, vol. 110, pp. 269-283, 1977.
[14] C.K. Smith and L. Regan , "Guidelines for Protein Design: The Energetics of $\beta$ Sheet Side Chain Interactions," Science, vol. 270, pp. 980-982, 1995.
[15] E.G. Hutchinson , R.B. Sessions , J.M. Thornton , and D.N. Woolfson , "Determinants of Strand Register in Antiparallel Beta-Sheets of Proteins," Protein Science, vol. 7, pp. 287-300, 1998.
[16] T.J. Hubbard , "Use of $\beta$ -Strand Interaction Pseudo Potentials in Protein Structure and Modelling," Proc. 27th Hawaii Int'l Conf. System Sciences (HICSS '94), pp. 336-344, 1994.
[17] T.J. Hubbard and J. Park , "Fold Recognition and Ab Initio Structure Predictions Using Hidden Markov Models and $\beta$ -Strand Pair Potentials," Proteins: Structure, Function, and Genetics, vol. 23, pp. 398-402, 1995.
[18] M. Asogawa , "Beta-Sheet Prediction Using Inter-Strand Residue Pairs and Refinement with Hopfield Neural Network," Proc. Int'l Conf. Intelligent Systems for Molecular Biology, vol. 5, pp. 48-51, 1997.
[19] B. Rost , J. Liu , D. Przybylski , R. Nair , K. Wrzeszczynski , H. Bigelow , and Y. Ofran , "Prediction of Protein Structure through Evolution," Handbook of Chemoinformatics from Data to Knowledge, J. Gasteiger and T. Engel, eds., pp. 1789-1811, Wiley, 2003.
[20] R.E. Steward and J.M. Thornton , "Prediction of Strand Pairing in Antiparallel and Parallel Beta-Sheets Using Information Theory," Proteins: Structure, Function, and Genetics, vol. 48, pp. 178-191, 2002.
[21] K. Karplus , C. Barrett , and R. Hughey , "Hidden Markov Models for Detecting Remote Protein Homologies," Bioinformatics, vol. 14, pp. 846-856, 1998.
[22] P. Baldi , G. Pollastri , C.A.F. Andersen , and S. Brunak , "Matching Protein $\beta$ -Sheet Partners by Feedforward and Recurrent Neural Networs," Proc. 2000 Conf. Intelligent Systems for Molecular Biology (ISMB '00), pp. 25-36, 2000.
[23] G. Pollastri and P. Baldi , "Prediction of Contact Maps by GIOHMMs and Recurrent Neural Networks Using Lateral Propagation from All Four Cardinal Corners," Bioinformatics, vol. 18, pp. S62-S70, 2002.
[24] N. Hamilton , K. Burrage , M. Ragan , and T. Huber , "Protein Contact Prediction Using Patterns of Correlation," Proteins, vol. 56, pp. 679-684, 2004.
[25] R. MacCallum , "Striped Sheets and Protein Contact Prediction," Bioinformatics, vol. 20, pp. i224-i231, 2004.
[26] J. Cheng and P. Baldi , "Three-Stage Prediction of Protein $\beta$ -Sheets by Neural Networks, Alignments and Graph Algorithms," Bioinformatics, vol. 21, pp. i75-i84, 2005.
[27] J. Cheng , A. Randall , M. Sweredoski , and P. Baldi , "SCRATCH: A Protein Structure and Structural Feature Prediction Server," Nucleic Acids Research, vol. 33, pp. w72-w76, 2005.
[28] M. Punta and B. Rost , "PROFcon: Novel Prediction of Long-Range Contacts," Bioinformatics, vol. 21, pp. 2960-2968, 2005.
[29] A. Vullo , I. Walsh , and G. Pollastri , "A Two-Stage Approach for Improved Prediction of Residue Contact Maps," BMC Bioinformatics, vol. 7, article no. 180, 2006.
[30] D. Bau , A. Martin , C. Mooney , A. Vullo , I. Walsh , and G. Pollastri , "Distill: A Suite of Web Servers for the Prediction of One-, Two- and Three-Dimensional Structural Features of Proteins," BMC Bioinformatics, vol. 7, article no. 402, 2006.
[31] J. Cheng and P. Baldi , "Improved Residue Contact Prediction Using Support Vector Machines and a Large Feature Set," BMC Bioinformatics, vol. 8, article no. 113, 2007.
[32] P. Bradley , L. Cowen , M. Menke , J. King , and B. Berger , "BETAWRAP: Successful Prediction of Parallel $\beta$ -Helices from Primary Sequence Reveals an Association with Many Pathogens," Proc. Nat'l Academy of Science USA, vol. 98, pp. 14819-14824, 2001.
[33] J. Waldispuhl , B. Berger , P. Clote , and J.M. Steyaert , "Predicting Transmembrane $\beta$ -Barrels and Interstrand Residue Interactions from Sequence," PROTEINS: Structure, Function, and Bioinformatics, vol. 65, pp. 61-74, 2006.
[34] A. Randall , J. Cheng , M. Sweredoski , and P. Baldi , "TMBpro: Secondary Structure, $\beta$ -Contact and Tertiary Structure Prediction of Transmembrane $\beta$ -Barrel Proteins," Bioinformatics, vol. 24, pp. 513-520, 2008.
[35] J. Jeong , P. Berman , and T. Przytycka , "Bringing Folding Pathways into Strand Pairing Prediction," Proc. Workshop Algorithms in Bioinformatics (WABI), pp. 38-49, 2007.
[36] "FTP Access to the DSSP Files at the CMBI," ftp://ftp.cmbi. kun.nl/pub/molbio/datadssp , 2009.
[37] W. Kabsch and C. Sander , "Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen Bonded and Geometrical Features," Biopolymers, vol. 22, pp. 2577-2637, 1983.
[38] I. Ruczinski , "Logic Regression and Statistical Issues Related to the Protein Folding Problem," PhD dissertation, Dept. of Statistics, Univ. of Washington, http://biostat.jhsph.edu/~iruczins/sheets scoring.pdf, 2000.
[39] S.B. Needleman and C.D. Wunsch , "A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins," J. Molecular Biology, vol. 48, pp. 443-453, 1970.
[40] O. Gotoh , "An Improved Algorithm for Matching Biological Sequences," J. Molecular Biology, vol. 264, pp. 823-838, 1982.
[41] L. Parker , "CS302 Lecture Notes: Topological Sort/Cycle Detection," http://www.cs.utk.edu/~parker/Courses/CS302-fall03/ NotesGraphIntro/, 2009.
[42] R. Durbin , S. Eddy , A. Krogh , and G. Mitchison , Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge Univ. Press, 1998.
[43] "The Protein Data Bank," http://www.rcsb.orgpdb, 2009.
[44] "Dunbrack Lab," http:/dunbrack.fcc.edu, 2008.
[45] "Pre-Compiled CulledPDB Lists from PISCES," http://dunbrack. fcc.edu/Guolipisces_download.php#culledpdb , 2008.
[46] "BetaSheet916 Set," http://www.ics.uci.edu/~baldigbetasheet_ data.html , 2009.
[47] "NCBI BLAST Downloads," http://www.ncbi.nlm.nih.gov/BLASTdownload.shtml , 2009.
17 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool