This Article 
 Bibliographic References 
 Add to: 
Bayesian Models and Algorithms for Protein β-Sheet Prediction
March/April 2011 (vol. 8 no. 2)
pp. 395-409
Zafer Aydin, Georgia Institute of Technology, Atlanta
Yucel Altunbasak, Georgia Institute of Technology, Atlanta
Hakan Erdogan, Sabanci University, Istanbul
Prediction of the 3D structure greatly benefits from the information related to secondary structure, solvent accessibility, and nonlocal contacts that stabilize a protein's structure. We address the problem of \beta-sheet prediction defined as the prediction of \beta--strand pairings, interaction types (parallel or antiparallel), and \beta-residue interactions (or contact maps). We introduce a Bayesian approach for proteins with six or less \beta-strands in which we model the conformational features in a probabilistic framework by combining the amino acid pairing potentials with a priori knowledge of \beta-strand organizations. To select the optimum \beta-sheet architecture, we significantly reduce the search space by heuristics that enforce the amino acid pairs with strong interaction potentials. In addition, we find the optimum pairwise alignment between \beta-strands using dynamic programming in which we allow any number of gaps in an alignment to model \beta-bulges more effectively. For proteins with more than six \beta-strands, we first compute \beta-strand pairings using the BetaPro method. Then, we compute gapped alignments of the paired \beta-strands and choose the interaction types and \beta--residue pairings with maximum alignment scores. We performed a 10-fold cross-validation experiment on the BetaSheet916 set and obtained significant improvements in the prediction accuracy.

[1] E. Koh , T. Kim , and H.S. Cho , "Mean Curvature as a Major Determinant of Beta-Sheet Propensity," Bioinformatics, vol. 22, pp. 297-302, 2006.
[2] C. Zhang and S. Kim , "The Anatomy of Protein Beta-Sheet Topology," J. Molecular Biology, vol. 299, pp. 1075-1089, 2000.
[3] S.M. Zaremba and L.M. Gregoret , "Context-Dependence of Amino Acid Residue Pairing in Antiparallel $\beta$ -Sheets," J. Molecular Biology, vol. 291, pp. 463-479, 1999.
[4] I. Ruczinski , C. Kooperberg , R. Bonneau , and D. Baker , "Distributions of Beta Sheets in Proteins with Application to Structure Prediction," Proteins: Structure, Function and Genetics, vol. 48, pp. 85-97, 2002.
[5] J.S. Merkel and L. Regan , "Modulating Protein Folding Rates In Vivo and In Vitro by Side Chain Interactions between the Parallel Beta Strands of Green Fluorescent Protein," J. Biological Chemistry, vol. 275, pp. 29200-29206, 2000.
[6] Y. Mandel-Gutfreund , S.M. Zaremba , and L.M. Gregoret , "Contributions of Residue Pairing to Beta-Sheet Formation: Conservation and Covariation of Amino Acid Residue Pairs on Antiparallel Beta-Strands," J. Molecular Biology, vol. 305, pp. 1145-1159, 2001.
[7] T. Kortemme , M. Ramirez-Alvarado , and L. Serrano , "Design of a 20-Amino Acid, Three-Stranded $\beta$ -Sheet Protein," Science, vol. 281, pp. 253-256, 1998.
[8] B. Kuhlman , G. Dantas , G. Ireton , G. Varani , B. Stoddard , and D. Baker , "Design of a Novel Globular Protein Fold with Atomic-Level Accuracy," Science, vol. 302, pp. 1364-1368, 2003.
[9] S. Lifson and C. Sander , "Specific Recognition in the Tertiary Structure of Beta-Sheets of Proteins," J. Molecular Biology, vol. 139, pp. 627-639, 1980.
[10] D.L. Minor and S. Kim , "Context Is a Major Determinant of Beta-Sheet Propensity," Nature, vol. 371, pp. 264-267, 1994.
[11] M.A. Wouters and P.M.G. Curmi , "An Analysis of Side Chain Interactions and Pair Correlations within Antiparallel Beta-Sheets: The Differences between Backbone Hydrogen Bonded and Nonhydrogen Bonded Residue Pairs," Proteins: Structure, Function, and Genetics, vol. 22, pp. 119-131, 1995.
[12] H. Zhu and W. Braun , "Sequence Specificity, Statistical Potentials, and Three-Dimensional Structure Prediction with Selfcorrecting," Protein Science, vol. 8, pp. 326-342, 1999.
[13] D.N. Woolfson , P.A. Evans , E.G. Hutchinson , and J.M. Thornton , "On the Conformation of Proteins: The Handedness of the Connection between Parallel $\beta$ -Strands," J. Molecular Biology, vol. 110, pp. 269-283, 1977.
[14] C.K. Smith and L. Regan , "Guidelines for Protein Design: The Energetics of $\beta$ Sheet Side Chain Interactions," Science, vol. 270, pp. 980-982, 1995.
[15] E.G. Hutchinson , R.B. Sessions , J.M. Thornton , and D.N. Woolfson , "Determinants of Strand Register in Antiparallel Beta-Sheets of Proteins," Protein Science, vol. 7, pp. 287-300, 1998.
[16] T.J. Hubbard , "Use of $\beta$ -Strand Interaction Pseudo Potentials in Protein Structure and Modelling," Proc. 27th Hawaii Int'l Conf. System Sciences (HICSS '94), pp. 336-344, 1994.
[17] T.J. Hubbard and J. Park , "Fold Recognition and Ab Initio Structure Predictions Using Hidden Markov Models and $\beta$ -Strand Pair Potentials," Proteins: Structure, Function, and Genetics, vol. 23, pp. 398-402, 1995.
[18] M. Asogawa , "Beta-Sheet Prediction Using Inter-Strand Residue Pairs and Refinement with Hopfield Neural Network," Proc. Int'l Conf. Intelligent Systems for Molecular Biology, vol. 5, pp. 48-51, 1997.
[19] B. Rost , J. Liu , D. Przybylski , R. Nair , K. Wrzeszczynski , H. Bigelow , and Y. Ofran , "Prediction of Protein Structure through Evolution," Handbook of Chemoinformatics from Data to Knowledge, J. Gasteiger and T. Engel, eds., pp. 1789-1811, Wiley, 2003.
[20] R.E. Steward and J.M. Thornton , "Prediction of Strand Pairing in Antiparallel and Parallel Beta-Sheets Using Information Theory," Proteins: Structure, Function, and Genetics, vol. 48, pp. 178-191, 2002.
[21] K. Karplus , C. Barrett , and R. Hughey , "Hidden Markov Models for Detecting Remote Protein Homologies," Bioinformatics, vol. 14, pp. 846-856, 1998.
[22] P. Baldi , G. Pollastri , C.A.F. Andersen , and S. Brunak , "Matching Protein $\beta$ -Sheet Partners by Feedforward and Recurrent Neural Networs," Proc. 2000 Conf. Intelligent Systems for Molecular Biology (ISMB '00), pp. 25-36, 2000.
[23] G. Pollastri and P. Baldi , "Prediction of Contact Maps by GIOHMMs and Recurrent Neural Networks Using Lateral Propagation from All Four Cardinal Corners," Bioinformatics, vol. 18, pp. S62-S70, 2002.
[24] N. Hamilton , K. Burrage , M. Ragan , and T. Huber , "Protein Contact Prediction Using Patterns of Correlation," Proteins, vol. 56, pp. 679-684, 2004.
[25] R. MacCallum , "Striped Sheets and Protein Contact Prediction," Bioinformatics, vol. 20, pp. i224-i231, 2004.
[26] J. Cheng and P. Baldi , "Three-Stage Prediction of Protein $\beta$ -Sheets by Neural Networks, Alignments and Graph Algorithms," Bioinformatics, vol. 21, pp. i75-i84, 2005.
[27] J. Cheng , A. Randall , M. Sweredoski , and P. Baldi , "SCRATCH: A Protein Structure and Structural Feature Prediction Server," Nucleic Acids Research, vol. 33, pp. w72-w76, 2005.
[28] M. Punta and B. Rost , "PROFcon: Novel Prediction of Long-Range Contacts," Bioinformatics, vol. 21, pp. 2960-2968, 2005.
[29] A. Vullo , I. Walsh , and G. Pollastri , "A Two-Stage Approach for Improved Prediction of Residue Contact Maps," BMC Bioinformatics, vol. 7, article no. 180, 2006.
[30] D. Bau , A. Martin , C. Mooney , A. Vullo , I. Walsh , and G. Pollastri , "Distill: A Suite of Web Servers for the Prediction of One-, Two- and Three-Dimensional Structural Features of Proteins," BMC Bioinformatics, vol. 7, article no. 402, 2006.
[31] J. Cheng and P. Baldi , "Improved Residue Contact Prediction Using Support Vector Machines and a Large Feature Set," BMC Bioinformatics, vol. 8, article no. 113, 2007.
[32] P. Bradley , L. Cowen , M. Menke , J. King , and B. Berger , "BETAWRAP: Successful Prediction of Parallel $\beta$ -Helices from Primary Sequence Reveals an Association with Many Pathogens," Proc. Nat'l Academy of Science USA, vol. 98, pp. 14819-14824, 2001.
[33] J. Waldispuhl , B. Berger , P. Clote , and J.M. Steyaert , "Predicting Transmembrane $\beta$ -Barrels and Interstrand Residue Interactions from Sequence," PROTEINS: Structure, Function, and Bioinformatics, vol. 65, pp. 61-74, 2006.
[34] A. Randall , J. Cheng , M. Sweredoski , and P. Baldi , "TMBpro: Secondary Structure, $\beta$ -Contact and Tertiary Structure Prediction of Transmembrane $\beta$ -Barrel Proteins," Bioinformatics, vol. 24, pp. 513-520, 2008.
[35] J. Jeong , P. Berman , and T. Przytycka , "Bringing Folding Pathways into Strand Pairing Prediction," Proc. Workshop Algorithms in Bioinformatics (WABI), pp. 38-49, 2007.
[36] "FTP Access to the DSSP Files at the CMBI," ftp://ftp.cmbi. , 2009.
[37] W. Kabsch and C. Sander , "Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen Bonded and Geometrical Features," Biopolymers, vol. 22, pp. 2577-2637, 1983.
[38] I. Ruczinski , "Logic Regression and Statistical Issues Related to the Protein Folding Problem," PhD dissertation, Dept. of Statistics, Univ. of Washington, scoring.pdf, 2000.
[39] S.B. Needleman and C.D. Wunsch , "A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins," J. Molecular Biology, vol. 48, pp. 443-453, 1970.
[40] O. Gotoh , "An Improved Algorithm for Matching Biological Sequences," J. Molecular Biology, vol. 264, pp. 823-838, 1982.
[41] L. Parker , "CS302 Lecture Notes: Topological Sort/Cycle Detection," NotesGraphIntro/, 2009.
[42] R. Durbin , S. Eddy , A. Krogh , and G. Mitchison , Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge Univ. Press, 1998.
[43] "The Protein Data Bank," http://www.rcsb.orgpdb, 2009.
[44] "Dunbrack Lab," http:/, 2008.
[45] "Pre-Compiled CulledPDB Lists from PISCES," http://dunbrack. , 2008.
[46] "BetaSheet916 Set," data.html , 2009.
[47] "NCBI BLAST Downloads," , 2009.

Index Terms:
Protein \beta-sheets, open \beta-sheets, \beta-sheet prediction, contact map prediction, Bayesian modeling.
Zafer Aydin, Yucel Altunbasak, Hakan Erdogan, "Bayesian Models and Algorithms for Protein β-Sheet Prediction," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. 2, pp. 395-409, March-April 2011, doi:10.1109/TCBB.2008.140
Usage of this product signifies your acceptance of the Terms of Use.