The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - Sept.-Oct. (2013 vol.10)
pp: 1162-1175
Kevin Molloy , Dept. of Comput. Sci., George Mason Univ., Fairfax, VA, USA
Sameh Saleh , Dept. of Comput. Sci., George Mason Univ., Fairfax, VA, USA
Amarda Shehu , Dept. of Comput. Sci., George Mason Univ., Fairfax, VA, USA
ABSTRACT
Adequate sampling of the conformational space is a central challenge in ab initio protein structure prediction. In the absence of a template structure, a conformational search procedure guided by an energy function explores the conformational space, gathering an ensemble of low-energy decoy conformations. If the sampling is inadequate, the native structure may be missed altogether. Even if reproduced, a subsequent stage that selects a subset of decoys for further structural detail and energetic refinement may discard near-native decoys if they are high energy or insufficiently represented in the ensemble. Sampling should produce a decoy ensemble that facilitates the subsequent selection of near-native decoys. In this paper, we investigate a robotics-inspired framework that allows directly measuring the role of energy in guiding sampling. Testing demonstrates that a soft energy bias steers sampling toward a diverse decoy ensemble less prone to exploiting energetic artifacts and thus more likely to facilitate retainment of near-native conformations by selection techniques. We employ two different energy functions, the associative memory Hamiltonian with water and Rosetta. Results show that enhanced sampling provides a rigorous testing of energy functions and exposes different deficiencies in them, thus promising to guide development of more accurate representations and energy functions.
INDEX TERMS
Proteins, Trajectory, Probability distribution, Energy resolution, Energy states,energy bias, Protein structure prediction, probabilistic conformational search, near-native conformations
CITATION
Kevin Molloy, Sameh Saleh, Amarda Shehu, "Probabilistic Search and Energy Guidance for Biased Decoy Sampling in Ab Initio Protein Structure Prediction", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.10, no. 5, pp. 1162-1175, Sept.-Oct. 2013, doi:10.1109/TCBB.2013.29
REFERENCES
[1] C.B. Anfinsen, "Principles That Govern the Folding of Protein Chains," Science, vol. 181, no. 4096, pp. 223-230, 1973.
[2] H.M. Berman, K. Henrick, and H. Nakamura, "Announcing the Worldwide Protein Data Bank," Nature Structural Biology, vol. 10, no. 12, pp. 980-980, 2003.
[3] M.R. Betancourt and J. Skolnick, "Finding the Needle in a Haystack: Educating Native Folds from Ambiguous Ab Initio Protein Structure Predictions," J. Computational Chemistry, vol. 22, no. 3, pp. 339-353, 2001.
[4] G.R. Bowman and V.S. Pande, "Simulated Tempering Yields Insight into the Low-Resolution Rosetta Scoring Functions," Proteins: Structure Function Bioinformatics, vol. 74, no. 3, pp. 777-788, 2009.
[5] P. Bradley, K.M.S. Misura, and D. Baker, "Toward High-Resolution De Novo Structure Prediction for Small Proteins," Science, vol. 309, no. 5742, pp. 1868-1871, 2005.
[6] B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S. Swaminathan, and M. Karplus, "CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations," J. Computational Chemistry, vol. 4, no. 2, pp. 187-217, 1983.
[7] T.J. Brunette and O. Brock, "Guiding Conformation Space Search with an All-Atom Energy Potential," Proteins: Structure Function Bioinformatics, vol. 73, no. 4, pp. 958-972, 2009.
[8] D.A. Case, T.A. Darden, T.E.I. Cheatham, C.L. Simmerling, J. Wang, R.E. Duke, R. Luo, K.M. Merz, D.A. Pearlman, M. Crowley, R.C. Walker, W. Zhang, B. Wang, S. Hayik, A. Roitberg, G. Seabra, K.F. Wong, F. Paesani, X. Wu, S. Brozell, V. Tsui, H. Gohlke, L. Yang, C. Tan, J. Mongan, V. Hornak, G. Cui, P. Beroza, D.H. Mathews, C. Schafmeister, W.S. Ross, and P.A. Kollman, Amber 9. Univ. of California, San Francisco, 2006.
[9] A. Cavalli, X. Salvatella, C.M. Dobson, and M. Vendruscolo, "Protein Structure Determination from NMR Chemical Shifts," Proc. Nat'l Academy of Sciences USA, vol. 104, no. 23, pp. 9615-9620, 2007.
[10] R. Das, "Four Small Puzzles That Rosetta Doesn't Solve," PLoS One, vol. 6, no. 5, article e20044, 2011.
[11] J. DeBartolo, G. Hocky, M. Wilde, J. Xu, K.F. Freed, and T.R. Sosnick, "Protein Structure Prediction Enhanced with Evolutionary Diversity: SPEED," Protein Science, vol. 19, no. 3, pp. 520-534, 2010.
[12] H. Gong, P.J. Fleming, and G.D. Rose, "Building Native Protein Conformations from Highly Approximate Backbone Torsion Angles," Proc. Nat'l Academy of Sciences USA, vol. 102, no. 45, pp. 16227-16232, 2005.
[13] D. Gront, D. Kulp, R. Vernon, C. Strauss, and Baker, "Generalized Fragment Picking in Rosetta: Design, Protocols and Applications," PLoS One, vol. 6, no. 8, article e23294, 2011.
[14] K.F. Han and D. Baker, "Global Properties of the Mapping between Local Amino Acid Sequence and Local Structure in Proteins," Proc. Nat'l Academy of Sciences USA, vol. 93, no. 12, pp. 5814-5818, 1996.
[15] J. Handl, J. Knowles, R. Vernon, D. Baker, and S.C. Lovell, "The Dual Role of Fragments in Fragment-Assembly Methods for De Novo Protein Structure Prediction," Proteins: Structure Function Bioinformatics, vol. 80, no. 2, pp. 490-504, 2011.
[16] J.A. Hegler, J. Laetzer, A. Shehu, C. Clementi, and P.G. Wolynes, "Restriction vs. Guidance: Fragment Assembly and Associative Memory Hamiltonians for Protein Structure Prediction," Proc. Nat'l Academy of Sciences USA, vol. 106, no. 36, pp. 15302-15307, 2009.
[17] L. Kinch, S. Yong Shi, Q. Cong, H. Cheng, Y. Liao, and N.V. Grishin, "CASP9 Assessment of Free Modeling Target Predictions," Proteins: Structure Function Bioinformatics, vol. 79, no. 10, pp. 59-73, 2011.
[18] R. Kolodny, P. Koehl, L. Guibas, and M. Levitt, "Small Libraries of Protein Fragments Model Native Protein Structures Accurately," J. Molecular Biology, vol. 323, no. 2, pp. 297-307, 2002.
[19] H. Kurniawati and D. Hsu, "Workspace-Based Connectivity Oracle: An Adaptive Sampling Strategy for PRM Planning," Proc. Int'l Workshop Algorithmic Foundations of Robotics (WAFR '06), pp. 35-51, 2006.
[20] A. Leaver-Fay, M. Tyka, S.M. Lewis, O.F. Lange, J. Thompson, R. Jacak, K. Kaufman, P.D. Renfrew, C.A. Smith, W. Sheffler, I.W. Davis, S. Cooper, A. Treuille, D.J. Mandell, F. Richter, Y.E. Ban, S.J. Fleishman, J.E. Corn, D.E. Kim, S. Lyskov, M. Berrondo, S. Mentzer, Z. Popovi, J.J. Havranek, J. Karanicolas, R. Das, J. Meiler, T. Kortemme, J.J. Gray, B. Kuhlman, D. Baker, and P. Bradley, "ROSETTA3: An Object-Oriented Software Suite for the Simulation and Design of Macromolecules," Methods Enzymology, vol. 487, pp. 545-574, 2011.
[21] V.N. Maiorov and G.M. Crippen, "Significance of Root-Mean-Square Deviation in Comparing Three-Dimensional Structures of Globular Proteins," J. Molecular Biology, vol. 235, no. 2, pp. 625-634, 1994.
[22] A.D. McLachlan, "A Mathematical Procedure for Superimposing Atomic Coordinates of Proteins," Acta Crystallographica A, vol. 26, no. 6, pp. 656-657, 1972.
[23] J. Moult, K. Fidelis, A. Kryshtafovych, and A. Tramontano, "Critical Assessment of Methods of Protein Structure Prediction (CASP) Round IX," Proteins: Structure Function Bioinformatics, vol. 79, no. 10, pp. 1-5, 2011.
[24] B. Olson and A. Shehu, "Evolutionary-Inspired Probabilistic Search for Enhancing Sampling of Local Minima in the Protein Energy Surface," Proteome Science, vol. 10, no. Suppl 1, article S5, 2012.
[25] B. Olson, K. Molloy, and A. Shehu, "In Search of the Protein Native State with a Probabilistic Sampling Approach," J. Bioinformatics and Computational Biology, vol. 9, no. 3, pp. 383-398, 2011.
[26] B.S. Olson, K. Molloy, S.-F. Hendi, and A. Shehu, "Guiding Search in the Protein Conformational Space with Structural Profiles," J. Bioinformatics and Computational Biology, vol. 10, no. 3,article 1242005, 2012.
[27] J.N. Onuchic and P.G. Wolynes, "Theory of Protein Folding," Current Opinion Structural Biology, vol. 14, pp. 70-75, 1997.
[28] G.A. Papoian, J. Ulander, M.P. Eastwood, Z. Luthey-Schulten, and P.G. Wolynes, "Water in Protein Structure Prediction," Proc. Nat'l Academy of Sciences USA, vol. 101, no. 10, pp. 3352-3357, 2004.
[29] E. Plaku, L. Kavraki, and M. Vardi, "Discrete Search Leading Continuous Exploration for Kinodynamic Motion Planning," Robotics: Science and System, MIT Press, 2007.
[30] M.C. Prentiss, C. Hardin, M.P. Eastwood, C. Zong, and P.G. Wolynes, "Protein Structure Prediction: The Next Generation," J. Chemical Theory Computation, vol. 2, no. 3, pp. 705-716, 2006.
[31] M.C. Prentiss, D.J. Wales, and P.G. Wolynes, "Protein Structure Prediction Using Basin-Hopping," J. Chemical Physics, vol. 128, no. 22, pp. 225106-225106, June 2008.
[32] S. Raman, D. Baker, B. Qian, and R.C. Walker, "Advances in Rosetta Protein Structure Prediction on Massively Parallel Systems," IBM J. Research and Development, vol. 52, nos. 1/2 pp. 7-17, 2008.
[33] C.A. Rohl, C.E. Strauss, K.M. Misura, and D. Baker, "Protein Structure Prediction Using Rosetta," Methods Enzymology, vol. 383, pp. 66-93, 2004.
[34] A. Shehu, "An Ab-Initio Tree-Based Exploration to Enhance Sampling of Low-Energy Protein Conformations," Robotics: Science and Systems, pp. 241-248, MIT Press, 2009.
[35] A. Shehu, L.E. Kavraki, and C. Clementi, "Multiscale Characterization of Protein Conformational Ensembles," Proteins: Structure Function Bioinformatics, vol. 76, no. 4, pp. 837-851, 2009.
[36] A. Shehu and B. Olson, "Guiding the Search for Native-Like Protein Conformations with an Ab-Initio Tree-Based Exploration," Int'l J. Robotic Research, vol. 29, no. 8, pp. 1106-11227, 2010.
[37] Y. Shen, O. Lange, F. Delaglio, P. Rossi, J.M. Aramini, G. Liu, A. Eletsky, Y. Wu, K.K. Singarapu, A. Lemak, A. Ignatchenko, C.H. Arrowsmith, T. Szyperski, G.T. Montelione, D. Baker, and A. Bax, "Consistent Blind Protein Structure Generation from NMR Chemical Shift Data," Proc. Nat'l Academy of Sciences USA, vol. 105, no. 12, pp. 4685-4690, 2008.
[38] A. Shmygelska and M. Levitt, "Generalized Ensemble Methods for De Novo Structure Prediction," Proc. Nat'l Academy of Sciences USA, vol. 106, no. 5, pp. 94305-95126, 2009.
[39] D. Simoncini, F. Berenger, R. Shrestha, and K.Y.J. Zhang, "A Probabilistic Fragment-Based Protein Structure Prediction Algorithm," PLoS One, vol. 7, no. 7, article e38799, 2012.
[40] M. Steinbach, G. Karypis, and V. Kumar, "A Comparison of Document Clustering Techniques," Proc. KDD Workshop Text Mining, 2000.
[41] M. Stilman and J.J. Kuffner, "Planning among Movable Obstacles with Artificial Constraints," Int'l J. Robotics Research, vol. 12, no. 12, pp. 1295-1307, 2008.
[42] A.W. Stumpff-Kane and M. Feig, "A Correlation-Based Method for the Enhancement of Scoring Functions on Funnel-Shaped Energy Landscapes," Proteins: Structure Function Bioinformatics, vol. 63, no. 1, pp. 155-164, 2006.
[43] J.P. van den Berg and M.H. Overmars, "Using Workspace Information as a Guide to Non-Uniform Sampling in Probabilistic Roadmap Planners," Int'l J. Robotics Research, vol. 24, no. 12, pp. 1055-1071, 2005.
[44] A. Verma, A. Schug, K.H. Lee, and W. Wenzel, "Basin Hopping Simulations for All-Atom Protein Folding," J. Chemical Physics, vol. 124, no. 4, article 044515, 2006.
[45] K. Wolff, M. Vendruscolo, and M. Porto, "Efficient Identification of Near-Native Conformations in Ab Initio Protein Structure Prediction Using Structural Profiles," Proteins: Structure, vol. 78, pp. 249-258, Jan. 2010.
[46] D. Xu and Y. Zhang, "Ab Initio Protein Structure Assembly Using Continuous Structure Fragments and Optimized Knowledge-Based Force Field," Proteins: Structure Function Bioinformatics, vol. 80, no. 7, pp. 1715-1735, 2012.
[47] Y. Yang and O. Brock, "Efficient Motion Planning Based on Disassembly," Robotics: Science and Systems, pp. 97-104, MIT Press, 2005.
[48] M. Zhang and L.E. Kavraki, "A New Method for Fast and Accurate Derivation of Molecular Conformations," J. Chemical Information Computer Sciences, vol. 42, no. 1, pp. 64-70, 2002.
[49] Y. Zhang, "Progress and Challenges in Protein Structure Prediction," Current Opinion Structural Biology, vol. 18, no. 3, pp. 342-348, 2008.
[50] Y. Zhang and J. Skolnick, "Scoring Function for Automated Assessment of Protein Structure Template Quality," Proteins: Structure, Function, and Bioinformatics, vol. 57, no. 4, pp. 702-710, 2004.
[51] Y. Zhang and J. Skolnick, "Spicker: A Clustering Approach to Identify Near-Native Protein Folds," J. Computational Chemistry, vol. 25, no. 6, pp. 865-871, 2004.
69 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool