This Article 
 Bibliographic References 
 Add to: 
Predictor@Home: A "Protein Structure Prediction Supercomputer' Based on Global Computing
August 2006 (vol. 17 no. 8)
pp. 786-796

Abstract—Predicting the structure of a protein from its amino acid sequence is a complex process, the understanding of which could be used to gain new insight into the nature of protein functions or provide targets for structure-based design of drugs to treat new and existing diseases. While protein structures can be accurately modeled using computational methods based on all-atom physics-based force fields including implicit solvation, these methods require extensive sampling of native-like protein conformations for successful prediction and, consequently, they are often limited by inadequate computing power. To address this problem, we developed Predictor@Home, a "structure prediction supercomputer” powered by the Berkeley Open Infrastructure for Network Computing (BOINC) framework and based on the global computing paradigm (i.e., volunteered computing resources interconnected to the Internet and owned by the public). In this paper, we describe the protocol we employed for protein structure prediction and its integration into a global computing architecture based on public resources. We show how Predictor@Home significantly improved our ability to predict protein structures by increasing our sampling capacity by one to two orders of magnitude.

[1] B. Rost, “Marrying Structure and Genomics,” Structure, vol. 6, pp. 256-263, 1998.
[2] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, and P.E. Bourne, “The Protein Data Bank,” Nucleic Acids Research, vol. 28, pp. 235-242, 2000.
[3] E. Gasteiger, A. Gattiker, C. Hoogland, I. Ivanyi, R.D. Appel, and A. Bairoch, “ExPASy: The Proteomics Server for In-Depth Protein Knowledge and Analysis,” Nucleic Acids Research, vol. 31, pp. 3784-3788, 2003.
[4] K.C. Worley, P. Culpepper, B.A. Wiese, and R.F. Smith, “BEAUTY-X: Enhanced BLAST Searches for DNA Queries,” Bioinformatics, vol. 14, no. 10, pp. 890-891, 1998.
[5] A. Marchler-Bauer, A.R. Panchenko, B.A. Shoemaker, P.A. Thiessen, L.Y. Geer, and S.H. Bryant, “CDD: A Database of Conserved Domain Alignments with Links to Domain Three-Dimensional Structure,” Nucleic Acids Research, vol. 30, pp. 281-283, 2002.
[6] Y. Wang, J.B. Anderson, J. Chen, L.Y. Geer, S. He, D.I. Hurwitz, C.A. Liebert, T. Madej, G.H. Marchler, A. Marchler-Bauer, A.R. Panchenko, B.A. Shoemaker, J.S. Song, P.A. Thiessen, R.A. Yamashita, and S.H. Bryant, “MMDB: Entrez's 3D-Structure Database,” Nucleic Acids Research, vol. 30, pp. 249-252, 2002.
[7] J.N. Onuchic, Z. Luthey-Schulten, and P.G. Wolynes, “Theory of Protein Folding: The Energy Landscape Perspective,” Ann. Rev. Physical Chemistry, vol. 48, pp. 545-600, 1997.
[8] H. Frauenfelder, S. Sligar, and P.G. Wolynes, “The Energy Landscapes and Motions of Proteins,” Science, vol. 254, pp. 1598-1603, 1991.
[9] P.G. Wolynes and Z. Luthey-Schulten, The Energy Landscape Theory of Protein Folding Physics of Biological Systems: From Molecules to Species, H. Flyvbjerg, J. Hertz, M.H. Jensen, O.G. Mouritsen, and K. Sneppen, eds., pp. 61-79, Berlin: Springer-Verlag, 1996.
[10] M. Eastwood, C. Hardin, Z. Luthey-Schulten, and P.G. Wolynes, “Evaluating Protein Structure Prediction Schemes Using Energy Landscape Theory,” IBM J. Research Development, vol. 45, pp. 475-497, 2001.
[11] J. Moult, K. Fidelis, A. Zemla, and T. Hubbard, “Critical Assessment of Methods of Protein Structure Prediction (CASP)-Round V,” Proteins, vol. 53, no. 6, pp. 334-339, 2003.
[12] M. Feig and C.L. Brooks III, “Evaluating CASP4 Predictions with Physical Energy Functions,” Proteins, vol. 49, pp. 232-245, 2002.
[13] B.N. Dominy and C.L. Brooks III, “Identifying Native-Like Protein Structures Using Physics-Based Potentials,” J. Computational Chemistry, vol. 23, pp. 147-160, 2001.
[14] T. Lazaridis and M. Karplus, “Effective Energy Function for Proteins in Solution,” Proteins, vol. 35, pp. 133-152, 1999.
[15] Y. Duan and P.A. Kollman, “Pathways to a Protein Folding Intermediate Observed in a 1-Microsecond Simulation in Aqueous Solution,” Science, vol. 282, no. 5389, pp. 740-744, 1998.
[16] P. Bradley, K.M. Misura, and D. Baker, “Toward High-Resolution de Novo Structure Prediction for Small Proteins,” Science, vol. 309, pp. 1868-1871, 2005
[17] D.P. Anderson et al., SETI@Home, 2001, http:/setiathome.ssl.
[18], http:/, 2004.
[19] A. Olson et al., FightAIDS@Home, http:/fightaidsathome., 2003.
[20] The United Devices Cancer Research Project, http:/, 2002.
[21] V. Pande et al., “Atomistic Protein Folding Simulations on the Submillisecond Time Scale Using Worldwide Distributed Computing,” Biopolymers, vol. 68, pp. 91-109, 2003.
[22] M. Shirts and V. Pande, “Screen Savers of the World, Unite!” Science, 2000.
[23] S.F. Altschul, T.L. Madden, A.A. Schffer, J. Zhang, Z. Zhang, W. Miller, and D.J. Lipman, “Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs,” Nucleic Acids Research, vol. 25, pp. 3389-3402, 1997.
[24] K. Karplus, R. Karchin, J. Draper, J. Casper, Y. Mandel-Gutfreund, M. Diekhans, and R. Hughey, “Combining Local-Structure, Fold-Recognition and New-Fold Methods for Protein Structure Prediction,” Proteins, vol. 53, pp. 491-496, 2003.
[25] L.J. McGuffin, K. Bryson, and D.T. Jones, “The PSIPRED Protein Structure Prediction Server,” Bioinformatics, vol. 16, pp. 404-405, 2000.
[26] J. Skolnick, A. Kolinski, and A.R. Ortiz, “MONSSTER: A Method for Folding Globular Proteins with a Small Number of Distance Restraints,” J. Molecular Biology, vol. 265, pp. 217-241, 1997.
[27] B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S. Swaminathan, and M. Karplus, “CHARMM: A Program for Macromolecular Energy Minimization, and Dynamics Calculations,” J. Computational Chemistry, vol. 4, pp. 187-217, 1983.
[28] A.D. MacKerell Jr., B. Brooks, C.L. Brooks III, L. Nilsson, B. Roux, Y. Won, and M. Karplus, “CHARMM: The Energy Function and Its Parameterization with an Overview of the Program,” The Encyclopedia of Computational Chemistry, vol. 1, P.v.R. Schleyer et al., eds., pp. 271-277, Chichester: John Wiley and Sons, 1998.
[29] M.S. Lee, M. Feig, F.R. Salsbury Jr., and C.L. Brooks III, “New Analytic Approximation to the Standard Molecular Volume Definition and Its Application to Generalized Born Calculations,” J. Computational Chemistry, vol. 24, pp. 1348-1356, 2003.
[30] M. Feig, J. Karanicolas, and C.L. Brooks III, “MMTSB Tool Set: Enhanced Sampling and Multiscale Modeling Methods for Applications in Structural Biology,” J. Molecular Graphics and Modeling, vol. 22, pp. 377-395, 2004.
[31] D.P. Anderson and J. Kubiatowicz, “The World-Wide Computer,” Scientific Am., Mar. 2002.
[32] D.P. Anderson, “BOINC: A System for Public-Resource Computing and Storage,” Proc. Fifth IEEE/ACM Int'l Workshop Grid Computing, Nov. 2004.
[33] M. Braxenthaler, R. Ron Unger, D. Auerbach, J.A. Given, and J. Moult, “Chaos in Protein Dynamics,” Proteins, vol. 29, pp. 417-425, 1997.
[34] M. Taufer, D.P. Anderson, P. Cicotti, and C.L. Brooks III, “Homogeneous Redundancy: A Technique to Ensure Integrity of Molecular Simulation Results Using Public Computing,” Proc. 14th Heterogeneous Computing Workshop, in conjunction with IPDPS 2005, Apr. 2005.

Index Terms:
Global computing paradigm, public resources, protein conformational sampling, Monte Carlo simulations, molecular dynamics.
Michela Taufer, Chahm An, Andreas Kerstens, Charles L. Brooks III, "Predictor@Home: A "Protein Structure Prediction Supercomputer' Based on Global Computing," IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 8, pp. 786-796, Aug. 2006, doi:10.1109/TPDS.2006.110
Usage of this product signifies your acceptance of the Terms of Use.