This Article 
 Bibliographic References 
 Add to: 
Molecular Dynamics Trajectory Compression with a Coarse-Grained Model
March/April 2012 (vol. 9 no. 2)
pp. 476-486
Yi-Ming Cheng, Michigan State University, East Lansing
Srinivasa Murthy Gopal, Michigan State University, East Lansing
Sean M. Law, Michigan State University, East Lansing
Michael Feig, Michigan State University, East Lansing
Molecular dynamics trajectories are very data intensive thereby limiting sharing and archival of such data. One possible solution is compression of trajectory data. Here, trajectory compression based on conversion to the coarse-grained model PRIMO is proposed. The compressed data are about one third of the original data and fast decompression is possible with an analytical reconstruction procedure from PRIMO to all-atom representations. This protocol largely preserves structural features and to a more limited extent also energetic features of the original trajectory.

[1] Y. Duan and P.A. Kollman, “Pathways to a Protein Folding Intermediate Observed in a 1-Microsecond Simulation in Aqueous Solution,” Science, vol. 282, pp. 740-744, Oct. 1998.
[2] P.L. Freddolino et al., “Ten-Microsecond Molecular Dynamics Simulation of a Fast-Folding WW Domain,” Biophysical J., vol. 94, pp. L75-7, May 2008.
[3] B. Zagrovic et al., “Simulation of Folding of a Small Alpha-Helical Protein in Atomistic Detail Using Worldwide-Distributed Computing,” J. Molecular Biology, vol. 323, pp. 927-37, Nov. 2002.
[4] M. Feig et al., “Large Scale Distributed Data Repository: Design of a Molecular Dynamics Trajectory Database,” Future Generation Computer Systems, vol. 16, pp. 101-110, Nov. 1999.
[5] C. Kehl et al., “Dynameomics: A Multi-Dimensional Analysis-Optimized Database for Dynamic Protein Data,” Protein Eng. Design Selection, vol. 21, pp. 379-86, June 2008.
[6] S.B. Dixit et al., “Molecular Dynamics Simulations of the 136 Unique Tetranucleotide Sequences of DNA Oligonucleotides. II: Sequence Context Effects on the Dynamical Structures of the 10 Unique Dinucleotide Steps,” Biophysical J., vol. 89, pp. 3721-40, Dec. 2005.
[7] K. Tai et al., “BioSimGrid: Towards a Worldwide Repository for Biomolecular Simulations,” Organic and Biomolecular Chemistry, vol. 2, pp. 3219-21, Nov. 2004.
[8] T. Meyer et al., “MoDEL (Molecular Dynamics Extended Library): A Database of Atomistic Molecular Dynamics Trajectories,” Structure, vol. 18, pp. 1399-1409, 2010.
[9] D.A. Lelewer and D.S. Hirschberg, “Data-Compression,” Computing Surveys, vol. 19, pp. 261-296, Sept. 1987.
[10] D. Salomon, Data Compression: The Complete Reference, fourth ed. Springer, 2007.
[11] K. Sayood, Introduction to Data Compression, third ed. Elsevier, 2006.
[12] I. Pavlov, http://www.7-zip.orgsdk.html, ed., 2005.
[13] P. Deutsch, “GZIP File Format Specification Version 4.3,” RFC 1952, Aladdin Enterprises, 1996.
[14] J. Seward, http:/, 2005.
[15] E. Roshal, http:/, 2011.
[16] C. Chang, “Compressing Atom Trajectory Data,” Dept. of Numerical Analysis and Computer Science, Royal Inst. of Technology, MA, 2005.
[17] ISO, Overview of the MPEG-4 Standard, http://www.chiariglione. orgmpeg/, 2011.
[18] ISO/IEC JTC1/SC29/WG11 N2562, MPEG-4 Requirements Document, 1998.
[19] ISO/IEC 14496-1:2002, Information Technology—Coding of Audio-Visual Objects—Part 1: Systems, 2002.
[20] D. Spångberg et al., “Trajectory NG: Portable, Compressed, General Molecular Dynamics Trajectories,” J. Molecular Modeling, vol. 17, pp. 2669-2685, 2011.
[21] D. van der Spoel et al., “GROMACS: Fast, Flexible, and Free,” J. Computational Chemistry, vol. 26, pp. 1701-1718, 2005.
[22] T. Meyer et al., “Essential Dynamics: A Tool for Efficient Trajectory Compression and Management,” J. Chemical Theory and Computation, vol. 2, pp. 251-258, 2006.
[23] A. Amadei et al., “Essential Dynamics of Proteins,” Proteins, vol. 17, pp. 412-25, Dec. 1993.
[24] S.M. Gopal et al., “PRIMO/PRIMONA: A Coarse-Grained Model for Proteins and Nucleic Acids that Preserves Near-Atomistic Accuracy,” Proteins, vol. 78, pp. 1266-81, Apr. 2010.
[25] V. Tozzini, “Coarse-Grained Models for Proteins,” Current Opinion in Structural Biology, vol. 15, pp. 144-50, Apr. 2005.
[26] A. Kolinski, “Protein Modeling and Structure Prediction with a Reduced Representation,” Acta Biochimical Polonica, vol. 51, pp. 349-71, 2004.
[27] N. Basdevant et al., “A Coarse-Grained Protein-Protein Potential Derived from an All-Atom Force Field,” J. Physical Chemistry B, vol. 111, pp. 9390-9399, Aug. 2007.
[28] A.P. Heath et al., “From Coarse-Grain to All-Atom: Toward Multiscale Analysis of Protein Landscapes,” Proteins, vol. 68, pp. 646-61, Aug. 2007.
[29] M. Feig et al., “Accurate Reconstruction of All-Atom Protein Representations from Side-Chain-Based Low-Resolution Models,” Proteins, vol. 41, pp. 86-97, Oct. 2000.
[30] P. Rotkiewicz and J. Skolnick, “Fast Procedure for Reconstruction of Full-Atom Protein Models from Reduced Representations,” J. Computational Chemistry, vol. 29, pp. 1460-1465, July 2008.
[31] Y. Li et al., “HAAD: A Quick Algorithm for Accurate Prediction of Hydrogen Atoms in Protein Structures,” PLoS One, vol. 4, p. e6701, 2009.
[32] M. Feig, “Kinetics from Implicit Solvent Simulations of Biomolecules as a Function of Viscosity,” J. Chemical Theory and Computation, vol. 3, pp. 1734-1748, 2007.
[33] B.R. Brooks et al., “CHARMM: The Biomolecular Simulation Program,” J. Computational Chemistry, vol. 30, pp. 1545-614, July 2009.
[34] M. Feig et al., “MMTSB Tool Set: Enhanced Sampling and Multiscale Modeling Methods for Applications in Structural Biology,” J. Molecular Graphics and Modeling, vol. 22, pp. 377-395, May 2004.
[35] A.D. MacKerellJr. et al., “All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins,” J. Physical Chemistry B, vol. 102, pp. 3586-3616, 1998.
[36] A.D. MacKerell et al., “Improved Treatment of the Protein Backbone in Empirical Force Fields,” J. Am. Chemical Soc., vol. 126, pp. 698-699, 2004.
[37] A.D. MacKerellJr., “Empirical Force Fields for Biological Macromolecules: Overview and Issues,” J. Computational Chemistry, vol. 25, pp. 1584-1604, Oct. 2004.
[38] M. Feig et al., “Force Field Influence on the Observation of π-Helical Protein Structures in Molecular Dynamics Simulations,” The J. Physical Chemistry B, vol. 107, pp. 2831-2836, 2003.
[39] W. Im et al., “Generalized Born Model with a Simple Smoothing Function,” J. Computational Chemistry, vol. 24, pp. 1691-702, Nov. 2003.
[40] M. Feig et al., “Performance Comparison of Generalized Born and Poisson Methods in the Calculation of Electrostatic Solvation Energies for Protein Structures,” J. Computational Chemistry, vol. 25, pp. 265-84, Jan. 2004.
[41] M. Orozco, html , 2011.
[42] J. Srinivasan et al., “Continuum Solvent Studies of the Stability of DNA, RNA, and Phosphoramidate-DNA Helices,” J. Am. Chemical Soc., vol. 120, pp. 9401-9409, 1998.
[43] M.R. Lee et al., “Use of MM-PB/SA in Estimating the Free Energies of Proteins: Application to Native, Intermediates, and Unfolded Villin Headpiece,” Proteins, vol. 39, pp. 309-316, 2000.
[44] K. Wittayanarakul et al., “Accurate Prediction of Protonation State as a Prerequisite for Reliable MM-PB(GB)/SA Binding Free Energy Calculations of HIV-1 Protease Inhibitors,” J. Computational Chemistry, vol. 29, pp. 1734-1748, 2008.
[45] M. Feig et al., “MMTSB Tool Set: Enhanced Sampling and Multiscale Modeling Methods for Applications in Structural Biology,” J. Molecular Graphics & Modelling, vol. 22, pp. 377-395, 2004.

Index Terms:
Proteins, all-atom reconstruction, PRIMO, molecular dynamics simulation, compression, coarse-grained model.
Yi-Ming Cheng, Srinivasa Murthy Gopal, Sean M. Law, Michael Feig, "Molecular Dynamics Trajectory Compression with a Coarse-Grained Model," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 2, pp. 476-486, March-April 2012, doi:10.1109/TCBB.2011.141
Usage of this product signifies your acceptance of the Terms of Use.