Subscribe
Issue No.04 - July/August (2011 vol.8)
pp: 976-986
Sebastian Böcker , Friedrich-Schiller-Universität Jena, Jena
Birte Kehr , Institute for Computer Science, Takustraβe, Berlin
Florian Rasche , Friedrich-Schiller-Universität Jena, Jena
ABSTRACT
Glycans are molecules made from simple sugars that form complex tree structures. Glycans constitute one of the most important protein modifications and identification of glycans remains a pressing problem in biology. Unfortunately, the structure of glycans is hard to predict from the genome sequence of an organism. In this paper, we consider the problem of deriving the topology of a glycan solely from tandem mass spectrometry (MS) data. We study, how to generate glycan tree candidates that sufficiently match the sample mass spectrum, avoiding the combinatorial explosion of glycan structures. Unfortunately, the resulting problem is known to be computationally hard. We present an efficient exact algorithm for this problem based on fixed-parameter algorithmics that can process a spectrum in a matter of seconds. We also report some preliminary results of our method on experimental data, combining it with a preliminary candidate evaluation scheme. We show that our approach is fast in applications, and that we can reach very well de novo identification results. Finally, we show how to count the number of glycan topologies for a fixed size or a fixed mass. We generalize this result to count the number of (labeled) trees with bounded out degree, improving on results obtained using Pólya's enumeration theorem.
INDEX TERMS
Computational mass spectrometry, glycans, parameterized algorithms, exact algorithms, counting trees.
CITATION
Sebastian Böcker, Birte Kehr, Florian Rasche, "Determination of Glycan Structure from Tandem Mass Spectra", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 4, pp. 976-986, July/August 2011, doi:10.1109/TCBB.2010.129
REFERENCES
 [1] R. Apweiler, H. Hermjakob, and N. Sharon, “On the Frequency of Protein Glycosylation, as Deduced from Analysis of the SWISS-PROT Database,” Biochimica et Biophysica Acta, vol. 1473, no. 1, pp. 4-8, 1999. [2] A. Dell and H.R. Morris, “Glycoprotein Structure Determination by Mass Spectrometry,” Science, vol. 291, no. 5512, pp. 2351-2356, 2001. [3] J. Zaia, “Mass Spectrometry of Oligosaccharides,” Mass Spectrom. Rev., vol. 23, no. 3, pp. 161-227, 2004. [4] K.K. Lohmann and C.-W. von der Lieth, “GlycoFragment and GlycoSearchMS: Web Tools to Support the Interpretation of Mass Spectra of Complex Carbohydrates,” Nucleic Acids Research, vol. 32, pp. W261-W266, 2004. [5] C.A. Cooper, E. Gasteiger, and N.H. Packer, “GlycoMod - A Software Tool for Determining Glycosylation Compositions from Mass Spectrometric Data,” Proteomics, vol. 1, no. 2, pp. 340-349, 2001. [6] J.A. Taylor and R.S. Johnson, “Sequence Database Searches via De Novo Peptide Sequencing by Tandem Mass Spectrometry,” Rapid Comm. Mass Spectrometry, vol. 11, pp. 1067-1075, 1997. [7] S. Böcker and Z. Lipták, “A Fast and Simple Algorithm for the Money Changing Problem,” Algorithmica, vol. 48, no. 4, pp. 413-432, 2007. [8] D. Goldberg, M. Bern, B. Li, and C.B. Lebrilla, “Automatic Determination of O-Glycan Structure from Fragmentation Spectra,” J. Proteome Research, vol. 5, no. 6, pp. 1429-1434, 2006. [9] T. Chen, M.-Y. Kao, M. Tepel, J. Rush, and G.M. Church, “A Dynamic Programming Approach to De Novo Peptide Sequencing via Tandem Mass Spectrometry,” J. Computational Biology, vol. 8, no. 3, pp. 325-337, 2001. [10] B. Shan, B. Ma, K. Zhang, and G. Lajoie, “Complexities and Algorithms for Glycan Sequencing Using Tandem Mass Spectrometry,” J. Bioinformatics and Computational Biology, vol. 6, no. 1, pp. 77-91, 2008. [11] S.P. Gaucher, J. Morrow, and J.A. Leary, “STAT: A Saccharide Topology Analysis Tool Used in Combination with Tandem Mass Spectrometry,” Analytical Chemistry, vol. 72, no. 11, pp. 2331-2336, 2000. [12] M. Ethier, J.A. Saba, M. Spearman, O. Krokhin, M. Butler, W. Ens, K.G. Standing, and H. Perreault, “Application of the StrOligo Algorithm for the Automated Structure Assignment of Complex N-Linked Glycans from Glycoproteins Using Tandem Mass Spectrometry,” Rapid Comm. Mass Spectrometry, vol. 17, no. 24, pp. 2713-2720, 2003. [13] H. Tang, Y. Mechref, and M.V. Novotny, “Automated Interpretation of MS/MS Spectra of Oligosaccharides,” Bioinformatics, vol. 21, Suppl 1, pp. i431-i439, 2005. [14] D. Goldberg, M. Sutton-Smith, J. Paulson, and A. Dell, “Automatic Annotation of Matrix-Assisted Laser Desorption/Ionization N-Glycan Spectra,” Proteomics, vol. 5, no. 4, pp. 865-875, Mar. 2005. [15] D. Goldberg, M. Bern, S. Parry, M. Sutton-Smith, M. Panico, H.R. Morris, and A. Dell, “Automated N-Glycopeptide Identification Using a Combination of Single- and Tandem-MS,” J. Proteome Research, vol. 6, no. 10, pp. 3995-4005, 2007. [16] A.J. Lapadula, P.J. Hatcher, A.J. Hanneman, D.J. Ashline, H. Zhang, and V.N. Reinhold, “Congruent Strategies for Carbohydrate Sequencing. 3. OSCAR: An Algorithm for Assigning Oligosaccharide Topology from ${\rm MS}^n$ Data,” Analytical Chemistry, vol. 77, no. 19, pp. 6271-6279, 2005. [17] A. Ceroni, K. Maass, H. Geyer, R. Geyer, A. Dell, and S.M. Haslam, “GlycoWorkbench: A Tool for the Computer-Assisted Annotation of Mass Spectra of Glycans,” J. Proteome Research, vol. 7, no. 4, pp. 1650-1659, 2008. [18] R. Niedermeier, Invitation to Fixed-Parameter Algorithms. Oxford Univ. Press, 2006. [19] B. Domon and C.E. Costello, “A Systematic Nomenclature for Carbohydrate Fragmentations in FAB-MS/MS Spectra of Glycoconjugates,” Glycoconjugate J., vol. 5, pp. 397-409, 1988. [20] S.E. Dreyfus and R.A. Wagner, “The Steiner Problem in Graphs,” Networks, vol. 1, no. 3, pp. 195-207, 1972. [21] A. Björklund, T. Husfeldt, P. Kaski, and M. Koivisto, “Fourier Meets Möbius: Fast Subset Convolution,” Proc. ACM Symp. Theory of Computing (STOC '07), pp. 67-74, 2007. [22] S. Böcker and F. Rasche, “Towards De Novo Identification of Metabolites by Analyzing Tandem Mass Spectra,” Bioinformatics, vol. 24, pp. i49-i55, 2008. [23] G. Lochnit and R. Geyer, “Carbohydrate Structure Analysis of Batroxobin, a Thrombin-Like Serine Protease from Bothrops Moojeni Venom,” European J. Biochemistry, vol. 228, no. 3, pp. 805-816, 1995. [24] R. Otter, “The Number of Trees,” The Annals of Math., vol. 49, no. 3, pp. 583-599, 1948. [25] G. Pólya, “Kombinatorische Anzahlbestimmungen Für Gruppen, Graphen und Chemische Verbindungen,” Acta Mathematica, vol. 68, no. 1, pp. 145-254, 1937. [26] G.H. Hardy and S. Ramanujan, “Asymptotic Formulae in Combinatory Analysis,” Proc. London Math. Soc., vol. s2-17, no. 1, pp. 75-115, 1918. [27] S. Böcker, M. Letzel, Z. Lipták, and A. Pervukhin, “SIRIUS: Decomposing Isotope Patterns for Metabolite Identification,” Bioinformatics, vol. 25, no. 2, pp. 218-224, 2009.