Subscribe
Issue No.01 - January-March (2008 vol.5)
pp: 91-100
ABSTRACT
Mass spectrometry has become one of the most popular analysis techniques in Proteomics and Systems Biology. With the creation of larger datasets, the automated recalibration of mass spectra becomes important to ensure that every peak in the sample spectrum is correctly assigned to some peptide and protein. Algorithms for recalibrating mass spectra have to be robust with respect to wrongly assigned peaks, as well as efficient due to the amount of mass spectrometry data. The recalibration of mass spectra leads us to the problem of finding an optimal matching between mass spectra under measurement errors.We have developed two deterministic methods that allow robust computation of such a matching: The first approach uses a computational geometry interpretation of the problem, and tries to find two parallel lines with constant distance that stab a maximal number of points in the plane. The second approach is based on finding a maximal common approximate subsequence, and improves existing algorithms by one order of magnitude exploiting the sequential nature of the matching problem. We compare our results to a computational geometry algorithm using a topological line-sweep.
INDEX TERMS
biotechnology, mass spectrometry, combinatorial pattern matching, computational geometry
CITATION
Sebastian B?cker, Veli M?kinen, "Combinatorial Approaches for Mass Spectra Recalibration", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.5, no. 1, pp. 91-100, January-March 2008, doi:10.1109/tcbb.2007.1077
REFERENCES
 [1] W.J. Henzel, C. Watanabe, and J.T. Stults, “Protein Identification: The Origins of Peptide Mass Fingerprints,” J. Am. Soc. Mass Spectrometry, vol. 14, pp. 931-942, 2003. [2] R. Matthiesen, M.B. Trelle, P. Hojrup, J. Bunkenborg, and O.N. Jensen, “VEMS 3.0: Algorithms and Computational Tools for Tandem Mass Spectrometry Based Identification of Post_Translational Modifications in Proteins,” J Proteome Research, vol. 4, no. 6, pp. 2338-2347, http://dx.doi.org/10.1021pr050264q, 2005. [3] B. Ma, K. Zhang, C. Hendrie, C. Liang, M. Li, A. Doherty-Kirby, and G. Lajoie, “PEAKS: Powerful Software for Peptide de Novo Sequencing by Tandem Mass Spectrometry,” Rapid Comm. Mass Spectrometry, vol. 17, no. 20, pp. 2337-2342, 2003. [4] B.-L. Adam, Y. Qu, J.W. Davis, M.D. Ward, M.A. Clements, L.H. Cazares, O.J. Semmes, P.F. Schellhammer, Y. Yasui, Z. Feng, and G.L. Wright Jr., “Serum Protein Fingerprinting Coupled with a Pattern-Matching Algorithm Distinguishes Prostate Cancer from Benign Prostate Hyperplasia and Healthy Men,” Cancer Research, vol. 62, pp. 3609-3614, 2002. [5] J. Gobom, M. Mueller, V. Egelhofer, D. Theiss, H. Lehrach, and E. Nordhoff, “A Calibration Method that Simplifies and Improves Accurate Determination of Peptide Molecular Masses by MALDI-TOF MS,” Analytical Chemistry, vol. 74, no. 15, pp. 3915-3923, 2002. [6] K.A. Baggerly, J.S. Morris, and K.R. Coombes, “Reproducibility of SELDI-TOF Protein Patterns in Serum: Comparing Datasets from Different Experiments,” Bioinformatics, vol. 20, no. 5, pp. 777-785, 2004. [7] O.J. Semmes, Z. Feng, B.-L. Adam, L.L. Banez, W.L. Bigbee, D. Campos, L.H. Cazares, D.W. Chan, W.E. Grizzle, E. Izbicka, J. Kagan, G. Malik, D. McLerran, J.W. Moul, A. Partin, P. Prasanna, J. Rosenzweig, L.J. Sokoll, S. Srivastava, S. Srivastava, I. Thompson, M.J. Welsh, N. White, M. Winget, Y. Yasui, Z. Zhang, and L. Zhu, “Evaluation of Serum Protein Profiling by Surface-Enhanced Laser Desorption/Ionization Time-of-Flight Mass Spectrometry for the Detection of Prostate Cancer: I. Assessment of Platform Reproducibility,” Clinical Chemistry, vol. 51, pp. 102-112, 2005. [8] M.W. Bern and D. Goldberg, “EigenMS: De Novo Analysis of Peptide Tandem Mass Spectra by Spectral Graph Partitioning,” Proc. Ann. Int'l Conf. Research in Computational and Molecular Biology (RECOMB '05), vol. 3500, pp. 357-372, 2005. [9] E. Cheney, An Introduction to Approximation Theory, second ed., reprint of 1982 ed. Am. Math. Soc., 2000. [10] J.W. Wong, G. Cagney, and H.M. Cartwright, “SpecAlign-Processing and Alignment of Mass Spectra Datasets,” Bioinformatics, vol. 21, no. 9, pp. 2088-2090, 2005. [11] W.E. Wolski, M. Lalowski, P. Jungblut, and K. Reinert, “Calibration of Mass Spectrometric Peptide Mass Fingerprint Data without Specific External or Internal Calibrants,” BMC Bioinformatics, vol. 6, p. 203, 2005. [12] N. Jeffries, “Algorithms for Alignment of Mass Spectrometry Proteomic Data,” Bioinformatics, vol. 21, no. 14, pp. 3066-3073, 2005. [13] R. Matthiesen, J. Bunkenborg, A. Stensballe, O.N. Jensen, K.G. Welinder, and G. Bauw, “Database-Independent, Database-Dependent, and Extended Interpretation of Peptide Mass Spectra in VEMS V2.0,” Proteomics, vol. 4, no. 9, pp. 2583-2593, , Sept. 2004. [14] A. Wool and Z. Smilansky, “Precalibration of Matrix-Assisted Laser Desorption/Ionization-Time of Flight Spectra for Peptide Mass Fingerprinting,” Proteomics, vol. 2, no. 10, pp. 1365-1373, , 2002. [15] K.R. Clauser, P. Baker, and A.L. Burlingame, “Role of Accurate Mass Measurement ($+/-$ 10 ppm) in Protein Identification Strategies Employing MS or MS/MS and Database Searching,” Analytical Chemistry, vol. 71, no. 14, pp. 2871-2882, July 1999. [16] V. Egelhofer, K. Büssow, C. Luebbert, H. Lehrach, and E. Nordhoff, “Improvements in Protein Identification by MALDI-TOF-MS Peptide Mapping,” Analytical Chemistry, vol. 72, no. 13, pp.2741-2750, July 2000. [17] T.J. Rivlin, An Introduction to the Approximation of Functions, reprint of 1969 ed. Dover, 1981. [18] S. Chattopadhyay and P. Das, “The $K{\hbox{-}}{\rm{Dense}}$ Corridor Problems,” Pattern Recognition Letters, vol. 11, no. 7, pp. 463-469, 1990. [19] F.Y. Chin, C.A. Wang, and F.L. Wang, “Maximum Stabbing Line in 2D Plane,” Proc. Ann. Int'l Conf. Computing and Combinatorics (COCOON '99), vol. 1627, pp. 379-388, 1999. [20] K.Q. Brown, “Geometric Transforms for Fast Geometric Algorithms,” Report CMUCS-80-101, Dept. of Computer Science, Carnegie Mellon Univ., 1980. [21] M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf, Computational Geometry: Algorithms and Applications, second ed. Springer, 2000. [22] D.L. Souvaine and J.M. Steele, “Time- and Space-Efficient Algorithms for Least Median of Squares Regression,” J. Am. Statistical Assoc., vol. 82, no. 399, pp. 794-801, 1987. [23] H. Edelsbrunner and D.L. Souvaine, “Computing Least Median of Squares Regression Lines and Guided Topological Sweep,” J. Am. Statistical Assoc., vol. 85, no. 409, pp. 115-119, 1990. [24] H. Edelsbrunner and L.J. Guibas, “Topologically Sweeping an Arrangement,” J. Computer and System Sciences, vol. 38, no. 1, pp.165-194, 1989. [25] E. Rafalin, S. Souvaine, and I. Streinu, “Topological Sweep in Degenerate Cases,” Proc. Fourth Workshop Algorithm Eng. and Experiments (ALENEX '02), vol. 2409, pp. 577-588, 2002. [26] P.J. Rousseeuw, “Least Median of Squares Regression,” J. Am. Statistical Assoc., 1984. [27] H. Alt, K. Mehlhorn, H. Wagener, and E. Welzl, “Congruence, Similarity and Symmetries of Geometric Objects,” Discrete and Computational Geometry, vol. 3, no. 3, pp. 237-256, 1988. [28] R. Karp and S. Li, “Two Special Cases of the Assignment Problem,” Discrete Math., vol. 13, no. 2, pp. 129-142, 1975. [29] J. Colannino, M. Damian, F. Hurtado, J. Iacono, H. Meijer, S. Ramaswami, and G. Toussaint, “An $O(n\log n)\hbox{-}{\rm Time}$ Algorithm for the Restriction Scaffold Assignment Problem,” J. Computational Biology, vol. 13, no. 4, pp. 979-989, http://dx.doi.org/10.1002/pmic.200300792http:/ /dx.doi.org/3.0.CO;2-9http://www.liebertonline.com/ doi/abs/10.1089cmb.2006.13.979 , 2006. [30] Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, D. Sankoff and J.B. Kruskal, eds. Addison-Wesley, 1983. [31] E. Rafalin,“LMS Regression Using Guided Topological Sweep in Degenerate Cases,” http://www.cs.tufts.edu/research/geometry lms/, 2002. [32] M. Lefmann, C. Honisch, S. Boecker, N. Storm, F. von Wintzingerode, C. Schloetelburg, A. Moter, D. van den Boom, and U.B. Goebel, “A Novel Mass Spectrometry Based Tool for Genotypic Identification of Mycobacteria,” J. Clinical Microbiology, vol. 42, no. 1, pp. 339-346, 2004. [33] T.J. Rivlin, Chebyshev Polynomials: From Approximation Theory to Algebra and Number Theory. Wiley-Interscience, 1990. [34] C.A. Taatjes, N. Hansen, A. McIlroy, J.A. Miller, J.P. Senosiain, S.J. Klippenstein, F. Qi, L. Sheng, Y. Zhang, T.A. Cool, J. Wang, P.R. Westmoreland, M.E. Law, T. Kasper, and K. Kohse-Höinghaus, “Enols Are Common Intermediates in Hydrocarbon Oxidation,” Science, vol. 308, no. 5730, pp. 1887-1889, 2005.