CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2008 vol.5 Issue No.01 - January-March

Issue No.01 - January-March (2008 vol.5)

pp: 91-100

ABSTRACT

Mass spectrometry has become one of the most popular analysis techniques in Proteomics and Systems Biology. With the creation of larger datasets, the automated recalibration of mass spectra becomes important to ensure that every peak in the sample spectrum is correctly assigned to some peptide and protein. Algorithms for recalibrating mass spectra have to be robust with respect to wrongly assigned peaks, as well as efficient due to the amount of mass spectrometry data. The recalibration of mass spectra leads us to the problem of finding an optimal matching between mass spectra under measurement errors.We have developed two deterministic methods that allow robust computation of such a matching: The first approach uses a computational geometry interpretation of the problem, and tries to find two parallel lines with constant distance that stab a maximal number of points in the plane. The second approach is based on finding a maximal common approximate subsequence, and improves existing algorithms by one order of magnitude exploiting the sequential nature of the matching problem. We compare our results to a computational geometry algorithm using a topological line-sweep.

INDEX TERMS

biotechnology, mass spectrometry, combinatorial pattern matching, computational geometry

CITATION

Sebastian B?cker, Veli M?kinen, "Combinatorial Approaches for Mass Spectra Recalibration",

*IEEE/ACM Transactions on Computational Biology and Bioinformatics*, vol.5, no. 1, pp. 91-100, January-March 2008, doi:10.1109/tcbb.2007.1077REFERENCES

- [2] R. Matthiesen, M.B. Trelle, P. Hojrup, J. Bunkenborg, and O.N. Jensen, “VEMS 3.0: Algorithms and Computational Tools for Tandem Mass Spectrometry Based Identification of Post_Translational Modifications in Proteins,”
J Proteome Research, vol. 4, no. 6, pp. 2338-2347, http://dx.doi.org/10.1021pr050264q, 2005.- [4] B.-L. Adam, Y. Qu, J.W. Davis, M.D. Ward, M.A. Clements, L.H. Cazares, O.J. Semmes, P.F. Schellhammer, Y. Yasui, Z. Feng, and G.L. Wright Jr., “Serum Protein Fingerprinting Coupled with a Pattern-Matching Algorithm Distinguishes Prostate Cancer from Benign Prostate Hyperplasia and Healthy Men,”
Cancer Research, vol. 62, pp. 3609-3614, 2002.- [5] J. Gobom, M. Mueller, V. Egelhofer, D. Theiss, H. Lehrach, and E. Nordhoff, “A Calibration Method that Simplifies and Improves Accurate Determination of Peptide Molecular Masses by MALDI-TOF MS,”
Analytical Chemistry, vol. 74, no. 15, pp. 3915-3923, 2002.- [7] O.J. Semmes, Z. Feng, B.-L. Adam, L.L. Banez, W.L. Bigbee, D. Campos, L.H. Cazares, D.W. Chan, W.E. Grizzle, E. Izbicka, J. Kagan, G. Malik, D. McLerran, J.W. Moul, A. Partin, P. Prasanna, J. Rosenzweig, L.J. Sokoll, S. Srivastava, S. Srivastava, I. Thompson, M.J. Welsh, N. White, M. Winget, Y. Yasui, Z. Zhang, and L. Zhu, “Evaluation of Serum Protein Profiling by Surface-Enhanced Laser Desorption/Ionization Time-of-Flight Mass Spectrometry for the Detection of Prostate Cancer: I. Assessment of Platform Reproducibility,”
Clinical Chemistry, vol. 51, pp. 102-112, 2005.- [8] M.W. Bern and D. Goldberg, “EigenMS: De Novo Analysis of Peptide Tandem Mass Spectra by Spectral Graph Partitioning,”
Proc. Ann. Int'l Conf. Research in Computational and Molecular Biology (RECOMB '05), vol. 3500, pp. 357-372, 2005.- [9] E. Cheney,
An Introduction to Approximation Theory, second ed., reprint of 1982 ed. Am. Math. Soc., 2000.- [11] W.E. Wolski, M. Lalowski, P. Jungblut, and K. Reinert, “Calibration of Mass Spectrometric Peptide Mass Fingerprint Data without Specific External or Internal Calibrants,”
BMC Bioinformatics, vol. 6, p. 203, 2005.- [15] K.R. Clauser, P. Baker, and A.L. Burlingame, “Role of Accurate Mass Measurement ($+/-$ 10 ppm) in Protein Identification Strategies Employing MS or MS/MS and Database Searching,”
Analytical Chemistry, vol. 71, no. 14, pp. 2871-2882, July 1999.- [16] V. Egelhofer, K. Büssow, C. Luebbert, H. Lehrach, and E. Nordhoff, “Improvements in Protein Identification by MALDI-TOF-MS Peptide Mapping,”
Analytical Chemistry, vol. 72, no. 13, pp.2741-2750, July 2000.- [17] T.J. Rivlin,
An Introduction to the Approximation of Functions, reprint of 1969 ed. Dover, 1981.- [19] F.Y. Chin, C.A. Wang, and F.L. Wang, “Maximum Stabbing Line in 2D Plane,”
Proc. Ann. Int'l Conf. Computing and Combinatorics (COCOON '99), vol. 1627, pp. 379-388, 1999.- [20] K.Q. Brown, “Geometric Transforms for Fast Geometric Algorithms,” Report CMUCS-80-101, Dept. of Computer Science, Carnegie Mellon Univ., 1980.
- [21] M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf,
Computational Geometry: Algorithms and Applications, second ed. Springer, 2000.- [25] E. Rafalin, S. Souvaine, and I. Streinu, “Topological Sweep in Degenerate Cases,”
Proc. Fourth Workshop Algorithm Eng. and Experiments (ALENEX '02), vol. 2409, pp. 577-588, 2002.- [26] P.J. Rousseeuw, “Least Median of Squares Regression,”
J. Am. Statistical Assoc., 1984.- [29] J. Colannino, M. Damian, F. Hurtado, J. Iacono, H. Meijer, S. Ramaswami, and G. Toussaint, “An $O(n\log n)\hbox{-}{\rm Time}$ Algorithm for the Restriction Scaffold Assignment Problem,”
J. Computational Biology, vol. 13, no. 4, pp. 979-989, http://dx.doi.org/10.1002/pmic.200300792http:/ /dx.doi.org/3.0.CO;2-9http://www.liebertonline.com/ doi/abs/10.1089cmb.2006.13.979 , 2006.- [30]
Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, D. Sankoff and J.B. Kruskal, eds. Addison-Wesley, 1983.- [31] E. Rafalin,“LMS Regression Using Guided Topological Sweep in Degenerate Cases,” http://www.cs.tufts.edu/research/geometry lms/, 2002.
- [33] T.J. Rivlin,
Chebyshev Polynomials: From Approximation Theory to Algebra and Number Theory. Wiley-Interscience, 1990. |