This Article 
 Bibliographic References 
 Add to: 
Efficient Peak-Labeling Algorithms for Whole-Sample Mass Spectrometry Proteomics
January-March 2010 (vol. 7 no. 1)
pp. 126-137
Richard Pelikan, University of Pittsburgh, Pittsburgh
Milos Hauskrecht, University of Pittsburgh, Pittsburgh
Whole-sample mass spectrometry (MS) proteomics allows for a parallel measurement of hundreds of proteins present in a variety of biospecimens. Unfortunately, the association between MS signals and these proteins is not straightforward. The need to interpret mass spectra demands the development of methods for accurate labeling of ion species in such profiles. To aid this process, we have developed a new peak-labeling procedure for associating protein and peptide labels with peaks. This computational method builds upon characteristics of proteins expected to be in the sample, such as the amino sequence, mass weight, and expected concentration within the sample. A new probabilistic score that incorporates this information is proposed. We evaluate and demonstrate our method's ability to label peaks first on simulated MS spectra and then on MS spectra from human serum with a spiked-in calibration mixture.

[1] N. Barbarini, P. Magni, and R. Bellazzi, "A New Approach for the Analysis of Mass Spectrometry Data for Biomarker Discovery," Proc. AMIA Ann. Symp., pp. 26-30, 2006.
[2] B. Watkins, R. Szaro, S. Ball, T. Knubovets, J. Briggman, J. Hlavaty, F. Kusinitz, A. Stieg, and Y.J. Wu, "Detection of Early-Stage Cancer by Serum Protein Analysis," Am. Laboratory, vol. 33, pp. 32-36, 2001.
[3] C.P. Paweletz, B. Trock, M. Pennanen, T. Tsangaris, C. Magnant, L.A. Liotta, and E.F.r. Petricoin, "Proteomic Patterns of Nipple Aspirate Fluids Obtained by SELDI-TOF," Dis Markers, vol. 17, no. 4, pp. 301-307, 2001.
[4] T.A. Zhukov, R.A. Johanson, A.B. Cantor, R.A. Clark, and M.S. Tockman, "Discovery of Distinct Protein Profiles Specific for Lung Tumors and Pre-Malignant Lung Lesions by SELDI Mass Spectrometry," Lung Cancer, vol. 40, no. 3, pp. 267-279, June 2003.
[5] E. Zeindl-Eberhart, S. Haraida, S. Liebmann, P.R. Jungblut, S. Lamer, D. Mayer, G. Jager, S. Chung, and H.M. Rabes, "Detection and Identification of Tumor-Associated Protein Variants in Human Hepatocellular Carcinomas," Hepatology, vol. 39, no. 2, pp. 540-549, Feb. 2004.
[6] E. Diamandis, "Point: Proteomic Patterns in Biological Fluids: Do They Represent the Future of Cancer Diagnostics," Clinical Chemistry, vol. 49, pp. 1272-1275, Aug. 2003.
[7] M. Hauskrecht, R. Pelikan, D.E. Malehorn, W.L. Bigbee, M.T. Lotze, H.J. Zeh, D.C. Whitcomb, and J. Lyons-Weiler, "Feature Selection for Classification of SELDI-TOF-MS Proteomic Profiles," Applied Bioinformatics, vol. 4, no. 4, pp. 227-246, 2005.
[8] J.S. Morris, K.R. Coombes, J. Koomen, K.A. Baggerly, and R. Kobayashi, "Feature Extraction and Quantification for Mass Spectrometry in Biomedical Applications Using the Mean Spectrum," Bioinformatics, vol. 21, no. 9, pp. 1764-1775, May 2005.
[9] N.L. Anderson and N.G. Anderson, "The Human Plasma Proteome: History, Character, and Diagnostic Prospects," Molecular and Cellular Proteomics, vol. 1, no. 11, pp. 845-867, Nov. 2002.
[10] H.H. Rasmussen, T.F. Orntoft, H. Wolf, and J.E. Celis, "Towards a Comprehensive Database of Proteins from the Urine of Patients with Bladder Cancer," J. Urology, vol. 155, no. 6, pp. 2113-2119, June 1996.
[11] S. Hu, Y. Xie, P. Ramachandran, R.R. Ogorzalek Loo, Y. Li, J.A. Loo, and D.T. Wong, "Large-Scale Identification of Proteins in Human Salivary Proteome by LCMS and 2DGE-MS," Proteomics, vol. 5, no. 6, pp. 1714-1728, Apr. 2005.
[12] A.I. Nesvizhskii and R. Aebersold, "Interpretation of Shotgun Proteomic Data: The Protein Inference Problem," Molecular and Cellular Proteomics, vol. 4, no. 10, pp. 1419-1440, Oct. 2005.
[13] H. Tang, R.J. Arnold, P. Alves, Z. Xun, D.E. Clemmer, M.V. Novotny, J.P. Reilly, and P. Radivojac, "A Computational Approach Toward Label-Free Protein Quantification Using Predicted Peptide Detectability," Bioinformatics, vol. 22, no. 14, pp. 481-488, July 2006.
[14] M. Flory, T. Griffin, D. Martin, and R. Aebersold, "Advances in Quantitative Proteomics Using Stable Isotope Tags," Trends in Biotechnology, vol. 20, pp. S23-S29, Dec. 2002.
[15] P. Lu, C. Vogel, R. Wang, X. Yao, and E.M. Marcotte, "Absolute Protein Expression Profiling Estimates the Relative Contributions of Transcriptional and Translational Regulation," Nature Biotechnology, vol. 25, no. 1, pp. 117-124, Jan. 2007.
[16] M.D. Schuchard, C.D. Melm, A.S. Crawford, H.A. Chapman, S.L. Cockrill, K.B. Ray, R.J. Mehigh, W.K. Kappel, and G.B.I. Scott, "Immunoaffinity Depletion of 20 High Abundance Human Plasma Proteins," Origins, vol. 21, Dec. 2005.
[17] J. Davis and M. Goadrich, "The Relationship between Precision-Recall and ROC Curves," Proc. 23rd Int'l Conf. Machine Learning (ICML '06), pp. 233-240, 2006.
[18] C. Van Rijsbergen, Information Retrieval, second ed. Dept. Computer Science, Univ. of Glasgow, 1979.
[19] K.A. Baggerly, J.S. Morris, and K.R. Coombes, "Reproducibility of SELDI-TOF Protein Patterns in Serum: Comparing Datasets from Different Experiments," Bioinformatics, vol. 20, no. 5, pp. 777-785, Mar. 2004.
[20] R. Pelikan, W. Bigbee, D. Malehorn, J. Lyons-Weiler, and M. Hauskrecht, "Intersession Reproducibility of Mass Spectrometry Profiles and Its Effect on Accuracy of Multivariate Classification Models," Bioinformatics, vol. 23, pp. 3065-3072, Nov. 2007.
[21] O. Stemmann, H. Zou, S. Gerber, S. Gygi, and M. Kirschner, "Dual Inhibition of Sister Chromatid Separation at Metaphase," Cell, vol. 107, pp. 715-726, Dec. 2001.

Index Terms:
Machine learning, biology and genetics, heuristics design.
Richard Pelikan, Milos Hauskrecht, "Efficient Peak-Labeling Algorithms for Whole-Sample Mass Spectrometry Proteomics," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. 1, pp. 126-137, Jan.-March 2010, doi:10.1109/TCBB.2008.31
Usage of this product signifies your acceptance of the Terms of Use.