The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - July/August (2011 vol.8)
pp: 1054-1066
Peng Zhang , University of Science and Technology of China, Hefei
Houqiang Li , University of Science and Technology of China, Hefei
Honghui Wang , National Institutes of Health, Bethesda
Stephen T.C. Wong , The Methodist Hospital Research Institute, Weill Cornell Medical College, Houston
Xiaobo Zhou , The Methodist Hospital Research Institute, Weill Cornell Medical College, Houston
ABSTRACT
Peak detection is one of the most important steps in mass spectrometry (MS) analysis. However, the detection result is greatly affected by severe spectrum variations. Unfortunately, most current peak detection methods are neither flexible enough to revise false detection results nor robust enough to resist spectrum variations. To improve flexibility, we introduce peak tree to represent the peak information in MS spectra. Each tree node is a peak judgment on a range of scales, and each tree decomposition, as a set of nodes, is a candidate peak detection result. To improve robustness, we combine peak detection and common peak alignment into a closed-loop framework, which finds the optimal decomposition via both peak intensity and common peak information. The common peak information is derived and loopily refined from the density clustering of the latest peak detection result. Finally, we present an improved ant colony optimization biomarker selection method to build a whole MS analysis system. Experiment shows that our peak detection method can better resist spectrum variations and provide higher sensitivity and lower false detection rates than conventional methods. The benefits from our peak-tree-based system for MS disease analysis are also proved on real SELDI data.
INDEX TERMS
Mass spectrometry, peak identification, peak tree, scale-space filtering, wavelets, feature selection.
CITATION
Peng Zhang, Houqiang Li, Honghui Wang, Stephen T.C. Wong, Xiaobo Zhou, "Peak Tree: A New Tool for Multiscale Hierarchical Representation and Peak Detection of Mass Spectrometry Data", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 4, pp. 1054-1066, July/August 2011, doi:10.1109/TCBB.2009.56
REFERENCES
[1] K.R. Coombes, J.M. Kooman, K.A. Baggerly, J.S. Morris, and R. Kobayashi, “Understanding the Characteristics of Mass Spectrometry Data through the Use of Simulation,” Cancer Informatics, vol. 1, pp. 41-52, Jan. 2005.
[2] M. Dijkstra, R.J. Vonk, and R.C. Jansen, “SELDI-TOF Mass Spectra: A View on Sources of Variation,” J. Chromatography B, vol. 847, pp. 12-23, Feb. 2007.
[3] W. Yu, B. Wu, T. Huang, X. Li, K. Williams, and H. Zhao, “Statistical Methods in Proteomics,” Handbook of Engineering Statistics, H. Pham, ed., pp. 623-638, Springer Press, 2006.
[4] M. Hilario, A. Kalousis, C. Pellegrini, and M. Muller, “Processing and Classification of Protein Mass Spectra,” Mass Spectrometry Rev., vol. 25, pp. 409-449, Feb. 2006.
[5] C.S. Tan, A. Ploner, A. Quandt, J. Lehtio, and Y. Pawtian, “Finding Regions of Significance in SELDI Measurements for Identifying Protein Biomarkers,” Bioinformatics, vol. 22, pp. 1515-1523, Mar. 2006.
[6] Y. Yasui, D. McLerran, B.L. Adam, M. Winget, M. Thornquist, and Z. Feng, “An Automated Peak Identification/Calibration Procedure for High Dimensional Protein Measures from Mass Spectrometers,” J. Biomedicine and Biotechnology, vol. 4, pp. 242-248, Apr. 2003.
[7] V.P. Andreev, T. Rejtar, H.S. Chen, E.V. Moskovets, A.R. Ivanov, and B.L. Karger, “A Universal Denoising and Peak Picking Algorithm for LC-MS Based on Matched Filtration in the Chromatographic Time Domain,” Analytical Chemistry, vol. 75, pp. 6314-6326, Nov. 2003.
[8] K.R. Coombes, S. Tsavachidis, J.S. Morris, K.A. Baggerly, M.C. Hung, and H.M. Kuerer, “Improved Peak Detection and Quantification of Mass Spectrometry Data Acquired from Surface-Enhanced Laser Desorption and Ionization by Denoising Spectra with the Undecimated Discrete Wavelet Transform,” Proteomics, vol. 5, pp. 4107-4117, Mar. 2005.
[9] S.M. Carlson, A. Najmi, J.C. Whitin, and H.J. Cohen, “Improving Feature Detection and Analysis of Surface-Enhanced Laser Desorption/Ionization-Time of Flight Mass Spectra,” Proteomics, vol. 5, pp. 2778-2788, June 2005.
[10] P. Du, W.A. Kibbe, and S.M. Lin, “Improved Peak Detection in Mass Spectrum by Incorporating Continuous Wavelet Transform-Based Pattern Matching,” Bioinformatics, vol. 22, pp. 2059-2065, July 2006.
[11] E. Lange, C. Gropl, K. Reinert, O. Kohlbacher, and A. Hildebrandt, “High Accuracy Peak Picking of Proteomics Data Using Wavelet Techniques,” Proc. Pacific Symp. Biocomputing, pp. 243-254, 2006.
[12] T.W. Randolph and Y. Yasui, “Multiscale Processing of Mass Spectrometry,” Data Biometrics, vol. 62, pp. 589-597, June 2006.
[13] P. Zhang, H. Li, X. Zhou, and S. Wong, “Peak Tree and Peak Detection for Mass Spectrometry Data,” Proc. Int'l Symp. Computational Models of Life Sciences, pp. 127-136, 2007.
[14] S. Mallat, A Wavelet Tour of Signal Processing. Academic Press, 1999.
[15] F.L. Shen, Z.F. Ye, and Y.M. Qian, Statistical Signal Analysis and Processing. Univ. of Science and Technology Press, 2002.
[16] A. Witkin, “Scale-Space Filtering: A New Approach to Multi-Scale Description,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 9, pp. 150-153, Mar. 1984.
[17] J. Babaud, M. Baudin, A. Witkin, and R. Duda, “Uniqueness of the Gaussian Kernel for Scale-Space Filtering,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 1, pp. 26-33, Jan. 1986.
[18] B.W. Silverman, Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.
[19] F.J. Herrmann, “Singularity Characterization by Monoscale Analysis: Application to Seismic Imaging,” Applied and Computational Harmonic Analysis, vol. 11, pp. 64-88, 2001.
[20] J.S. Morris, K.R. Coombes, J. Koomen, K.A. Baggerly, and R. Kobayashi, “Feature Extraction and Quantification for Mass Spectrometry in Biomedical Applications Using the Mean Spectrum,” Bioinformatics, vol. 9, pp. 1764-1775, Apr. 2005.
[21] T. Fushiki, H. Fujisawa, and S. Eguchi, “Identification of Biomarkers from Mass Spectrometry Data Using a ‘Common’ Peak Approach,” BMC Bioinformatics, vol. 7, pp. 358-365, July 2006.
[22] W. Yu, J. Liu, B. Wu, K. Williams, and H. Zhao, “Multiple Peak Alignment in Sequential Data Analysis: A Scale-Space Based Approach,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 3, pp. 208-219, July-Sept. 2006.
[23] A. Hinneburg and D.A. Keim, “An Efficient Approach to Clustering in Large Multimedia Databases with Noise,” Proc. Knowledge Discovery and Data Mining Conf. (KDD '98), pp. 58-65, July 2008.
[24] W. Yu, Z. He, J. Liu, and H. Zhao, “Improving Mass Spectrometry Peak Detection Using Multiple Peak Alignment Results,” J. Proteome Research, vol. 7, pp. 123-129, 2008.
[25] J. Prados, A. Kalousis, and M. Hilario, “On Preprocessing of SELDI-MS and Its Evaluation,” Proc. 19th IEEE Int'l Symp. Computer-Based Medical Systems (CBMS '06), pp. 953-958, July 2008.
[26] M. Dorigo, V. Maniezzo, and A. Colorni, “Ant System: Optimization by a Colony of Cooperating Agents,” IEEE Trans. Systems, Man, and Cybernetics-Part B, vol. 26, no. 1, pp. 29-41, Jan. 1996.
[27] H.W. Ressom, R.S. Varghese, S.K. Drake, G.L. Hortin, M. Abdel-Hamid, C.A. Loffredo, and R. Goldman, “Peak Selection from MALDI-TOF Mass Spectra Using Ant Colony Optimization,” Bioinformatics, vol. 23, pp. 619-626, May 2007.
[28] M. Hall and L. Smith, “Feature Subset Selection: A Correlation Based Filter Approach,” Proc. Fourth Int'l Conf. Neural Information Processing and Intelligent Information Systems, pp. 855-858, 1997.
[29] I.S. Oh, J.S. Lee, and B.R. Moon, “Hybrid Genetic Algorithms for Feature Selection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1424-1437, Nov. 2004.
[30] X. Zhou, H. Wang, J. Wang, G. Hoehn, J. Azok, M.L. Brennan, S.L. Hazen, K. Li, and S.T.C. Wong, “Biomarker Discovery for Risk Stratification of Cardiovascular Events Using an Improved Genetic Algorithm,” Proc. IEEE/NLM Int'l Symp. Life Science and Multimodality, pp. 42-44, 2006.
[31] I. Guyon, J. Weston, and S. Barnhill, “Gene Selection for Cancer Classification Using Support Vector Machines,” Machine Learning, vol. 46, pp. 389-422, 2002.
[32] O.S. Trieglaff, R. Hussong, C. Grop, A. Hildebrandt, and K. Reinert, “A Fast and Accurate Algorithm for the Quantification of Peptides from Mass Spectrometry Data,” Proc. Ann. Int'l Conf. Research in Computational Molecular Biology, pp. 473-487, 2007.
29 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool