This Article 
 Bibliographic References 
 Add to: 
Probabilistic Mixture Regression Models for Alignment of LC-MS Data
September/October 2011 (vol. 8 no. 5)
pp. 1417-1424
Getachew K. Befekadu, Georgetown University Medical Center, Washington DC
Mahlet G. Tadesse, Georgetown University, Washington DC
Tsung-Heng Tsai, Virginia Polytechnic Institute and State University, Arlington
Habtom W. Ressom, Georgetown University Medical Center, Washington DC
A novel framework of a probabilistic mixture regression model (PMRM) is presented for alignment of liquid chromatography-mass spectrometry (LC-MS) data with respect to retention time (RT) points. The expectation maximization algorithm is used to estimate the joint parameters of spline-based mixture regression models and prior transformation density models. The latter accounts for the variability in RT points and peak intensities. The applicability of PMRM for alignment of LC-MS data is demonstrated through three data sets. The performance of PMRM is compared with other alignment approaches including dynamic time warping, correlation optimized warping, and continuous profile model in terms of coefficient variation of replicate LC-MS runs and accuracy in detecting differentially abundant peptides/proteins.

[1] R. Aebersold and M. Mann, “Mass Spectrometry-Based Proteomics,” Nature, vol. 422, no. 6928, pp. 198-207, 2003.
[2] G. Tomasi, F. van den Berg, and C. Andersson, “Correlation Optimized Warping and Dynamic Time Warping as Preprocessing Methods for Chromatographic Data,” J. Chemometrics, vol. 18, pp. 231-241, 2004.
[3] C.A. Hastings, S.M. Norton, and S. Roy, “New Algorithms for Processing and Peak Detection in Liquid Chromatography/Mass Spectrometry Data,” Rapid Comm. Mass Spectrometry, vol. 16, no. 5, pp. 462-467, 2002.
[4] B. Fischer, J. Grossmann, V. Roth, W. Gruissem, S. Baginsky, and J.M. Buhmann, “Semi-Supervised LC/MS Alignment for Differential Proteomics,” Bioinformatics, vol. 22, no. 14, pp. e132-e140, 2006.
[5] P. Wang, H. Tang, M.P. Fitzgibbon, M. Mclntosh, M. Coram, H. Zhang, E. Yi, and R. Aebersold, “A Statistical Method for Chromatographic Alignment of LC-MS Data,” Biostatistics, vol. 8, no. 2, pp. 357-367, 2007.
[6] L.N. Mueller, O. Rinner, A. Schmidt, S. Letarte, B. Bodenmiller, M.Y. Brusniak, O. Vitek, R. Aebersold, and M. Müller, “SuperHirn—A Novel Tool for High Resolution LC-MS-Based Peptide/Protein Profiling,” Proteomics, vol. 7, no. 19, pp. 3470-3480, 2007.
[7] J. Listgarten, R.M. Neal, S.T. Roweis, P. Wong, and A. Emili, “Difference Detection in LC-MS Data for Protein Biomarker Discovery,” Bioinformatics, vol. 23, no. 2, pp. e198-e204, 2007.
[8] J. Listgarten, R.M. Neal, S.T. Roweis, and A. Emili, “Multiple Alignment of Continuous Time Series,” Advances in Neural Information Processing Systems 17, pp. 817-824, MIT Press, 2005.
[9] J.S. Morris, P.J. Brown, R.C. Herrick, K.A. Baggerly, and K.R. Coombes, “Bayesian Analysis of Mass Spectrometry Proteomic Data Using Wavelet-Based Functional Mixed Models,” Biometrics, vol. 64, no. 2, pp. 479-489, 2008.
[10] J.S. Morris and R.J. Carroll, “Wavelet-Based Functional Mixed Models,” J. Royal Statistical Soc. B, vol. 68, no. 2, pp. 179-199, 2006.
[11] W. Guo, “Functional Data Analysis in Longitudinal Settings Using Smoothing Splines,” Statistical Methods Medical Research, vol. 13, no. 1, pp. 49-62, 2004.
[12] S. Gaffney and P. Smyth, “Joint Probabilistic Curve Clustering and Alignment,” Advances in Neural Information Processing Systems 17, pp. 473-480, MIT Press, 2005.
[13] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc. B, vol. 39, no. 1, pp. 1-38, 1977.
[14] M.I. Jordan and R.A. Jacobs, “Hierarchical Mixtures of Experts and the EM Algorithm,” Neural Computation, vol. 6, no. 2, pp. 181-214, 1994.
[15] R.A. Redner and H.F. Walker, “Mixture Densities, Maximum Likelihood and the EM Algorithm,” Soc. for Industrial and Applied Math. Rev., vol. 26, no. 2, pp. 195-239, 1984.
[16] K. Lange, Numerical Analysis for Statisticians. Springer, 1999.
[17] J. Listgarten, “Analysis of Sibling Time Series Data: Alignment and Difference Detection,” PhD thesis, Univ. of Toronto, 2006.
[18] C.J. van Rijsbergen, Information Retrieval, secound ed. Butterworths, 1979.
[19] J. Henna, “On Estimating of the Number of Constituents of a Finite Mixture of Continuous Distributions,” Annals of the Inst. of Statistical Math., vol. 37, no. 1, pp. 235-240, 1985.
[20] G.J. McLachlan and K.E. Basford, Mixture Models: Inference and Applications to Clustering. Marcel Dekker, 1988.

Index Terms:
Liquid chromatography, mass spectrometry, mixed-regression model, expectation-maximization.
Getachew K. Befekadu, Mahlet G. Tadesse, Tsung-Heng Tsai, Habtom W. Ressom, "Probabilistic Mixture Regression Models for Alignment of LC-MS Data," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. 5, pp. 1417-1424, Sept.-Oct. 2011, doi:10.1109/TCBB.2010.88
Usage of this product signifies your acceptance of the Terms of Use.