Faster Mass Spectrometry-Based Protein Inference: Junction Trees Are More Efficient than Sampling and Marginalization by Enumeration
Issue No. 03 - May-June (2012 vol. 9)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.26
W. S. Noble , Dept. of Genome Sci., Univ. of Washington, Seattle, WA, USA
O. Serang , Dept. of Pathology, Children's Hosp. Boston, Boston, MA, USA
The problem of identifying the proteins in a complex mixture using tandem mass spectrometry can be framed as an inference problem on a graph that connects peptides to proteins. Several existing protein identification methods make use of statistical inference methods for graphical models, including expectation maximization, Markov chain Monte Carlo, and full marginalization coupled with approximation heuristics. We show that, for this problem, the majority of the cost of inference usually comes from a few highly connected subgraphs. Furthermore, we evaluate three different statistical inference methods using a common graphical model, and we demonstrate that junction tree inference substantially improves rates of convergence compared to existing methods. The python code used for this paper is available at http://noble.gs.washington.edu/proj/fido.
trees (mathematics), biology computing, expectation-maximisation algorithm, inference mechanisms, Markov processes, mass spectroscopic chemical analysis, molecular biophysics, Monte Carlo methods, proteins, connected subgraphs, tandem mass spectrometry, mass spectrometry-based protein inference, protein identification method, statistical inference method, graphical models, expectation maximization, Markov chain Monte Carlo model, marginalization, approximation heuristics, junction tree inference, python code, Proteins, Peptides, Junctions, Databases, Computational modeling, Bioinformatics, Complexity theory, Bayesian inference., Mass spectrometry, protein identification, graphical models
W. S. Noble, O. Serang, "Faster Mass Spectrometry-Based Protein Inference: Junction Trees Are More Efficient than Sampling and Marginalization by Enumeration", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. , pp. 809-817, May-June 2012, doi:10.1109/TCBB.2012.26