Deep learning-based MSMS spectra reduction in support of running multiple protein search engines on cloud
2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2017)
Kansas City, MO, USA
Nov. 13, 2017 to Nov. 16, 2017
Majdi Maabreh , Department of Computer Science, Western Michigan University, Kalamazoo, MI, USA
Basheer Qolomany , Department of Computer Science, Western Michigan University, Kalamazoo, MI, USA
Izzat Alsmadi , Department of Computing and Cyber Security, Texas A&M University, San Antonio, TX, USA
Ajay Gupta , Department of Computer Science, Western Michigan University, Kalamazoo, MI, USA
The diversity of the available protein search engines with respect to the utilized matching algorithms, the low overlap ratios among their results and the disparity of their coverage encourage the community of proteomics to utilize ensemble solutions of different search engines. The advancing in cloud computing technology and the availability of distributed processing clusters can also provide support to this task. However, data transferring and results' combining, in this case, could be the major bottleneck. The flood of billions of observed mass spectra, hundreds of Gigabytes or potentially Terabytes of data, could easily cause the congestions, increase the risk of failure, poor performance, add more computations' cost, and waste available resources. Therefore, in this study, we propose a deep learning model in order to mitigate the traffic over cloud network and, thus reduce the cost of cloud computing. The model, which depends on the top 50 intensities and their m/z values of each spectrum, removes any spectrum which is predicted not to pass the majority voting of the participated search engines. Our results using three search engines namely: pFind, Comet and X!Tandem, and four different datasets are promising and promote the investment in deep learning to solve such type of Big data problems.
Search engines, Cloud computing, Proteins, Peptides, Machine learning, Databases, Proteomics
M. Maabreh, B. Qolomany, I. Alsmadi and A. Gupta, "Deep learning-based MSMS spectra reduction in support of running multiple protein search engines on cloud," 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, USA, 2017, pp. 1909-1914.