This Article 
 Bibliographic References 
 Add to: 
Using Gaussian Process with Test Rejection to Detect T-Cell Epitopes in Pathogen Genomes
October-December 2010 (vol. 7 no. 4)
pp. 741-751
Liwen You, University of Lund, Lund
Vladimir Brusic, Dana-Farber Cancer Institute, Boston
Marcus Gallagher, University of Queensland, St Lucia
Mikael Bodén, University of Queensland, St Lucia
A major challenge in the development of peptide-based vaccines is finding the right immunogenic element, with efficient and long-lasting immunization effects, from large potential targets encoded by pathogen genomes. Computer models are convenient tools for scanning pathogen genomes to preselect candidate immunogenic peptides for experimental validation. Current methods predict many false positives resulting from a low prevalence of true positives. We develop a test reject method based on the prediction uncertainty estimates determined by Gaussian process regression. This method filters false positives among predicted epitopes from a pathogen genome. The performance of stand-alone Gaussian process regression is compared to other state-of-the-art methods using cross validation on 11 benchmark data sets. The results show that the Gaussian process method has the same accuracy as the top performing algorithms. The combination of Gaussian process regression with the proposed test reject method is used to detect true epitopes from the Vaccinia virus genome. The test rejection increases the prediction accuracy by reducing the number of false positives without sacrificing the method's sensitivity. We show that the Gaussian process in combination with test rejection is an effective method for prediction of T-cell epitopes in large and diverse pathogen genomes, where false positives are of concern.

[1] J.W. Yewdell and J.R. Bennink, "Immunodominance in Major Histocompatibility Complex Class I-Restricted T Lymphocyte Responses," Ann. Rev. Immunology, vol. 17, pp. 51-88, 1999.
[2] N. Zaitlen, M. Reyes-Gomez, D. Heckerman, and N. Jojic, "Shift-Invariant Adaptive Double Threading: Learning MHC II - Peptide Binding," Lecture Notes in Computer Science, Springer, 2007.
[3] P.A. Reche, J.-P. Gluttinga, and E.L. Reinherz, "Prediction of MHC Class I Binding Peptides Using Profile Motifs," Human Immunology, vol. 63, pp. 701-709, 2002.
[4] H.H. Bui, J. Sidney, B. Peters, M. Sathiamurthy, A. Sinichi, K.A. Purton, B.R. Mothe, F.V. Chisari, D.I. Watkins, and A. Sette, "Automated Generation and Evaluation of Specific MHC Binding Predictive Tools: ARB Matrix Applications," Immunogenetics, vol. 57, pp. 304-314, 2005.
[5] B. Peters, W. Tong, J. Sidney, A. Sette, and Z. Weng, "Examining the Independent Binding Assumption for Binding of Peptide Epitopes to MHC I Molecules," Bioinformatics, vol. 19, pp. 1765-1772, 2003.
[6] B. Peters and A. Sette, "Generating Quantitative Models Describing the Sequence Specificity of Biological Processes with the Stabilized Matrix Method," BMC Bioinformatics, vol. 6, article no. 132, May 2005.
[7] K. Udaka, K.H. Wiesmuller, S. Kienle, G. Jung, H. Tamamura, H. Yamagishi, K. Okumura, P. Walden, T. Suto, and T. Kawasaki, "An Automated Prediction of MHC Class I-Binding Peptides Based on Positional Scanning with Peptide Libraries," Immunogenetics, vol. 51, pp. 816-828, 2000.
[8] K.C. Parker, M.A. Bednarek, and J.E. Coligan, "Scheme for Ranking Potential HLA-A2 Binding Peptides Based on Independent Binding of Individual Peptide Side Chains," J. Immunology, vol. 152, pp. 163-175, 1994.
[9] M.C. Honeyman, V. Brusic, N.L. Stone, and L.C. Harrison, "Neural Network-Based Prediction of Candidate T-Cell Epitopes," Nature Biotechnology, vol. 16, no. 10, pp. 966-969, Oct. 1998.
[10] M. Nielsen, C. Lundegaard, P. Worning, S.L. Lauemoller, K. Lamberth, S. Buus, S. Brunak, and O. Lund, "Reliable Prediction of T-Cell Epitopes Using Neural Networks with Novel Sequence Representations," Protein Science, vol. 12, pp. 1007-1017, 2003.
[11] P. Dönnes and A. Elofsson, "Prediction of MHC Class I Binding Peptides, Using SVMHC," BMC Bioinformatics, vol. 3, article no. 25, Sept. 2002.
[12] Y. Zhao, C. Pinilla, D. Valmori, R. Martin, and R. Simon, "Application of Support Vector Machines for T-Cell Epitopes Prediction," Bioinformatics, vol. 19, pp. 1978-1984, 2003.
[13] H. Riedesel, B. Kolbeck, O. Schmetzer, and E.W. Knapp, "Peptide Binding at Class I Major Histocompatibility Complex Scored with Linear Functions and Support Vector Machines," Genome Informatics, vol. 15, pp. 198-212, 2004.
[14] G.L. Zhang, I. Bozic, C.K. Kwoh, J.T. August, and V. Brusic, "Prediction of Supertypespecific HLA Class I Binding Peptides Using Support Vector Machines," J. Immunological Methods, vol. 320, pp. 143-154, 2007.
[15] H. Mamitsuka, "Predicting Peptides That Bind to MHC Molecules Using Supervised Learning of Hidden Markov Models," Proteins, vol. 33, pp. 460-474, 1998.
[16] T. Hertz and C. Yanover, "PepDist: A New Framework for Protein-Peptide Binding Prediction Based on Learning Peptide Distance Functions," BMC Bioinformatics, vol. 7, no. 1,Suppl 1: S3, Mar. 2006.
[17] M. Moutaftsi, B. Peters, V. Pasquetto, D.C. Tscharke, J. Sidney, H.H. Bui, H. Grey, and A. Sette, "A Consensus Epitope Prediction Approach Identifies the Breadth of Murine T(${\rm CD8{+}}$ )-Cell Responses to Vaccinia Virus," Nature Biotechnology, vol. 24, pp. 817-819, 2006.
[18] D.J.C. MacKay, "Introduction to Gaussian Processes," Neural Networks and Machine Learning, C.M. Bishop, ed., vol. 168, pp. 133-165, Springer-Verlag, 1998.
[19] C.E. Rasmussen and C.K.I. Williams, Gaussian Processes for Machine Learning, first ed. The MIT Press, 2006.
[20] M. Seeger, "Gaussian Processes for Machine Learning," Int'l J. Neural Systems, vol. 14, pp. 69-106, 2004.
[21] B. Peters, H.H. Bui, S. Frankild, M. Nielson, C. Lundegaard, E. Kostem, D. Basch, K. Lamberth, M. Harndahl, W. Fleri, S.S. Wilson, J. Sidney, O. Lund, S. Buus, and A. Sette, "A Community Resource Benchmarking Predictions of Peptide Binding to MHC I Molecules," PLoS Computational Biology, vol. 2, pp. 574-584, 2006.
[22] C.C. Chang and C.J. Lin, "LIBSVM: A Library for Support Vector Machines,", 2001.
[23] P. Rao and M.J. Katzoff, "Bootstrap for Finite Populations," Comm. Statistics—Simulation and Computation, vol. 25, pp. 979-994, 1996.

Index Terms:
Immunology, amino acid sequence, epitope, machine learning, Gaussian processes, regression, false positives.
Liwen You, Vladimir Brusic, Marcus Gallagher, Mikael Bodén, "Using Gaussian Process with Test Rejection to Detect T-Cell Epitopes in Pathogen Genomes," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. 4, pp. 741-751, Oct.-Dec. 2010, doi:10.1109/TCBB.2008.131
Usage of this product signifies your acceptance of the Terms of Use.