loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Bayesian Classification With Gaussian Processes
December 1998 (vol. 20 no. 12)
pp. 1342-1351

Abstract—We consider the problem of assigning an input vector to one of m classes by predicting P(c|${\schmi x}$) for c = 1, ..., m. For a two-class problem, the probability of class one given ${\schmi x}$ is estimated by σ(y(${\schmi x}$)), where σ(y) = 1/(1 + ey). A Gaussian process prior is placed on y(${\schmi x}$), and is combined with the training data to obtain predictions for new ${\schmi x}$ points. We provide a Bayesian treatment, integrating over uncertainty in y and in the parameters that control the Gaussian process prior; the necessary integration over y is carried out using Laplace's approximation. The method is generalized to multiclass problems (m > 2) using the softmax function. We demonstrate the effectiveness of the method on a number of datasets.

[1] D. Barber and C.K.I Williams, "Gaussian Processes for Bayesian Classification via Hybrid Monte Carlo," M.C. Mozer, M.I. Jordan, and T. Petsche, eds., Advances in Neural Information Processing Systems 9. MIT Press, 1997.[2] M.K. Cowles and B.P. Carlin, "Markov-Chain Monte-Carlo Convergence Diagnostics—A Comparative Review," J. Am. Statistics Assoc., vol. 91, pp. 883-904, 1996.[3] N.A.C. Cressie, Statistics for Spatial Data.New York, NY: Wiley, 1993.[4] S. Duane, A.D. Kennedy, B.J. Pendleton, and D. Roweth, "Hybrid Monte Carlo," Physics Letters B, vol. 195, pp. 216-222, 1987.[5] A. Gelman, J.B. Carlin, H.S. Stern, and D.B. Rubin, Bayesian Data Analysis.London: Chapman and Hall, 1995.[6] M. Gibbs and D.J.C. MacKay, "Efficient Implementation of Gaussian Processes," Draft manuscript, available fromhttp://wol.ra.phy.cam.ac.uk/mackayhomepage.html ., 1997.[7] M. Gibbs and D.J.C. MacKay, "Variational Gaussian Process Classifiers," Draft manuscript, available viahttp://wol.ra.phy.cam.ac.uk/mackayhomepage.html ., 1997.[8] P. J. Green and B. W. Silverman, Nonparametric Regression and Generalized Linear Models.London: Chapman and Hall, 1994.[9] G. Kimeldorf and G. Wahba, "A Correspondence Between Bayesian Estimation of Stochastic Processes and Smoothing by Splines," Annals of Math. Statistics, vol. 41, pp. 495-502, 1970.[10] D.J.C. MacKay, "Bayesian Methods for Backpropagation Networks," J.L. van Hemmen, E. Domany, and K. Schulten, eds., Models of Neural Networks II. Springer, 1993.[11] K.V. Mardia and R.J. Marshall, "Maximum Likelihood Estimation for Models of Residual Covariance in Spatial Regression," Biometrika, vol. 71, no. 1, pp. 135-146, 1984.[12] P. McCullagh and J. Nelder, Generalized Linear Models. Chapman and Hall, 1983.[13] M. Møller, "A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning," Neural Networks, vol. 6, no. 4, pp. 525-533, 1993.[14] R.M. Neal, "Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification," Technical Report 9702, Dept. of Statistics, Univ. of Toronto, 1997. Available fromhttp://www.cs.toronto.edu~radford/.[15] R.M. Neal, Bayesian Learning for Neural Networks.New York, Springer, 1996. Lecture Notes in Statistics 118.[16] F. O'Sullivan, B.S. Yandell, and W.J. Raynor, "Automatic Smoothing of Regression Functions in Generalized Linear Models," J. Am. Statistical Assoc., vol. 81, pp. 96-103, 1986.[17] C.E. Rasmussen, Evaluation of Gaussian Processes and Other Methods for Non-Linear Regression. PhD thesis, Dept. of Computer Science, Univ. of Toronto, 1996. Available fromhttp://www.cs.utoronto.ca~carl/.[18] B. Ripley, Pattern Recognition and Neural Networks.Cambridge, UK: Cambridge Univ. Press, 1996.[19] B.D. Ripley, "Statistical Aspects of Neural Networks," O.E. Barndorff-Nielsen, J.L. Jensen, and W.S. Kendall, eds., Networks and Chaos—Statistical and Probabilistic Aspects, pp. 40-123. Chapman and Hall, 1993.[20] B.D. Ripley, "Flexible Non-Linear Approaches to Classification," V. Cherkassy, J.H. Friedman, and H. Wechsler, eds., From Statistics to Neural Networks, pp. 105-126. Springer, 1994.[21] B.W. Silverman, "Density Ratios, Empirical Likelihood and Cot Death," Applied Statistics, vol. 27, no. 1, pp. 26-33, 1978.[22] J. Skilling, "Bayesian Numerical Analysis," W.T. Grandy, Jr. and P. Milonni, eds., Physics and Probability. Cambridge Univ. Press, 1993.[23] V.N. Vapnik, The Nature of Statistical Learning Theory.New York, NY: Springer Verlag, 1995.[24] G. Wahba, "A Comparison of GCV and GML for Choosing the Smoothing Parameter in the Generalized Spline Smoothing Problem," Annals of Statistics, vol. 13, pp. 1,378-1,402, 1985.[25] G. Wahba, Spline Models for Observational Data. Soc. Industrial and Applied Mathematics, 1990. CBMS-NSF Regional Conf. Series in Applied Mathematics.[26] G. Wahba, C. Gu, Y. Wang, and R. Chappell, "Soft Classification, a.k.a., Risk Estimation, via Penalized Log Likelihood and Smoothing Spline Analysis of Variance," D.H. Wolpert, ed., The Mathematics of Generalization. Addison-Wesley, 1995. Proc. vol. XX in the Santa Fe Institute Studies in the Sciences of Complexity.[27] C.K.I. Williams, "Computing With Infinite Networks," M.C. Mozer, M.I. Jordan, and T. Petsche, eds., Advances in Neural Information Processing Systems 9. MIT Press, 1997.[28] C.K.I. Williams and C.E. Rasmussen, "Gaussian Processes for Regression," D.S. Touretzky, M.C. Mozer, and M.E. Hasselmo, eds., Advances in Neural Information Processing Systems 8, pp. 514-520. MIT Press, 1996.[29] S.J. Yakowitz and F. Szidarovszky, "A Comparison of Kriging With Nonparametric Regression Methods," J. Multivariate Analysis, vol. 16, pp. 21-53, 1985.

Index Terms:
Gaussian processes, classification problems, parameter uncertainty, Markov chain Monte Carlo, hybrid Monte Carlo, Bayesian classification.
Citation:
Christopher K.I. Williams, David Barber, "Bayesian Classification With Gaussian Processes," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 12, pp. 1342-1351, Dec. 1998, doi:10.1109/34.735807
Usage of this product signifies your acceptance of the Terms of Use.