This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Recognizing Handwritten Digits Using Hierarchical Products of Experts
February 2002 (vol. 24 no. 2)
pp. 189-197

Abstract—The product of experts learning procedure can discover a set of stochastic binary features that constitute a nonlinear generative model of handwritten images of digits. The quality of generative models learned in this way can be assessed by learning a separate model for each class of digit and then comparing the unnormalized probabilities of test images under the 10 different class-specific models. To improve discriminative performance, a hierarchy of separate models can be learned for each digit class. Each model in the hierarchy learns a layer of binary feature detectors that model the probability distribution of vectors of activity of feature detectors in the layer below. The models in the hierarchy are trained sequentially and each model uses a layer of binary feature detectors to learn a generative model of the patterns of feature activities in the preceding layer. After training, each layer of feature dectectors produces a separate, unnormalized log probabilty score. With three layers of feature detectors for each of the 10 digit classes, a test image produces 30 scores which can be used as inputs to a supervised, logistic classification network that is trained on separate data. On the MNIST database, our system is comparable with current state-of-the-art discriminative methods, demonstrating that the product of experts learning procedure can produce effective hierarchies of generative models of high-dimensional data.

[1] G.E. Hinton, “Training Products of Experts by Minimizing Contrastive Divergence,” Technical Report GCNU TR 2000-004, Gatsby Computational Neuroscience Unit, Univ. College London, 2000.
[2] P. Smolensky, “Information Processing in Dynamical Systems: Foundations of Harmony Theory,” Parallel Distributed Processing: Explorations in the Microstructure of Cognition, D.E. Rumelhart and J.L. McClelland, eds., vol. 1, 1986.
[3] Y. Freund and D. Haussler, “Unsupervised Learning of Distributions of Binary Vectors Using 2-Layer Networks,” Advances in Neural Information Processing Systems, J.E. Moody, S.J. Hanson, and R.P. Lippmann, eds., vol. 4, pp. 912-919, 1992.
[4] G.E. Hinton and T.J. Sejnowski, “Learning and Relearning in Boltzmann Machines,” Parallel Distributed Processing: Explorations in Microstructure of Cognition, D.E. Rumelhart and J.L. McClelland, eds., Cambridge, Mass.: MIT Press, 1986.
[5] Y. LeCun, L.D. Jackel, L. Bottou, A. Brunot, C. Cortes, J.S. Denker, H. Drucker, I. Guyon, U.A. Muller, E. Sackinger, P. Simard, and V. Vapnik, “Comparison of Learning Algorithms for Handwritten Digit Recognition,” Proc. Int'l Conf. Artificial Neural Networks, pp. 53-60, 1995.
[6] C.J.C. Burges and B. Schölkopf, “Improving the Accuracy and Speed of Support Vector Machines,” Advances in Neural Information Processing Systems, M.C. Mozer, M.I. Jordan, and T. Petsche, eds., vol. 9, p. 375, 1997.
[7] P. Simard, Y. LeCun, J. Denker, and B. Victorri, “An Efficient Algorithm for Learning Invariances in Adaptive Classifiers,” Proc. Int'l Conf. Pattern Recognition (IAPR '92), 1992.
[8] D. Wolpert, "Stacked Generalization," Neural Networks, Vol. 5, 1992, pp. 241-259.
[9] L. Breiman, “Bagging Predictors,” Machine Learning, vol. 24, pp. 123-140, 1996.
[10] R.S. Zemel and T. Pitassi, “A Gradient-Based Boosting Algorithm for Regression Problems,” Advances in Neural Information Processing Systems, V. Tresp, T. Leen, and T. Dietterich, eds., vol. 13, 2001.
[11] T. Heskes, “Bias/Variance Decompositions for Likelihood-Based Estimators,” Neural Computation, vol. 10, pp. 1425-1433, 1998.
[12] R. Jacobs, M.I. Jordan, S.J. Nowlan, and G.E. Hinton, “Adaptive Mixtures of Local Experts,” Neural Computation, vol. 3, pp. 79-87, 1991.

Index Terms:
Neural networks, products of experts, handwriting recognition, feature extraction, shape recognition, Boltzmann machines, model-based recognition, generative models.
Citation:
Guy Mayraz, Geoffrey E. Hinton, "Recognizing Handwritten Digits Using Hierarchical Products of Experts," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 2, pp. 189-197, Feb. 2002, doi:10.1109/34.982899
Usage of this product signifies your acceptance of the Terms of Use.