2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2010)
San Francisco, CA, USA
June 13, 2010 to June 18, 2010
Marc'Aurelio Ranzato , Department of Computer Science - University of Toronto, 10 King's College Road, Toronto, Canada
Geoffrey E. Hinton , Department of Computer Science - University of Toronto, 10 King's College Road, Toronto, Canada
Learning a generative model of natural images is a useful way of extracting features that capture interesting regularities. Previous work on learning such models has focused on methods in which the latent features are used to determine the mean and variance of each pixel independently, or on methods in which the hidden units determine the covariance matrix of a zero-mean Gaussian distribution. In this work, we propose a probabilistic model that combines these two approaches into a single framework. We represent each image using one set of binary latent features that model the image-specific covariance and a separate set that model the mean. We show that this approach provides a probabilistic framework for the widely used simple-cell complex-cell architecture, it produces very realistic samples of natural images and it extracts features that yield state-of-the-art recognition accuracy on the challenging CIFAR 10 dataset.
G. E. Hinton and M. Ranzato, "Modeling pixel means and covariances using factorized third-order boltzmann machines," 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), San Francisco, CA, USA, 2010, pp. 2551-2558.