The Community for Technology Leaders
2009 IEEE Conference on Computer Vision and Pattern Recognition (2009)
Miami, FL, USA
June 20, 2009 to June 25, 2009
ISBN: 978-1-4244-3992-8
pp: 1605-1612
K. Kavukcuoglu , Courant Inst. of Math. Sci., New York Univ., New York, NY, USA
M.A. Ranzato , Courant Inst. of Math. Sci., New York Univ., New York, NY, USA
R. Fergus , Courant Inst. of Math. Sci., New York Univ., New York, NY, USA
Yann Le-Cun , Courant Inst. of Math. Sci., New York Univ., New York, NY, USA
ABSTRACT
Several recently-proposed architectures for high-performance object recognition are composed of two main stages: a feature extraction stage that extracts locally-invariant feature vectors from regularly spaced image patches, and a somewhat generic supervised classifier. The first stage is often composed of three main modules: (1) a bank of filters (often oriented edge detectors); (2) a non-linear transform, such as a point-wise squashing functions, quantization, or normalization; (3) a spatial pooling operation which combines the outputs of similar filters over neighboring regions. We propose a method that automatically learns such feature extractors in an unsupervised fashion by simultaneously learning the filters and the pooling units that combine multiple filter outputs together. The method automatically generates topographic maps of similar filters that extract features of orientations, scales, and positions. These similar filters are pooled together, producing locally-invariant outputs. The learned feature descriptors give comparable results as SIFT on image recognition tasks for which SIFT is well suited, and better results than SIFT on tasks for which SIFT is less well suited.
INDEX TERMS
image recognition, invariant feature vectors, topographic filter maps, object recognition, feature extraction, spaced image patches, generic supervised classifier, quashing function, quantization, spatial pooling operation, learned feature descriptors
CITATION

M. Ranzato, R. Fergus, K. Kavukcuoglu and Yann Le-Cun, "Learning invariant features through topographic filter maps," 2009 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Miami, FL, USA, 2009, pp. 1605-1612.
doi:10.1109/CVPRW.2009.5206545
418 ms
(Ver 3.3 (11022016))