Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) (2007)
Curitiba, Parana, Brazil
Sept. 23, 2007 to Sept. 26, 2007
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDAR.2007.35
M. Ranzato , New York University - New York, NY
Y. LeCun , New York University - New York, NY
We describe an unsupervised learning algorithm for ex- tracting sparse and locally shift-invariant features. We also devise a principled procedure for learning hierarchies of in- variant features. Each feature detector is composed of a set of trainable convolutional filters followed by a max-pooling layer over non-overlapping windows, and a point-wise sig- moid non-linearity. A second stage of more invariant fea- tures is fed with patches provided by the first stage feature extractor, and is trained in the same way. The method is used to pre-train the first four layers of a deep convolutional network which achieves state-of-the-art performance on the MNIST dataset of handwritten digits. The final testing error rate is equal to 0.42%. Preliminary experiments on com- pression of bitonal document images show very promising results in terms of compression ratio and reconstruction er- ror.
M. Ranzato and Y. LeCun, "A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images," Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)(ICDAR), Curitiba, Parana, Brazil, 2007, pp. 1213-1217.