The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.08 - Aug. (2013 vol.35)
pp: 1872-1886
Joan Bruna , Courant Inst., New York Univ., New York, NY, USA
S. Mallat , Ecole Normale Super., Paris, France
ABSTRACT
A wavelet scattering network computes a translation invariant image representation which is stable to deformations and preserves high-frequency information for classification. It cascades wavelet transform convolutions with nonlinear modulus and averaging operators. The first network layer outputs SIFT-type descriptors, whereas the next layers provide complementary invariant information that improves classification. The mathematical analysis of wavelet scattering networks explains important properties of deep convolution networks for classification. A scattering representation of stationary processes incorporates higher order moments and can thus discriminate textures having the same Fourier power spectrum. State-of-the-art classification results are obtained for handwritten digits and texture discrimination, with a Gaussian kernel SVM and a generative PCA classifier.
INDEX TERMS
Scattering, Convolution, Fourier transforms, Wavelet coefficients, Computer architecture,wavelets, Classification, convolution networks, deformations, invariants
CITATION
Joan Bruna, S. Mallat, "Invariant Scattering Convolution Networks", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 8, pp. 1872-1886, Aug. 2013, doi:10.1109/TPAMI.2012.230
REFERENCES
[1] S. Allassonniere, Y. Amit, and A. Trouve, "Toward a Coherent Statistical Framework for Dense Deformable Template Estimation," J. Royal Statistical Soc., vol. 69, pp. 3-29, 2007.
[2] J. Anden and S. Mallat, "Scattering Audio Representations," IEEE Trans. Signal Processing, to be published.
[3] Y. Amit and A. Trouve, "POP. Patchwork of Parts Models for Object Recognition," Int'l J. Computer Vision, vol 75, pp. 267-282, 2007.
[4] P.J. Bickel and E. Levina, "Covariance Regularization by Thresholding," Annals of Statistics, 2008.
[5] L. Birge and P. Massart, "From Model Selection to Adaptive Estimation," Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics, pp. 55-88, 1997.
[6] R.E. Broadhurst, "Statistical Estimation of Histogram Variation for Texture Classification," Proc. Workshop Texture Analysis and Synthesis, 2005.
[7] J. Bruna, "Scattering Representations for Pattern and Texture Recognition," PhD thesis, CMAP, Ecole Polytechnique, 2012.
[8] Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce, "Learning Mid-Level Features For Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[9] J. Bouvrie, L. Rosasco, and T. Poggio, "On Invariance in Hierarchical Models," Proc. Advances in Neural Information Processing Systems Conf., 2009.
[10] C. Chang and C. Lin, "LIBSVM: A Library for Support Vector Machines," ACM Trans. Intelligent Systems and Technology, vol. 2, pp. 27:1-27:27, 2011.
[11] M. Crosier and L. Griffin, "Using Basic Image Features for Texture Classification," Int'l J. Computer Vision, pp. 447-460, 2010.
[12] L. Fei-Fei, R. Fergus, and P. Perona, "Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.
[13] Z. Guo, L. Zhang, and D. Zhang, "Rotation Invariant Texture Classification Using LBP Variance (LBPV) with Global Matching," J. Pattern Recognition, vol. 43, pp. 706-719, Aug. 2010.
[14] B. Haasdonk and D. Keysers, "Tangent Distance Kernels for Support Vector Machines," Proc. 16th Int'l Conf. Pattern Recognition, 2002.
[15] E. Hayman, B. Caputo, M. Fritz, and J.O. Eklundh, "On the Significance of Real-World Conditions for Material Classification," Proc. European Conf. Computer Vision, 2004.
[16] K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun, "What Is the Best Multi-Stage Architecture for Object Recognition?" Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[17] D. Keysers, T. Deselaers, C. Gollan, and H. Ney, "Deformation Models for Image Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 8, pp. 1422-1435, Aug. 2007.
[18] H. Larochelle, Y. Bengio, J. Louradour, and P. Lamblin, "Exploring Strategies for Training Deep Neural Networks," J. Machine Learning Research, vol. 10, pp. 1-40, Jan. 2009.
[19] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[20] Y. LeCun, K. Kavukvuoglu, and C. Farabet, "Convolutional Networks and Applications in Vision," Proc. IEEE Int'l Symp. Circuits and Systems, 2010.
[21] T. Leung and J. Malik, "Representing and Recognizing the Visual Appearance of Materials Using Three-Dimensional Textons," Int'l J. Computer Vision, vol. 43, no. 1, pp. 29-44, 2001.
[22] J. Lindenstrauss, D. Preiss, and J. Tise, Fréchet Differentiability of Lipschitz Functions and Porous Sets in Banach Spaces. Princeton Univ. Press, 2012.
[23] D.G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[24] S. Mallat, "Recursive Interferometric Representation," Proc. European Signal Processing Conf., Aug. 2010.
[25] S. Mallat, "Group Invariant Scattering," Comm. Pure and Applied Math., vol. 65, no. 10, pp. 1331-1398, Oct. 2012.
[26] L. Sifre and S. Mallat, "Combined Scattering for Rotation Invariant Texture Analysis," Proc. European Symp. Artificial Neural Networks, Apr. 2012.
[27] J. Mairal, F. Bach, and J. Ponce, "Task-Driven Dictionary Learning," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 34, no. 4, pp. 791-804, Apr. 2012.
[28] A.Y. Ng and M.I. Jordan, "On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes," Proc. Advances in Neural Information Processing Systems Conf., 2002.
[29] L. Perrinet, "Role of Homeostasis in Learning Sparse Representations," Neural Computation J., vol. 22, pp. 1812-1836, 2010.
[30] M. Ranzato, F. Huang, Y. Boreau, and Y. LeCun, "Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[31] C. Sagiv, N.A. Sochen, and Y.Y. Zeevi, "Gabor Feature Space Diffusion via the Minimal Weighted Area Method" Proc. Third Int'l Workshop Energy Minimization Methods in Computer Vision and Pattern Recognition, pp. 621-635, 2001.
[32] B. Scholkopf and A.J. Smola, Learning with Kernels. MIT Press, 2002.
[33] S. Soatto, "Actionable Information in Vision," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[34] E. Tola, V. Lepetit, and P. Fua, "DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 5, pp. 815-830, May 2010.
[35] M. Varma and A. Zisserman, "Texture Classification: Are Filter Banks Necessary?" Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2003.
[36] M. Welling, "Robust Higher Order Statistics," AISTATS, 2005.
[37] I. Waldspurger, A. D'Aspremont, and S. Mallat, "Phase Recovery, Maxcut and Complex Semidefinite Programming," ArXiv: 1206.0102, June 2012.
48 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool