The Community for Technology Leaders
RSS Icon
Issue No.08 - Aug. (2013 vol.35)
pp: 1872-1886
Joan Bruna , Courant Inst., New York Univ., New York, NY, USA
S. Mallat , Ecole Normale Super., Paris, France
A wavelet scattering network computes a translation invariant image representation which is stable to deformations and preserves high-frequency information for classification. It cascades wavelet transform convolutions with nonlinear modulus and averaging operators. The first network layer outputs SIFT-type descriptors, whereas the next layers provide complementary invariant information that improves classification. The mathematical analysis of wavelet scattering networks explains important properties of deep convolution networks for classification. A scattering representation of stationary processes incorporates higher order moments and can thus discriminate textures having the same Fourier power spectrum. State-of-the-art classification results are obtained for handwritten digits and texture discrimination, with a Gaussian kernel SVM and a generative PCA classifier.
Scattering, Convolution, Fourier transforms, Wavelet coefficients, Computer architecture,wavelets, Classification, convolution networks, deformations, invariants
Joan Bruna, S. Mallat, "Invariant Scattering Convolution Networks", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 8, pp. 1872-1886, Aug. 2013, doi:10.1109/TPAMI.2012.230
[1] S. Allassonniere, Y. Amit, and A. Trouve, "Toward a Coherent Statistical Framework for Dense Deformable Template Estimation," J. Royal Statistical Soc., vol. 69, pp. 3-29, 2007.
[2] J. Anden and S. Mallat, "Scattering Audio Representations," IEEE Trans. Signal Processing, to be published.
[3] Y. Amit and A. Trouve, "POP. Patchwork of Parts Models for Object Recognition," Int'l J. Computer Vision, vol 75, pp. 267-282, 2007.
[4] P.J. Bickel and E. Levina, "Covariance Regularization by Thresholding," Annals of Statistics, 2008.
[5] L. Birge and P. Massart, "From Model Selection to Adaptive Estimation," Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics, pp. 55-88, 1997.
[6] R.E. Broadhurst, "Statistical Estimation of Histogram Variation for Texture Classification," Proc. Workshop Texture Analysis and Synthesis, 2005.
[7] J. Bruna, "Scattering Representations for Pattern and Texture Recognition," PhD thesis, CMAP, Ecole Polytechnique, 2012.
[8] Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce, "Learning Mid-Level Features For Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[9] J. Bouvrie, L. Rosasco, and T. Poggio, "On Invariance in Hierarchical Models," Proc. Advances in Neural Information Processing Systems Conf., 2009.
[10] C. Chang and C. Lin, "LIBSVM: A Library for Support Vector Machines," ACM Trans. Intelligent Systems and Technology, vol. 2, pp. 27:1-27:27, 2011.
[11] M. Crosier and L. Griffin, "Using Basic Image Features for Texture Classification," Int'l J. Computer Vision, pp. 447-460, 2010.
[12] L. Fei-Fei, R. Fergus, and P. Perona, "Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.
[13] Z. Guo, L. Zhang, and D. Zhang, "Rotation Invariant Texture Classification Using LBP Variance (LBPV) with Global Matching," J. Pattern Recognition, vol. 43, pp. 706-719, Aug. 2010.
[14] B. Haasdonk and D. Keysers, "Tangent Distance Kernels for Support Vector Machines," Proc. 16th Int'l Conf. Pattern Recognition, 2002.
[15] E. Hayman, B. Caputo, M. Fritz, and J.O. Eklundh, "On the Significance of Real-World Conditions for Material Classification," Proc. European Conf. Computer Vision, 2004.
[16] K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun, "What Is the Best Multi-Stage Architecture for Object Recognition?" Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[17] D. Keysers, T. Deselaers, C. Gollan, and H. Ney, "Deformation Models for Image Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 8, pp. 1422-1435, Aug. 2007.
[18] H. Larochelle, Y. Bengio, J. Louradour, and P. Lamblin, "Exploring Strategies for Training Deep Neural Networks," J. Machine Learning Research, vol. 10, pp. 1-40, Jan. 2009.
[19] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[20] Y. LeCun, K. Kavukvuoglu, and C. Farabet, "Convolutional Networks and Applications in Vision," Proc. IEEE Int'l Symp. Circuits and Systems, 2010.
[21] T. Leung and J. Malik, "Representing and Recognizing the Visual Appearance of Materials Using Three-Dimensional Textons," Int'l J. Computer Vision, vol. 43, no. 1, pp. 29-44, 2001.
[22] J. Lindenstrauss, D. Preiss, and J. Tise, Fréchet Differentiability of Lipschitz Functions and Porous Sets in Banach Spaces. Princeton Univ. Press, 2012.
[23] D.G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[24] S. Mallat, "Recursive Interferometric Representation," Proc. European Signal Processing Conf., Aug. 2010.
[25] S. Mallat, "Group Invariant Scattering," Comm. Pure and Applied Math., vol. 65, no. 10, pp. 1331-1398, Oct. 2012.
[26] L. Sifre and S. Mallat, "Combined Scattering for Rotation Invariant Texture Analysis," Proc. European Symp. Artificial Neural Networks, Apr. 2012.
[27] J. Mairal, F. Bach, and J. Ponce, "Task-Driven Dictionary Learning," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 34, no. 4, pp. 791-804, Apr. 2012.
[28] A.Y. Ng and M.I. Jordan, "On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes," Proc. Advances in Neural Information Processing Systems Conf., 2002.
[29] L. Perrinet, "Role of Homeostasis in Learning Sparse Representations," Neural Computation J., vol. 22, pp. 1812-1836, 2010.
[30] M. Ranzato, F. Huang, Y. Boreau, and Y. LeCun, "Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[31] C. Sagiv, N.A. Sochen, and Y.Y. Zeevi, "Gabor Feature Space Diffusion via the Minimal Weighted Area Method" Proc. Third Int'l Workshop Energy Minimization Methods in Computer Vision and Pattern Recognition, pp. 621-635, 2001.
[32] B. Scholkopf and A.J. Smola, Learning with Kernels. MIT Press, 2002.
[33] S. Soatto, "Actionable Information in Vision," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[34] E. Tola, V. Lepetit, and P. Fua, "DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 5, pp. 815-830, May 2010.
[35] M. Varma and A. Zisserman, "Texture Classification: Are Filter Banks Necessary?" Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2003.
[36] M. Welling, "Robust Higher Order Statistics," AISTATS, 2005.
[37] I. Waldspurger, A. D'Aspremont, and S. Mallat, "Phase Recovery, Maxcut and Complex Semidefinite Programming," ArXiv: 1206.0102, June 2012.
200 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool