This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Sparse Representation for Coarse and Fine Object Recognition
April 2006 (vol. 28 no. 4)
pp. 555-567
This paper offers a sparse, multiscale representation of objects. It captures the object appearance by selection from a very large dictionary of Gaussian differential basis functions. The learning procedure results from the matching pursuit algorithm, while the recognition is based on polynomial approximation to the bases, turning image matching into a problem of polynomial evaluation. The method is suited for coarse recognition between objects and, by adding more bases, also for fine recognition of the object pose. The advantages over the common representation using PCA include storing sampled points for recognition is not required, adding new objects to an existing data set is trivial because retraining other object models is not needed, and significantly in the important case where one has to scan an image over multiple locations in search for an object, the new representation is readily available as opposed to PCA projection at each location. The experimental result on the COIL-100 data set demonstrates high recognition accuracy with real-time performance.

[1] Y. Bengio, J.-F. Paiement, P. Vincent, O. Delalleau, N. Le Roux, and M. Ouimet, “Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering,” Advances in Neural Information Processing Systems, vol. 16, 2004.
[2] P. Buhlmann and B. Yu, “Boosting with the $L_2$ Loss: Regression and Classification,” J. Am. Statistical Assoc., vol. 98, pp. 324-340, 2001.
[3] S. Chen, D. Donoho, and M. Saunders, “Atomic Decomposition by Basis Pursuit,” SIAM J. Scientific Computation, vol. 20, no. 1, pp. 33-61, 1998.
[4] Y. Freund and R.E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” J. Computer and System Sciences, vol. 55, no. 1, pp. 119-139, 1997.
[5] J. Friedman, T. Hastie, and R. Tibshirani, “Additive Logistic Regression: A Statistical View of Boosting,” The Annals of Statistics, vol. 38, no. 2, pp. 337-374, 2000.
[6] J.M. Geusebroek, A.W.M. Smeulders, and J. van de Weijer, “Fast Anisotropic Gauss Filtering,” IEEE Trans. Image Processing, vol. 12, no. 8, pp. 938-943, 2003.
[7] F. Girosi, “An Equivalence between Sparse Approximation and Support Vector Machines,” Neural Computation, vol. 10, pp. 1455-1480, 1998.
[8] D.E. Knuth, The Art of Computer Programming: Seminumerical Algorithms, vol. 2, third ed., Addison-Wesley, 1997.
[9] J.J. Koenderink, “The Structure of Images,” Biological Cybernetics, vol. 50, pp. 363-370, 1984.
[10] J.J. Koenderink and A.J. van Doorn, “Representation of Local Geometry in the Visual System,” Biological Cybernetics, vol. 55, pp. 367-375, 1987.
[11] X. Liu, A. Srivastava, and K. Gallivan, “Optimal Linear Representations of Images for Object Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 5, pp. 662-666, May 2004.
[12] S. Mallat, A Wavelet Tour of Signal Processing. Academic Press, 1999.
[13] S. Mallat and Z. Zhang, “Matching Pursuits with Time-Frequency Dictionaries,” IEEE Trans. Signal Processing, vol. 41, no. 12, pp. 3397-3415, 1993.
[14] S. Mukherjee and S.K. Nayar, “Automatic Generation of RBF Networks Using Wavelets,” Pattern Recognition, vol. 29, pp. 1369-1383, 1996.
[15] H. Murase and S. Nayar, “Visual Learning and Recognition of 3-D Objects from Appearance,” Int'l J. Computer Vision, vol. 14, pp. 5-24, 1995.
[16] S.A. Nene, S.K. Nayar, and H. Murase, “Columbia Object Image Library (COIL-100),” Technical Report CUCS-006-96, Columbia Univ., 1996.
[17] P.J. Phillips, “Matching Pursuit Filters Applied to Face Identification,” IEEE Trans. Image Processing, vol. 7, no. 8, pp. 1150-1164, 1998.
[18] T. Poggio and S. Edelman, “A Network that Learns to Recognize Three-Dimensional Objects,” Nature, vol. 343, pp. 263-266, 1990.
[19] T. Poggio and F. Girosi, “Networks for Approximation and Learning,” Proc. IEEE, vol. 78, no. 9, pp. 1481-1497, 1990.
[20] M. Pontil and A. Verri, “Support Vector Machines for 3-D Object Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 6, pp. 637-646, June 1998.
[21] D. Roth, M-H. Yang, and N. Ahuja, “Learning to Recognize Three-Dimensional Objects,” Neural Computation, vol. 14, pp. 1071-1103, 2002.
[22] S. Roweis and L. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding,” Science, vol. 290, no. 5500, pp. 2323-2326, 2000.
[23] F. Schaffalitzky and A. Zisserman, “Viewpoint Invariant Texture Matching and Wide Baseline Stereo,” Proc. Int'l Conf. Computer Vision, pp. 636-643, 2001.
[24] R.E. Schapire, Y. Freund, P. Bartlett, and W.S. Lee, “Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods,” The Annals of Statistics, vol. 26, no. 5, pp. 1651-1686, 1998.
[25] C. Schmid and R. Mohr, “Local Grayvalue Invariants for Image Retrieval,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 5, pp. 530-534, May 1997.
[26] B. Schölkopf, A.J. Smola, and K.R. Müller, “Nonlinear Component Analysis as a Kernel Eigenvalue Problem,” Neural Computation, vol. 10, pp. 1299-1319, 1998.
[27] A.J. Smola and B. Schölkopf, “From Regularization Operators to Support Vector Kernels,” Advances in Neural Information Processing Systems, vol. 10, pp. 343-349, 1998.
[28] D.M.J. Tax and R.P.W. Duin, “Support Vector Domain Description,” Pattern Recognition Letters, vol. 20, nos. 11-13, pp. 1191-1199, 1999.
[29] J.B. Tenenbaum, V. de Silva, and J.C. Langford, “A Global Geometric Framework for Nonlinear Dimensionality Reduction,” Science, vol. 290, no. 5500, pp. 2319-2323, 2000.
[30] M.A. Turk and A.P. Pentland, “Face Recognition Using Eigenfaces,” J. Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991.
[31] M. Unser, A. Aldroubi, and M. Eden, “B-Spline Signal Processing: Part I— Theory,” IEEE Trans. Signal Processing, vol. 41, no. 2, pp. 821-833, 1993.
[32] M. Unser, A. Aldroubi, and M. Eden, “B-Spline Signal Processing: Part II— Efficient Design and Applications,” IEEE Trans. Signal Processing, vol. 41, no. 2, pp. 834-848, 1993.
[33] L.J. van Vliet, I.T. Young, and P.W. Verbeek, “Recursive Gaussian Derivative Filters,” Proc. 14th Int'l Conf. Pattern Recognition, vol. I, pp. 509-514, 1998.
[34] V.N. Vapnik, Statistical Learning Theory. John Wiley and Sons, 1998.
[35] N. Vasconcelos, P. Ho, and P. Moreno, “The Kullback-Leibler Kernel as a Framework for Discriminant and Localized Representations for Visual Recognition,” Proc. Eighth European Conf. Computer Vision, vol. III, pp. 430-441, 2004.
[36] P. Vincent and Y. Bengio, “Kernel Matching Pursuit,” Machine Learning, vol. 48, no. 1, pp. 165-187, 2002.
[37] W.M. Wells, “Efficient Synthesis of Gaussian Filters by Cascaded Uniform Filters,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 2, pp. 234-239, 1986.

Index Terms:
B-spline, Gaussian derivatives, matching pursuit, multiscale, PCA, polynomial approximation, sparse representation.
Citation:
Thang V. Pham, Arnold W.M. Smeulders, "Sparse Representation for Coarse and Fine Object Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 4, pp. 555-567, April 2006, doi:10.1109/TPAMI.2006.84
Usage of this product signifies your acceptance of the Terms of Use.