Publication 2005 Issue No. 10 - October Abstract - A Performance Evaluation of Local Descriptors
A Performance Evaluation of Local Descriptors
October 2005 (vol. 27 no. 10)
pp. 1615-1630
 ASCII Text x Krystian Mikolajczyk, Cordelia Schmid, "A Performance Evaluation of Local Descriptors," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, October, 2005.
 BibTex x @article{ 10.1109/TPAMI.2005.188,author = {Krystian Mikolajczyk and Cordelia Schmid},title = {A Performance Evaluation of Local Descriptors},journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence},volume = {27},number = {10},issn = {0162-8828},year = {2005},pages = {1615-1630},doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2005.188},publisher = {IEEE Computer Society},address = {Los Alamitos, CA, USA},}
 RefWorks Procite/RefMan/Endnote x TY - JOURJO - IEEE Transactions on Pattern Analysis and Machine IntelligenceTI - A Performance Evaluation of Local DescriptorsIS - 10SN - 0162-8828SP1615EP1630EPD - 1615-1630A1 - Krystian Mikolajczyk, A1 - Cordelia Schmid, PY - 2005KW - Index Terms- Local descriptorsKW - interest pointsKW - interest regionsKW - invarianceKW - matchingKW - recognition.VL - 27JA - IEEE Transactions on Pattern Analysis and Machine IntelligenceER -
In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [3], steerable filters [12], PCA-SIFT [19], differential invariants [20], spin images [21], SIFT [26], complex filters [37], moment invariants [43], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

[1] A. Ashbrook, N. Thacker, P. Rockett, and C. Brown, “Robust Recognition of Scaled Shapes Using Pairwise Geometric Histograms,” Proc. Sixth British Machine Vision Conf., pp. 503-512, 1995.
[2] A. Baumberg, “Reliable Feature Matching across Widely Separated Views,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 774-781, 2000.
[3] S. Belongie, J. Malik, and J. Puzicha, “Shape Matching and Object Recognition Using Shape Contexts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 2, no. 4, pp. 509-522, Apr. 2002.
[4] M. Brown and D. Lowe, “Recognising Panoramas,” Proc. Ninth Int'l Conf. Computer Vision, pp. 1218-1227, 2003.
[5] J. Canny, “A Computational Approach to Edge Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 679-698, 1986.
[6] G. Carneiro and A.D. Jepson, “Phase-Based Local Features,” Proc. Seventh European Conf. Computer Vision, pp. 282-296, 2002.
[7] Empirical Evaluation Methods in Computer Vision, vol. 50 of series in machine perception and artificial intelligence, H.I. Christensen and P.J. Phillips, eds. World Scientific Publishing Co., 2002.
[8] G. Dorko and C. Schmid, “Selection of Scale-Invariant Parts for Object Class Recognition,” Proc. Ninth Int'l Conf. Computer Vision, pp. 634-640, 2003.
[9] R. Fergus, P. Perona, and A. Zisserman, “Object Class Recognition by Unsupervised Scale-Invariant Learning,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 264-271, 2003.
[10] V. Ferrari, T. Tuytelaars, and L. Van Gool, “Simultaneous Object Recognition and Segmentation by Image Exploration,” Proc. Eighth European Conf. Computer Vision, pp. 40-54, 2004.
[11] L. Florack, B. ter Haar Romeny, J. Koenderink, and M. Viergever, “General Intensity Transformations and Second Order Invariants,” Proc. Seventh Scandinavian Conf. Image Analysis, pp. 338-345, 1991.
[12] W. Freeman and E. Adelson, “The Design and Use of Steerable Filters,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 9, pp. 891-906, Sept. 1991.
[13] D. Gabor, “Theory of Communication,” J. IEE, vol. 3, no. 93, pp. 429-457, 1946.
[14] V. Gouet, P. Montesinos, R. Deriche, and D. Pelé, “Evaluation de Détecteurs de Points d'Intér${\rm \hat e}$ t pour la Couleur,” Proc. 12ème Congrès Francophone AFRIF-AFIA de Reconnaissance des Formes et Intelligence Artificielle, pp. 257-266, 2000.
[15] C. Harris and M. Stephens, “A Combined Corner and Edge Detector,” Proc. Alvey Vision Conf., pp. 147-151, 1988.
[16] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge Univ. Press, 2000.
[17] A. Johnson and M. Hebert, “Object Recognition by Matching Oriented Points,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 684-689, 1997.
[18] T. Kadir, M. Brady, and A. Zisserman, “An Affine Invariant Method for Selecting Salient Regions in Images,” Proc. Eighth European Conf. Computer Vision, pp. 345-457, 2004.
[19] Y. Ke and R. Sukthankar, “PCA-SIFT: A More Distinctive Representation for Local Image Descriptors,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 511-517, 2004.
[20] J. Koenderink and A. van Doorn, “Representation of Local Geometry in the Visual System,” Biological Cybernetics, vol. 55, pp. 367-375, 1987.
[21] S. Lazebnik, C. Schmid, and J. Ponce, “Sparse Texture Representation Using Affine-Invariant Neighborhoods,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 319-324, 2003.
[22] B. Leibe and B. Schiele, “Interleaved Object Categorization and Segmentation,” Proc. 14th British Machine Vision Conf., pp. 759-768, 2003.
[23] T. Lindeberg, “Feature Detection with Automatic Scale Selection,” Int'l J. Computer Vision, vol. 30, no. 2, pp. 79-116, 1998.
[24] T. Lindeberg and J. Gårding, “Shape-Adapted Smoothing in Estimation of 3-D Shape Cues from Affine Deformations of Local 2-D Brightness Structure,” Image and Vision Computing, vol. 15, no. 6, pp. 415-434, 1997.
[25] D. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int'l J. Computer Vision, vol. 2, no. 60, pp. 91-110, 2004.
[26] D.G. Lowe, “Object Recognition from Local Scale-Invariant Features,” Proc. Seventh Int'l Conf. Computer Vision, pp. 1150-1157, 1999.
[27] J.K.M. Vetterli, Wavelets and Subband Coding. Prentice Hall, 1995.
[28] J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust Wide Baseline Stereo from Maximally Stable Extremal Regions,” Proc. 13th British Machine Vision Conf., pp. 384-393, 2002.
[29] K. Mikolajczyk and C. Schmid, “Indexing Based on Scale Invariant Interest Points,” Proc. Eighth Int'l Conf. Computer Vision, pp. 525-531, 2001.
[30] K. Mikolajczyk and C. Schmid, “An Affine Invariant Interest Point Detector,” Proc. Seventh European Conf. Computer Vision, pp. 128-142, 2002.
[31] K. Mikolajczyk and C. Schmid, “A Performance Evaluation of Local Descriptors,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 257-264, 2003.
[32] K. Mikolajczyk and C. Schmid, “Scale and Affine Invariant Interest Point Detectors,” Int'l J. Computer Vision, vol. 1, no. 60, pp. 63-86, 2004.
[33] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L.V. Gool, “A Comparison of Affine Region Detectors,” accepted by Int'l J. Computer Vision.
[34] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971-987, July 2002.
[35] A. Opelt, M. Fussenegger, A. Pinz, and P. Auer, “Weak Hypotheses and Boosting for Generic Object Detection and Recognition,” Proc. Eighth European Conf. Computer Vision, pp. 71-84, 2004.
[36] T. Randen and J.H. Husoy, “Filtering for Texture Classification: A Comparative Study,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 4, pp. 291-310, Apr. 1999.
[37] F. Schaffalitzky and A. Zisserman, “Multi-View Matching for Unordered Image Sets,” Proc. Seventh European Conf. Computer Vision, pp. 414-431, 2002.
[38] C. Schmid and R. Mohr, “Local Grayvalue Invariants for Image Retrieval,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 5, pp. 530-534, May 1997.
[39] C. Schmid, R. Mohr, and C. Bauckhage, “Evaluation of Interest Point Detectors,” Int'l J. Computer Vision, vol. 37, no. 2, pp. 151-172, 2000.
[40] S. Se, D. Lowe, and J. Little, “Global Localization Using Distinctive Visual Features,” Proc. Int'l Conf. Intelligent Robots and Systems, pp. 226-231, 2002.
[41] J. Sivic and A. Zisserman, “Video Google: A Text Retrieval Approach to Object Matching in Videos,” Proc. Ninth Int'l Conf. Computer Vision, pp. 1470-1478, 2003.
[42] T. Tuytelaars and L. Van Gool, “Matching Widely Separated Views Based on Affine Invariant Regions,” Int'l J. Computer Vision, vol. 1, no. 59, pp. 61-85, 2004.
[43] L. Van Gool, T. Moons, and D. Ungureanu, “Affine/Photometric Invariants for Planar Intensity Patterns,” Proc. Fourth European Conf. Computer Vision, pp. 642-651, 1996.
[44] M. Varma and A. Zisserman, “Texture Classification: Are Filter Banks Necessary?” Proc. Conf. Computer Vision and Pattern Recognition, pp. 477-484, 2003.
[45] R. Zabih and J. Woodfill, “Non-Parametric Local Transforms for Computing Visual Correspondance,” Proc. Third European Conf. Computer Vision, pp. 151-158, 1994.

Index Terms:
Index Terms- Local descriptors, interest points, interest regions, invariance, matching, recognition.
Citation:
Krystian Mikolajczyk, Cordelia Schmid, "A Performance Evaluation of Local Descriptors," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, Oct. 2005, doi:10.1109/TPAMI.2005.188