The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.09 - September (2008 vol.30)
pp: 1632-1646
ABSTRACT
Some of the most effective recent methods for content-based image classification work by quantizing image descriptors, and accumulating histograms of the resulting visual word codes. Large numbers of descriptors and large codebooks are required for good results and this becomes slow using k-means. We introduce Extremely Randomized Clustering Forests ? ensembles of randomly created clustering trees ? and show that they provide more accurate results, much faster training and testing, and good resistance to background clutter. Second, an efficient image classification method is proposed. It combines ERC-Forests and saliency maps very closely with the extraction of image information. For a given image, a classifier builds a saliency map online and uses it to classify the image. We show in several state-of-the-art image classification tasks that this method can speed up the classification process enormously. Finally, we show that the proposed ERC-Forests can also be used very successfully for learning distance between images. The distance computation algorithm consists of learning the characteristic differences between local descriptors sampled from pairs of same or different objects. These differences are vector quantized by ERC-Forests and the similarity measure is computed from this quantization. The similarity measure has been evaluated on four very different datasets and always outperforms the state-of-the-art competitive approaches.
INDEX TERMS
Computer vision, Object recognition
CITATION
Frank Moosmann, Eric Nowak, Frederic Jurie, "Randomized Clustering Forests for Image Classification", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.30, no. 9, pp. 1632-1646, September 2008, doi:10.1109/TPAMI.2007.70822
REFERENCES
[1] Y. Amit, D. Geman, and K. Wilder, Joint Induction of Shape Features and Tree Classifiers, 1997.
[2] T. Avraham and M. Lindenbaum, “Dynamic Visual Search Using Inner Scene Similarity: Algorithms and Inherent Limitations,” Proc. Eighth European Conf. Computer Vision, 2004.
[3] A. Bar Hillel, T. Hertz, N. Shental, and D. Weinshall, “Learning a Mahalanobis Metric from Equivalence Constraints,” J. Machine Learning Research, vol. 6, pp. 937-965, 2005.
[4] E. Bauer and R. Kohavi, “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants,” Machine Learning J., vol. 36, nos. 1-2, pp. 105-139, 1999.
[5] T.L. Berg, A.C. Berg, J. Edwards, M. Maire, R. White, Y.-W. Teh, E. Learned-Miller, and D.A. Forsyth, “Names and Faces in the News,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 848-854, 2004.
[6] K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft, “When Is Nearest Neighbors Meaningful,” Proc. Seventh Int'l Conf. Database Theories), pp. 217-235, 1999.
[7] H. Blockeel, L. De Raedt, and J. Ramon, “Top-Down Induction of Clustering Trees,” Proc. 15th Int'l Conf. Machine Learning, pp. 55-63, 1998.
[8] J. Bonaiuto and L. Itti, “Combining Attention and Recognition for Rapid Scene Analysis,” Proc. Third Int'l Workshop Attention and Performance in Computational Vision, 2005.
[9] L. Breiman, “Random Forests,” Machine Learning J., vol. 45, no. 1, pp. 5-32, 2001.
[10] S. Chopra, R. Hadsell, and Y. LeCun, “Learning a Similarity Metric Discriminatively, with Application to Face Verification,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp.539-546, 2005.
[11] G. Csurka, C. Dance, L. Fan, J. Williamowski, and C. Bray, “Visual Categorization with Bags of Keypoints,” Proc. ECCV Workshop Statistical Learning in Computer Vision, pp. 59-74, 2004.
[12] M. Everingham, “The 2005 PASCAL Visual Object Classes Challenge,” Proc. First PASCAL Challenges Workshop, 2006.
[13] A. Ferencz, E.G. Learned Miller, and J. Malik, “Building a Classification Cascade for Visual Identification from One Example,” Proc. 10th IEEE Int'l Conf. Computer Vision, vol. 1, pp. 286-293, 2005.
[14] A.D. Ferencz, E.G. Learned-Miller, and J. Malik, “Learning Hyper-Features for Visual Identification,” Proc. Ann. Conf. Neural Information Processing Systems, pp. 425-432, 2005.
[15] R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning Object Categories from Google's Image Search,” Proc. 10th IEEE Int'l Conf. Computer Vision, vol. 2, pp. 1816-1823, 2005.
[16] A.W. Fitzgibbon and A. Zisserman, “Joint Manifold Distance: A New Approach to Appearance-Based Clustering,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 26-33, 2003.
[17] F. Fleuret and G. Blanchard, “Pattern Recognition from One Example by Chopping,” Proc. Ann. Conf. Neural Information Processing Systems, pp. 371-378, 2005.
[18] G. Fritz, C. Seifert, L. Paletta, and H. Bischof, “Entropy-Based Saliency Maps for Object Recognition,” Proc. Workshop Early Cognitive Vision, 2004.
[19] A. Frome, Y. Singer, and J. Malik, “Image Retrieval and Classification Using Local Distance Functions,” Proc. Ann. Conf. Neural Information Processing Systems, 2006.
[20] P. Geurts, D. Ernst, and L. Wehenkel, “Extremely Randomized Trees,” Machine Learning J., vol. 63, no. 1, 2006.
[21] A. Globerson and S. Roweis, “Metric Learning by Collapsing Classes,” Proc. Ann. Conf. Neural Information Processing Systems, 2005.
[22] J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov, “Neighborhood Components Analysis,” Proc. Ann. Conf. Neural Information Processing Systems, 2004.
[23] D. Hall, B. Leibe, and B. Schiele, “Saliency of Interest Points under Scale Changes,” Proc. 13th British Machine Vision Conf., 2002.
[24] C. Harris and M. Stephens, “A Combined Corner and Edge Detector,” Proc. Fourth Alvey Vision Conf., 1988.
[25] L. Itti, C. Koch, and E. Niebur, “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254-1259, Nov. 1998.
[26] V. Jain, A. Ferencz, and E.G. Learned Miller, “Discriminative Training of Hyper-Feature Models for Object Identification,” Proc. 17th British Machine Vision Conf., vol. 1, pp. 357-366, 2006.
[27] F. Jurie and C. Schmid, “Scale-Invariant Shape Features for Recognition of Object Categories,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 90-96, 2004.
[28] F. Jurie and B. Triggs, “Creating Efficient Codebooks for Visual Recognition,” Proc. 10th IEEE Int'l Conf. Computer Vision, 2005.
[29] T. Kadir and M. Brady, “Saliency, Scale and Image Description,” Int'l J. Computer Vision, vol. 45, no. 2, Nov. 2001.
[30] B. Leibe and B. Schiele, “Interleaved Object Categorization and Segmentation,” Proc. 14th British Machine Vision Conf., 2003.
[31] V. Lepetit, P. Lagger, and P. Fua, “Randomized Trees for Real-Time Keypoint Recognition,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 775-781, 2005.
[32] T. Leung and J. Malik, “Representing and Recognizing the Visual Appearance of Materials Using Three-Dimensional Textons,” Int'l J. Computer Vision, vol. 43, no. 1, pp. 29-44, June 2001.
[33] F.F. Li, R. Fergus, and P. Perona, “One-Shot Learning of Object Categories,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 4, pp. 594-611, Apr. 2006.
[34] B. Liu, Y. Xia, and P.S. Yu, “Clustering through Decision Tree Construction,” Proc. Ninth ACM Int'l Conf. Information and Knowledge Management, pp. 20-29, 2000.
[35] D.G. Lowe, “Similarity Metric Learning for a Variable-Kernel Classifier,” Neural Computation, vol. 7, no. 1, pp. 72-85, 1995.
[36] D.G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int'l J. Computer Vision, vol. 60, no. 2, 2004.
[37] R. Marée, P. Geurts, J. Piater, and L. Wehenkel, “Random Subwindows for Robust Image Classification,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 34-40, 2005.
[38] K. Mikolajczyk and C. Schmid, “Scale and Affine Invariant Interest Point Detectors,” Int'l J. Computer Vision, vol. 60, no. 1, pp.63-86, 2004.
[39] E.G. Miller, N.E. Matsakis, and P.A. Viola, “Learning from One Example through Shared Densities on Transforms,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 464-471, 2000.
[40] V. Navalpakkam and L. Itti, “Sharing Resources: Buy Attention, Get Recognition,” Proc. First Int'l Workshop Attention and Performance in Computational Vision, 2003.
[41] D. Nistér and H. Stewénius, “Scalable Recognition with a Vocabulary Tree,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, 2006.
[42] E. Nowak, F. Jurie, and B. Triggs, “Sampling Strategies for Bag-of-Features Image Classification,” Proc. Ninth European Conf. Computer Vision), 2006.
[43] S. Obdrzlek and J. Matas., “Sub-Linear Indexing for Large Scale Object Recognition,” Proc. 16th British Machine Vision Conf., 2005.
[44] A. Opelt and A. Pinz, “Object Localization with Boosting and Weak Supervision for Generic Object Recognition,” Proc. 14th Scandinavian Conf. Image Analysis, 2005.
[45] F. Perronnin, C. Dance, G. Csurka, and M. Bressan, “Adapted Vocabularies for Generic Visual Categorization,” Proc. Ninth European Conf. Computer Vision, 2006.
[46] C. Schmid, R. Mohr, and C. Bauckhage, “Evaluation of Interest Point Detectors,” Int'l J. Computer Vision, vol. 37, no. 2, pp. 151-172, June 2000.
[47] N. Sebe and M.S. Lew, “Comparing Salient Point Detectors,” Pattern Recognition Letters, vol. 24, nos. 1-3, pp. 89-96, Jan. 2003.
[48] T. Serre, M. Riesenhuber, J. Louie, and T. Poggio, “On the Role of Object-Specific Features for Real-World Object Recognition in Biological Vision,” Proc. 13th British Machine Vision Conf., 2002.
[49] U. Shaft, J. Goldstein, and K. Beyer, “Nearest Neighbor Query Performance for Unstable Distributions,” Technical Report TR1388, Dept. of Computer Science, Univ. of Wisconsin, 1998.
[50] S. Shalev-Shwartz, Y. Singer, and A.Y. Ng, “Online and Batch Learning of Pseudo-Metrics,” Proc. 21st Int'l Conf. Machine Learning, 2004.
[51] J. Sivic and A. Zisserman, “Video Google: A Text Retrieval Approach to Object Matching in Videos,” Proc. Eighth IEEE Int'l Conf. Computer Vision, vol. 2, pp. 1470-1477, Oct. 2003.
[52] E. Stollnitz, T.D. DeRose, and D.H. Salesin, “Wavelets for Computer Graphics: A Primer—Part 1,” IEEE Computer Graphics and Applications, vol. 15, no. 3, pp. 76-84, 1995.
[53] M. Vidal-Naquet and S. Ullman, “Object Recognition with Informative Features and Linear Classification,” Proc. Eighth IEEE Int'l Conf. Computer Vision, vol. 2, pp. 281-288, 2003.
[54] K.N. Walker, T.F. Cootes, and C.J. Taylor, “Locating Salient Object Features,” Proc. 13th British Machine Vision Conf., 1998.
[55] D. Walther, U. Rutishauser, C. Koch, and P. Perona, “On the Usefulness of Attention for Object Recognition,” Proc. Eighth European Conf. Computer Vision, 2004.
[56] K. Weinberger, J. Blitzer, and L. Saul, “Distance Metric Learning for Large Margin Nearest Neighbor Classification,” Proc. Ann. Conf. Neural Information Processing Systems, 2006.
[57] J. Winn, A. Criminisi, and T. Minka, “Object Categorization by Learned Universal Visual Dictionary,” Proc. 10th IEEE Int'l Conf. Computer Vision, pp. 1800-1807, 2005.
[58] E.P. Xing, A.Y. Ng, S. Jordan, and M.I. Russell, “Distance Metric Learning, with Application to Clustering with Side Information,” Proc. Ann. Conf. Neural Information Processing Systems, 2002.
[59] Y. Ye and J.K. Tsotsos, “Where to Look Next in 3D Object Search,” Proc. IEEE Int'l Symp. Computer Vision, 1995.
[60] A. Zaharescu, A.L. Rothenstein, and J.K. Tsotsos, “Towards a Biologically Plausible Active Visual Search Model,” Proc. Third Int'l Workshop Attention and Performance in Computational Vision, pp. 133-147, 2004.
[61] J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid, “Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study,” Int'l J. Computer Vision, 2006.
15 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool