The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.10 - October (2009 vol.31)
pp: 1747-1761
Long (Leo) Zhu , Massachusetts Institute of Technology, Cambridge
Alan Yuille , University of California, Los Angeles, Los Angeles
Hongjiang Zhang , Microsoft Advanced Technology Center, Beijing
ABSTRACT
We present a method to learn probabilistic object models (POMs) with minimal supervision, which exploit different visual cues and perform tasks such as classification, segmentation, and recognition. We formulate this as a structure induction and learning task and our strategy is to learn and combine elementary POMs that make use of complementary image cues. We describe a novel structure induction procedure, which uses knowledge propagation to enable POMs to provide information to other POMs and “teach them” (which greatly reduces the amount of supervision required for training and speeds up the inference). In particular, we learn a POM-IP defined on Interest Points using weak supervision [1], [2] and use this to train a POM-mask, defined on regional features, which yields a combined POM that performs segmentation/localization. This combined model can be used to train POM-edgelets, defined on edgelets, which gives a full POM with improved performance on classification. We give detailed experimental analysis on large data sets for classification and segmentation with comparison to other methods. Inference takes five seconds while learning takes approximately four hours. In addition, we show that the full POM is invariant to scale and rotation of the object (for learning and inference) and can learn hybrid objects classes (i.e., when there are several objects and the identity of the object in each image is unknown). Finally, we show that POMs can be used to match between different objects of the same category, and hence, enable objects recognition.
INDEX TERMS
Unsupervised learning, object classification, segmentation, recognition.
CITATION
Long (Leo) Zhu, Alan Yuille, Hongjiang Zhang, "Unsupervised Learning of Probabilistic Object Models (POMs) for Object Classification, Segmentation, and Recognition Using Knowledge Propagation", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 10, pp. 1747-1761, October 2009, doi:10.1109/TPAMI.2009.95
REFERENCES
[1] L. Zhu, Y. Chen, and A.L. Yuille, “Unsupervised Learning of a Probabilistic Grammar for Object Detection and Parsing,” Proc. Conf. Neural Information Processing Systems, pp. 1617-1624, 2006.
[2] L. Zhu, Y. Chen, and A.L. Yuille, “Unsupervised Learning of Probabilistic Grammar-Markov Models for Object Categories,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 1, pp.114-128, Jan. 2009.
[3] R. Fergus, P. Perona, and A. Zisserman, “Object Class Recognition by Unsupervised Scale-Invariant Learning,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 264-271, 2003.
[4] B. Leibe, A. Leonardis, and B. Schiele, “Combined Object Categorization and Segmentation with an Implicit Shape Model,” Proc. European Conf. Computer Vision Workshop Statistical Learning in Computer Vision, pp. 17-32, May 2004.
[5] R. Fergus, P. Perona, and A. Zisserman, “A Sparse Object Category Model for Efficient Learning and Exhaustive Recognition,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 380-387, 2005.
[6] D.J. Crandall, P. Felzenszwalb, and D. Huttenlocher, “Spatial Priors for Part-Based Recognition Using Statistical Models,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 10-17, 2005.
[7] D.J. Crandall and D.P. Huttenlocher, “Weakly Supervised Learning of Part-Based Spatial Models for Visual Object Recognition,” Proc. European Conf. Computer Vision, vol. 1, pp. 16-29, 2006.
[8] A. Kushal, C. Schmid, and J. Ponce, “Flexible Object Models for Category-Level 3d Object Recognition,” Proc. IEEE Conf. Computer Vision and Pattern Recognition 2007.
[9] G. Bouchard and B. Triggs, “Hierarchical Part-Based Visual Object Categorization,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 710-715, 2005.
[10] E. Borenstein and S. Ullman, “Learning to Segment,” Proc. European Conf. Computer Vision, vol. 3, pp. 315-328, 2004.
[11] A. Levin and Y. Weiss, “Learning to Combine Bottom-Up and Top-Down Segmentation,” Proc. European Conf. Computer Vision, vol. 4, pp. 581-594, 2006.
[12] X. Ren, C. Fowlkes, and J. Malik, “Cue Integration for Figure/Ground Labeling,” Proc. Conf. Neural Information Processing Systems, 2005.
[13] J.M. Winn and N. Jojic, “Locus: Learning Object Classes with Unsupervised Segmentation,” Proc. Int'l Conf. Computer Vision, pp.756-763, 2005.
[14] J. Sivic, B.C. Russell, A.A. Efros, A. Zisserman, and W.T. Freeman, “Discovering Objects and Their Localization in Images,” Proc. Int'l Conf. Computer Vision, pp. 370-377, 2005.
[15] L. Cao and L. Fei-Fei, “Spatially Coherent Latent Topic Model for Concurrent Object Segmentation and Classification,” Proc. Int'l Conf. Computer Vision, 2007.
[16] U. Grenander, Pattern Synthesis: Lectures in Pattern Theory, vol. 1. Springer, 1976.
[17] U. Grenander, Pattern Analysis: Lectures in Pattern Theory, vol. 2. Springer, 1978.
[18] Y. Chen, L. Zhu, A.L. Yuille, and H. Zhang, “Unsupervised Learning of Probabilistic Object Models (POMs) for Object Classification, Segmentation and Recognition,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[19] N. Friedman and D. Koller, “Being Bayesian about Bayesian Network Structure: A Bayesian Approach to Structure Discovery in Bayesian Networks,” Machine Learning, vol. 50, nos. 1/2, pp. 95-125, 2003.
[20] A. Blake, C. Rother, M. Brown, P. Pérez, and P.H.S. Torr, “Interactive Image Segmentation Using an Adaptive gmmrf Model,” Proc. European Conf. Computer Vision, vol. 1, pp.428-441, 2004.
[21] Y. Boykov and M.-P. Jolly, “Interactive Graph Cuts for Optimal Boundary and Region Segmentation of Objects in N-D Images,” Proc. Int'l Conf. Computer Vision, pp. 105-112, 2001.
[22] Y. Boykov and V. Kolmogorov, “An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision,” Proc. Int'l Workshop Energy Minimization Methods in Computer Vision and Pattern Recognition, pp. 359-374, 2001.
[23] C. Rother, V. Kolmogorov, and A. Blake, “'Grabcut': Interactive Foreground Extraction Using Iterated Graph Cuts,” ACM Trans. Graphics, vol. 23, no. 3, pp. 309-314, 2004.
[24] M.P. Kumar, P.H.S. Torr, and A. Zisserman, “Obj Cut,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 18-25, 2005.
[25] N. Jojic, J.M. Winn, and L. Zitnick, “Escaping Local Minima through Hierarchical Model Selection: Automatic Object Discovery, Segmentation, and Tracking in Video,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 117-124, 2006.
[26] B. Frey and N. Jojic, “Transformation-Invariant Clustering Using the em Algorithm,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 1, pp. 1-17, Jan. 2003.
[27] T. Kadir and M. Brady, “Saliency, Scale and Image Description,” Int'l J. Computer Vision, vol. 45, no. 2, pp. 83-105, 2001.
[28] D.G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[29] Y. Amit and D. Geman, “A Computational Model for Visual Selection,” Neural Computation, vol. 11, no. 7, pp. 1691-1715, 1999.
[30] S. Lazebnik, C. Schmid, and J. Ponce, “A Sparse Texture Representation Using Local Affine Regions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1265-1278, Aug. 2005.
[31] Y. Wu, Z. Si, C. Fleming, and S. Zhu, “Deformable Template as Active Basis,” Proc. Int'l Conf. Computer Vision, 2007.
[32] R.M. Neal and G.E. Hinton, “A View of the em Algorithm that Justifies Incremental, Sparse, and Other Variants,” Learning in Graphical Models, pp. 355-368, MIT Press, 1999.
[33] L. Fei-Fei, R. Fergus, and P. Perona, “Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories,” Computer Vision and Image Understanding, vol. 106, no. 1, pp. 59-70, 2007.
19 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool