The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - December (2008 vol.30)
pp: 2158-2174
ABSTRACT
Suppose a set of arbitrary (unlabeled) images contains frequent occurrences of 2D objects from an unknown category. This paper is aimed at simultaneously solving the following related problems: (1) unsupervised identification of photometric, geometric, and topological properties of multiscale regions comprising instances of the 2D category; (2) learning a region-based structural model of the category in terms of these properties; and (3) detection, recognition and segmentation of objects from the category in new images. To this end, each image is represented by a tree that captures a multiscale image segmentation. The trees are matched to extract the maximally matching subtrees across the set, which are taken as instances of the target category. The extracted subtrees are then fused into a tree-union that represents the canonical category model. Detection, recognition, and segmentation of objects from the learned category are achieved simultaneously by finding matches of the category model with the segmentation tree of a new image. Experimental validation on benchmark datasets demonstrates the robustness and high accuracy of the learned category models, when only a few training examples are used for learning without any human supervision.
INDEX TERMS
Object recognition, Segmentation, Graph Theory, Graph algorithms, Graph-theoretic methods, Trees, Hierarchical, Computer vision, Vision and Scene Understanding, Image Representation, Structural
CITATION
Sinisa Todorovic, Narendra Ahuja, "Unsupervised Category Modeling, Recognition, and Segmentation in Images", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.30, no. 12, pp. 2158-2174, December 2008, doi:10.1109/TPAMI.2008.24
REFERENCES
[1] J. Feldman and Y. Yakimovsky, “Decision Theory and Artificial Intelligence: A Semantics Based Region Analyzer,” Artificial Intelligence, vol. 5, no. 4, pp. 349-371, 1974.
[2] A. Hanson and E. Riseman, “VISIONS: A Computer System for Interpreting Scenes,” Computer Vision Systems, A. Hanson and E.Riseman, eds. Academic Press, pp. 303-333, 1978.
[3] T.J. Fan, G. Medioni, and R. Nevatia, “Recognizing 3D Objects Using Surface Descriptors,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 11, pp. 1140-1157, Nov. 1989.
[4] D.T. Clemens, “Region-Based Feature Interpretation for Recognizing 3D Models in 2D Images,” Technical Report AITR-1307, Massachusetts Inst. of Tech nology, 1991.
[5] R. Basri and D. Jacobs, “Recognition Using Region Correspondences,” Int'l J. Computer Vision, vol. 25, no. 2, pp. 145-166, 1997.
[6] A.R. Ahmadyfard and J.V. Kittler, “Using Relaxation Technique for Region-Based Object Recognition,” Image and Vision Computing, vol. 20, no. 11, pp. 769-781, 2002.
[7] R. Zhang and Z. Zhang, “Hidden Semantic Concept Discovery in Region Based Image Retrieval,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 996-1001, 2004.
[8] Y. Keselman and S. Dickinson, “Generic Model Abstraction from Examples,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 7, pp. 1141-1156, July 2005.
[9] I. Weiss and M. Ray, “Recognizing Articulated Objects Using a Region-Based Invariant Transform,” IEEE Trans. Pattern Analysis Machine Intelligence, vol. 27, no. 10, pp. 1660-1665, Oct. 2005.
[10] M.A. Fischler and R.A. Elschlager, “The Representation and Matching of Pictorial Structures,” IEEE Trans. Computers, vol. 22, no. 1, pp. 67-92, Jan. 1973.
[11] P.F. Felzenszwalb and D.P. Huttenlocher, “Pictorial Structures for Object Recognition,” Int'l J. Computer Vision, vol. 61, no. 1, pp. 55-79, 2005.
[12] R. Fergus, P. Perona, and A. Zisserman, “Object Class Recognition by Unsupervised Scale-Invariant Learning,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 264-271, 2003.
[13] J.L. Crowley and A.C. Sanderson, “Multiple Resolution Representation and Probabilistic Matching of 2-D Gray-Scale Shape,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, no. 1, pp. 113-121, Jan. 1987.
[14] J. Utans, “Learning in Compositional Hierarchies: Inducing the Structure of Objects from Data,” Advances in Neural Information Processing Systems, vol. 6, pp. 285-292, 1994.
[15] L. Bretzner and T. Lindeberg, “Qualitative Multi-Scale Feature Hierarchies for Object Tracking,” Proc. Int'l Scale-Space Conf., pp.117-128, 1999.
[16] C.A. Bouman and M. Shapiro, “A Multiscale Random Field Model for Bayesian Image Segmentation,” IEEE Trans. Image Processing, vol. 3, no. 2, pp. 162-177, 1994.
[17] A. Shokoufandeh, I. Marsic, and S. Dickinson, “View-Based Object Recognition Using Saliency Maps,” Image and Vision Computing, vol. 17, nos. 5-6, pp. 445-460, 1999.
[18] H. Cheng and C.A. Bouman, “Multiscale Bayesian Segmentation Using a Trainable Context Model,” IEEE Trans. Image Processing, vol. 10, no. 4, pp. 511-525, 2001.
[19] S. Krempp, D. Geman, and Y. Amit, “Sequential Learning of Reusable Parts for Object Detection,” technical report, Computer Science Dept. Johns Hopkins Univ., 2002.
[20] A.J. Storkey and C.K.I. Williams, “Image Modeling with Position-Encoding Dynamic Trees,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp. 859-871, July 2003.
[21] S. Todorovic and M.C. Nechyba, “Dynamic Trees for Unsupervised Segmentation and Matching of Image Regions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 11, pp. 1762-1777, Nov. 2005.
[22] S. Todorovic and M.C. Nechyba, “Interpretation of Complex Scenes Using Dynamic Tree-Structure Bayesian Networks,” Computer Vision and Image Understanding, 2006.
[23] Y. Jin and S. Geman, “Context and Hierarchy in a Probabilistic Image Model,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 2145-2152, 2006.
[24] S. Fidler, G. Berginc, and A. Leonardis, “Hierarchical Statistical Learning of Generic Parts of Object Structure,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 182-189, 2006.
[25] B. Ommer, M. Sauter, and J.M. Buhmann, “Learning Top-Down Grouping of Compositional Hierarchies for Recognition,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, p. 194, 2006.
[26] A. Shokoufandeh, L. Bretzner, D. Macrini, M.F. Demirici, C. Jönsson, and S. Dickinson, “The Representation and Matching of Categorical Shape,” Computer Vision and Image Understanding, vol. 103, no. 2, pp. 139-154, 2006.
[27] W. Wang, I. Pollak, T.S. Wong, C.A. Bouman, M.P. Harper, and J.M. Siskind, “Hierarchical Stochastic Image Grammars for Classification and Segmentation,” IEEE Trans. Image Processing, vol. 15, no. 10, pp. 3033-3052, 2006.
[28] J.M. Siskind, J.J. Sherman, I. Pollak, M.P. Harper, and C.A. Bouman, “Spatial Random Tree Grammars for Modeling Hierarchical Structure in Images with Regions of Arbitrary Shape,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 9, pp. 1504-1519, Sept. 2007.
[29] P.H. Winston, “Learning Structural Descriptions from Examples,” Psychology of Computer Vision, P.H. Winston, ed., chapter 5, pp.157-209, McGraw-Hill, 1975.
[30] G.J. Ettinger, “Large Hierarchical Object Recognition Using Libraries of Parameterized Model Sub-Parts,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp. 32-41, 1988.
[31] H. Nishida and S. Mori, “An Algebraic Approach to Automatic Construction of Structural Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 12, pp. 1298-1311, Dec. 1993.
[32] B. Perrin, N. Ahuja, and N. Srinivasa, “Learning Multiscale Image Models of 2D Object Classes,” Proc. Asian Conf. Computer Vision, vol. 1352, pp. 323-331, 1998.
[33] Y. Xu, E. Saber, and A.M. Tekalp, “Dynamic Learning from Multiple Examples for Semantic Object Segmentation and Search,” Computer Vision and Image Understanding, vol. 95, pp.334-353, 2005.
[34] X. Jiang, A. Munger, and H. Bunke, “On Median Graphs: Properties, Algorithms, and Applications,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 10, pp. 1144-1151, Oct. 2001.
[35] B. Luo, R.C. Wilson, and E.R. Hancock, “Spectral Embedding of Graphs,” Pattern Recognition, vol. 36, no. 18, pp. 2213-2230, 2003.
[36] A. Levinshtein, C. Sminchisescu, and S. Dickinson, “Learning Hierarchical Shape Models from Examples,” Proc. Int'l Workshop Energy Minimization Methods in Computer Vision and Pattern Recognition, vol. 3757, pp. 251-267, 2005.
[37] B. Leibe, A. Leonardis, and B. Schiele, “Combined Object Categorization and Segmentation with an Implicit Shape Model,” Proc. ECCV Workshop Statistical Learning in Computer Vision, pp. 17-32, 2004.
[38] S. Agarwal, A. Awan, and D. Roth, “Learning to Detect Objects in Images via a Sparse, Part-Based Representation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1475-1490, Nov. 2004.
[39] J. Winn, A. Criminisi, and T. Minka, “Object Categorization by Learned Universal Visual Dictionary,” Proc. IEEE Int'l Conf. Computer Vision, vol. 2, pp. 1800-1807, 2005.
[40] J. Shotton, A. Blake, and R. Cipolla, “Contour-Based Learning for Object Detection,” Proc. IEEE Int'l Conf. Computer Vision, vol. 1, pp.503-510, 2005.
[41] J. Winn and N. Jojic, “Locus: Learning Object Classes with Unsupervised Segmentation,” Proc. IEEE Int'l Conf. Computer Vision, vol. 1, pp. 756-763, 2005.
[42] L. Fei-Fei, R. Fergus, and P. Perona, “One-Shot Learning of Object Categories,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 4, pp. 594-611, Apr. 2006.
[43] A. Opelt, A. Pinz, and A. Zisserman, “Incremental Learning of Object Detectors Using a Visual Shape Alphabet,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 3-10, 2006.
[44] J. Sivic, B.C. Russell, A.A. Efros, A. Zisserman, and W.T. Freeman, “Discovering Object Categories in Image Collections,” Proc. IEEE Int'l Conf. Computer Vision, 2005.
[45] B.C. Russell, A.A. Efros, J. Sivic, W.T. Freeman, and A. Zisserman, “Using Multiple Segmentations to Discover Objects and Their Extent in Image Collections,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 1604-1605, 2006.
[46] N. Ahuja, “A Transform for Multiscale Image Segmentation by Integrated Edge and Region Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 12, pp. 1211-1235, Dec. 1996.
[47] H. Arora and N. Ahuja, “Analysis of Ramp Discontinuity Model for Multiscale Image Segmentation,” Proc. Int'l Conf. Pattern Recognition, 2006.
[48] S. Todorovic and N. Ahuja, “Extracting Subimages of an Unknown Category from a Set of Images,” Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 927-934, 2006.
[49] M. Tabb and N. Ahuja, “Multiscale Image Segmentation by Integrated Edge and Region Detection,” IEEE Trans. Image Processing, vol. 6, no. 5, pp. 642-655, 1997.
[50] H. Bunke and G. Allermann, “Inexact Graph Matching for Structural Pattern Recognition,” Pattern Recognition Letters, vol. 1, no. 4, pp. 245-253, 1983.
[51] M.A. Eshera and K.S. Fu, “An Image Understanding System Using Attributed Symbolic Representation and Inexact Graph-Matching,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 5, pp. 604-618, May 1986.
[52] M. Pelillo, K. Siddiqi, and S.W. Zucker, “Matching Hierarchical Structures Using Association Graphs,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 11, pp. 1105-1120, Nov. 1999.
[53] H. Bunke and A. Kandel, “Mean and Maximum Common Subgraph of Two Graphs,” Pattern Recognition Letters, vol. 21, no. 2, pp. 163-168, 2000.
[54] A. Torsello and E.R. Hancock, “Computing Approximate Tree Edit Distance Using Relaxation Labeling,” Pattern Recognition Letters, vol. 24, no. 8, pp. 1089-1097, 2003.
[55] T.B. Sebastian, P.N. Klein, and B.B. Kimia, “Recognition of Shapes by Editing Their Shock Graphs,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 5, pp. 550-571, May 2004.
[56] K. Siddiqi, A. Shokoufandeh, S.J. Dickinson, and S.W. Zucker, “Shock Graphs and Shape Matching,” Int'l J. Computer Vision, vol. 35, no. 1, pp. 13-32, 1999.
[57] A. Shokoufandeh, D. Macrini, S. Dickinson, K. Siddiqi, and S.W. Zucker, “Indexing Hierarchical Structures Using Graph Spectra,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 7, pp. 1125-1140, July 2005.
[58] M. Pelillo, K. Siddiqi, and S.W. Zucker, “Many-to-Many Matching of Attributed Trees Using Association Graphs and Game Dynamics,” Proc. Int'l Workshop Visual Form, vol. 2059, pp. 583-593, 2001.
[59] M.F. Demirci, A. Shokoufandeh, Y. Keselman, L. Bretzner, and S.J. Dickinson, “Object Recognition as Many-to-Many Feature Matching,” Int'l J. Computer Vision, vol. 69, no. 2, pp. 203-222, 2006.
[60] T. Caelli and S. Kosinov, “An Eigenspace Projection Clustering Method for Inexact Graph Matching,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 4, pp. 515-519, Apr. 2004.
[61] S. Todorovic and N. Ahuja, “Region-Based Hierarchical Image Matching,” Int'l J. Computer Vision, to appear.
[62] M. Pelillo, “Replicator Equations, Maximal Cliques, and Graph Isomorphism,” Neural Computation, vol. 11, no. 9, pp. 1935-1955, 1999.
[63] A. Touzani and J.G. Postaire, “Mode Detection by Relaxation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 10, no. 6, pp. 970-978, June 1988.
[64] A. Gupta and N. Nishimura, “Finding Largest Subtrees and Smallest Supertrees,” Algorithmica, vol. 21, no. 2, pp. 183-210, 1998.
[65] H. Bunke, X. Jiang, and A. Kandel, “On the Minimum Common Supergraph of Two Graphs,” Computing, vol. 65, no. 1, pp. 13-25, 2000.
[66] A. Torsello and E.R. Hancock, “Matching and Embedding through Edit-Union of Trees,” Proc. European Conf. Computer Vision, vol. 3, pp. 822-836, 2002.
[67] H. Bunke, P. Foggia, C. Guidobaldi, and M. Vento, “Graph Clustering Using the Weighted Minimum Common Supergraph,” Proc. IAPR Workshop Graph Based Representations in Pattern Recognition, pp. 235-246, 2003.
[68] A. Torsello and E.R. Hancock, “Learning Shape-Classes Using a Mixture of Tree-Unions,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 6, pp. 954-967, June 2006.
[69] A. Dempster, N. Laird, and D. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc. B, vol. 39, no. 1, pp. 1-38, 1977.
[70] E. Borenstein and S. Ullman, “Class-Specific, Top-Down Segmentation,” Proc. European Conf. Computer Vision, vol. 2, pp. 109-124, 2002.
[71] E.K.P. Chong and S.H. Zak, An Introduction to Optimization, second ed. John Wiley & Sons, 2001.
20 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool