This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Learning Shape-Classes Using a Mixture of Tree-Unions
June 2006 (vol. 28 no. 6)
pp. 954-967
This paper poses the problem of tree-clustering as that of fitting a mixture of tree unions to a set of sample trees. The tree-unions are structures from which the individual data samples belonging to a cluster can be obtained by edit operations. The distribution of observed tree nodes in each cluster sample is assumed to be governed by a Bernoulli distribution. The clustering method is designed to operate when the correspondences between nodes are unknown and must be inferred as part of the learning process. We adopt a minimum description length approach to the problem of fitting the mixture model to data. We make maximum-likelihood estimates of the Bernoulli parameters. The tree-unions and the mixing proportions are sought so as to minimize the description length criterion. This is the sum of the negative logarithm of the Bernoulli distribution, and a message-length criterion that encodes both the complexity of the union-trees and the number of mixture components. We locate node correspondences by minimizing the edit distance with the current tree unions, and show that the edit distance is linked to the description length criterion. The method can be applied to both unweighted and weighted trees. We illustrate the utility of the resulting algorithm on the problem of classifying 2D shapes using a shock graph representation.

[1] R. Baxter and J. Olivier, “MDL and MML: Similarities and Differences (Introduction to Minimum Encoding Inference— Part III),” Technical Report 207, Dept. of Computer Science, Monash Univ., 1994.
[2] H. Bunke et al., “Graph Clustering Using the Weighted Minimum Common Supergraph,” Graph Based Representations in Pattern Recognition, pp. 235-246, 2003.
[3] C.J.K. Chow and C.N. Liu, “Approximating Discrete Probability Distributions with Dependence Trees,” IEEE Trans. Information Theory, vol. 14, no. 3, pp. 462-467, 1968.
[4] T.F. Cootes, C.J. Taylor, and D.H. Cooper, “Active Shape Models— Their Training and Application,” Computer Vision and Image Understanding, vol. 61, pp. 38-59, 1995.
[5] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., Series B, vol. 39, pp. 1-38, 1977.
[6] M.A. Eshera and K.-S. Fu, “An Image Understanding System Using Attributed Symbolic Representation And Inexact Graph-Matching,” IEEE Trans. Pattern Recognition and Machine Intelligence, vol. 8, pp. 604-618, 1986.
[7] M.A.T. Figueiredo and A.K. Jain, “Unsupervised Learning of Finite Mixture Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 381-396, Mar. 2002.
[8] D. Fisher, “Knowledge Acquisition via Incremental Conceptual Clustering,” Machine Learning, vol. 2, no. 2, pp. 139-172, 1987.
[9] N. Friedman, “Learning Bayesian Networks in the Presence of Missing Variables,” Proc. Int'l Conf. Machine Learning, pp. 125-133, 1997.
[10] N. Friedman and D. Koller, “Being Bayesian about Network Structure,” Machine Learning, vol. 50, nos. 1-2, pp. 95-125, 2003.
[11] L. Getoor et al., “Learning Probabilistic Models of Relational Structure,” Proc. Int'l Conf. Machine Learning, pp. 170-177, 2001.
[12] P. Grünwald, “A Minimum Description Length Approach to Grammar Inference,” Proc. Symbolic, Connectionist, and Statistical Approaches to Learning for Natural Language Processing, pp. 203-216, 1996.
[13] P. Grünwald, “Minimum Description Length Tutorial,” Advances in Minimum Description Length: Theory and Applications, Apr. 2005.
[14] D. Heckerman, D. Geiger, and D.M. Chickering, “Learning Bayesian Networks: The Combination of Knowledge and Statistical Data,” Machine Learning, vol. 20, no. 3, pp. 197-243, 1995.
[15] M. Hagenbuchner, A. Sperduti, and A.C. Tsoi, “A Self-Organizing Map for Adaptive Processing of Structured Data,” IEEE Trans. Neural Networks, vol. 14, pp. 491-505, 2003.
[16] T. Horváth, S. Worbel, and U. Bohnebeck, “Relational Instance-Based Learning with Lists and Terms,” Machine Learning, vol. 43, pp. 53-80, 2001.
[17] S. Ioffe and D.A. Forsyth, “Human Tracking with Mixtures of Trees,” Proc. Int'l Conf. Computer Vision, vol. I, pp. 690-695, 2001.
[18] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Prentice Hall, 1988.
[19] B.J. Jain and F. Wysotzki, “Central Clustering of Attributed Graphs,” Machine Learning, vol. 56, pp. 169-207, 2004.
[20] D.G. Kendall, “A Survey of Statistical Theory of Shape (with Discussion),” Statistical Science, vol. 4, no. 2, pp. 87-120, 1989.
[21] B.B. Kimia, A.R. Tannenbaum, and S.W. Zucker, “Shapes, Shocks, and Deformations I,” Int'l J. Computer Vision, vol. 15, pp. 189-224, 1995.
[22] A.D. Lanterman, “Schwarz, Wallace and Rissanen: Intertwining Themes in Theories of Model Selection,” Int'l Statistical Rev., vol. 69, no. 2, pp. 185-212, 2001.
[23] T. Liu and D. Geiger, “Approximate Tree Matching and Shape Similarity,” Proc. Int'l Conf. Computer Vision, pp. 456-462, 1999.
[24] M.A. Lozano and F. Escolano, “ACM Attributed Graph Clustering for Learning Classes of Images,” Graph Based Representations in Pattern Recognition, pp. 247-258, 2003.
[25] M. Meilă, “Learning with Mixtures of Trees,” PhD thesis, Mass. Inst. of Tech nology, 1999.
[26] R.L. Ogniewicz, “A Multiscale Mat from Voronoi Diagrams: The Skeleton-Space and Its Application to Shape Description and Decomposition,” Aspects of Visual Form Processing, pp. 430-439, 1994.
[27] R. Otter, “The Number of Trees,” Ann. Math., vol. 49, pp. 583-599, 1948.
[28] P. Perona and W.T. Freeman, “A Factorization Approach to Grouping,” Proc. European Conf. Computer Vision, vol. 1, pp. 655-670, 1998.
[29] J. Rissanen, “Modelling by Shortest Data Description,” Automatica, vol. 14, pp. 465-471, 1978.
[30] A. Robles-Kelly and E.R. Hancock, “A Probabilistic Spectral Framework for Segmentation and Grouping,” Pattern Recognition, vol. 37, pp. 1387-1406, 2004.
[31] A. Sanfeliu and K.S. Fu, “A Distance Measure Between Attributed Relational Graphs for Pattern Recognition,” Proc. IEEE Int'l Conf. Systems, Man, and Cybernetics, vol. 13, pp. 353-362, 1983.
[32] A. Shokoufandeh, S.J. Dickinson, K. Siddiqi, and S.W. Zucker, “Indexing Using a Spectral Encoding of Topological Structure,” Computer Vision and Pattern Recognition, vol. 2, pp. 491-497, 1999.
[33] K. Siddiqi et al., “Shock Graphs And Shape Matching,” Int'l J. Computer Vision, vol. 35, pp. 13-32, 1999.
[34] E. Klassen, A. Srivastava, W. Mio, and S.H. Joshi, “Analysis of Planar Shapes Using Geodesic Paths on Shape Spaces,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 3, pp. 372-383, Mar. 2004.
[35] A. Torsello and E.R. Hancock, “A Skeletal Measure of 2D Shape Similarity,” Computer Vision and Image Understanding, vol. 95, no. 1, pp. 1-29, 2004.
[36] A. Torsello and E.R. Hancock, “Correcting Curvature-Density Effects in the Hamilton-Jacobi Skeleton,” IEEE Trans. Image Processing, vol. 15, no. 4, pp. 877-891, 2006.
[37] A. Torsello and E.R. Hancock, “Efficiently Computing Weighted Tree Edit Distance Using Relaxation Labeling,” Energy Minimization Methods in Computer Vision and Pattern Recognition, 2001.
[38] A. Torsello, D. Hidovic-Rowe, M. Pelillo, “Polynomial-Time Metrics for Attributed Trees,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 7, pp. 1087-1099, July 2005.
[39] C. Wallace and D. Burton, “An Information Measure for Classification,” The Computer J. , vol. 11, no. 2, pp. 195-209, 1968.
[40] J.H.M. Wedderburn, “The Functional Equation $g(x^2) = 2ax +$ $ [ g(x) ]^2$ ,” Ann. Math., vol. 24, pp. 121-140, 1922-23.
[41] S.C. Zhu and A.L. Yuille, “FORMS: A Flexible Object Recognition and Modelling System,” Int'l J. Computer Vision, vol. 20, no. 3, pp. 187-212, 1996.

Index Terms:
Structural learning, tree clustering, mixture modelinq, minimum description length, model codes, shock graphs.
Citation:
Andrea Torsello, Edwin R. Hancock, "Learning Shape-Classes Using a Mixture of Tree-Unions," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 6, pp. 954-967, June 2006, doi:10.1109/TPAMI.2006.125
Usage of this product signifies your acceptance of the Terms of Use.