This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Binary Linear Programming Formulation of the Graph Edit Distance
August 2006 (vol. 28 no. 8)
pp. 1200-1214
A binary linear programming formulation of the graph edit distance for unweighted, undirected graphs with vertex attributes is derived and applied to a graph recognition problem. A general formulation for editing graphs is used to derive a graph edit distance that is proven to be a metric, provided the cost function for individual edit operations is a metric. Then, a binary linear program is developed for computing this graph edit distance, and polynomial time methods for determining upper and lower bounds on the solution of the binary program are derived by applying solution methods for standard linear programming and the assignment problem. A recognition problem of comparing a sample input graph to a database of known prototype graphs in the context of a chemical information system is presented as an application of the new method. The costs associated with various edit operations are chosen by using a minimum normalized variance criterion applied to pairwise distances between nearest neighbors in the database of prototypes. The new metric is shown to perform quite well in comparison to existing metrics when applied to a database of chemical graphs.

[1] T. Pavlidis, Structural Pattern Recognition. New York: Springer-Verlag, 1977.
[2] L. Jianzhuang and L. Tsui, “Graph-Based Method for Face Identification from a Single 2D Line Drawing,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 10, pp. 1106-1119, Oct. 2000.
[3] J. Llados, E. Marti, and J. Villanueva, “Symbol Recognition by Error-Tolerant Subgraph Matching between Region Adjacency Graphs,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 10, pp. 1137-1143, Oct. 2001.
[4] D. Shasha, J. Wang, and R. Giugno, “Algorithmics and Applications of Tree and Graph Searching,” Proc. 21st ACM SIGMOD-SIGACT-SIGART, June 2005.
[5] Concepts and Applications of Molecular Similarity, M. Johnson and G. Maggiora, eds., New York: John Wiley and Sons, 1990.
[6] G. Downs and P. Willett, “Similarity Searching in Databases of Chemical Structures,” Reviews in Computational Chemistry, K. Lipkowitz and D. Boyd, eds., vol. 7, New York: VCH, pp. 1-66, 1996.
[7] J. Raymond and P. Willett, “Effectiveness of Graph-Based and Fingerprint-Based Similarity Measures for Virtual Screening of 2D Chemical Structure Databases,” J. Computer-Aided Molecular Design, vol. 16, pp. 59-71, 2002.
[8] P. Willett, “Matching of Chemical and Biological Structures Using Subgraph and Maximal Common Subgraph Isomorphism Algorithms,” IMA Volume Math. and Its Applications, vol. 108, pp. 11-38, 1999.
[9] J. Raymond and P. Willett, “Maximum Common Subgraph Isomorphism Algorithms for the Matching of Chemical Structures,” J. Computer-Aided Molecular Design, vol. 16, pp. 521-533, 2002.
[10] M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco: W.H. Freeman, 1979.
[11] H. Bunke and K. Shearer, “A Graph Distance Metric Based on the Maximal Common Subgraph,” Pattern Recognition Letters, vol. 19, pp. 255-259, 1998.
[12] W. Wallis, P. Shoubridge, M. Kraetz, and D. Ray, “Graph Distances Using Graph Union,” Pattern Recognition Letters, vol. 22, pp. 701-704, 2001.
[13] M. Johnson, M. Naim, V. Nicholson, and C. Tsai, “Unique Mathematical Features of the Substructure Metric Approach to Quantitative Molecular Similarity Analysis,” Graph Theory and Topology in Chemistry, R. King and D. Rouvray, eds., pp. 219-225, Mar. 1987.
[14] M.-L. Fernandez and G. Valiente, “A Graph Distance Metric Combining Maximum Common Subgraph and Minimum Common Supergraph,” Pattern Recognition Letters, vol. 22, pp. 753-758, 2001.
[15] A. Torsello, D. Hidovic-Rowe, and M. Pelillo, “Polynomial-Time Metrics for Attributed Trees,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 7, pp. 1087-1099, July 2005.
[16] M. Gori, M. Maggini, and L. Sarti, “Exact and Approximate Graph Matching Using Random Walks,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 7, pp. 1100-1111, July 2005.
[17] B. McKay, “Practical Graph Isomorphism,” Congressus Numerantium, vol. 30, pp. 45-87, 1981.
[18] L. Cordella, P. Foggia, C. Sansone, and M. Vento, “A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 10, pp. 1367-1372, Oct. 2004.
[19] W. Tsai and K. Fu, “Error-Correcting Isomorphisms of Attributed Relational Graphs for Pattern Recognition,” IEEE Trans. Systems, Man, and Cybernetics, vol. 9, pp. 757-768, 1979.
[20] H. Almohamad and S. Duffuaa, “A Linear Programming Approach for the Weighted Graph Matching Problem,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 5, pp. 522-525, May 1993.
[21] S. Umeyama, “An Eigendecomposition Approach to Weighted Graph Matching Problems,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 10, no. 5, pp. 695-703, Sept. 1988.
[22] S. Gold and A. Rangarajan, “A Graduated Assignment Algorithm for Graph Matching,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 4, pp. 377-387, Apr. 1996.
[23] B. van Wyk and M. van Wyk, “A POCS-Based Graph Matching Algorithm,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1526-1530, Nov. 2004.
[24] H. Bunke, “Error Correcting Graph Matching: On the Influence of the Underlying Cost Function,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 9, pp. 917-922, Sept. 1999.
[25] H. Bunke, “Recent Developments in Graph Matching,” Proc. 15th Int'l Conf. Pattern Recognition, vol. 2, pp. 117-124, Sept. 2000.
[26] R. Wagner and M. Fischer, “The String-to-String Correction Problem,” J. Assoc. for Computing Machinery, vol. 21, no. 1, pp. 168-173, 1974.
[27] A. Hlaoui and S. Wang, “A New Algorithm for Inexact Graph Matching,” Proc. 16th Int'l Conf. Pattern Recognition, vol. 4, pp. 180-183, 2002.
[28] B. Messmer and H. Bunke, “Error-Correcting Graph Isomorphism Using Decision Trees,” Int'l J. Pattern Recognition and Artifical Intelligence, vol. 12, pp. 721-742, 1998.
[29] R. Myers, R. Wilson, and E. Hancock, “Bayesian Graph Edit Distance,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 6, pp. 628-635, June 2000.
[30] A. Robles-Kelly and E. Hancock, “Graph Edit Distance from Spectral Seriation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 365-378, Mar. 2005.
[31] P. Bergamini, L. Cinque, A. Cross, and E. Hancock, “Efficient Alignment and Correspondence Using Edit Distance,” Proc. Joint IAPR Int'l Workshops Structural, Syntactic, and Statistical Pattern Recognition, pp. 246-255, 2000.
[32] K. Zhang, “A Constrained Edit Distance between Unordered Labeled Trees,” Algorithmica, vol. 15, no. 6, pp. 205-222, 1996.
[33] Z. Wang and K. Zhang, “Alignment between Two RNA Structures,” Proc. 26th Int'l Symp. Math. Foundations of Computer Science, pp. 690-702, 2001.
[34] P. Klein, S. Tirthapura, D. Sharvit, and B. Kimia, “A Tree-Edit Distance Algorithm for Comparing Simple, Closed Shapes,” Proc. ACM-SIAM Symp. Discrete Algorithms, pp. 696-704, 2000.
[35] M. Pavel, Fundamentals of Pattern Recognition. New York: Marcel Dekker, 1989.
[36] M. Neuhaus and H. Bunke, “A Probabilistic Approach to Learning Costs for Graph Edit Distance,” Proc. 17th Int'l Conf. Pattern Recognition, vol. 3, pp. 389-393, 2004.
[37] T. Sebastian, P. Klein, and B. Kimia, “Recognition of Shapes by Editing Their Shock Graphs,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 5, pp. 550-571, May 2004.
[38] G. Harper, G. Bravi, S. Pickett, J. Hussain, and D. Green, “The Reduced Graph Descriptor in Virtual Screening and Data-Driven Clustering of High-Throughput Screening Data,” J. Chemical Information and Computer Sciences, vol. 44, pp. 2145-2156, 2004.
[39] P. Willett and V. Winterman, “A Comparison of Some Measures for the Determination of Inter-Molecular Stuctural Similarity,” Quantitative Structure-Activity Relationships, vol. 5, 1986.
[40] C. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity. Englewood Cliffs, N.J.: Prentice Hall, Inc., 1982.
[41] J. Conway and N. Sloane, Sphere Packings, Lattices and Groups. New York: Springer-Verlag, 1988.
[42] B. Dunford-Shore, W. Sulaman, B. Feng, F. Fabrizio, J. Holcomb, W. Wise, and T. Kazic, “Klotho: Biochemical Compounds Declarative Database,” http://www.biocheminfo.orgklotho/, 2002.
[43] R. Saigal, Linear Programming: A Modern Integrated Analysis. Boston: Kluwer Academic, 1995.
[44] D. Luenberger, Optimization by Vector Space Methods. New York: John Wiley and Sons, Inc., 1969.
[45] S. Boyd and L. Vandenberghe, Convex Optimization. New York: Cambridge Univ. Press, 2004.
[46] P. Willett, Clustering in Chemical Information Systems. Letchworth: Research Studies Press, 1987.
[47] M. Berkelaar, K. Eikland, and P. Notebaert, “lp_solve: Open Source (Mixed-Integer) Linear Programming System,” http://groups.yahoo.com/grouplp_solve, May 2004.
[48] M. Cone, R. Venkataraghavan, and F. McLafferty, “Molecular Structure Comparison Program for the Identification of Maximal Common Substructures,” J. Am. Chemical Soc., vol. 99, no. 23, pp. 7668-7671, Nov. 1977.
[49] J. Raymond, E. Gardiner, and P. Willett, “Rascal: Calculation of Graph Similarity Using Maximum Common Edge Subgraphs,” The Computer J., vol. 45, no. 6, pp. 631-644, 2002.
[50] T. Hagadone, “Molecular Substructure Similarity Searching: Efficient Retrieval in Two-Dimensional Structure Databases,” J. Chemical Information and Computer Sciences, vol. 32, pp. 515-521, 1992.
[51] G. Levi, “A Note on the Derivation of Maximal Common Subgraphs of Two Directed or Undirected Graphs,” Calcolo, vol. 9, pp. 341-352, 1972.
[52] P. Ostergard, “A Fast Algorithm for the Maximum Clique Problem,” Discrete Applied Math., vol. 120, pp. 197-207, 2002.

Index Terms:
Graph algorithms, similarity measures, structural pattern recognition, graphs and networks, linear programming, continuation (homotopy) methods.
Citation:
Derek Justice, Alfred Hero, "A Binary Linear Programming Formulation of the Graph Edit Distance," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 8, pp. 1200-1214, Aug. 2006, doi:10.1109/TPAMI.2006.152
Usage of this product signifies your acceptance of the Terms of Use.