This Article 
 Bibliographic References 
 Add to: 
Specificity: A Graph-Based Estimator of Divergence
December 2011 (vol. 33 no. 12)
pp. 2492-2505
Carole J. Twining, University of Manchester, Manchester
Christopher J. Taylor, The University of Manchester, Manchester
In statistical modeling, there are various techniques used to build models from training data. Quantitative comparison of modeling techniques requires a method for evaluating the quality of the fit between the model probability density function (pdf) and the training data. One graph-based measure that has been used for this purpose is the specificity. We consider the large-numbers limit of the specificity, and derive expressions which show that it can be considered as an estimator of the divergence between the unknown pdf from which the training data was drawn and the model pdf built from the training data. Experiments using artificial data enable us to show that these limiting large-number relations enable us to obtain good quantitative and qualitative predictions of the behavior of the measured specificity, even for small numbers of training examples and in some extreme cases. We demonstrate that specificity can provide a more sensitive measure of difference between various modeling methods than some previous graph-based techniques. Key points are illustrated using real data sets. We thus establish a proper theoretical basis for the previously ad hoc concept of specificity, and obtain useful insights into the application of specificity in the analysis of real data.

[1] A. Bhattacharyya, "On a Measure of Divergence Between Two Statistical Populations Defined by Their Probability Distribution," Bull. of the Calcutta Math. Soc., vol. 35, pp. 99-110, 1943.
[2] T. Kailath, "The Divergence and Bhattacharyya Distance Measures in Signal Selection," IEEE Trans. Comm. Technology, vol. 15, no. 1, pp. 52-60, Feb. 1967.
[3] E. Hellinger, "Neue Begründung der Theorie Quadratischer Formen von Unendlichenvielen Veränderlichen," J. für die Reine und Angewandte Mathematik, vol. 136, pp. 210-271, 1909.
[4] K. Matusita, "Decision Rules Based on Distance for Problems of Fit, Two Samples and Estimation," Annals of Math. Statistics, vol. 26, pp. 631-641, 1955.
[5] A. Rényi, "On Measures of Entropy and Information," Proc. Fourth Berkeley Symp. Math. Statistics and Probability, vol. 1, pp. 547-561, 1961.
[6] S. Kullback and R.A. Leibler, "On Information and Sufficiency," Annals of Math. Statistics, vol. 22, pp. 79-86, 1951.
[7] I. Csiszár, "Information-Type Measures of Difference of Probability Distributions and Indirect Observations," Studia Scientiarum Mathematicarum Hungarica, vol. 2, pp. 299-318, 1967.
[8] S.M. Ali and S.D. Silvey, "A General Class of Coefficients of Divergence of One Distribution from Another," J. Royal Statistical Soc., Series B, vol. 28, pp. 131-142, 1966.
[9] T.F. Cootes, C.J. Taylor, D.H. Cooper, and J. Graham, "Training Models of Shape from Sets of Examples," Proc. 13th British Machine Vision Conf., pp. 9-18, 1992.
[10] T.F. Cootes, C.J. Taylor, D.H. Cooper, and J. Graham, "Active Shape Models—Their Training and Application," Computer Vision and Image Understanding, vol. 61, no. 1, pp. 38-59, 1995.
[11] C. Brechbühler, G. Gerig, and O. Kübler, "Parametrization of Closed Surfaces for 3D Shape Description," Computer Vision and Image Understanding, vol. 61, no. 2, pp. 154-170, 1995.
[12] S.M. Pizer, D. Eberly, D.S. Fritsch, and B.S. Morse, "Zoom-Invariant Vision of Figural Shape: The Mathematics of Cores," Computer Vision and Image Understanding, vol. 69, pp. 55-71, 1998.
[13] Medial Representations: Mathematics, Algorithms and Applications, K. Siddiqi and S. Pizer, eds., Springer, 2008.
[14] R.H. Davies, C.J. Twining, P.D. Allen, T.F. Cootes, and C.J. Taylor, "Shape Discrimination in the Hippocampus Using an MDL Model," Proc. 18th Conf. Information Processing in Medical Imaging, pp. 38-50, 2003.
[15] M.A. Styner, K.T. Rajamani, L.-P. Nolte, G. Zsemlye, G. Szekely, C.J. Taylor, and R.H. Davies, "Evaluation of 3D Correspondence Methods for Model Building," Proc. 18th Conf. Information Processing in Medical Imaging, pp. 63-75, 2003.
[16] Y. Fu, W. Gao, Y. Xiao, and J. Liu, "A Framework for Automatic Construction of 3D PDM from Segmented Volumetric Neuroradiological Data Sets," Computer Methods and Programs in Biomedicine, vol. 97, no. 3, pp. 199-210, 2010.
[17] L. Shi, D. Wang, P. Heng, T.-T. Wong, W.C.W. Chu, B.H.Y. Yeung, and J.C.Y. Cheng, "Landmark Correspondence Optimization for Coupled Surfaces," Proc. 10th Int'l Conf. Medical Image Computing and Computer-Assisted Intervention, pp. 818-825, 2007.
[18] R. Schestowitz, C.J. Twining, T. Cootes, V. Petrovic, C.J. Taylor, and W.R. Crum, "Assessing the Accuracy of Non-Rigid Registration with and Without Ground Truth," Proc. IEEE Third Int'l Symp. Biomedical Imaging: From Nano to Macro, pp. 836-839, 2006.
[19] A. Hero, B. Ma, O. Michel, and J. Gorman, "Applications of Entropic Spanning Graphs," IEEE Signal Processing Magazine, vol. 19, no. 5, pp. 85-95, Sept. 2002.
[20] A. Hero and O.J.J. Michel, "Estimation of Rényi Information Divergence via Pruned Minimal Spanning Trees," Proc. IEEE Signal Processing Workshop Higher Order Statistics, pp. 264-268, 1999.
[21] A. Hero, B. Ma, O. Michel, and J. Gorman, "Alpha-Divergence for Classification, Indexing and Retrieval," Technical Report CSPL-328, Comm. and Signal Processing Laboratory, The Univ. of Michigan, 2001.
[22] O. Michel, A. Hero, and P. Flandrin, "Graphes de Représentations Minimaux, Entropies et Divergences: Applications," Traitement du Signal, vol. 17, no. 4, pp. 287-297, 2000.
[23] N. Leonenko, L. Pronzato, and V. Savani, "A Class of Rényi Information Estimators for Multidimensional Densities," Annals of Statistics, vol. 36, no. 5, pp. 2153-2182, 2008.
[24] Q. Wang, S. Kulkarni, and S. Verdú, "Divergence Estimation for Multidimensional Densities via K-Nearest-Neighbor Distances," IEEE Trans. Information Theory, vol. 55, no. 5, pp. 2392-2405, May 2009.
[25] C.J. Twining and C.J. Taylor, "Specificity as a Graph-Based Estimator of Cross-Entropy and KL Divergence," Proc. 17th British Machine Vision Conf., vol. 2, pp. 459-468, 2006.
[26] R. Davies, C. Twining, and C. Taylor, Statistical Models of Shape: Optimisation and Evaluation. Springer, 2008.
[27] J.M. Steele, Probability Theory and Combinatorial Optimization. SIAM, 1997.
[28] J.M. Steele, "Growth Rates of Euclidean Minimal Spanning Trees with Power Weighted Edges," The Annals of Probability, vol. 16, no. 4, pp. 1767-1787, 1988.
[29] J. Beardwood, J.H. Halton, and J.M. Hammersley, "The Shortest Path through Many Points," Proc. Cambridge Philosophical Soc., vol. 55, no. 4, pp. 299-327, 1959.
[30] A.R. Wade, "Explicit Laws of Large Numbers for Random Nearest-Neighbour Type Graphs," Advances in Applied Probability, vol. 39, no. 2, pp. 326-342, 2007.
[31] K.A. Brakke, Statistics of random plane Voronoi tessellations, papers vorplane.pdf, 2005.
[32] E.G. Miller, "A New Class of Entropy Estimators for Multi-Dimensional Densities," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, vol. 3, pp. 297-300, 2003.
[33] T.F. Cootes, G.J. Edwards, and C.J. Taylor, "Active Appearance Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 681-685, June 2001.
[34] I. Craw and P. Cameron, "Parameterising Images for Recognition and Reconstruction," Proc. Second British Machine Vision Conf., pp. 367-370, 1991.
[35] A. Lanitis, C.J. Taylor, and T.F. Cootes, "Automatic Tracking, Coding and Reconstruction of Human Faces, Using Flexible Appearance Models," Electronics Letters, vol. 30, no. 19, pp. 1587-1588, 1994.
[36] D. Rueckert, A.F. Frangi, and J.A. Schnabel, "Automatic Construction of 3D Statistical Deformation Models of the Brain Using Non-Rigid Registration," IEEE Trans. Medical Imaging, vol. 22, no. 8, pp. 1014-1025, Aug. 2003.
[37] T.F. Cootes, C.J. Twining, K.O. Babalola, and C.J. Taylor, "Diffeomorphic Statistical Shape Models," Image and Vision Computing, vol. 26, no. 3, pp. 326-332, 2008.
[38] J.A. Costa and A.O. Hero, "Geodesic Entropic Graphs for Dimension and Entropy Estimation in Manifold Learning," IEEE Trans. Signal Processing, vol. 52, no. 8, pp. 2210-2221, Aug. 2004.
[39] D.O. Loftsgaarden and C.P. Quesenberry, "A Nonparametric Estimate of a Multivariate Density Function," Annals of Math. Statistics, vol. 36, no. 3, pp. 1049-1051, 1965.

Index Terms:
Specificity, generalization, assessment of modeling, graph-based estimators, entropy estimation, estimation of statistical distance, estimation of divergence, nearest-neighbor estimators, cross entropy, Kullback-Leibler divergence.
Carole J. Twining, Christopher J. Taylor, "Specificity: A Graph-Based Estimator of Divergence," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2492-2505, Dec. 2011, doi:10.1109/TPAMI.2011.90
Usage of this product signifies your acceptance of the Terms of Use.