This Article 
 Bibliographic References 
 Add to: 
Clustering by Scale-Space Filtering
December 2000 (vol. 22 no. 12)
pp. 1396-1410

Abstract—In pattern recognition and image processing, the major application areas of cluster analysis, human eyes seem to possess a singular aptitude to group objects and find important structures in an efficient and effective way. Thus, a clustering algorithm simulating a visual system may solve some basic problems in these areas of research. From this point of view, we propose a new approach to data clustering by modeling the blurring effect of lateral retinal interconnections based on scale space theory. In this approach, a data set is considered as an image with each light point located at a datum position. As we blur this image, smaller light blobs merge into larger ones until the whole image becomes one light blob at a low enough level of resolution. By identifying each blob with a cluster, the blurring process generates a family of clusterings along the hierarchy. The advantages of the proposed approach are: 1) The derived algorithms are computationally stable and insensitive to initialization and they are totally free from solving difficult global optimization problems. 2) It facilitates the construction of new checks on cluster validity and provides the final clustering a significant degree of robustness to noise in data and change in scale. 3) It is more robust in cases where hyperellipsoidal partitions may not be assumed. 4) It is suitable for the task of preserving the structure and integrity of the outliers in the clustering process. 5) The clustering is highly consistent with that perceived by human eyes. 6) The new approach provides a unified framework for scale-related clustering algorithms recently derived from many different fields such as estimation theory, recurrent signal processing on self-organization feature maps, information theory and statistical mechanics, and radial basis function neural networks.

[1] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis. New York: Wiley-Interscience, 1974.
[2] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Englewood Cliffs, N.J.: Prentice Hall, 1988.
[3] M. Blatt, S. Wiseman, and E. Domany, “Data Clustering Using a Model Granular Magnet,” Neural Computation, vol. 9, pp. 1,805-1,847, 1997.
[4] R. Dubes and A.K. Jain, “Clustering Techniques: The User's Dilemma,” Pattern Recognition, vol. 8, pp. 247-260, 1976.
[5] L. Hubert, “Approximate Evaluation Technique for the Single-Link and Complete-Link Hierarchical Clustering Procedure,” J. Am. Statistical Assoc., vol. 69, p. 968, 1974.
[6] H.P. Friedman and J. Robin, “On Some Invariant Criteria for Grouping Data,” J. Am. Statistical Assoc., vol. 62, p. 1,159 1967.
[7] S.C. Johnson, “Hierarchical Clustering Scheme,” Psychometrika, vol. 32, p. 241, 1967.
[8] C.T. Zahn, “Graphic-Theoretic Methods for Detecting and Describing Gestalt Clusters,” IEEE Trans. Computers, vol. 20, pp. 68-86, 1971.
[9] D. Miller and K. Rose, “Hierarchical, Unsupervised Learning with Growing via Phase Transitions,” Neural Computation, vol. 8, pp. 425-450, 1996.
[10] J. Waldemark, “An Automated Procedure for Cluster Analysis of Multivariate Satellite Data,” Int'l J. Neural Systems, vol. 8, no. 1, pp. 3-15, 1997.
[11] S.J. Roberts, “Parametric and Nonparametric Unsupervised Clustering Analysis,” Pattern Recognition, vol. 30, no. 2, pp. 261-272, 1997.
[12] P. Taven, H. Grubmuller, and H. Huhnel, “Self-Organization of Associative Memory and Pattern Classification: Recurrent Signal Processing on Topological Feature Maps,” Biological Cybernetics, vol. 64, pp. 95-105, 1990.
[13] R. Wilson and M. Spann, “A New Approach to Clustering,” Pattern Recognition, vol. 23, no. 12, pp. 1413-1425, 1990.
[14] Y.F. Wong, “Clustering Data by Melting,” Neural Computation, vol. 5, no. 1, pp. 89-104, 1993.
[15] S.V. Chakravarthy and J. Ghosh, Scale-Based Clustering Using the Radial Basis Function Network IEEE Trans. Neural Networks, vol. 7, pp. 1250-1261, 1996.
[16] M.R. Anderberg, Cluster Analysis for Applications. New York: Academic Press, 1973.
[17] G. Ball and D. Hall, “A Clustering Technique for Summarizing Multivariate Data,” Behavioral Science, vol. 12, pp. 153-155, 1976.
[18] J.C. Bezdek, “A Convergence Theorem for the Fuzzy ISODATA Clustering Algorithms,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 2, pp. 1-8, 1980.
[19] S. Kirpatrick, C.D. Gelatt, and M.P. Vecchi, “Optimization by Simulated Annealing,” Science, vol. 220, pp. 671-680, 1983.
[20] K. Rose, E. Gurewitz, and G.C. Fox, A Deterministic Annealing Approach to Clustering Pattern Recognition Letters, vol. 11, no. 9, pp. 589-594, 1990.
[21] G. Celeux and G. Govaert, “A Classification EM Algorithm for Clustering and Two Stochastic Versions,” Computational Statistics and Data Analysis, vol. 14, pp. 315-332, 1992.
[22] A.P. Witkin, “Scale Space Filtering,” Proc. Int'l Joint Conf. Artificial Intelligence, pp. 1,019-1,022, 1983.
[23] A.P. Witkin, “Scale Space Filtering: A New Approach to MultiScale Description,” Image Understanding, S. Ullman and W. Richards, eds., Norwood, N.J.: Ablex, 1984.
[24] J.J. Koenderink, “The Structure of Images,” Biological Cybernetics, vol. 50, pp. 363-370, 1984.
[25] J. Babaud, A. Witkin, M. Baudin, and R. Duda, "Uniqueness of the Gaussian Kernel for Scale-Space Filtering," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, pp. 26-33, Jan. 1986.
[26] A. Yuille and T. Poggio, "Scaling Theorems for Zero Crossings," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, pp. 15-26, Jan. 1986.
[27] D. Marr, Vision, A Computational Investigation into the Human Representation. San Francisco: W.H. Freeman, 1982.
[28] R.A. Hummel and A.R. Moniot, "Reconstructions From Zero Crossings in Scale-Space," IEEE Trans. Acoust., Speech, Signal Processing, Dec. 1989.
[29] D.H. Hubel, Eye, Brain, and Vision. New York: Scientific Am. Library, 1995.
[30] S. Coren, L.M. Ward, and J.T. Enns, Sensation and Perception. Harcourt Brace College Publishers, 1994.
[31] B. Everitt, Cluster Analysis. New York: Wiley, 1974.
[32] E.L. Allgower and K. Georg, Numerical Continuation Methods: An Introduction. New York: Springer 1990.
[33] F. Mulier and V. Cherkassky, “Self-Organization as an Iterative Kernel Smoothing Process,” Neural Computation, vol. 7, pp. 1,165-1,177, 1995.
[34] E.A. Nadaraya, “On Estimating Regression,” Theory Probability Application, vol. 74, pp. 743-750, 1964.
[35] G.S. Watson, “Smooth Regression Analysis,” Sankhya, series A, vol. 26, pp. 359-372, 1964.
[36] T. Lindeberg,“Scale-space for discrete signals,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 3, pp. 234–254, 1990.
[37] L.M. Lifshitz and S.M. Pizer, “A Multiresolution Hierarchical Approach to Image Segmentation Based on Intensity Extrema,” internal report, Dept. of Computer Science and Radiology, Univ. of North Carolina, Chapel Hill, 1987.
[38] S. Roberts, D. Husmeier, I. Rezek, and W. Penny, Bayesian Approaches to Gaussian Mixture Modeling IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 11, Nov. 1998.
[39] I. Gath and A.B. Geva, Unsupervised Optimal Fuzzy Clustering IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, pp. 773-781, 1989.

Index Terms:
Hierarchical clustering, scale space theory, cluster validity.
Yee Leung, Jiang-She Zhang, Zong-Ben Xu, "Clustering by Scale-Space Filtering," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1396-1410, Dec. 2000, doi:10.1109/34.895974
Usage of this product signifies your acceptance of the Terms of Use.