Subscribe

Issue No.11 - November (2009 vol.21)

pp: 1515-1531

Xiao-Feng Wang , Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei Anhui

De-Shuang Huang , Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei Anhui

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2009.21

ABSTRACT

In this paper, a new density-based clustering framework is proposed by adopting the assumption that the cluster centers in data space can be regarded as target objects in image space. First, the level set evolution is adopted to find an approximation of cluster centers by using a new initial boundary formation scheme. Accordingly, three types of initial boundaries are defined so that each of them can evolve to approach the cluster centers in different ways. To avoid the long iteration time of level set evolution in data space, an efficient termination criterion is presented to stop the evolution process in the circumstance that no more cluster centers can be found. Then, a new effective density representation called level set density (LSD) is constructed from the evolution results. Finally, the valley seeking clustering is used to group data points into corresponding clusters based on the LSD. The experiments on some synthetic and real data sets have demonstrated the efficiency and effectiveness of the proposed clustering framework. The comparisons with DBSCAN method, OPTICS method, and valley seeking clustering method further show that the proposed framework can successfully avoid the overfitting phenomenon and solve the confusion problem of cluster boundary points and outliers.

INDEX TERMS

Density-based clustering, initial boundary, level set method, level set density, valley seeking clustering.

CITATION

Xiao-Feng Wang, De-Shuang Huang, "A Novel Density-Based Clustering Framework by Using Level Set Method",

*IEEE Transactions on Knowledge & Data Engineering*, vol.21, no. 11, pp. 1515-1531, November 2009, doi:10.1109/TKDE.2009.21REFERENCES

- [1] J.W. Han and M. Kamber,
Data Mining Concepts and Techniques, second ed. Morgan Kaufmann Publishers, 2006.- [3] P.H.A. Sneath and R.R. Sokal,
Numerical Taxonomy. Freeman, 1973.- [4] B. King, “Step-Wise Clustering Procedures,”
J. Am. Statistical Assoc., vol. 69, pp. 86-101, 1967.- [5] J. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,”
Proc. Fifth Berkeley Symp. Math. Statistics and Probability, vol. 1, pp. 281-297, 1967.- [6] H. Vinod, “Integer Programming and the Theory of Grouping,”
J.Am. Statistical Assoc., vol. 64, pp. 506-517, 1969.- [8] W.K. Liao, Y. Liu, and A.N. Choudhary, “A Grid-Based Clustering Algorithm Using Adaptive Mesh Refinement,”
Proc. Seventh Workshop Mining Scientific and Eng. Datasets, pp. 61-69, 2004.- [9] W. Wang, J. Yang, and R. Muntz, “STING: A Statistical Information Grid Approach to Spatial Data Mining,”
Proc. 23rd Conf. Very Large Databases, pp. 186-195, 1997.- [10] G. Sheikholeslami, S. Chatterjee, and A. Zhang, “WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases,”
Proc. 1998 Conf. Very Large Databases, pp. 428-439, 1998.- [13] M. Ester, H. Kriegel, J. Sander, and X. Xu, “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise,”
Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 226-231, 1996.- [14] J. Sander, M. Ester, H. Kriegel, and X. Xu, “Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications,”
Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 169-194, 1998.- [15] M. Ankerst, M. Breunig, H.P. Kriegel, and J. Sander, “OPTICS: Ordering Points to Identify the Clustering Structure,”
Proc. ACM SIGMOD '99, pp. 49-60, 1999.- [16] K. Fukunaga,
Introduction to Statistical Pattern Recognition, second ed. Boston Academic Press, 1990.- [17] A. Hinneburg and D.A. Keim, “An Efficient Approach to Clustering in Large Multimedia Databases with Noise,”
Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 58-65, 1998.- [18] T.D. Pham, “Image Segmentation Using Probabilistic Fuzzy C-Means Clustering,”
Proc. Int'l Conf. Image Processing, pp. 722-725, 2001.- [20] T. Geraud, P.Y. Strub, and J. Darbon, “Color Image Segmentation Based on Automatic Morphological Clustering,”
Proc. IEEE Int'l Conf. Image Processing, pp. 70-73, 2001.- [23] M. Sonka, V. Hlavac, and R. Boyle,
Image Processing, Analysis and Machine Vision, second ed. Posts & Telecom Press, 2002.- [26] F. Meyer and S. Beucher, “Morphology Segmentation,”
J. Visual Comm. and Image Representation, vol. 1, no. 1, pp. 21-26, 1990.- [28] S. Osher and J.A. Sethian, “Fronts Propagating with Curvature-Dependent Speed: Algorithms Based on Hamilton-Jacobi Formulations,”
J. Computational Physics, vol. 79, no. 1, pp. 12-49, 1988.- [29] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active Contour Models,”
Int'l J. Computer Vision, vol. 1, no. 4, pp. 321-331, 1987.- [30] Y.H. Tsai and S. Osher, “Total Variation and Level Set Based Methods in Image Science,”
Acta Numerica, pp. 1-61, 2005.- [31] V. Caselles, R. Kimmel, and G. Sapiro, “Geodesic Active Contours,”
Proc. IEEE Int'l Conf. Computer Vision, pp. 694-699, 1995.- [33] J.A. Sethian,
Level Set Methods and Fast Marching Methods. Cambridge Univ. Press, 1999.- [35] N. Paragios, O. Mellina-Gottardo, and V. Ramesh, “Gradient Vector Flow Fast Geodesic Active Contours,”
Proc. IEEE Int'l Conf. Computer Vision, vol. 1, pp. 67-73, 2001.- [39] A.K.H. Tung, X. Xu, and C.B. Ooi, “CURLER: Finding and Visualizing Nonlinear Correlation Clusters,”
Proc. ACM SIGMOD '05, pp. 467-478, 2005. |