
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Raymond T. Ng, Jiawei Han, "CLARANS: A Method for Clustering Objects for Spatial Data Mining," IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 5, pp. 10031016, September/October, 2002.  
BibTex  x  
@article{ 10.1109/TKDE.2002.1033770, author = {Raymond T. Ng and Jiawei Han}, title = {CLARANS: A Method for Clustering Objects for Spatial Data Mining}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {14}, number = {5}, issn = {10414347}, year = {2002}, pages = {10031016}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2002.1033770}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Knowledge and Data Engineering TI  CLARANS: A Method for Clustering Objects for Spatial Data Mining IS  5 SN  10414347 SP1003 EP1016 EPD  10031016 A1  Raymond T. Ng, A1  Jiawei Han, PY  2002 KW  Spatial data mining KW  clustering algorithms KW  randomized search KW  computational geometry. VL  14 JA  IEEE Transactions on Knowledge and Data Engineering ER   
Abstract—Spatial data mining is the discovery of interesting relationships and characteristics that may exist implicitly in spatial databases. To this end, this paper has three main contributions. First, we propose a new clustering method called CLARANS, whose aim is to identify spatial structures that may be present in the data. Experimental results indicate that, when compared with existing clustering methods, CLARANS is very efficient and effective. Second, we investigate how CLARANS can handle not only points objects, but also polygon objects efficiently. One of the methods considered, called the IRapproximation, is very efficient in clustering convex and nonconvex polygon objects. Third, building on top of CLARANS, we develop two spatial data mining algorithms that aim to discover relationships between spatial and nonspatial attributes. Both algorithms can discover knowledge that is difficult to find with existing spatial data mining algorithms.
[1] R. Aggrawal et al., "Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications," Proc. ACM SIGMOD Int'l Conf. Management of Data, ACM Press, 1998, pp. 94105.
[2] R. Agrawal, S. Ghosh, T. Imielinski, B. Iyer, and A. Swami, “An Interval Classifier for Database Mining Applications,” Proc. 18th Conf. Very Large Databases, pp. 560–573, 1992.
[3] R. Agrawal, T. Imielinski, and A. Swami, “Mining Association Rules Between Sets of Items in Large Databases,” Proc. 1993 ACMSIGMOD Int'l Conf. Management of Data, pp. 207216, May 1993.
[4] M. Ankerst, M. Breunig, H.P. Kriegel, and J. Sander, “OPTICS: Ordering Points To Identify the Clustering Structure,” Proc. 1999 ACM Special Interest Group on Management of Data, pp. 49–60, 1999.
[5] W.G. Aref and H. Samet, “Optimization Strategies for Spatial Query Processing,” Proc. 17th Conf. Very Large Databases, pp. 8190, 1991.
[6] A. Borgida and R. J. Brachman, “Loading Data into Description Reasoners,” Proc. 1993 ACM Special Interest Group on Management of Data, pp. 217–226, 1993.
[7] P. Bradley, U. Fayyad, and C. Reina, “Scaling Clustering Algorithms to Large Databases,” Proc. Fourth Int'l Conf. Knowledge Discovery and Data Mining, pp. 9–15, 1998.
[8] T. Brinkhoff, H.P. Kriegel, and B. Seeger, “Efficient Processing of Spatial Joins Using Rtrees,” Proc. ACM SIGMOD Conf. Management of Data, 1993.
[9] D. Dobkin and D. Kirkpatrick, “A Linear Algorithm for Determining the Separation of Convex Polyhedra,” J. Algorithms, vol. 6, no. 3, pp. 381–392, 1985.
[10] M. Ester, H. Kriegel, and X. Xu, “Knowledge Discovery in Large Spatial Databases: Focusing Techniques for Efficient Class Identification,” Proc. Fourth Int'l Symp. Large Spatial Databases (SSD '95), pp. 67–82, 1995.
[11] M. Ester, H. Kriegel, J. Sander, and X. Xu, “A DensityBased Algorithm for Discovering Large Clusters in Large Spatial Databases with Noise,” Proc: Second Int'l Conf. Knowledge Discovery and Data Mining, 1996.
[12] S. Guha, R. Rastogi, and K. Shim, CURE: An Efficient Clustering Algorithm for Large Databases Proc. ACM SIGMOD, pp. 7384, June 1998.
[13] O. Günther, “Efficient Computation of Spatial Joins,” Proc. Ninth Conf. Data Eng., pp. 5060, 1993.
[14] J. Han, Y. Cai, and N. Cercone, “Knowledge Discovery in Databases: an AttributeOriented Approach,” Proc. 18th Conf. Very Large Databases, pp. 547–559, 1992.
[15] A. Hinneburg and D. A. Keim, “An Efficient Approach to Clustering in Large Multimedia Databases with Noise,” Proc. 1998 Int'l Conf. Knowledge Discovery and Data Mining, pp. 58–65, 1998.
[16] Y.E. Ioannidis and Y.C. Kang,“Randomized algorithms for optimizing large join queries,” Proc. ACMSIGMOD Conf., vol. 19, pp. 312321, 1990.
[17] Y.E. Ioannidis and E. Wong,“Query optimization by simulated annealing,” Proc. ACMSIGMOD Conf., pp. 922, 1987.
[18] G. Karypis, EH. Han, and V. Kumar, "Chameleon: A Hierarchical Clustering Algorithm Using Dynamic Modeling," Computer, Aug. 1999, pp. 6875.
[19] L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley&Sons, 1990.
[20] D.A. Keim, H.P. Kriegel, and T. Seidl, “Supporting Data Mining of Large Databases by Visual Feedback Queries,” Proc. 10th Int'l Conf. Data Eng., pp. 302313, 1994.
[21] D. Kirkpatrick and J. Snoeyink, “Tentative PruneandSearch for Computing FixedPoints with Applications to Geometric Computation,” Proc. Ninth ACM Symp. Computational Geometry, pp. 133–142, 1993.
[22] R. Laurini and D. Thompson, Fundamentals of Spatial Information Systems. Academic Press, 1992.
[23] W. Lu, J. Han, and B. Ooi, “Discovery of General Knowledge in Large Spatial Databases,” Proc. Far East Workshop Geographic Information Systems, pp. 275–289, 1993.
[24] G. Milligan and M. Cooper, “An Examination of Procedures for Determining the Number of Clusters in a Data Set,” Psychometrika, vol. 50, pp. 159–179, 1985.
[25] R.T. Ng and J. Han, "Efficient and Effective Clustering Methods for Spatial Data Mining," Proc. 20th Int'l Conf. Very Large Databases, Morgan Kaufmann, 1994, pp. 144155.
[26] G. PiatetskyShapiro and W.J. Frawley, Knowledge Discovery in Databases. AAAI/MIT Press, 1991.
[27] F.P. Preparata and M.I. Shamos, Computational Geometry. SpringerVerlag, 1985.
[28] H. Samet, The Design and Analysis of Spatial Data Structures. AddisonWesley, 1990.
[29] G. Sheikholeslami, S. Chatterjee, and A. Zhang, WaveCluster: A MultiResolution Clustering Approach for Very Large Spatial Databases Proc. Very Large Date Bases Conf., pp. 428439, Aug. 1998.
[30] H. Spath, Cluster Dissection and Analysis: Theory, FORTRAN programs, Examples. Ellis Horwood Ltd., 1985.
[31] W. Wang, J. Yang, and R.R. Muntz, "Sting: A Statistical Information Grid Approach to Spatial Data Mining," Proc. 23rd Int'l Conf. Very Large Databases, Morgan Kaufmann, 1997, pp. 186195.
[32] Y. Yu, “Finding Strong, Common and Discriminating Characteristics of Clusters from Thematic Maps,” MSc Thesis, Dept. of Computer Science, Univ. of British Columbia, 1996.
[33] T. Zhang, R. Ramakrishnan, and M. Livny, "Birch: An Efficient Data Clustering Method for Very Large Databases," Proc. ACM SIGMOD Int'l Conf. Management of Data, ACM Press, 1996, pp. 103114.