
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Charu C. Aggarwal, "A HumanComputer Interactive Method for Projected Clustering," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 4, pp. 448460, April, 2004.  
BibTex  x  
@article{ 10.1109/TKDE.2004.1269669, author = {Charu C. Aggarwal}, title = {A HumanComputer Interactive Method for Projected Clustering}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {16}, number = {4}, issn = {10414347}, year = {2004}, pages = {448460}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2004.1269669}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Knowledge and Data Engineering TI  A HumanComputer Interactive Method for Projected Clustering IS  4 SN  10414347 SP448 EP460 EPD  448460 A1  Charu C. Aggarwal, PY  2004 KW  Highdimensional data mining KW  clustering KW  humancomputer interaction. VL  16 JA  IEEE Transactions on Knowledge and Data Engineering ER   
Abstract—Clustering is a central task in data mining applications such as customer segmentation. Highdimensional data has always been a challenge for clustering algorithms because of the inherent sparsity of the points. Therefore, techniques have recently been proposed to find clusters in hidden subspaces of the data. However, since the behavior of the data can vary considerably in different subspaces, it is often difficult to define the notion of a cluster with the use of simple mathematical formalizations. The widely used practice of treating clustering as the exact problem of optimizing an arbitrarily chosen objective function can often lead to misleading results. In fact, the proper clustering definition may vary not only with the application and data set but also with the perceptions of the end user. This makes it difficult to separate the definition of the clustering problem from the perception of an enduser. In this paper, we propose a system which performs highdimensional clustering by cooperation between the human and the computer. The complex task of cluster creation is accomplished through a combination of human intuition and the computational support provided by the computer. The result is a system which leverages the best abilities of both the human and the computer for solving the clustering problem.
[1] C.C. Aggarwal, A HumanComputer Cooperative System for Effective High Dimensional Clustering Proc. Knowledge Discovery and Data Mining Conf., pp. 221226, 2001.
[2] C.C. Aggarwal, C. Procopiuc, J. Wolf, P.S. Yu, and J.S. Park, Fast Algorithms for Projected Clustering Proc. ACM SIGMOD Conf., pp. 6172, 1999.
[3] C.C. Aggarwal and P.S. Yu, Finding Generalized Projected Clusters in High Dimensional Spaces Proc. ACM SIGMOD Conf., pp. 7081, 2000.
[4] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications Proc. ACM SIGMOD Conf., pp. 94105, 1998.
[5] R. Agrawal and R. Srikant, Fast Algorithms for Mining Association Rules in Large Databases Proc. Very Large Databases Conf., pp. 487499, 1994.
[6] A.V. Aho, J. Hopcroft, and J.D. Ullman, Data Structures and Algorithms. AddisonWesley, 1987.
[7] M. Ankerst, C. Elsen, M. Ester, and H.P. Kriegel, Visual Classification: An Interactive Approach to Decision Tree Construction Proc. ACM Knowledge Discovery and Data Mining Conf., pp. 392296, 1999.
[8] M. Ankerst, M. Ester, and H.P. Kriegel, Towards an Effective Cooperation of the User and the Computer for Classification Proc. ACM Knowledge Discovery and Data Mining Conf., pp. 179188, 2000.
[9] K. Beyer, R. Ramakrishnan, U. Shaft, and J. Goldstein, When Is Nearest Neighbor Meaningful? Proc. Int'l Conf. Database Theory, pp. 217235, 1999.
[10] J.C. Bezdek, J. Keller, R. Krisnapuram, and N.R. Pal, Fuzzy Models and Algorithms for Pattern Recognition and Image Processing, D. Dubois and H. Prade eds., Handbooks of Fuzzy Sets Series, Kluwer Academic, 1999.
[11] K. Chakrabarti and S. Mehrotra, Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces Proc. Very Large Databases Conf., pp. 89100, 2000.
[12] M. Ester, H.P. Kriegel, J. Sander, and X. Xu, A Density Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise Proc. ACM Knowledge Discovery and Data Mining Conf., pp. 226231, 1996.
[13] V. EstevillCastro, Collaborative Knowledge Acquisition with a Genetic Algorithm Proc. IEEE Int'l Conf. Tools with Artificial Intelligence, pp. 10823409, 1997.
[14] V. Ganti et al., "Clustering Large Datasets in Arbitrary Metric Spaces," Proc. 15th Int'l Conf. Data Eng., IEEE CS Press, Los Alamitos, Calif., 1999, pp. 502511.
[15] S. Guha, R. Rastogi, and K. Shim, CURE: An Efficient Clustering Algorithm for Large Databases Proc. ACM SIGMOD Conf., pp. 7384, 1998.
[16] S. Guha, R. Rastogi, and K. Shim, ROCK: A Robust Clustering Algorithm for Categorical Attributes Information Systems, vol. 25, no. 5, pp. 345366, 2000.
[17] S. Guha, N. Mishra, R. Motwani, and L. O'Callaghan, “Clustering Data Streams,” Proc. 41st Ann. Symp. Foundations of Computer Science, 2000.
[18] J. Han, L. Lakshmanan, and R. Ng, Constraint Based Multidimensional Data Mining Computer, vol. 32, no. 8, pp. 4650, Aug. 1999.
[19] A. Hinneburg and D.A. Keim, Optimal GridClustering: Towards Breaking the Curse of Dimensionality in HighDimensional Clustering Proc. Very Large Databases Conf., pp. 506517, 1999.
[20] A. Hinneburg, C.C. Aggarwal, and D.A. Keim, What Is the Nearest Neighbor in High Dimensional Spaces? Proc. Very Large Databases Conf., pp. 506515, 2000.
[21] A. Hinneburg, M. Wawryniuk, and D.A. Keim, "HDEye: Visual Mining of HighDimensional Data," IEEE Computer Graphics&Applications, vol. 19, no. 5, 1999, pp. 2231.
[22] Z. Huang, M.K. Ng, T. Lin, and D.W.L. Cheung, An Interactive Approach to Building Classification Models by Clustering and Cluster Validation Proc. Int'l Conf. Intelligent Data Eng. and Automated Learning, pp. 2328, 2000.
[23] Z. Huang and T. Lin, A Visual Method of Cluster Validation with Fastmap Proc. PacificAsia Conf. Knowledge Discovery and Data Mining, pp. 153164, 2000.
[24] A. Jain and R. Dubes, Algorithms for Clustering Data. Prentice Hall, 1998.
[25] I.T. Jolliffe, Principal Component Analysis. SpringerVerlag, 1986.
[26] R. Motwani and P. Raghavan, Randomized Algorithms. Cambridge Univ. Press, 1995.
[27] R. Ng and J. Han, Efficient and Effective Clustering Methods for Spatial Data Mining Proc. Very Large Databases Conf., pp. 144155, 1994.
[28] S. Sarawagi, UserAdaptive Exploration of Multidimensional Data Proc. Very Large Databases Conf., pp. 307316, 2000.
[29] B.W. Silverman, Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.
[30] A. Tung, J. Han, L. Lakshmanan, and R. Ng, Constraint Based Clustering in Large Databases Proc. Int'l Conf. Database Theory Conf., pp. 405419, 2001.
[31] X. Xu et al., "A DistributionBased Clustering Algorithm for Mining in Large Spatial Databases," Proc. 14th Int'l Conf. Data Eng., IEEE CS Press, 1998, pp. 324331.
[32] T. Zhang, R. Ramakrishnan, and M. Livny, BIRCH: An Efficient Data Clustering Method for Very Large Databases Proc. ACM SIGMOD Conf., pp. 103114, 1996.
[33] L. Yang, Interactive Exploration of Very Large Relational Databases Through 3D Dynamic Projections Proc. ACM Knowledge Discovery and Data Mining Conf., pp. 236243, 2000.