
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Anne Patrikainen, Marina Meila, "Comparing Subspace Clusterings," IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 7, pp. 902916, July, 2006.  
BibTex  x  
@article{ 10.1109/TKDE.2006.106, author = {Anne Patrikainen and Marina Meila}, title = {Comparing Subspace Clusterings}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {18}, number = {7}, issn = {10414347}, year = {2006}, pages = {902916}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2006.106}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Knowledge and Data Engineering TI  Comparing Subspace Clusterings IS  7 SN  10414347 SP902 EP916 EPD  902916 A1  Anne Patrikainen, A1  Marina Meila, PY  2006 KW  Subspace clustering KW  projected clustering KW  distance KW  feature selection KW  cluster validation. VL  18 JA  IEEE Transactions on Knowledge and Data Engineering ER   
[1] C. Aggarwal, C. Procopiuc, J. Wolf, P. Yu, and J. Park, “A Framework for Finding Projected Clusters in High Dimensional Spaces,” Proc. ACM SIGMOD Int'l Conf. Management of Data, 1999.
[2] C.C. Aggarwal and P.S. Yu, “Finding Generalized Projected Clusters in High Dimensional Spaces,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 7081, 2000, citeseer.nj.nec.comaggarwal00finding.html .
[3] K.Y. Yip, D.W. Cheung, and M.K. Ng, “HARP: A Practical Projected Clustering Algorithm,” IEEE Trans. Knowledge and Data Eng., vol. 16, no. 11, pp. 13871397, Nov. 2004.
[4] K.Y. Yip, D.W. Cheung, and M.K. Ng, “On Discovery of Extremely LowDimensional Clusters Using SemiSupervised Projected Clustering,” Proc. 21st Int'l Conf. Data Eng., 2005.
[5] P.K. Agarwal and N.H. Mustafa, “KMeans Projective Clustering,” Proc. 23rd ACM SIGACTSIGMODSIGART Symp. Principles of Database Systems, pp. 155165, 2004.
[6] C.M. Procopiuc, M.T. Jones, P.K. Agarwal, and T.M. Murali, “A Monte Carlo Algorithm for Fast Projective Clustering,” Proc. ACM SIGMOD Int'l Conf. Management of Data, 2002.
[7] Y. Kluger, R. Basri, J.T. Chang, and M. Gerstein, “Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions,” Genome Research, vol. 13, no. 4, pp. 703716, Apr. 2003.
[8] A. Banerjee, I. Dhillon, J. Ghosh, S. Merugu, and D.S. Modha, “A Generalized Maximum Entropy Approach to Bregman CoClustering and Matrix Approximation,” Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2004.
[9] H. Cho, I.S. Dhillon, Y. Guan, and S. Sra, “Minimum SumSquared Residue CoClustering of Gene Expression Data,” Proc. Fourth SIAM Int'l Conf. Data Mining, 2004.
[10] I.S. Dhillon, S. Mallela, and D.S. Modha, “InformationTheoretic CoClustering,” Proc. Ninth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2003.
[11] G. Getz, E. Levine, and E. Domany, “Coupled TwoWay Clustering Analysis of Gene Microarray Data,” Proc. Nat'l Academy of Sciences, vol. 97, pp. 1207912084, 2000.
[12] K. Pollard and M.J. van der Laan, “Statistical Inference for Simultaneous Clustering of Gene Expression Data,” Math. Biosciences, vol. 176, no. 1, pp. 99121, 2002.
[13] A. Hartigan, “Direct Clustering of a Data Matrix,” J. Am. Statistical Assoc., vol. 67, no. 337, pp. 123129, 1972.
[14] J.H. Friedman and J.J. Meulman, “Clustering Objects on Subsets of Attributes,” J. Royal Statistical Soc. B, vol. 66, pp. 125, 2004.
[15] E. Oja and J. Parkkinen, “On Subspace Clustering,” Proc. Seventh Int'l Conf. Pattern Recognition, 1984.
[16] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 94105, 1998, .
[17] C. Böhm, K. Kailing, P. Kröger, and A. Zimek, “Computing Clusters of Correlation Connected Objects,” Proc. 2004 ACM SIGMOD Int'l Conf. Management of Data, pp. 455466, 2004.
[18] C.H. Cheng, A.W.C. Fu, and Y. Zhang, “EntropyBased Subspace Clustering for Mining Numerical Data,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 8493, 1999, citeseer.nj.nec.com/agrawal98automatic.htmlciteseer.nj.nec.com/ articlecheng99entropybased. html .
[19] Y. Cheng and G.M. Church, “Biclustering of Expression Data,” Proc. Eighth Int'l Conf. Intelligent Systems for Molecular Biology, 2000.
[20] I.S. Dhillon, S. Mallela, and D.S. Modha, “InformationTheoretic CoClustering,” Proc. Ninth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2003.
[21] C. Domeniconi, D. Papadopoulos, D. Gunopulos, and S. Ma, “Subspace Clustering of High Dimensional Data,” Proc. Fourth SIAM Int'l Conf. Data Mining, 2004.
[22] K. Kailing, H.P. Kriegel, and P. Kröger, “DensityConnected Subspace Clustering for HighDimensional Data,” Proc. Fourth SIAM Int'l Conf. Data Mining, pp. 246257, 2004.
[23] J. Liu, W. Wang, and J. Yang, “A Framework for OntologyDriven Subspace Clustering,” Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2004.
[24] A.A. Melkman and E. Shaham, “Sleeved Coclustering,” Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2004.
[25] H. Nagesh, S. Goil, and A. Choudhary, “MAFIA: Efficient and Scalable Subspace Clustering for Very Large Data Sets,” Technical Report 9906010, Northwestern Univ., 1999.
[26] A. Patrikainen and H. Mannila, “Subspace Clustering of High Dimensional Binary Data— A Probabilistic Approach,” Proc. Fourth SIAM Int'l Conf. Data Mining, Workshop Clustering High Dimensional Data and Its Applications, 2004.
[27] A. Tanay, R. Sharan, and R. Shamir, “Discovering Statistically Significant Biclusters in Gene Expression Data,” Bioinformatics, vol. 18, pp. 136144, 2002.
[28] J. Yang, W. Wang, H. Wang, and P.S. Yu, “DeltaCluster: Capturing Subspace Correlation in a Large Data Set,” Proc. 18th IEEE Int'l Conf. Data Eng., pp. 517528, 2002, citeseer.ist.psu.edu566513.html.
[29] L. Parsons, E. Haque, and H. Liu, “Evaluating Subspace Clustering Algorithms,” Proc. Fourth SIAM Int'l Conf. Data Mining, Workshop Clustering High Dimensional Data and Its Applications, 2004.
[30] L. Parsons, E. Haque, and H. Liu, “Subspace Clustering for High Dimensional Data: A Review,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, 2004.
[31] K.Y. Yip, M.K. Ng, and D.W. Cheung, “A Review on Projected Clustering Algorithms,” Int'l J. Applied Math., vol. 13, pp. 3547, 2003.
[32] S. Theodoridis and K. Koutroumbas, Pattern Recognition. Academic Press, 1999.
[33] M. Meila, “Comparing Clusterings by the Variation of Information,” Proc. 16th Ann. Conf. Computational Learning Theory, pp. 173187, 2003.
[34] W.M. Rand, “Objective Criteria for the Evaluation of Clustering Methods,” J. Am. Statistical Assoc., vol. 66, pp. 846850, 1971.
[35] B. Mirkin, Mathematical Classification and Clustering. Kluwer Academic Press, 1996.
[36] D.L. Wallace, “Comment,” J. Am. Statistical Assoc., vol. 78, no. 383, pp. 569576, 1983.
[37] T. Lange, V. Roth, M.L. Braun, and J.M. Buhmann, “StabilityBased Validation of Clustering Solutions,” Neural Computation, vol. 16, pp. 12991323, 2004.
[38] S. Dudoit and J. Fridlyand, “A PredictionBased Resampling Method for Estimating the Number of Clusters in a Dataset,” Genome Biology, vol. 3, no. 7, 2002.
[39] E. Levine and E. Domany, “Resampling Method for Unsupervised Estimation of Cluster Validity,” Neural Computation, vol. 13, pp. 25732593, 2001.
[40] A. Gionis, H. Mannila, and P. Tsaparas, “Clustering Aggregation,” Proc. 21st Int'l Conf. Data Eng., 2005.
[41] A. Strehl and J. Ghosh, “Cluster Ensembles— A Knowledge Reuse Framework for Combining Multiple Partititons,” J. Machine Learning Research, vol. 3, pp. 583617, 2002.
[42] A. Topchy, A. Jain, and W. Punch, “A Mixture Model for Clustering Ensembles,” Proc. Fourth SIAM Int'l Conf. Data Mining, 2004.
[43] A. Topchy, M.H. Law, A.K. Jain, and A. Fred, “Analysis of Consensus Partition in Cluster Ensemble,” Proc. Fourth IEEE Int'l Conf. Data Mining, 2004.
[44] A. Topchy, B. MinaeiBidgoli, A. Jain, and W. Punch, “Adaptive Clustering Ensembles,” Proc. 17th Int'l Conf. Pattern Recognition, pp. 272275, 2004.
[45] P. Artigas, A. Goldenberg, A. Likhodedov, and R. Caruana, “Meta Clustering,” http://www2.cs.cmu.edu/~artigas/classproj mlproj.ps, 2000.
[46] A.K. Jain, A. Topchy, M.H. Law, and J. Buhmann, “Landscape of Clustering Algorithms,” Proc. 17th Int'l Conf. Pattern Recognition, pp. 260263, 2004.
[47] C.H. Papadimitriou and K. Steiglitz, Combinatorial Optimization, Algorithms and Complexity. PrenticeHall, 1982.
[48] E.B. Fowlkes and C.L. Mallows, “A Method for Comparing Two Hierarchical Clusterings,” J. Am. Statistical Assoc., vol. 78, no. 383, pp. 553569, 1983.
[49] P. Jaccard, “The Distribution of Flora in the Alpine Zone,” The New Phytologist, vol. 11, no. 2, pp. 3750, 1912.
[50] M. Meila, “Comparing Clusterings— An Axiomatic View,” Proc. 22nd Int'l Conf. Machine Learning (ICML '05), L.D. Raedt and S. Wrobel, eds., vol. 22, 2005.
[51] C.V. Rijsbergen, Information Retrieval, second ed. Butterworths, 1979.
[52] A. Patrikainen and M. Meila, “Comparing Subspace Clusterings,” Technical Report UWCSE2004101, Univ. Washington, 2004.
[53] M.E. Argentati, “Principal Angles between Subspaces,” http://wwwmath.cudenver.edu/~aknyazev/teaching/ ricotalk_defense.pdf, 2006.
[54] A. Björk and G.H. Golub, “Numerical Methods for Computing Angles between Linear Subspaces,” Math. Computation, vol. 27, pp. 579594, 1973.
[55] Z. Drmac, “On Principal Angles between Subspaces of Euclidean Space,” Siam J. Matrix Analysis Applications, vol. 22, no. 1, pp. 173194, 2000.
[56] K.Y. Yip, “HARP: A Practical Projected Clustering Algorithm for Mining Gene Expression Data,” master's thesis, The University of Hong Kong, Pokfulam Road, Hong Kong, http://www.csis. hku.hk/~ylyip/papersthesis.pdf , 2004.
[57] C. Yang, U. Fayyad, and P.S. Bradley, “Efficient Discovery of ErrorTolerant Frequent Itemsets in High Dimensions,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2000.
[58] J.K. Seppänen and H. Mannila, “Dense Itemsets,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2004.
[59] A. Gionis, H. Mannila, and J.K. Seppänen, “Geometric and Combinatorial Tiles in 01 Data,” Proc. European Conf. Principles and Practice of Knowledge Dicovery in Databases, 2004.
[60] J. Besson, C. Robardet, and J.F. Boulicaut, “Mining AlphaBeta Concepts as Relevant BiSets from Transactional Data,” Proc. Third Int'l Workshop Knowledge Discovery in Inductive Databases (KDID '04), 2004.
[61] N. Mishra, D. Ron, and R. Swaminathan, “A New Conceptual Clustering Framework,” Machine Learning, vol. 56, nos. 13, pp. 115151, 2004.
[62] A. Kaban, E. Bingham, and T. Hirsimäki, “Learning to Read between the Lines: The Aspect Bernoulli Model,” Proc. Fourth SIAM Int'l Conf. Data Mining, pp. 462466, 2004.
[63] A. BenHur, A. Elisseeff, and I. Guyon, “A Stability Based Method for Discovering Structure in Clustered Data,” Proc. Pacific Symp. Biocomputing, pp. 617, 2002.