|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Anne Patrikainen, Marina Meila, "Comparing Subspace Clusterings," IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 7, pp. 902-916, July, 2006. | |||
| BibTex | x | ||
| @article{ 10.1109/TKDE.2006.106, author = {Anne Patrikainen and Marina Meila}, title = {Comparing Subspace Clusterings}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {18}, number = {7}, issn = {1041-4347}, year = {2006}, pages = {902-916}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2006.106}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Knowledge and Data Engineering TI - Comparing Subspace Clusterings IS - 7 SN - 1041-4347 SP902 EP916 EPD - 902-916 A1 - Anne Patrikainen, A1 - Marina Meila, PY - 2006 KW - Subspace clustering KW - projected clustering KW - distance KW - feature selection KW - cluster validation. VL - 18 JA - IEEE Transactions on Knowledge and Data Engineering ER - | |||
[1] C. Aggarwal, C. Procopiuc, J. Wolf, P. Yu, and J. Park, “A Framework for Finding Projected Clusters in High Dimensional Spaces,” Proc. ACM SIGMOD Int'l Conf. Management of Data, 1999.
[2] C.C. Aggarwal and P.S. Yu, “Finding Generalized Projected Clusters in High Dimensional Spaces,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 70-81, 2000, citeseer.nj.nec.comaggarwal00finding.html .
[3] K.Y. Yip, D.W. Cheung, and M.K. Ng, “HARP: A Practical Projected Clustering Algorithm,” IEEE Trans. Knowledge and Data Eng., vol. 16, no. 11, pp. 1387-1397, Nov. 2004.
[4] K.Y. Yip, D.W. Cheung, and M.K. Ng, “On Discovery of Extremely Low-Dimensional Clusters Using Semi-Supervised Projected Clustering,” Proc. 21st Int'l Conf. Data Eng., 2005.
[5] P.K. Agarwal and N.H. Mustafa, “K-Means Projective Clustering,” Proc. 23rd ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, pp. 155-165, 2004.
[6] C.M. Procopiuc, M.T. Jones, P.K. Agarwal, and T.M. Murali, “A Monte Carlo Algorithm for Fast Projective Clustering,” Proc. ACM SIGMOD Int'l Conf. Management of Data, 2002.
[7] Y. Kluger, R. Basri, J.T. Chang, and M. Gerstein, “Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions,” Genome Research, vol. 13, no. 4, pp. 703-716, Apr. 2003.
[8] A. Banerjee, I. Dhillon, J. Ghosh, S. Merugu, and D.S. Modha, “A Generalized Maximum Entropy Approach to Bregman Co-Clustering and Matrix Approximation,” Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2004.
[9] H. Cho, I.S. Dhillon, Y. Guan, and S. Sra, “Minimum Sum-Squared Residue Co-Clustering of Gene Expression Data,” Proc. Fourth SIAM Int'l Conf. Data Mining, 2004.
[10] I.S. Dhillon, S. Mallela, and D.S. Modha, “Information-Theoretic Co-Clustering,” Proc. Ninth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2003.
[11] G. Getz, E. Levine, and E. Domany, “Coupled Two-Way Clustering Analysis of Gene Microarray Data,” Proc. Nat'l Academy of Sciences, vol. 97, pp. 12079-12084, 2000.
[12] K. Pollard and M.J. van der Laan, “Statistical Inference for Simultaneous Clustering of Gene Expression Data,” Math. Biosciences, vol. 176, no. 1, pp. 99-121, 2002.
[13] A. Hartigan, “Direct Clustering of a Data Matrix,” J. Am. Statistical Assoc., vol. 67, no. 337, pp. 123-129, 1972.
[14] J.H. Friedman and J.J. Meulman, “Clustering Objects on Subsets of Attributes,” J. Royal Statistical Soc. B, vol. 66, pp. 1-25, 2004.
[15] E. Oja and J. Parkkinen, “On Subspace Clustering,” Proc. Seventh Int'l Conf. Pattern Recognition, 1984.
[16] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 94-105, 1998, .
[17] C. Böhm, K. Kailing, P. Kröger, and A. Zimek, “Computing Clusters of Correlation Connected Objects,” Proc. 2004 ACM SIGMOD Int'l Conf. Management of Data, pp. 455-466, 2004.
[18] C.H. Cheng, A.W.-C. Fu, and Y. Zhang, “Entropy-Based Subspace Clustering for Mining Numerical Data,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 84-93, 1999, citeseer.nj.nec.com/agrawal98automatic.htmlciteseer.nj.nec.com/ articlecheng99entropybased. html .
[19] Y. Cheng and G.M. Church, “Biclustering of Expression Data,” Proc. Eighth Int'l Conf. Intelligent Systems for Molecular Biology, 2000.
[20] I.S. Dhillon, S. Mallela, and D.S. Modha, “Information-Theoretic Co-Clustering,” Proc. Ninth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2003.
[21] C. Domeniconi, D. Papadopoulos, D. Gunopulos, and S. Ma, “Subspace Clustering of High Dimensional Data,” Proc. Fourth SIAM Int'l Conf. Data Mining, 2004.
[22] K. Kailing, H.-P. Kriegel, and P. Kröger, “Density-Connected Subspace Clustering for High-Dimensional Data,” Proc. Fourth SIAM Int'l Conf. Data Mining, pp. 246-257, 2004.
[23] J. Liu, W. Wang, and J. Yang, “A Framework for Ontology-Driven Subspace Clustering,” Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2004.
[24] A.A. Melkman and E. Shaham, “Sleeved Coclustering,” Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2004.
[25] H. Nagesh, S. Goil, and A. Choudhary, “MAFIA: Efficient and Scalable Subspace Clustering for Very Large Data Sets,” Technical Report 9906-010, Northwestern Univ., 1999.
[26] A. Patrikainen and H. Mannila, “Subspace Clustering of High Dimensional Binary Data— A Probabilistic Approach,” Proc. Fourth SIAM Int'l Conf. Data Mining, Workshop Clustering High Dimensional Data and Its Applications, 2004.
[27] A. Tanay, R. Sharan, and R. Shamir, “Discovering Statistically Significant Biclusters in Gene Expression Data,” Bioinformatics, vol. 18, pp. 136-144, 2002.
[28] J. Yang, W. Wang, H. Wang, and P.S. Yu, “Delta-Cluster: Capturing Subspace Correlation in a Large Data Set,” Proc. 18th IEEE Int'l Conf. Data Eng., pp. 517-528, 2002, citeseer.ist.psu.edu566513.html.
[29] L. Parsons, E. Haque, and H. Liu, “Evaluating Subspace Clustering Algorithms,” Proc. Fourth SIAM Int'l Conf. Data Mining, Workshop Clustering High Dimensional Data and Its Applications, 2004.
[30] L. Parsons, E. Haque, and H. Liu, “Subspace Clustering for High Dimensional Data: A Review,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, 2004.
[31] K.Y. Yip, M.K. Ng, and D.W. Cheung, “A Review on Projected Clustering Algorithms,” Int'l J. Applied Math., vol. 13, pp. 35-47, 2003.
[32] S. Theodoridis and K. Koutroumbas, Pattern Recognition. Academic Press, 1999.
[33] M. Meila, “Comparing Clusterings by the Variation of Information,” Proc. 16th Ann. Conf. Computational Learning Theory, pp. 173-187, 2003.
[34] W.M. Rand, “Objective Criteria for the Evaluation of Clustering Methods,” J. Am. Statistical Assoc., vol. 66, pp. 846-850, 1971.
[35] B. Mirkin, Mathematical Classification and Clustering. Kluwer Academic Press, 1996.
[36] D.L. Wallace, “Comment,” J. Am. Statistical Assoc., vol. 78, no. 383, pp. 569-576, 1983.
[37] T. Lange, V. Roth, M.L. Braun, and J.M. Buhmann, “Stability-Based Validation of Clustering Solutions,” Neural Computation, vol. 16, pp. 1299-1323, 2004.
[38] S. Dudoit and J. Fridlyand, “A Prediction-Based Resampling Method for Estimating the Number of Clusters in a Dataset,” Genome Biology, vol. 3, no. 7, 2002.
[39] E. Levine and E. Domany, “Resampling Method for Unsupervised Estimation of Cluster Validity,” Neural Computation, vol. 13, pp. 2573-2593, 2001.
[40] A. Gionis, H. Mannila, and P. Tsaparas, “Clustering Aggregation,” Proc. 21st Int'l Conf. Data Eng., 2005.
[41] A. Strehl and J. Ghosh, “Cluster Ensembles— A Knowledge Reuse Framework for Combining Multiple Partititons,” J. Machine Learning Research, vol. 3, pp. 583-617, 2002.
[42] A. Topchy, A. Jain, and W. Punch, “A Mixture Model for Clustering Ensembles,” Proc. Fourth SIAM Int'l Conf. Data Mining, 2004.
[43] A. Topchy, M.H. Law, A.K. Jain, and A. Fred, “Analysis of Consensus Partition in Cluster Ensemble,” Proc. Fourth IEEE Int'l Conf. Data Mining, 2004.
[44] A. Topchy, B. Minaei-Bidgoli, A. Jain, and W. Punch, “Adaptive Clustering Ensembles,” Proc. 17th Int'l Conf. Pattern Recognition, pp. 272-275, 2004.
[45] P. Artigas, A. Goldenberg, A. Likhodedov, and R. Caruana, “Meta Clustering,” http://www-2.cs.cmu.edu/~artigas/classproj mlproj.ps, 2000.
[46] A.K. Jain, A. Topchy, M.H. Law, and J. Buhmann, “Landscape of Clustering Algorithms,” Proc. 17th Int'l Conf. Pattern Recognition, pp. 260-263, 2004.
[47] C.H. Papadimitriou and K. Steiglitz, Combinatorial Optimization, Algorithms and Complexity. Prentice-Hall, 1982.
[48] E.B. Fowlkes and C.L. Mallows, “A Method for Comparing Two Hierarchical Clusterings,” J. Am. Statistical Assoc., vol. 78, no. 383, pp. 553-569, 1983.
[49] P. Jaccard, “The Distribution of Flora in the Alpine Zone,” The New Phytologist, vol. 11, no. 2, pp. 37-50, 1912.
[50] M. Meila, “Comparing Clusterings— An Axiomatic View,” Proc. 22nd Int'l Conf. Machine Learning (ICML '05), L.D. Raedt and S. Wrobel, eds., vol. 22, 2005.
[51] C.V. Rijsbergen, Information Retrieval, second ed. Butterworths, 1979.
[52] A. Patrikainen and M. Meila, “Comparing Subspace Clusterings,” Technical Report UW-CSE-2004-10-1, Univ. Washington, 2004.
[53] M.E. Argentati, “Principal Angles between Subspaces,” http://www-math.cudenver.edu/~aknyazev/teaching/ ricotalk_defense.pdf, 2006.
[54] A. Björk and G.H. Golub, “Numerical Methods for Computing Angles between Linear Subspaces,” Math. Computation, vol. 27, pp. 579-594, 1973.
[55] Z. Drmac, “On Principal Angles between Subspaces of Euclidean Space,” Siam J. Matrix Analysis Applications, vol. 22, no. 1, pp. 173-194, 2000.
[56] K.Y. Yip, “HARP: A Practical Projected Clustering Algorithm for Mining Gene Expression Data,” master's thesis, The University of Hong Kong, Pokfulam Road, Hong Kong, http://www.csis. hku.hk/~ylyip/papersthesis.pdf , 2004.
[57] C. Yang, U. Fayyad, and P.S. Bradley, “Efficient Discovery of Error-Tolerant Frequent Itemsets in High Dimensions,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2000.
[58] J.K. Seppänen and H. Mannila, “Dense Itemsets,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2004.
[59] A. Gionis, H. Mannila, and J.K. Seppänen, “Geometric and Combinatorial Tiles in 0-1 Data,” Proc. European Conf. Principles and Practice of Knowledge Dicovery in Databases, 2004.
[60] J. Besson, C. Robardet, and J.-F. Boulicaut, “Mining Alpha-Beta Concepts as Relevant Bi-Sets from Transactional Data,” Proc. Third Int'l Workshop Knowledge Discovery in Inductive Databases (KDID '04), 2004.
[61] N. Mishra, D. Ron, and R. Swaminathan, “A New Conceptual Clustering Framework,” Machine Learning, vol. 56, nos. 1-3, pp. 115-151, 2004.
[62] A. Kaban, E. Bingham, and T. Hirsimäki, “Learning to Read between the Lines: The Aspect Bernoulli Model,” Proc. Fourth SIAM Int'l Conf. Data Mining, pp. 462-466, 2004.
[63] A. Ben-Hur, A. Elisseeff, and I. Guyon, “A Stability Based Method for Discovering Structure in Clustered Data,” Proc. Pacific Symp. Biocomputing, pp. 6-17, 2002.

