Issue No.09 - September (2011 vol.23)

pp: 1406-1418

Xiaofei He , Zhejiang University, Hangzhou

Deng Cai , Zhejiang University, Hangzhou

Yuanlong Shao , Zhejiang University, Hangzhou

Hujun Bao , Zhejiang University, Hangzhou

Jiawei Han , University of Illinois at Urbana Champaign, Urbana

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2010.259

ABSTRACT

Gaussian Mixture Models (GMMs) are among the most statistically mature methods for clustering. Each cluster is represented by a Gaussian distribution. The clustering process thereby turns to estimate the parameters of the Gaussian mixture, usually by the Expectation-Maximization algorithm. In this paper, we consider the case where the probability distribution that generates the data is supported on a submanifold of the ambient space. It is natural to assume that if two points are close in the intrinsic geometry of the probability distribution, then their conditional probability distributions are similar. Specifically, we introduce a regularized probabilistic model based on manifold structure for data clustering, called Laplacian regularized Gaussian Mixture Model (LapGMM). The data manifold is modeled by a nearest neighbor graph, and the graph structure is incorporated in the maximum likelihood objective function. As a result, the obtained conditional probability distribution varies smoothly along the geodesics of the data manifold. Experimental results on real data sets demonstrate the effectiveness of the proposed approach.

INDEX TERMS

Gaussian mixture model, clustering, graph laplacian, manifold structure.

CITATION

Xiaofei He, Deng Cai, Yuanlong Shao, Hujun Bao, Jiawei Han, "Laplacian Regularized Gaussian Mixture Model for Data Clustering",

*IEEE Transactions on Knowledge & Data Engineering*, vol.23, no. 9, pp. 1406-1418, September 2011, doi:10.1109/TKDE.2010.259REFERENCES

- [1] Z. Abbassi and V.S. Mirrokni, "A Recommender System Based on Local Random Walks and Spectral Methods,"
Proc. Ninth WebKDD and First SNA-KDD Workshop Web Mining and Social Network Analysis, 2007.- [2] C.C. Aggarwal, N. Ta, J. Wang, J. Feng, and M. Zaki, "Xproj: A Framework for Projected Structural Clustering of XML Documents,"
Proc. 13th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 46-55, 2007.- [3] A. Argyriou, M. Herbster, and M. Pontil, "Combining Graph Laplacian for Semi-Supervised Learning,"
Advances in Neural Information Processing Systems 18, MIT Press, 2005.- [4] C. Böhm, C. Faloutsos, J.-Y. Pan, and C. Plant, "Robust Information-Theoretic Clustering,"
Proc. 12th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 65-75, Aug. 2006.- [5] M. Belkin and P. Niyogi, "Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering,"
Advances in Neural Information Processing Systems 14, pp. 585-591, MIT Press, 2001.- [6] M. Belkin, P. Niyogi, and V. Sindhwani, "Manifold Regularization: A Geometric Framework for Learning from Examples,"
J. Machine Learning Research, vol. 7, pp. 2399-2434, 2006.- [7] C.M. Bishop,
Pattern Recognition and Machine Learning. Springer, 2007.- [8] J.-P. Brunet, P. Tamayo, T.R. Golub, and J.P. Mesirov, "Metagenes and Molecular Pattern Discovery Using Matrix Factorization,"
Proc. Nat'l Academy of Sciences, vol. 101, no. 12, pp. 4164-4169, 2004.- [9] D. Cai and X. He, "Orthogonal Locality Preserving Indexing,"
Proc. ACM Int'l Conf. Research and Development in Information Retrieval (SIGIR '05), pp. 3-10, 2005.- [10] D. Cai, X. He, and J. Han, "Document Clustering Using Locality Preserving Indexing,"
IEEE Trans. Knowledge and Data Eng., vol. 17, no. 12, pp. 1624-1637, Dec. 2005.- [11] D. Cai, X. He, W.-Y. Ma, J.-R. Wen, and H.-J. Zhang, "Organizing www Images Based on the Analysis of Page Layout and Web Link Structure,"
Proc. IEEE Int'l Conf. Multimedia and Expo (ICME '04), 2004.- [12] D. Cai, Q. Mei, J. Han, and C. Zhai, "Modeling Hidden Topics on Document Manifold,"
Proc. ACM Int'l Conf. Information and Knowledge Management, 2008.- [13] D. Cai, X. Wang, and X. He, "Probabilistic Dyadic Data Analysis with Local and Global Consistency,"
Proc. 26th Int'l Conf. Machine Learning, 2009.- [14] M.A. Carreira-Perpinan and R.S. Zemel, "Proximity Graphs for Clustering and Manifold Learning,"
Proc. Advances in Neural Information Processing Systems 17, MIT Press, Dec. 2004.- [15] E. Cesario, G. Manco, and R. Ortale, "Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data,"
IEEE Trans. Knowledge and Data Eng., vol. 19, no. 12, pp. 1607-1624, Dec. 2007.- [16] P.K. Chan, D.F. Schlag, and J.Y. Zien, "Spectral k-Way Ratio-Cut Partitioning and Clustering,"
IEEE Trans. Computer-Aided Design, vol. 13, no. 9, pp. 1088-1096, Sept. 1994.- [17] M. Chu, F. Diele, R. Plemmons, and S. Ragni, "Optimality, Computation, and Interpretation of Nonnegative Matrix Factoriaztions,"
SIAM J. Matrix Analysis, Preprint, http://www.wfu. edu/~plemmons/paperschu_ple.pdf . - [18] F.R.K. Chung,
Spectral Graph Theory, Volume 92 of Regional Conf. Series in Mathematics. AMS, 1997.- [19] W. Dai, Q. Yang, G.-R. Xue, and Y. Yu, "Self-Taught Clustering,"
Proc. 25nd Int'l Conf. Machine Learning, July 2008.- [20] I. Davidson, M. Ester, and S.S. Ravi, "Efficient Incremental Constraint Clustering,"
Proc. 13th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 240-249, Aug. 2007.- [21] A.P. Dempster, N.M. Laird, and D.B. Rubin, "Maximum Likelihood from Incomplete Data via the Em Algorithm,"
J. Royal Statistical Soc., Series B (Methodological), vol. 39, no. 1, pp. 1-38, 1977.- [22] C. Ding, T. Li, and M. Jordan, "Convex and Semi-Nonnegative Matrix Factorizations for Clustering and Low-Dimension Representation," Technical Report LBNL-60428, Lawrence Berkeley Nat'l Laboratory, 2006.
- [23] C. Ding, T. Li, W. Peng, and H. Park, "Orthogonal Nonnegative Matrix Tri-Factorizations for Clustering,"
Proc. 12th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 126-135, 2006.- [24] R.O. Duda, P.E. Hart, and D.G. Stork,
Pattern Classification, second ed. Wiley-Interscience, 2000.- [25] D.J. Higham, G. Kalna, and M. Kibble, "Spectral Clustering and Its Use in Bioinformatics,"
J. Computational and Applied Math., vol. 204, no. 1, pp. 25-37, 2007.- [26] P.O. Hoyer, "Non-Negative Matrix Factorizaiton with Sparseness Constraints,"
J. Machine Learning Research, vol. 5, pp. 1457-1469, 2004.- [27] J.J. Hull, "A Database for Handwritten Text Recognition Research,"
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 5, pp. 550-554, May 1994.- [28] C.S. Jensen, "Continuous Clustering of Moving Objects,"
IEEE Trans. Knowledge and Data Eng., vol. 19, no. 9, pp. 1161-1174, Sept. 2007.- [29] N. Kumar and K. Kummamuru, "Semisupervised Clustering with Metric Learning Using Relative Comparisons,"
IEEE Trans. Knowledge and Data Eng., vol. 20, no. 4, pp. 496-503, Apr. 2008.- [30] D.D. Lee and H.S. Seung, "Learning the Parts of Objects by Non-Negative Matrix Factorization,"
Nature, vol. 401, pp. 788-791, 1999.- [31] D.D. Lee and H.S. Seung, "Algorithms for Non-Negative Matrix Factorization,"
Proc. Advances in Neural Information Processing Systems 13, MIT press, 2001.- [32] J.M. Lee,
Introduction to Smooth Manifolds. Springer-Verlag, 2002.- [33] S.Z. Li, X. Hou, H. Zhang, and Q. Cheng, "Learning Spatially Localized, Parts-Based Representation,"
Proc. IEEE Computer Soc. Conf. Computer Vision and Pattern Recognition (CVPR '01), pp. 207-212, 2001.- [34] T. Li and C. Ding, "The Relationships among Various Nonnegative Matrix Factorization Methods for Clustering,"
Proc. Int'l Conf. Data Mining (ICDM '06), 2006.- [35] X. Li, S. Lin, S. Yan, and D. Xu, "Discriminant Locally Linear Embedding with High-Order Tensor Data,"
IEEE Trans. Systems, Man, and Cybernetics, Part B, vol. 38, no. 2, pp. 342-352, Apr. 2008.- [36] L. Lovasz and M. Plummer,
Matching Theory. Akadémiai Kiadó, 1986.- [37] Q. Mei, D. Cai, D. Zhang, and C. Zhai, "Topic Modeling with Network Regularization,"
Proc. 17th Int'l World Wide Web Conf., 2008.- [38] R. Neal and G. Hinton, "A View of the EM Algorithm that Justifies Incremental, Sparse, and Other Variants,"
Learning in Graphical Models, Kluwer, 1998.- [39] A.Y. Ng, M. Jordan, and Y. Weiss, "On Spectral Clustering: Analysis and an Algorithm,"
Advances in Neural Information Processing Systems 14, pp. 849-856, MIT Press, 2001.- [40] S. Roweis and L. Saul, "Nonlinear Dimensionality Reduction by Locally Linear Embedding,"
Science, vol. 290, no. 5500, pp. 2323-2326, 2000.- [41] H.S. Seung and D.D. Lee, "The Manifold Ways of Perception,"
Science, vol. 290, no. 12, pp. 2268-2269, 2000.- [42] J. Shi and J. Malik, "Normalized Cuts and Image Segmentation,"
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.- [43] J. Tenenbaum, V. de Silva, and J. Langford, "A Global Geometric Framework for Nonlinear Dimensionality Reduction,"
Science, vol. 290, no. 5500, pp. 2319-2323, Aug. 2000.- [44] V.N. Vapnik,
Statistical Learning Theory. John Wiley and Sons, 1998.- [45] F. Wang, C. Zhang, and T. Li, "Regularized Clustering for Documents,"
Proc. Int'l Conf. Research and Development in Information Retrieval (SIGIR '07), July 2007.- [46] W. Xu and Y. Gong, "Document Clustering by Concept Factorization,"
Proc. ACM Int'l Conf. Research and Development in Information Retrieval (SIGIR '04), pp. 202-209, July 2004.- [47] W. Xu, X. Liu, and Y. Gong, "Document Clustering Based on Non-Negative Matrix Factorization,"
Proc. Int'l Conf. Research and Development in Information Retrieval (SIGIR '03), pp. 267-273, Aug. 2003.- [48] Z. Zhang and H. Zha, "Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment,"
SIAM J. Scientific Computing, vol. 26, no. 1, pp. 313-338, 2004.- [49] X. Zhu and J. Lafferty, "Harmonic Mixtures: Combining Mixture Models and Graph-Based Methods for Inductive and Scalable Semi-Supervised Learning,"
ICML '05: Proc. 22nd Int'l Conf. Machine Learning, pp. 1052-1059, 2005. |