The Community for Technology Leaders
Subscribe
Issue No.11 - November (2010 vol.32)
pp: 2006-2021
Zhaoshui He , RIKEN Brain Science Institute, Saitama and South China University of Technology, Guangzhou
Andrzej Cichocki , RIKEN Brain Science Institute, Saitama, Polish Academy of Sciences, Warsaw and Warsaw University of Technology, Warsaw
Shengli Xie , South China University of Technology, Guangzhou
Kyuwan Choi , ATR Computational Neuroscience Laboratories, Kyoto
ABSTRACT
Recently, there has been a growing interest in multiway probabilistic clustering. Some efficient algorithms have been developed for this problem. However, not much attention has been paid on how to detect the number of clusters for the general n-way clustering (n\ge 2). To fill this gap, this problem is investigated based on n-way algebraic theory in this paper. A simple, yet efficient, detection method is proposed by eigenvalue decomposition (EVD), which is easy to implement. We justify this method. In addition, its effectiveness is demonstrated by the experiments on both simulated and real-world data sets.
INDEX TERMS
Multiway clustering, probabilistic clustering, hypergraph, parallel factor analysis (PARAFAC), model order selection, multiway array, higher order tensor, supersymmetric tensors, affinity arrays, enumeration of clusters, estimation of PARAFAC components, principal components enumeration.
CITATION
Zhaoshui He, Andrzej Cichocki, Shengli Xie, Kyuwan Choi, "Detecting the Number of Clusters in n-Way Probabilistic Clustering", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.32, no. 11, pp. 2006-2021, November 2010, doi:10.1109/TPAMI.2010.15
REFERENCES
 [1] D. Zhou, J. Huang, and B. Schölkopf, "Learning with Hypergraphs: Clustering, and Classification, Embedding," Advances in Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman, eds., vol. 19, pp. 1601-1608, MIT Press, 2007. [2] C. Berge, Hypergraphs, first ed. North Holland, Aug. 1989. [3] A. Shashua, R. Zass, and T. Hazan, "Multi-Way Clustering, Using Super-Symmetric Non-Negative Tensor Factorization," Lecture Notes in Computer Science, vol. 3954, pp. 595-608, Springer, July 2006. [4] A. Banerjee, S. Basu, and S. Merugu, "Multi-Way Clustering on Relation Graphs," Proc. SIAM Conf. Data Mining, 2007. [5] J.M. Buhmann and T. Hofmann, "A Maximum Entropy Approach to Pairwise Data Clustering," Proc. 12th IAPR Int'l Conf. Pattern Recognition, pp. 207-212, Oct. 1994. [6] T. Hofmann and J. Buhmann, "Hierarchical Pairwise Data Clustering by Mean-Field Annealing," Proc. Int'l Conf. Artificial Neural Networks, pp. 197-202, 1995. [7] T. Hofmann and J.M. Buhmann, "Pairwise Data Clustering by Deterministic Annealing," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 1, pp. 1-14, Jan. 1997. [8] D.J.C. MacKay, Information Theory, Inference and Learning Algorithms. Cambridge Univ. Press, Sept. 2003. [9] L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Interscience, Mar. 2005. [10] R. Zass and A. Shashua, "A Unifying Approach to Hard Probabilistic Clustering," Proc. 10th IEEE Int'l Conf. Computer Vision, vol. 1, pp. 294-301, 2005. [11] H.A. Kiers, "Towards a Standardized Notation and Terminology in Multiway Analysis," J. Chemometrics, vol. 14, no. 3, pp. 105-122, 2000. [12] L. De Lathauwer, "A Link between the Canonical Decomposition in Multilinear Algebra and Simultaneous Matrix Diagonalization," SIAM J. Matrix Analysis and Applications, vol. 28, no. 3, pp. 642-666, http://publi-etis.ensea.fr/2006Del06, 2006. [13] T.G. Kolda and B.W. Bader, "Tensor Decompositions and Applications," SIAM Rev., vol. 51, no. 3, Sept. 2009. [14] R.A. Harshman, "Foundations of the PARAFAC Procedure: Models and Conditions for an 'Explanatory'," UCLA Working Papers in Phonetics, vol. 16, pp. 1-84, 1970. [15] R. Bro, "PARAFAC. Tutorial and Applications," Chemometrics and Intelligent Laboratory Systems, vol. 38, no. 2, pp. 149-171, Oct. 1997. [16] A. Cichocki, R. Zdunek, A.H. Phan, and S.I. Amari, Nonnegative Matrix and Tensor Factorizations. Wiley, Nov. 2009. [17] J.D. Carroll and J.-J. Chang, "Analysis of Individual Differences in Multidimensional Scaling via an $n$ -Way Generalization of 'Eckart-Young' Decomposition," Psychometrika, vol. 35, no. 3, pp. 283-319, http://ideas.repec.org/a/spr/psychov35y1970i3p283-319.html , Sept. 1970. [18] R.A. Harshman, "Determination and Proof of Minimum Uniqueness Conditions for PARAFAC1," UCLA Working Papers in Phonetics, vol. 22, pp. 111-117, 1972. [19] R.A. Harshman, "PARAFAC2: Mathematical and Technical Notes," UCLA Working Papers in Phonetics, vol. 22, pp. 30-47, 1972. [20] R. Zass and A. Shashua, "Doubly Stochastic Normalization for Spectral Clustering," Advances in Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman, eds., vol. 19, pp. 1569-1576, MIT Press, 2007. [21] P.M. Kroonenberg and T.H.A. van der Voort, "Multiplicatieve Decompositie van Interacties bij Oordelen over de Werkelijkheidswaarde van Televisiefilms [Multiplicative Decomposition of Interactions for Judgements of Realism of Television Films]," Kwantitatieve Methoden, vol. 8, no. 23, pp. 117-144, 1987. [22] J. Håstad, "Tensor Rank Is NP-Complete," J. Algorithms, vol. 11, no. 4, pp. 644-654, 1990. [23] M.E. Timmerman and H.A.L. Kiers, "Three-Mode Principal Components Analysis: Choosing the Numbers of Components and Sensitivity to Local Optima," British J. Math. and Statistical Psychology, vol. 53, no. 1, pp. 1-16, 2000. [24] R. Bro and H.A.L. Kiers, "A New Efficient Method for Determining the Number of Components in PARAFAC Models," J. Chemometrics, vol. 17, no. 5, pp. 274-286, 2003. [25] H.A.L. Kiers and A. der Kinderen, "A Fast Method for Choosing the Numbers of Components in Tucker3 Analysis," British J. Math. and Statistical Psychology, vol. 56, no. 1, pp. 119-125, May 2003. [26] E. Ceulemans and H.A.L. Kiers, "Selecting among Three-Mode Principal Component Models of Different Types and Complexities: A Numerical Convex Hull Based Method," British J. Math. and Statistical Psychology, vol. 59, no. 1, pp. 133-150, May 2006. [27] J.P.C.L. da Costa, M. Haardt, F. Römer, and G. Del Galdo, "Enhanced Model Order Estimation Using Higher-Order Arrays," Proc. 41st Asilomar Conf. Signals, Systems, and Computers, pp. 412-416, Nov. 2007. [28] J.P.C.L. da Costa, M. Haardt, and F. Römer, "Robust Methods Based on the HOSVD for Estimating the Model Order in PARAFAC Models," Proc. Fifth IEEE Sensor Array and Multichannel Signal Processing Workshop, pp. 510-514, July 2008. [29] J.B. Kruskal, Rank, Decomposition, and Uniqueness for 3-Way and N-Way Arrays. North-Holland Publishing Co., 1989. [30] P. Comon and J. ten Berge, "Generic and Typical Ranks of Three-Way Arrays," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 3313-3316, Apr. 2008. [31] E. Kofidis and P.A. Regalia, "On the Best Rank-1 Approximation of Higher-Order Supersymmetric Tensors," SIAM J. Matrix Analysis and Applications, vol. 23, no. 3, pp. 863-884, 2002. [32] T.P. Minka, "Automatic Choice of Dimensionality for PCA," Advances in Neural Information Processing Systems, T.K. Leen, T.G. Dietterich, and V. Tresp, eds, pp. 556-562, MIT Press, 2001. [33] M.O. Ulfarsson and V. Solo, "Dimension Estimation in Noisy PCA with SURE and Random Matrix Theory," IEEE Trans. Signal Processing, vol. 56, no. 12, pp. 5804-5816, Dec. 2008. [34] E. Radoi and A. Quinquis, "A New Method for Estimating the Number of Harmonic Components in Noise with Application in High Resolution Radar," EURASIP J. Applied Signal Processing, vol. 2004, no. 8, pp. 1177-1188, 2004. [35] J. Grouffaud, P. Larzabal, and H. Clergeot, "Some Properties of Ordered Eigenvalues of a Wishart Matrix: Application in Detection Test and Model Order Selection," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing,, vol. 5, pp. 2463-2466, 1996. [36] A. Quinlan, J.-P. Barbot, P. Larzabal, and M. Haardt, "Model Order Selection for Short Data: An Exponential Fitting Test (EFT)," EURASIP J. Advances in Signal Processing, vol. 2007, pp. 1-11, 2007. [37] A. Smilde, R. Bro, and P. Geladi, Multi-Way Analysis: Applications in the Chemical Sciences. John Wiley & Sons, Aug. 2005. [38] G. Milligan and M. Cooper, "An Examination of Procedures for Determining the Number of Clusters in a Data Set," Psychometrika, vol. 50, no. 2, pp. 159-179, June 1985. [39] W.J. Krzanowski and Y.T. Lai, "A Criterion for Determining the Number of Groups in a Data Set Using Sum-of-Squares Clustering," Biometrics, vol. 44, no. 1, pp. 23-34, Mar. 1988. [40] M.F. Antonio Cuevas and R. Fraiman, "Estimating the Number of Clusters," The Canadian J. Statistics, vol. 28, no. 2, pp. 367-382, June 2000. [41] W. Feller, An Introduction to Probability Theory and Its Applications, second ed., vol. 2. John Wiley & Sons, Jan. 1991. [42] R. Tibshirani, G. Walther, and T. Hastie, "Estimating the Number of Clusters in a Data Set via the Gap Statistic," J. Royal Statistics Soc. (Series B), vol. 63, no. 2, pp. 411-423, 2001. [43] M.A.F. Figueiredo and A.K. Jain, "Unsupervised Learning of Finite Mixture Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 381-396, Mar. 2002. [44] C.A. Sugar and G.M. James, "Finding the Number of Clusters in a Dataset: An Information-Theoretic Approach," J. Am. Statistical Assoc., vol. 98, no. 463, pp. 750-763, Sept. 2003. [45] M. Yan and K. Ye, "Determining the Number of Clusters Using the Weighted Gap Statistic," Biometrics, vol. 63, no. 4, pp. 1031-1037, Apr. 2007. [46] P. Guo, P. Chen, and M. Lyu, "Cluster Number Selection for a Small Set of Samples Using the Bayesian Ying-Yang Model," IEEE Trans. Neural Networks, vol. 13, no. 3, pp. 757-763, Apr. 2002. [47] X. Hu and L. Xu, "Investigation on Several Model Selection Criteria for Determining the Number of Cluster," Neural Information Processing—Letters and Rev., vol. 4, no. 1, pp. 1-10, 2004. [48] I.O. Kyrgyzov, O.O. Kyrgyzov, H. Maître, and M. Campede, "Kernel MDL to Determine the Number of Clusters," Lecture Notes in Computer Science, vol. 4571, pp. 203-217, Springer, 2007. [49] A. Dempster, N. Laird, and D. Rubin, "Maximum Likelihood from Incomplete Data via the EM Algorithm," J. Royal Statistical Soc., Series B, vol. 39, no. 1, pp. 1-38, 1977. [50] H. Akaike, "A New Look at the Statistical Model Identification," IEEE Trans. Automatic Control, vol. 19, no. 6, pp. 716-723, Dec. 1974. [51] H. Bozdogan, "Model Selection and Akaike's Information Criterion (AIC): The General Theory and its Analytical Extensions," Psychometrika, vol. 52, no. 3, pp. 345-370, Sept. 1987. [52] J. Rissanen, "Modelling by the Shortest Data Description," Automatica, vol. 14, pp. 465-471, 1978. [53] A. Barron, J. Rissanen, and B. Yu, "The Minimum Description Length Principle in Coding and Modeling," IEEE Trans. Information Theory, vol. 44, no. 6, pp. 2743-2760, Oct. 1998. [54] S.L. Sclove, "Some Aspects of Model-Selection Criterion," Proc. First US/Japan Conf. Frontiers of Statistical Modeling: An Informational Approach, H. Bozdogan, ed., vol. 2, pp. 37-67, 1994. [55] T. Calinski and J. Harabasz, "A Dendrite Method for Cluster Analysis," Comm. Statistics, vol. 3, pp. 1-27, 1974. [56] J.A. Hartigan, Clustering Algorithms. John Wiley & Sons, Apr. 1975. [57] C.A. Andersson and R. Bro, "The $n$ -Way Toolbox for MATLAB," Chemometrics and Intelligent Laboratory Systems, vol. 52, no. 1, pp. 1-4, 2000. [58] D.D. Lee and H.S. Seung, "Learning the Parts of Objects by Nonnegative Matrix Factorization," Nature, vol. 401, no. 6755, pp. 788-791, 1999. [59] D.D. Lee and H.S. Seung, "Algorithms for Non-Negative Matrix Factorization," Advances in Neural Information Processing Systems, T.K. Leen, T.G. Dietterich, and V. Tresp, eds., pp. 556-562, MIT Press, 2001. [60] A. Georghiades, P. Belhumeur, and D. Kriegman, "From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 643-660, July 2001. [61] J. Ho, M.-H. Yang, J. Lim, K.-C. Lee, and D. Kriegman, "Clustering Appearances of Objects under Varying Illumination Conditions," Proc. 2003 IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 11-18, 2003. [62] E. Donchin, K.M. Spencer, and R. Wijesinghe, "The Mental Prosthesis: Assessing the Speed of a P300-Based Brain-Computer Interface," IEEE Trans. Rehabilitation Eng., vol. 8, no. 2, pp. 174-179, June 2000. [63] F. Piccione, F. Giorgi, P. Tonin, K. Priftis, S. Giove, S. Silvoni, G. Palmas, and F. Beverina, "P300-Based Brain Computer Interface: Reliability and Performance in Healthy and Paralysed Participants," Clinical Neurophysiology, vol. 117, no. 3, pp. 531-537, Mar. 2006.