This Article 
 Bibliographic References 
 Add to: 
Bayesian Clustering of Fuzzy Feature Vectors Using a Quasi-Likelihood Approach
January 2009 (vol. 31 no. 1)
pp. 74-85
Pekka Marttinen, University of Helsinki, Helsinki
Jing Tang, University of Helsinki, Helsinki
Bernard De Baets, Ghent University, Ghent
Peter Dawyndt, Ghent University, Ghent
Jukka Corander, Abo Akademi University, Fanriksgatan
Bayesian model-based classifiers, both unsupervised and supervised, have been studied extensively and their value and versatility have been demonstrated on a wide spectrum of applications within science and engineering. A majority of the classifiers are built on the assumption of intrinsic discreteness of the considered data features or on the discretization of them prior to the modeling. On the other hand, Gaussian mixture classifiers have also been utilized to a large extent for continuous features in the Bayesian framework. Often the primary reason for discretization in the classification context is the simplification of the analytical and numerical properties of the models. However, the discretization can be problematic due to its \textit{ad hoc} nature and the decreased statistical power to detect the correct classes in the resulting procedure. We introduce an unsupervised classification approach for fuzzy feature vectors that utilizes a discrete model structure while preserving the continuous characteristics of data. This is achieved by replacing the ordinary likelihood by a binomial quasi-likelihood to yield an analytical expression for the posterior probability of a given clustering solution. The resulting model can be justified from an information-theoretic perspective. Our method is shown to yield highly accurate clusterings for challenging synthetic and empirical data sets.

[1] D. Hand and K. Yu, “Idiot's Bayes—Not So Stupid After All,” Int'l Statistical Rev., vol. 69, pp. 385-399, 2001.
[2] R. Herbrich, T. Graepel, and C. Campbell, “Bayes Point Machines,” J. Machine Learning Research, vol. 1, pp. 245-279, 2001.
[3] B. Krishnapuram, A. Hartemink, L. Carin, and M. Figueiredo, “A Bayesian Approach to Joint Feature Selection and Classifier Design,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, pp. 1105-1111, 2004.
[4] H.-C. Kim and Z. Ghahramani, “Bayesian Gaussian Process Classification with the EM-EP Algorithm,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, pp. 1948-1959, 2006.
[5] S. Lloyd, “Least Squares Quantization in PCM,” IEEE Trans. Information Theory, vol. 28, pp. 129-137, 1982.
[6] X. Zhou, X. Wang, and E.R. Dougherty, “Binarization of Microarray Data on the Basis of a Mixture Model,” Molecular Cancer Therapeutics, vol. 2, pp. 679-684, 2003.
[7] R. Kohavi and M. Sahami, “Error-Based and Entropy-Based Discretization of Continuous Features,” Proc. Second Int'l Conf. Knowledge Discovery and Data Mining, pp. 114-119, 1996.
[8] J. Bernardo and A. Smith, Bayesian Theory. John Wiley & Sons, 1994.
[9] R. Wedderburn, “Quasi-Likelihood Functions, Generalized Linear Models, and the Gauss-Newton Method,” Biometrika, vol. 61, pp.439-447, 1974.
[10] J. Corander, M. Gyllenberg, and T. Koski, “Random Partition Models and Exchangeability for Bayesian Identification of Population Structure,” Bull. of Math. Biology, vol. 69, pp. 797-815, 2007.
[11] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed. John Wiley & Sons, 2000.
[12] P. Marttinen, J. Corander, P. Törönen, and L. Holm, “Bayesian Search of Functionally Divergent Protein Subgroups and Their Function Specific Residues,” Bioinformatics, vol. 22, pp. 2466-2474, 2006.
[13] C. Robert and G. Casella, Monte Carlo Statistical Methods, second ed. Springer, 2005.
[14] S. Sisson, “Transdimensional Markov Chains: A Decade of Progress and Future Perspectives,” J. Am. Statistical Assoc., vol. 100, pp. 1077-1089, 2005.
[15] B. Jones, C. Carvalho, A. Dobra, C. Hans, C. Carter, and M. West, “Experiments in Stochastic Computation for High-Dimensional Graphical Models,” Statistical Science, vol. 20, pp. 388-400, 2005.
[16] K.B. Laskey and J.W. Myers, “Population Markov Chain Monte Carlo,” Machine Learning, vol. 50, pp. 175-196, 2003.
[17] J. Corander, M. Gyllenberg, and T. Koski, “Bayesian Model Learning Based on a Parallel MCMC Strategy,” Statistics and Computing, vol. 16, pp. 355-362, 2006.
[18] A. Gelman, J. Carlin, H. Stern, and D. Rubin, Bayesian Data Analysis. Chapman & Hall, 1996.
[19] B. De Baets and H. De Meyer, “Transitivity-Preserving Fuzzification Schemes for Cardinality-Based Similarity Measures,” European J. Operational Research, vol. 160, pp. 726-740, 2005.
[20] B. De Baets, S. Janssens, and H. De Meyer, “On the Transitivity of a Parametric Family of Cardinality-Based Similarity Measures,” Int'l J. Approximate Reasoning, in press, 2008.
[21] J. Nelder, “Quasi-Likelihood and Pseudo-Likelihood Are Not the Same Thing,” J. Applied Statistics, vol. 27, pp. 1007-1011, 2000.
[22] R. Fisher, “Theory of Statistical Estimation,” Proc. Cambridge Philosophical Soc., vol. 22, pp. 700-725, 1925.
[23] B. Ripley, Pattern Recognition and Neural Networks. Cambridge Univ. Press, 1996.
[24] L. Hubert and P. Arabie, “Comparing Partitions,” J. Classification, vol. 2, pp. 193-218, 1985.
[25] P. Cheeseman and J. Stutz, “Bayesian Classification (AutoClass): Theory and Results,” Advances in Knowledge Discovery and Data Mining, U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R.Uthurusamy, eds., MIT Press, pp. 153-180, 1996.
[26] M.A. Upal and E.M. Neufeld, “Comparison of Unsupervised Classifiers,” Proc. ISIS Information, Statistics and Induction in Science, pp. 342-353, Aug. 1996.
[27] B. Slabbinck, B. De Baets, P. Dawyndt, and P. De Vos, “Genus-Wide Bacillus Species Identification through Proper Artificial Neural Network Experiments on Fatty Acid Profiles,” Antonie van Leeuwenhoek, doi: 10.1007/s10482-008-9229-z, 2008.
[28] P. Dawyndt, M. Vancanneyt, C. Snauwaert, B. De Baets, H. De Meyer, and J. Swings, “Mining Fatty Acid Databases for Detection of Novel Compounds in Aerobic Bacteria,” J. Microbiological Methods, vol. 66, pp. 410-433, 2006.
[29] J. Corander, P. Marttinen, and S. Mäntyniemi, “Bayesian Identification of Stock Mixtures from Molecular Marker Data,” Fishery Bull., vol. 104, pp. 550-558, 2006.
[30] W. Jiang and X. Liu, “Consistent Model Selection Based on Parameter Estimates,” J. Statistical Planning and Inference, vol. 121, pp. 265-283, 2004.
[31] W. Pan, “Model Selection in Estimating Equations,” Biometrics, vol. 57, pp. 529-534, 2001.
[32] J. Nelder and D. Pregibon, “An Extended Quasi-Likelihood Function,” Biometrika, vol. 74, pp. 221-232, 1987.
[33] D. Ashlock, Evolutionary Computation for Modeling and Optimization. Springer, 2006.
[34] R. Neal, “Markov Chain Sampling Methods for Dirichlet Process Mixture Models,” Technical Report 9815, Univ. of Toronto, 1998.
[35] D. Gevers, P. Dawyndt, P. Vandamme, A. Willems, M. Vancanneyt, J. Swings, and P. De Vos, “Stepping Stones towards a New Prokaryotic Taxonomy,” Philosophical Trans. of the Royal Soc.B—Biological Sciences, vol. 361, pp. 1911-1916, 2006.

Index Terms:
Bayesian clustering, quasi-likelihood, fuzzy modeling, continuous data
Pekka Marttinen, Jing Tang, Bernard De Baets, Peter Dawyndt, Jukka Corander, "Bayesian Clustering of Fuzzy Feature Vectors Using a Quasi-Likelihood Approach," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 1, pp. 74-85, Jan. 2009, doi:10.1109/TPAMI.2008.53
Usage of this product signifies your acceptance of the Terms of Use.