The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.08 - August (2009 vol.31)
pp: 1429-1443
Sabri Boutemedjet , Université de Sherbrooke, Sherbrooke
Nizar Bouguila , Concordia University, Montreal
Djemel Ziou , Université de Sherbrooke, Sherbrooke
ABSTRACT
This paper presents an unsupervised approach for feature selection and extraction in mixtures of generalized Dirichlet (GD) distributions. Our method defines a new mixture model that is able to extract independent and non-Gaussian features without loss of accuracy. The proposed model is learned using the Expectation-Maximization algorithm by minimizing the message length of the data set. Experimental results show the merits of the proposed methodology in the categorization of object images.
INDEX TERMS
Unsupervised learning, mixture models, feature selection, dimensionality reduction, generalized Dirichlet mixture, EM, MML, information theory, object image categorization.
CITATION
Sabri Boutemedjet, Nizar Bouguila, Djemel Ziou, "A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 8, pp. 1429-1443, August 2009, doi:10.1109/TPAMI.2008.155
REFERENCES
[1] A. Bosch, A. Zisserman, and X. Munoz, “Scene Classification via pLSA,” Proc. Ninth European Conf. Computer Vision, pp. 517-530, 2006.
[2] N. Bouguila and D. Ziou, “A Hybrid SEM Algorithm for High-Dimensional Unsupervised Learning Using a Finite Generalized Dirichlet Mixture,” IEEE Trans. Image Processing, vol. 15, no. 9, pp.1785-1803, 2006.
[3] N. Bouguila and D. Ziou, “Unsupervised Selection of a Finite Dirichlet Mixture Model: An MML-Based Approach,” IEEE Trans. Knowledge and Data Eng., vol. 18, no. 8, pp. 993-1009, Aug. 2006.
[4] N. Bouguila and D. Ziou, “High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 10, pp. 1716-1731, Oct. 2007.
[5] N. Bouguila, D. Ziou, and E. Monga, “Practical Bayesian Estimation of a Finite Beta Mixture through Gibbs Sampling and Its Applications,” Statistics and Computing, vol. 16, no. 2, 2006.
[6] N. Bouguila, D. Ziou, and J. Vaillancourt, “Novel Mixtures Based on the Dirichlet Distribution: Application to Data and Image Classification,” Proc. Third Int'l Conf. Machine Learning and Data Mining in Pattern Recognition, pp. 172-181, 2003.
[7] N. Bouguila, D. Ziou, and J. Vaillancourt, “Unsupervised Learning of a Finite Mixture Model Based on the Dirichlet Distribution and Its Applications,” IEEE Trans. Image Processing, vol. 13, no. 11, pp. 1533-1543, 2004.
[8] S. Boutemedjet, N. Bouguila, and D. Ziou, “Unsupervised Feature and Model Selection for Generalized Dirichlet Mixture Models,” Proc. Int'l Conf. Image Analysis and Recognition, pp. 330-341, 2007.
[9] R.J. Connor and J.E. Mosimann, “Concepts of Independence for Proportions with a Generalization of the Dirichlet Distribution,” J.Am. Statistical Assoc., vol. 39, pp. 1-38, 1977.
[10] C. Constantinopoulos, M.K. Titsias, and A. Likas, “Bayesian Feature and Model Selection for Gaussian Mixture Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 6, pp.1013-1018, June 2006.
[11] G. Csurka, C.R. Dance, L. Fan, J. Willamowski, and C. Bray, “Visual Categorization with Bags of Keypoints,” Proc. ECCV Int'l Workshop Statistical Learning in Computer Vision, 2004.
[12] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., Series B, vol. 39, pp. 1-38, 1977.
[13] J.G. Dy and C.E. Brodley, “Feature Selection for Unsupervised Learning,” J. Machine Learning Research, vol. 5, pp. 845-889, 2004.
[14] M.A.T. Figueiredo and A.K. Jain, “Unsupervised Learning of Finite Mixture Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 4-37, Mar. 2002.
[15] J.H. Friedman and J.J. Meulman, “Clustering Objects on Subsets of Attributes,” J. Royal Statistical Soc. Series B, vol. 66, no. Part 4, pp. 1-25, 2004.
[16] M.W. Graham and D.J. Miller, “Unsupervised Learning of Parsimonious Mixtures on Large Spaces with Integrated Feature and Component Selection,” IEEE Trans. Signal Processing, vol. 54, no. 4, pp. 1289-1303, 2006.
[17] J. Grim, “Multivariate Statistical Pattern Recognition with Non-Reduced Dimensionality,” Kybernetika, vol. 22, no. 2, pp. 142-157, 1986.
[18] J. Grim, “Information Approach to Structural Optimization of Probabilistic Neural Networks,” Proc. Fourth System Science European Congress, pp. 527-540, 1999.
[19] J. Grim, J. Kittler, P. Pudil, and P. Somol, “Multiple Classifier Fusion in Probabilistic Neural Networks,” Pattern Analysis and Applications, vol. 5, no. 7, pp. 221-233, 2002.
[20] J. Grim, M. Haindl, P. Somol, and P. Pudil, “A Subspace Approach to Texture Modeling by Using Gaussian Mixtures,” Proc. 18th Int'l Conf. Pattern Recognition, pp. 235-238, 2006.
[21] J. Grim, P. Pudil, and P. Somol, “Multivariate Structural Bernoulli Mixtures for Recognition of Handwritten Numerals,” Proc. 15th Int'l Conf. Pattern Recognition, pp. 585-589, 2000.
[22] I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature Selection,” J. Machine Learning Research, vol. 3, pp. 1157-1182, 2003.
[23] T. Hofmann, “Probabilistic Latent Semantic Indexing,” Proc. ACM SIGIR '99, pp. 50-57, 1999.
[24] J. Novovicová, P. Pudil, and J.V. Kittler, “Divergence Based Feature Selection for Multimodal Class Densities,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 2, pp. 218-223, Feb. 1996.
[25] A. Jain and D. Zongker, “Feature Selection: Evaluation, Application, and Small Sample Performance,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 153-158, Feb. 1997.
[26] I.T. Jolliffe, Principal Component Analysis. Springer, 2002.
[27] R. Kohavi and G.H. John, “Wrappers for Feature Subset Selection,” Artificial Intelligence, vol. 97, nos. 1/2, pp. 273-324, 1997.
[28] L. Fei-Fei, R. Fergus, and P. Perona, “Learning Generative Visual Models From Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories,” Proc. IEEE CVPR Workshop Generative-Model Based Vision, 2004.
[29] M.H.C. Law, M.A.T. Figueiredo, and A.K. Jain, “Simultaneous Feature Selection and Clustering Using Mixture Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 9, pp.1154-1166, Sept. 2004.
[30] M.H.C. Law, A.K. Jain, and M.A.T. Figueiredo, “Feature Selection in Mixture-Based Clustering,” Advances in Neural Information Processing Systems, vol. 15, pp. 625-632, 2002.
[31] D.G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[32] K.Z. Mao, “Identifying Critical Variables of Principal Components for Unsupervised Feature Selection,” IEEE Trans. Systems, Man, and Cybernetics—Part B: Cybernetics, vol. 35, no. 2, pp. 339-344, 2005.
[33] K. Mikolajczyk and C. Schmid, “A Performance Evaluation of Local Descriptors,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, Oct. 2005.
[34] W. Niblack, R. Barber, W. Equitz, M. Flickner, E.H. Glasman, D. Yanker, P. Faloutsos, and G. Taubin, “The QBIC Project: Querying Images by Content Using Color, Texture and Shape,” Proc. SPIE Conf. Storage and Retrieval for Images and Video Databases, pp. 173-187, 1993.
[35] A. Oliva and A. Torralba, “Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope,” Int'l J. Computer Vision, vol. 42, pp. 145-175, 2001.
[36] W. Pan and X. Shen, “Penalized Model-Based Clustering with Application to Variable Selection,” J. Machine Learning Research, vol. 8, pp. 1145-1164, 2007.
[37] C.P. Robert, The Bayesian Choice. Springer, 2001.
[38] C.P. Robert and J. Rousseau, “A Mixture Approach To Bayesian Goodness of Fit,” Technical Report 02009, Cahier duCEREMADE, Universite Paris Dauphine, 2002.
[39] S.J. Raudys and A.K. Jain, “Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 3, pp.252-264, Mar. 1991.
[40] A. Vailaya, M.A.T. Figueiredo, A.K. Jain, and H.J. Zhang, “Image Classification for Content-Based Indexing,” IEEE Trans. Image Processing, vol. 10, no. 1, pp. 117-130, 2001.
[41] S. Vaithyanathan and B. Dom, “Generalized Model Selection for Unsupervised Learning in High Dimensions,” Advances in Neural Information Processing Systems, vol. 12, pp. 970-976, 1999.
[42] H.Y Wang, H. Zha, and H. Qin, “Dirichlet Aggregation: Unsupervised Learning towards an Optimal Metric for Proportional Data,” Proc. 24th Int'l Conf. Machine Learning, 2007.
396 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool