The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - Dec. (2012 vol.24)
pp: 2184-2202
Nizar Bouguila , Concordia University, Montreal
ABSTRACT
The work proposed in this paper is motivated by the need to develop powerful models and approaches to classify and learn proportional data. Indeed, an abundance of interesting data in several applications occur naturally in this form. Our goal is to discover and capture the intrinsic nature of the data by proposing some approaches that combine the major advantages of generative models namely finite mixtures and discriminative techniques namely support vector machines (SVMs). Indeed, SVMs often rely on classic kernels which are not generally meaningful for proportional data. One serious limitation of these kernels is that they do not take into account the nature of data to classify and choosing a suitable kernel continues to be a formidable challenge for data mining and machine learning researchers. Our approach builds on selecting accurate kernels generated from finite mixtures of Dirichlet, generalized Dirichlet and Beta-Liouville distributions which chief advantage is their flexibility and explanatory capabilities in the case of heterogenous proportional data. Using extensive simulations and a number of experiments involving scene modeling and classification, and automatic image orientation detection, we show the merits of the proposed mixture models and the accuracy of the generated kernels.
INDEX TERMS
Data models, Kernel, Hidden Markov models, Support vector machine classification, Machine learning, image orientation, Generative/discriminative learning, proportional data, finite mixture models, SVMs, kernels, model selection, Dirichlet, generalized Dirichlet, Liouville, scene classification
CITATION
Nizar Bouguila, "Hybrid Generative/Discriminative Approaches for Proportional Data Modeling and Classification", IEEE Transactions on Knowledge & Data Engineering, vol.24, no. 12, pp. 2184-2202, Dec. 2012, doi:10.1109/TKDE.2011.162
REFERENCES
[1] V. Vapnik, Statistical Learning Theory. Wiley-Interscience, 1998.
[2] R. Raina, Y. Shen, A.Y. Ng, and A. McCallum, "Classification with Hybrid Generative/Discriminative Models," Proc. Advances in Neural Information Processing Systems (NIPS), 2003.
[3] C.M. Bishop, Pattern Recognition and Machine Learning. Springer-Verlag, 2006.
[4] T.S. Jaakkola and D. Haussler, "Exploiting Generative Models in Discriminative Classifiers," Proc. Advances in Neural Information Systems (NIPS), pp. 487-493, 1998.
[5] Y. Li, L.G. Shapiro, and J.A. Bilmes, "A Generative/Discriminative Learning Algorithm for Image Classification," Proc. IEEE 10th Int'l Conf. Computer Vision (ICCV), pp. 1605-1612, 2005.
[6] A.D. Holub, M. Welling, and P. Perona, "Hybrid Generative-Discriminative Visual Categorization," Int'l J. Computer Vision, vol. 77, nos. 1-3, pp. 239-258, 2008.
[7] J. Aitchison, The Statistical Analysis of Compositional Data. Chapman and Hall, 1986.
[8] G. Csurka, C.R. Dance, L. Fan, J. Willamowski, and C. Bray, "Visual Categorization with Bags of Keypoints," Proc. Workshop Statistical Learning in Computer Vision (ECCV), 2004.
[9] H. Kashima, J. Hu, B. Ray, and M. Singh, "K-Means Clustering of Proportional Data Using L1 Distance" Proc. Int'l Conf. Pattern Recognition (ICPR), pp. 1-4, 2008.
[10] J. Hu, B. Ray, and M. Singh, "Statistical Methods for Automated Generation of Service Engagement Staffing Plans," IBM J. Research and Development, vol. 51, nos. 3/4, pp. 1-13, 2007.
[11] C. Burges, "A Tutorial on Support Vector Machines for Pattern Recognition," Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121-167, 1998.
[12] L.K. Saul and D.D. Lee, "Multiplicative Updates for Classification by Mixture Models," Proc. Advances in Neural Information Processing Systems (NIPS), pp. 897-904, 2002.
[13] G.J. McLachlan and D. Peel, Finite Mixture Models. Wiley, 2000.
[14] M.A.T. Figueiredo and A.K. Jain, "Unsupervised Learning of Finite Mixture Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 4-37, Jan. 2002.
[15] N. Bouguila, D. Ziou, and J. Vaillancourt, "Unsupervised Learning of a Finite Mixture Model Based on the Dirichlet Distribution and Its Application," IEEE Trans. Image Processing, vol. 13, no. 11, pp. 1533-1543, Nov. 2004.
[16] N. Bouguila and D. Ziou, "High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 10, pp. 1716-1731, Oct. 2007.
[17] H-Y. Wang, H. Zha, and H. Qin, "Dirichlet Aggregation: Unsupervised Learning Towards an Optimal Metric for Proportional Data," Proc. Int'l Conf. Machine Learning (ICML), pp. 959-966, 2007.
[18] N. Bouguila and D. Ziou, "A Hybrid SEM Algorithm for High-Dimensional Unsupervised Learning Using a Finite Generalized Dirichlet Mixture," IEEE Trans. Image Processing, vol. 15, no. 9, pp. 2657-2668, Sept. 2006.
[19] N. Bouguila and D. Ziou, "Unsupervised Selection of a Finite Dirichlet Mixture Model: An MML-Based Approach," IEEE Trans. Knowledge and Data Eng., vol. 18, no. 8, pp. 993-1009, Aug. 2006.
[20] S. Fine, J. Navratil, and R.A. Gopinath, "A Hybrid GMM/SVM Approach to Speaker Identification," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP), pp. 417-420, 2001.
[21] V. Wan and S. Renals, "Speaker Verification Using Sequence Discriminant Support Vector Machines," IEEE Trans. Speech and Audio Processing, vol. 13, no. 2, pp. 203-210, Mar. 2005.
[22] N. Smith and M. Gales, "Speech Recognition Using SVMs," Proc. Advances in Neural Information Processing Systems (NIPS), pp. 1197-1204, 2002.
[23] T.S. Jaakkola, M. Diekhans, and D. Haussler, "Using the Fisher Kernel Method to Detect Remote Protein Homologies." Proc. Seventh Int'l Conf. Intelligent Systems for Molecular Biology (ISMB), pp. 149-158, 1999.
[24] T. Sing and N. Beerenwinkel, "Mutagenetic Tree Fisher Kernel Improves Prediction of HIV Drug Resistance from Viral Genotype," Proc. Advances in Neural Information Processing Systems (NIPS), pp. 1297-1304, 2006.
[25] P.J. Moreno and R. Rifkin, "Using the Fisher Kernel Method for Web Audio Classification," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP), pp. 2417-2420, 2000.
[26] K. Tsuda, M. Kawanabe, and K.-R. Müller, "Clustering with the Fisher Scorem," Proc. Advances in Neural Information Systems (NIPS), pp. 32-39, 2003.
[27] T. Jebara, R. Kondor, and A. Howard, "Probability Product Kernels," J. Machine Learning Research, vol. 5, pp. 819-844, 2004.
[28] T. Deselaers, G. Heigold, and H. Ney, "Object Classification by Fusing SVMs and Gaussian Mixtures," Pattern Recognition, vol. 43, pp. 2476-2484, 2010.
[29] R.J. Connor and J.E. Mosimann, "Concepts of Independence for Proportions with a Generalization of the Dirichlet Distribution," J. Am. Statistical Assoc., vol. 64, no. 325, pp. 194-206, 1969.
[30] I.R. James, "Products of Independent Beta Variables with Applications to Connor and Mosimann's Generalized Dirichlet Distribution," J. Am. Statistical Assoc., vol. 67, no. 340, pp. 910-912, 1972.
[31] S. Akosy and R.M. Haralick, "Probabilistic vs. Geometric Similarity Measures for Image Retrieval," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 357-362, 2000.
[32] S. Akosy and R.M. Haralick, "Feature Normalization and Likelihood-Based Similarity Measures for Image Retrieval," Pattern Recognition Letters, vol. 22, no. 5, pp. 563-582, 2001.
[33] S. Boutemedjet, N. Bouguila, and D. Ziou, "A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 9, pp. 1429-1443, Aug. 2009.
[34] K.T. Fang, S. Kotz, and K.W. Ng, Symmetric Multivariate and Related Distributions. Chapman and Hall, 1990.
[35] T-T. Wong, "Perfect Aggregation of Bayesian Analysis On Compositional Data," Statistical Papers, vol. 48, pp. 265-282, 2007.
[36] N. Bouguila, D. Ziou, and E. Monga, "Practical Bayesian Estimation of a Finite Beta Mixture through Gibbs Sampling and Its Applications," Statistics and Computing, vol. 16, no. 2, pp. 215-225, 2006.
[37] J. Rissanen, Information and Complexity in Statistical Modeling. Springer-Verlag, 2007.
[38] J.G. McLachlan and T. Krishnan, The EM Algorithm and Extensions. Wiley, 1997.
[39] S. Konishi and G. Kitagawa, Information Criteria and Statistical Modeling. Springer-Verlag, 2008.
[40] C.S. Wallace, Statistical and Inductive Inference by Minimum Message Length. Springer-Verlag, 2005.
[41] T. Jebara, "Images as Bags of Pixels," Proc. IEEE Int'l Conf. Computer Vision (ICCV), pp. 265-272, 2003.
[42] K. Grauman and T. Darrell, "The Pyramid Match Kernel: Efficient Learning with Sets of Features," J. Machine Learning Research, vol. 8, pp. 725-760, 2007.
[43] K. Tsuda, S. Akaho, M. Kawanabe, and K-R. Müller, "Asymptotic Properties of the Fisher Kernel," Neural Computation, vol. 16, no. 1, pp. 115-137, 2004.
[44] T. Jebara and R. Kondor, "Bhattacharyya Expected Likelihood Kernels," Proc. Conf. Learning Theory (COLT), pp. 57-71, 2003.
[45] R. Kondor and T. Jebara, "A Kernel between Sets of Vectors," Proc. Int'l Conf. Machine Learning (ICML), pp. 361-368, 2003.
[46] T.G. Dietterich, R.H. Lathrop, and T. Loranzo-Pérez, "Solving the Multiple Instance Problem with Axis-Parallel Rectangles," Artificial Intelligence, vol. 89, nos. 1/2, pp. 31-71, 1997.
[47] H-Y. Wang, Q. Yang, and H. Zha, "Adaptive p-Posterior Mixture-Model Kernels for Multiple Instance Learning," Proc. Int'l Conf. Machine Learning (ICML), pp. 1136-1143, 2008.
[48] O. Maron and A.L. Ratan, "Multiple-Instance Learning for Natural Scene Classification," Proc. Int'l Conf. Machine Learning (ICML), pp. 341-349, 1998.
[49] Q. Zhang, S.A. Goldman, W. Yu, and J.E. Fritts, "Content-Based Image Retrieval Using Multiple-Instance Learning," Proc. Int'l Conf. Machine Learning (ICML), pp. 682-689, 2002.
[50] S. Ray and M. Craven, "Supervised versus Multiple Instance Learning: An Empirical Comparison," Proc. Int'l Conf. Machine Learning (ICML), pp. 697-704, 2005.
[51] J. Wang and J-D. Zucker, "Solving the Multiple-Instance Problem: A Lazy Learning Approach," Proc. Int'l Conf. Machine Learning (ICML), pp. 1119-1126, 2000.
[52] Q. Zhang and S.A. Goldman, "EM-DD: An improved Multiple-Instance Learning Technique," Proc. Advances in Neural Information Systems (NIPS), pp. 1073-1080, 2001.
[53] J.T. Kwok and P-M. Cheung, "Marginalized Multi-Instance Kernels," Proc. Int'l Joint Conf. Artificial Intelligence (IJCAI), pp. 901-906, 2007.
[54] S. Andrews, I. Tsochantaridis, and T. Hofmann, "Support Vector Machines for Multiple-Instance Learning," Proc. Advances in Neural Information Systems (NIPS), pp. 561-568, 2002.
[55] P.-M. Cheung and J.T. Kwok, "A Regularization Framework for Multiple-Instance Learning," Proc. Int'l Conf. Machine Learning (ICML), pp. 193-200, 2006.
[56] R.C. Bunescu and R.J. Mooney, "Multiple-Instance Learning for Sparse Positive Bags," Proc. Int'l Conf. Machine Learning (ICML), pp. 105-112, 2007.
[57] T. Gärtner, P.A. Flach, A. Kowalczyk, and A.J. Smola, "Multiple-Instance Kernels," Proc. Int'l Conf. Machine Learning (ICML), pp. 179-186, 2002.
[58] S-K. Chang and A. Hsu, "Image Information Systems: Where Do We Go from Here?," IEEE Trans. Knowledge and Data Eng., vol. 4, no. 5, pp. 431-442, Oct. 1992.
[59] S. Boutemedjet, D. Ziou, and N. Bouguila, "Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data," Proc. Advances in Neural Information Processing Systems (NIPS), pp. 177-184, 2007.
[60] Y.A. Aslandogan and C.T. Yu, "Techniques and Systems for Image and Video Retrieval," IEEE Trans. Knowledge and Data Eng., vol. 11, no. 1, pp. 56-63, Jan./Feb. 1999.
[61] A. Bosch, X. Muñoz, and R. Marti, "A Review: Which Is the Best Way to Organize/Classify Images by Content," Image and Vision Computing, vol. 25, no. 6, pp. 778-791, 2007.
[62] P. Quelhas, F. Monay, J.-M. Odobez, D. Gatica-Perez, and T. Tuytelaars, "A Thousand Words in a Scene," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 9, pp. 1575-1589, Sept. 2007.
[63] J. Farquhar, S. Szedmak, H. Meng, and J. Shawe-Taylor, "Improving "Bag-of-Keypoints" Image Categorisation: Generative Models and PDF-Kernels," technical report, Univ. of Southampton, 2005.
[64] F. Perronnin and C. Dance, "Fisher Kernels on Visual Vocabularies for Image Categorization," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 1-8, 2007.
[65] A. Bosch, A. Zisserman, and X. Muñoz, "Scene Classification Using a Hybrid Generative/Discriminative Approach," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 4, pp. 712-727, Apr. 2008.
[66] A. Oliva and A. Torralba, "Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope," Int'l J. Computer Vision, vol. 42, no. 3, pp. 145-175, 2001.
[67] L. Fei-Fei and P. Perona, "A Bayesian Hierarchical Model for Learning Natural Scene Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 524-531, 2005.
[68] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition (CVPR), pp. 2169-2178, 2006.
[69] D.G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[70] T. Hofman, "Unsupervised Learning by Probabilistic Latent Semantic Analysis," Machine Learning, vol. 42, no. 1, pp. 177-196, 2001.
[71] E. Kavallieratou, N. Fakotakis, and G. Kokkinakis, "Skew Angle Estimation in Document Processing Using Cohen's Class Distributions," Pattern Recognition Letters, vol. 20, nos. 11-13, pp. 1305-1311, 1999.
[72] R.S. Caprari, "Algorithm for Text Page Up/Down Orientation Determination," Pattern Recognition Letters, vol. 21, no. 4, pp. 311-317, 2000.
[73] L. Zhang, M. Li, and H-J. Zhang, "Boosting Image Orientation Detection with Indoor vs. Outdoor Classification." Proc. IEEE Sixth Workshop Applications of Computer Vision (WACV), pp. 95-99, 2002.
[74] A. Vailaya, H. Zhang, C. Yang, F-I. Liu, and A.K. Jain, "Automatic Image Orientation Detection," IEEE Trans. Image Processing, vol. 11, no. 7, pp. 746-755, July 2002.
[75] Y.M. Wang and H. Zhang, "Detecting Image Orientation Based on Low-Level Visual Content," Computer Vision and Image Understanding, vol. 93, no. 3, pp. 328-346, 2004.
[76] M.G. Evanoff and K.M. McNeill, "Computer Recognition of Chest Image Orientation," Proc. IEEE 11th Symp. Computer-Based Medical Systems (CBMS), pp. 275-279, 1998.
[77] A.P.D. Poz and A.M.G. Tommaselli, "Automatic Absolute Orientation of Scanned Aerial Photographs," Proc. Int'l Symp. Computer Graphics, Image Processing, and Vision (SIBGRAPI), pp. 295-302, 1998.
[78] J. Luo, D. Crandall, A. Singhal, M. Boutell, and R.T. Gray, "Psychophysical Study of Image Orientation Perception," Spatial Vision, vol. 16, no. 5, pp. 429-457, 2003.
[79] F.A. Graybill, Matrices with Applications in Statistics. Wadsworth, 1983.
46 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool