The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - Nov. (2013 vol.35)
pp: 2665-2679
N. Rasiwasia , Yahoo! Labs. Bangalore, Bangalore, India
N. Vasconcelos , Univ. of California San Diego, La Jolla, CA, USA
ABSTRACT
Two new extensions of latent Dirichlet allocation (LDA), denoted topic-supervised LDA (ts-LDA) and class-specific-simplex LDA (css-LDA), are proposed for image classification. An analysis of the supervised LDA models currently used for this task shows that the impact of class information on the topics discovered by these models is very weak in general. This implies that the discovered topics are driven by general image regularities, rather than the semantic regularities of interest for classification. To address this, ts--LDA models are introduced which replace the automated topic discovery of LDA with specified topics, identical to the classes of interest for classification. While this results in improvements in classification accuracy over existing LDA models, it compromises the ability of LDA to discover unanticipated structure of interest. This limitation is addressed by the introduction of css-LDA, an LDA model with class supervision at the level of image features. In css-LDA topics are discovered per class, i.e., a single set of topics shared across classes is replaced by multiple class-specific topic sets. The css-LDA model is shown to combine the labeling strength of topic-supervision with the flexibility of topic-discovery. Its effectiveness is demonstrated through an extensive experimental evaluation, involving multiple benchmark datasets, where it is shown to outperform existing LDA-based image classification approaches.
INDEX TERMS
Image classification, Visualization, Semantics, Computational modeling, Mathematical model, Resource management, Analytical models,attributes, Image classification, graphical models, latent Dirichlet allocation, semantic classification
CITATION
N. Rasiwasia, N. Vasconcelos, "Latent Dirichlet Allocation Models for Image Classification", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 11, pp. 2665-2679, Nov. 2013, doi:10.1109/TPAMI.2013.69
REFERENCES
[1] C. Bishop, Pattern Recognition and Machine Learning, vol. 4. Springer, 2006.
[2] D. Blei and M. Jordan, "Pattern Recognition and Machine Learning," Proc. 26th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 127-134, 2003.
[3] D. Blei and J. McAuliffe, "Supervised Topic Models," Advances in Neural Information Processing Systems, vol. 20, pp. 121-128, 2008.
[4] D. Blei, A. Ng, and M. Jordan, "Latent Dirichlet Allocation," The J. Machine Learning Research, vol. 3, pp. 993-1022, 2003.
[5] A. Bosch, A. Zisserman, and X. Muoz, "Scene Classification Using a Hybrid Generative/Discriminative Approach," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 4, pp. 712-727, Apr. 2008.
[6] Y. Boureau, F. Bach, Y. LeCun, and J. Ponce, "Learning Mid-Level Features for Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2559-2566, 2010.
[7] W. Buntine, "Operations for Learning with Graphical Models," J. Artificial Intelligence Research, vol. 2, pp. 159-225, 1994.
[8] G. Carneiro, A. Chan, P. Moreno, and N. Vasconcelos, "Supervised Learning of Semantic Classes for Image Annotation and Retrieval," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 3, pp. 394-410, Mar. 2007.
[9] G. Csurka, C. Dance, L. Fan, and C. Bray, "Visual Categorization with Bags of Keypoints," Proc. European Conf. Computer Vision Workshop Statistical Learning in Computer Vision, vol. 1, pp. 1-22, 2004.
[10] L. Fei-Fei and P. Perona, "A Bayesian Hierarchical Model for Learning Natural Scene Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 524-531, 2005.
[11] T. Hofmann, "Probabilistic Latent Semantic Indexing," Proc. 22nd Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 50-57, 1999.
[12] J. Kruskal, "Nonmetric Multidimensional Scaling: A Numerical Method," Psychometrika, vol. 29, no. 2, pp. 115-129, 1964.
[13] S. Lacoste-Julien, F. Sha, and M. Jordan, "DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification," Proc. Advances in Neural Information Processing Systems Conf., vol. 21, 2008.
[14] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2169-2178, 2006.
[15] T. Minka, "Estimating a Dirichlet Distribution," vol. 1, p. 3, http://research.microsoft.com/minka/papers dirichlet/, 2000.
[16] D. Putthividhya, H. Attias, and S. Nagarajan, "Supervised Topic Model for Automatic Image Annotation," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 1894-1897, 2010.
[17] D. Putthividhya, H. Attias, and S. Nagarajan, "Topic Regression Multi-Modal Latent Dirichlet Allocation for Image Annotation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 3408-3415, 2010.
[18] P. Quelhas, F. Monay, J. Odobez, D. Gatica-Perez, and T. Tuytelaars, "A Thousand Words in a Scene," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 9, pp. 1575-1589, Sept. 2007.
[19] D. Ramage, D. Hall, R. Nallapati, and C. Manning, "Labeled LDA: A Supervised Topic Model for Credit Attribution in Multi-Labeled Corpora," Proc. Conf. Empirical Methods in Natural Language Processing, pp. 248-256, 2009.
[20] J. Rennie, "Improving Multi-Class Text Classification with Naive Bayes," PhD thesis, Massachusetts Inst. of Tech nology, 2001.
[21] G. Salton and M. McGill, Introduction to Modern Information Retrieval. McGraw-Hill, Inc., 1986.
[22] J. Sivic and A. Zisserman, "Video Google: A Text Retrieval Approach to Object Matching in Videos," Proc. Ninth IEEE Int'l Conf. Computer Vision, pp. 1470-1477, 2003.
[23] M. Steyvers and T. Griffiths, "Probabilistic Topic Models," Handbook of Latent Semantic Analysis, vol. 427, no. 7, pp. 424-440, Psychology Press, 2007.
[24] M. Vasconcelos, N. Vasconcelos, and G. Carneiro, "Weakly Supervised Top-Down Image Segmentation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1001-1006, 2006.
[25] C. Wang, D. Blei, and L. Fei-Fei, "Simultaneous Image Classification and Annotation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1903-1910, 2009.
[26] Y. Wang, P. Sabzmeydani, and G. Mori, "Semi-Latent Dirichlet Allocation: A Hierarchical Model for Human Action Recognition," Proc. Second Conf. Human Motion: Understanding, Modeling, Capture, and Animation, pp. 240-254, 2007.
[27] J. Winn, A. Criminisi, and T. Minka, "Object Categorization by Learned Universal Visual Dictionary," Proc. 10th IEEE Int'l Conf. Computer Vision, pp. 1800-1807, 2005.
[28] J. Zhu, A. Ahmed, and E. Xing, "MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification," Proc. 26th Ann. Int'l Conf. Machine Learning, pp. 1257-1264, 2009.
16 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool