This Article 
 Bibliographic References 
 Add to: 
Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval
September 2009 (vol. 21 no. 9)
pp. 1233-1248
Steven C.H. Hoi, Nanyang Technological University, Singapore
Rong Jin, Michigan State University, East Lansing
Michael R. Lyu, The Chinese University of Hong Kong, Shatin
Most machine learning tasks in data classification and information retrieval require manually labeled data examples in the training stage. The goal of active learning is to select the most informative examples for manual labeling in these learning tasks. Most of the previous studies in active learning have focused on selecting a single unlabeled example in each iteration. This could be inefficient, since the classification model has to be retrained for every acquired labeled example. It is also inappropriate for the setup of information retrieval tasks where the user's relevance feedback is often provided for the top K retrieved items. In this paper, we present a framework for batch mode active learning, which selects a number of informative examples for manual labeling in each iteration. The key feature of batch mode active learning is to reduce the redundancy among the selected examples such that each example provides unique information for model updating. To this end, we employ the Fisher information matrix as the measurement of model uncertainty, and choose the set of unlabeled examples that can efficiently reduce the Fisher information of the classification model. We apply our batch mode active learning framework to both text categorization and image retrieval. Promising results show that our algorithms are significantly more effective than the active learning approaches that select unlabeled examples based only on their informativeness for the classification model.

[1] S. Fine, R. Gilad-Bachrach, and E. Shamir, “Query by Committee, Linear Separation and Random Walks,” Theoretical Computer Science, vol. 284, no. 1, pp. 25-51, 2002.
[2] Y. Freund, H.S. Seung, E. Shamir, and N. Tishby, “Selective Sampling Using the Query by Committee Algorithm,” Machine Learning, vol. 28, nos. 2/3, pp. 133-168, 1997.
[3] H.S. Seung, M. Opper, and H. Sompolinsky, “Query by Committee,” Proc. Workshop Computational Learning Theory, pp.287-294,, 1992.
[4] C. Campbell, N. Cristianini, and A.J. Smola, “Query Learning with Large Margin Classifiers,” Proc. 17th Int'l Conf. Machine Learning (ICML '00), pp. 111-118, 2000.
[5] G. Schohn and D. Cohn, “Less is More: Active Learning with Support Vector Machines,” Proc. 17th Int'l Conf. Machine Learning, pp. 839-846, 2000.
[6] S. Tong and D. Koller, “Support Vector Machine Active Learning with Applications to Text Classification,” Proc. 17th Int'l Conf. Machine Learning (ICML '00), pp. 999-1006, 2000.
[7] A.K. McCallum and K. Nigam, “Employing EM and Pool-Based Active Learning for Text Classification,” Proc. 15th Int'l Conf. Machine Learning, pp. 350-358, 1998.
[8] N. Roy and A. McCallum, “Toward Optimal Active Learning through Sampling Estimation of Error Reduction,” Proc. 18th Int'l Conf. Machine Learning (ICML '01), pp. 441-448, 2001.
[9] T. Luo, K. Kramer, S. Samson, and A. Remsen, “Active Learning to Recognize Multiple Types of Plankton,” Proc. Int'l Conf. Pattern Recognition (ICPR '04), pp. 478-481, 2004.
[10] A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-Based Image Retrieval at the End of the Early Years,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349-1380, Dec. 2000.
[11] X. Shen and C. Zhai, “Active Feedback in Ad Hoc Information Retrieval,” Proc. ACM SIGIR '05 Conf., pp. 59-66, 2005.
[12] S.C. Hoi, R. Jin, and M.R. Lyu, “Large-Scale Text Categorization by Batch Mode Active Learning,” Proc. 15th Int'l World Wide Web Conf. (WWW '06), May 2006.
[13] D.D. Lewis and W.A. Gale, “A Sequential Algorithm for Training Text Classifiers,” Proc. 17th ACM Int'l SIGIR Conf., pp. 3-12, 1994.
[14] R. Liere and P. Tadepalli, “Active Learning with Committees for Text Categorization,” Proc. 14th Conf. Am. Assoc. for Artificial Intelligence (AAAI '97), pp. 591-596, 1997.
[15] S. Tong and E. Chang, “Support Vector Machine Active Learning for Image Retrieval,” Proc. Ninth ACM Int'l Conf. Multimedia, pp.107-118, 2001.
[16] S.C. Hoi, M.R. Lyu, and E.Y. Chang, “Learning the Unified Kernel Machines for Classification,” Proc. 20th ACM SIGKDD Conf. (KDD '06), pp. 187-196, Aug. 2006.
[17] S.C. Hoi, R. Jin, J. Zhu, and M.R. Lyu, “Batch Mode Active Learning and Its Application to Medical Image Classification,” Proc. 23rd Int'l Conf. Machine Learning (ICML '06), June 2006.
[18] Y. Guo and D. Schuurmans, “Discriminative Batch Mode Active Learning,” Proc. Conf. Advances in Neural Information Processing Systems (NIPS '07), 2007.
[19] S.C. Hoi, R. Jin, J. Zhu, and M.R. Lyu, “Semi-Supervised SVM Batch Mode Active Learning and Its Applications to Image Retrieval,” ACM Trans. Information Systems, vol. 27, no. 3, pp. 1-29, 2009.
[20] Y. Yang, “An Evaluation of Statistical Approaches to Text Categorization,” J. Information Retrieval, vol. 1, nos. 1/2, pp. 67-88, 1999.
[21] Y. Yang and J.O. Pedersen, “A Comparative Study on Feature Selection in Text Categorization,” Proc. 14th Int'l Conf. Machine Learning (ICML '97), pp. 412-420, 1997.
[22] B. Masand, G. Lino, and D. Waltz, “Classifying News Stories Using Memory Based Reasoning,” Proc. 15th ACM SIGIR Conf., pp. 59-65, 1992.
[23] C. Apte, F. Damerau, and S. Weiss, “Automated Learning of Decision Rules for Text Categorization,” ACM Trans. Information Systems, vol. 12, no. 3, pp. 233-251, 1994.
[24] K. Tzeras and S. Hartmann, “Automatic Indexing Based on Bayesian Inference Networks,” Proc. 16th ACM Int'l SIGIR Conf., pp. 22-34, 1993.
[25] W.W. Cohen, “Text Categorization and Relational Learning,” Proc. 12th Int'l Conf. Machine Learning (ICML '95), pp. 124-132, 1995.
[26] M.E. Ruiz and P. Srinivasan, “Hierarchical Text Categorization Using Neural Networks,” Information Retrieval, vol. 5, no. 1, pp. 87-118, 2002.
[27] T. Joachims, “Text Categorization with Support Vector Machines: Learning with Many Relevant Features,” Proc. 10th European Conf. Machine Learning (ECML '98), pp. 137-142, 1998.
[28] P. Komarek and A. Moore, “Fast Robust Logistic Regression for Large Sparse Data Sets with Binary Outputs,” Proc. Conf. Artificial Intelligence and Statistics (AISTAT), 2003.
[29] P. Komarek and A. Moore, “Making Logistic Regression a Core Data Mining Tool: A Practical Investigation of Accuracy, Speed, and Simplicity,” Technical Report TR-05-27, Robotics Inst., Carnegie Mellon Univ., May 2005.
[30] M.I. Jordan and R.A. Jacobs, “Hierarchical Mixtures of Experts and the Em Algorithm,” Neural Computation, vol. 6, no. 2, pp. 181-214, 1994.
[31] J. Zhu, “Semi-Supervised Learning Literature Survey,” technical report, Carnegie Mellon Univ., 2005.
[32] D. MacKay, “Information-Based Objective Functions for Active Data Selection,” Neural Computation, vol. 4, no. 4, pp. 590-604, , 1992.
[33] N. Vasconcelos and A. Lippman, “Learning from User Feedback in Image Retrieval Systems,” Proc. Conf. Advances in Neural Information Processing Systems, 1999.
[34] K. Tieu and P. Viola, “Boosting Image Retrieval,” Proc. IEEE Computer Vision and Pattern Recognition Conf. (CVPR '00), vol. 1, pp.228-235, 2000.
[35] T.S. Huang and X.S. Zhou, “Image Retrieval by Relevance Feedback: From Heuristic Weight Adjustment to Optimal Learning Methods,” Proc. IEEE Int'l Conf. Image Processing, vol. 3, pp. 2-5, 2001.
[36] L. Zhang, F. Lin, and B. Zhang, “Support Vector Machine Learning for Image Retrieval,” Proc. Int'l Conf. Image Processing (ICIP '01), vol. 2, pp. 721-724, 2001.
[37] C.-H. Hoi and M.R. Lyu, “A Novel Log-Based Relevance Feedback Technique in Content-Based Image Retrieval,” Proc. 12th ACM Int'l Conf. Multimedia (MM '04), pp. 24-31, 2004.
[38] S.C.H. Hoi, M.R. Lyu, and R. Jin, “A Unified Log-Based Relevance Feedback Scheme for Image Retrieval,” IEEE Trans. Knowledge and Data Eng., vol. 18, no. 4, pp. 509-524, Apr. 2006.
[39] V.N. Vapnik, Statistical Learning Theory. Wiley, 1998.
[40] J. Zhang, R. Jin, Y. Yang, and A. Hauptmann, “Modified Logistic Regression: An Approximation to SVM and Its Applications in Large-Scale Text Categorization,” Proc. Int'l Conf. Machine Learning, 2003.
[41] G. Kimeldorf and G. Wahba, “Some Results on Tchebycheffian Spline Functions,” J. Math. Analysis and Applications, vol. 33, no. 1, pp. 82-95, 1971.
[42] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. Springer-Verlag, 2001.
[43] J. Zhu and T. Hastie, “Kernel Logistic Regression and the Import Vector Machine,” Proc. Conf. Advances in Neural Information Processing Systems, vol. 14, pp. 1081-1088, 2001.
[44] T. Zhang and F.J. Oles, “A Probability Analysis on the Value of Unlabeled Data for Classification Problems,” Proc. 17th Int'l Conf. Machine Learning (ICML), 2000.
[45] S.D. Silvey, Statistical Inference. Chapman and Hall, 1975.
[46] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Univ. Press, 2003.
[47] J. Sturm, “Using Sedumi: A Matlab Toolbox for Optimization over Symmetric Cones,” Optimization Methods and Software, vols. 11/12, pp. 625-653, 1999.
[48] E.Z.B. Anderson, LAPACK User's Guide, third ed. SIAM, 1999.
[49] T. Joachims, “Making Large-Scale SVM Learning Practical,” Advances in Kernel Methods—Support Vector Learning, MIT Press, 1999.

Index Terms:
Batch mode active learning, logistic regressions, kernel logistic regressions, convex optimization, text categorization, image retrieval.
Steven C.H. Hoi, Rong Jin, Michael R. Lyu, "Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1233-1248, Sept. 2009, doi:10.1109/TKDE.2009.60
Usage of this product signifies your acceptance of the Terms of Use.