This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization
October 2006 (vol. 18 no. 10)
pp. 1338-1351
In multilabel learning, each instance in the training set is associated with a set of labels and the task is to output a label set whose size is unknown a priori for each unseen instance. In this paper, this problem is addressed in the way that a neural network algorithm named BP-MLL, i.e., Backpropagation for Multilabel Learning, is proposed. It is derived from the popular Backpropogation algorithm through employing a novel error function capturing the characteristics of multilabel learning, i.e., the labels belonging to an instance should be ranked higher than those not belonging to that instance. Applications to two real-world multilabel learning problems, i.e., functional genomics and text categorization, show that the performance of BP-MLL is superior to that of some well-established multilabel learning algorithms.

[1] C.M. Bishop, Neural Networks for Pattern Recognition. New York: Oxford Univ. Press, 1995.
[2] M.R. Boutell, J. Luo, X. Shen, and C.M. Brown, “Learning Multi-Label Scene Classification,” Pattern Recognition, vol. 37, no. 9, pp. 1757-1771, 2004.
[3] A. Clare, “Machine Learning and Data Mining for Yeast Functional Genomics,” PhD dissertation, Dept. of Computer Science, Univ. of Wales Aberystwyth, 2003.
[4] A. Clare and R.D. King, “Knowledge Discovery in Multi-Label Phenotype Data,” Lecture Notes in Computer Science, L.D. Raedt and A. Siebes, eds., vol. 2168, pp. 42-53. Berlin: Springer, 2001.
[5] F.D. Comité, R. Gilleron, and M. Tommasi, “Learning Multi-Label Alternating Decision Tree from Texts and Data,” Lecture Notes in Computer Science, P. Perner and A. Rosenfeld, eds., vol. 2734, pp. 35-49. Berlin: Springer, 2003.
[6] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistics Soc. B, vol. 39, no. 1, pp. 1-38, 1977.
[7] S.T. Dumais, J. Platt, D. Heckerman, and M. Sahami, “Inductive Learning Algorithms and Representation for Text Categorization,” Proc. Seventh ACM Int'l Conf. Information and Knowledge Management (CIKM '98), pp. 148-155, 1998.
[8] A. Elisseeff and J. Weston, “A Kernel Method for Multi-Labelled Classification,” Advances in Neural Information Processing Systems, T.G. Dietterich, S. Becker, and Z. Ghahramani, eds., vol. 14, pp. 681-687, 2002.
[9] Y. Freund and L. Mason, “The Alternating Decision Tree Learning Algorithm,” Proc. 16th Int'l Conf. Machine Learning (ICML '99), pp. 124-133, 1999.
[10] Y. Freund and R.E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” J. Computer and System Sciences, vol. 55, no. 1, pp. 119-139, 1997.
[11] S. Gao, W. Wu, C.-H. Lee, and T.-S. Chua, “A Maximal Figure-of-Merit Learning Approach to Text Categorization,” Proc. 26th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '03), pp. 174-181, 2003.
[12] S. Gao, W. Wu, C.-H. Lee, and T.-S. Chua, “A MFoM Learning Approach to Robust Multiclass Multi-Label Text Categorization,” Proc. 21st Int'l Conf. Machine Learning (ICML '04), pp. 329-336, 2004.
[13] S. Haykin, Neural Networks: A Comprehensive Foundation, second ed. Englewood Cliffs, N.J.: Prentice-Hall, 1999.
[14] R. Jin and Z. Ghahramani, “Learning with Multiple Labels,” Advances in Neural Information Processing Systems, S. Becker, S. Thrun, and K. Obermayer, eds., vol. 15, pp. 897-904, 2003.
[15] T. Joachims and F. Sebastiani, guest editors' introduction, J. Intelligent Information Systems, special issue on automated text categorization, vol. 18, nos. 2/3, pp. 103-105, Mar.-May 2002.
[16] H. Kazawa, T. Izumitani, H. Taira, and E. Maeda, “Maximal Margin Labeling for Multi-Topic Text Categorization,” Advances in Neural Information Processing Systems, L.K. Saul, Y. Weiss, and L. Bottou, eds., vol. 17, pp. 649-656, 2005.
[17] T. Kohonen, “An Introduction to Neural Computing,” Neural Networks, vol. 1, no. 1, pp. 3–16, 1988.
[18] A. McCallum, “Multi-Label Text Classification with a Mixture Model Trained by EM,” Proc. Working Notes Am. Assoc. Artificial Intelligence Workshop Text Learning (AAAI '99), 1999.
[19] W.S. McCulloch and W. Pitts, “A Logical Calculus of the Ideas Immanent in Nervous Activity,” Bull. Math. Biophysics, vol. 5, pp. 115-133, 1943.
[20] M. Minsky and S. Papert, Perceptrons. Cambridge, Mass.: MIT Press, 1969.
[21] P. Pavlidis, J. Weston, J. Cai, and W.N. Grundy, “Combining Microarray Expression Data and Phylogenetic Profiles to Learn Functional Categories Using Support Vector Machines,” Proc. Fifth Ann. Int'l Conf. Computational Molecular Biology (RECOMB '01), pp. 242-248, 2001.
[22] J.R. Quinlan, Programs for Machine Learning. San Mateo, Calif.: Morgan Kaufmann, 1993.
[23] F. Rosenblatt, Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Washington, D.C.: Spartan Books, 1962.
[24] D.E. Rumelhart, G.E. Hinton, and R.J. Williams, “Learning Internal Representations by Error Propagation,” Parallel Distributed Processing: Explorations in the Microstructure of Cognition, D.E. Rumelhart and J. L. McClelland, eds., vol. 1, pp. 318-362, 1986.
[25] G. Salton, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Reading, Pa.: Addison-Wesley, 1989.
[26] G. Salton, “Developments in Automatic Text Retrieval,” Science, vol. 253, pp. 974-980, 1991.
[27] R.E. Schapire and Y. Singer, “Improved Boosting Algorithms Using Confidence-Rated Predictions,” Proc. 11th Ann. Conf. Computational Learning Theory (COLT '98), pp. 80-91, 1998.
[28] R.E. Schapire and Y. Singer, “BoosTexter: A Boosting-Based System for Text Categorization,” Machine Learning, vol. 39, no. 2/3, pp. 135-168, 2000.
[29] F. Sebastiani, “Machine Learning in Automated Text Categorization,” ACM Computing Surveys, vol. 34, no. 1, pp. 1-47 2002.
[30] N. Ueda and K. Saito, “Parametric Mixture Models for Multi-Label Text,” Advances in Neural Information Processing Systems, S. Becker, S. Thrun, and K. Obermayer, eds., vol. 15, pp. 721-728, 2003.
[31] B. Widrow and M.E. Hoff, “Adaptive Switching Circuits,” IRE WESCON Convention Record, vol. 4, pp. 96-104, 1960.
[32] Y. Yang, “An Evaluation of Statistical Approaches to Text Categorization,” Information Retrieval, vol. 1, no. 1-2, pp. 69-90, 1999.
[33] Y. Yang and J.O. Pedersen, “A Comparative Study on Feature Selection in Text Categorization,” Proc. 14th Int'l Conf. Machine Learning (ICML '97), pp. 412-420, 1997.
[34] Z.-H. Zhou, J. Wu, and W. Tang, “Ensembling Neural Networks: Many Could Be Better than All,” Artificial Intelligence, vol. 137, no. 1-2, pp. 239-263, 2002.

Index Terms:
Machine learning, data mining, multilabel learning, neural networks, backpropagation, functional genomics, text categorization.
Citation:
Min-Ling Zhang, Zhi-Hua Zhou, "Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization," IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 10, pp. 1338-1351, Oct. 2006, doi:10.1109/TKDE.2006.162
Usage of this product signifies your acceptance of the Terms of Use.