The Community for Technology Leaders
RSS Icon
Issue No.08 - August (2009 vol.21)
pp: 1118-1132
Pasquale Rullo , University of Calabria, Rende
Veronica Lucia Policicchio , University of Calabria, Rende
Chiara Cumbo , Exeura S.r.l., Rende
Salvatore Iiritano , Exeura S.r.l., Rende
This paper describes Olex, a novel method for the automatic induction of rule-based text classifiers. Olex supports a hypothesis language of the form "if T_{1} or \cdots or T_{n} occurs in document d, and none of T_{n + 1}, \ldots T_{n + m} occurs in d, then classify d under category c,” where each T_{i} is a conjunction of terms. The proposed method is simple and elegant. Despite this, the results of a systematic experimentation performed on the Reuters-21578, the Ohsumed, and the ODP data collections show that Olex provides classifiers that are accurate, compact, and comprehensible. A comparative analysis conducted against some of the most well-known learning algorithms (namely, Naive Bayes, Ripper, C4.5, SVM, and Linear Logistic Regression) demonstrates that it is more than competitive in terms of both predictive accuracy and efficiency.
Data mining, text mining, clustering, classification, and association rules, mining methods and algorithms.
Pasquale Rullo, Veronica Lucia Policicchio, Chiara Cumbo, Salvatore Iiritano, "Olex: Effective Rule Learning for Text Categorization", IEEE Transactions on Knowledge & Data Engineering, vol.21, no. 8, pp. 1118-1132, August 2009, doi:10.1109/TKDE.2008.206
[1] A. Agresti, Categorical Data Analysis. Wiley-Interscience, 2002.
[2] M. Anthony and N. Biggs, Computational Learning Theory. Cambridge Univ. Press, 1992.
[3] M. Antonie and O. Zaiane, “An Associative Classifier Based on Positive and Negative Rules,” Proc. Ninth ACM SIGMOD Workshop Research Issues in Data Mining and Knowledge Discovery (DMKD), 2004.
[4] C. Apté, F.J. Damerau, and S.M. Weiss, “Automated Learning of Decision Rules for Text Categorization,” ACM Trans. Information Systems, vol. 12, no. 3, pp. 233-251, 1994.
[5] E. Baralis and P. Garza, “Associative Text Categorization Exploiting Negated Words,” Proc. 21st Ann. ACM Symp. Applied Computing (SAC '06), pp. 530-535, 2006.
[6] M.F. Caropreso, S. Matwin, and F. Sebastiani, “A Learner-Independent Evaluation of the Usefulness of Statistical Phrases for Automated Text Categorization,” Text Databases and Document Management: Theory and Practice, A.G. Chin, ed., pp. 78-102, Idea Group Publishing, 2001.
[7] W.W. Cohen, “Text Categorization and Relational Learning,” Proc. 12th Int'l Conf. Machine Learning (ICML), 1995.
[8] W. Cohen and C.D. Page, “Polynomial Learnability and Inductive Logic Programming: Methods and Results,” New Generation Computing, vol. 13, no. 34, pp. 369-409, 1995.
[9] W.W. Cohen and Y. Singer, “Context-Sensitive Learning Methods for Text Categorization,” ACM Trans. Information Systems, vol. 17, no. 2, pp. 141-173, 1999.
[10] F. Debole and F. Sebastiani, “An Analysis of the Relative Difficulty of Reuters-21578 Subsets,” Proc. Fourth Int'l Conf. Language Resources and Evaluation (LREC '04), 2004.
[11] S. Dzeroski, S. Muggleton, and S.J. Russell, “PAC-Learnability of Determinate Logic Programs,” Proc. Fifth Ann. ACM Workshop Computational Learning Theory (COLT), 1992.
[12] G. Forman, “An Extensive Empirical Study of Feature Selection Metrics for Text Classification,” J. Machine Learning Research, vol. 3, pp. 1289-1305, 2003.
[13] G. Gottlob, N. Leone, and F. Scarcello, “On the Complexity of Some Inductive Logic Programming Problems,” Proc. Seventh Int'l Workshop Inductive Logic Programming (ILP '97), pp. 17-32, 1997.
[14] W. Hersh, C. Buckley, T. Leone, and D. Hickman, “Ohsumed: An Interactive Retrieval Evaluation and New Large Text Collection for Research,” Proc. 17th ACM Int'l Conf. Research and Development in Information Retrieval (SIGIR '94), W.B. Croft and C.J. van Rijsbergen, eds., pp. 192-201, 1994.
[15] N. Japkowicz and S. Stephen, “The Class Imbalance Problem: A Systematic Study,” Intelligent Data Analysis J., vol. 6, no. 5, pp. 429-449, 2002.
[16] T. Joachims, “Text Categorization with Support Vector Machines: Learning with Many Relevant Features,” Proc. 10th European Conf. Machine Learning (ECML '98), C. Nédellec and C. Rouveirol, eds., pp. 137-142, 1998.
[17] D.E. Johnson, F.J. Oles, T. Zhang, and T. Goetz, “A Decision-Tree-Based Symbolic Rule Induction System for Text Categorization,” IBM Systems J., vol. 41, no. 3, pp. 428-437, 2002.
[18] J.-U. Kietz, “Some Lower Bounds for the Computational Complexity of Inductive Logic Programming,” Proc. Sixth European Conf. Machine Learning (ECML '93), vol. 667, pp. 115-123, 1993.
[19] J.-U. Kietz and S. Džeroski, “Inductive Logic Programming and Learnability,” SIGART Bull., vol. 5, no. 1, pp. 22-32, 1994.
[20] W. Kloesgen, “Explora: A Multipattern and Multistrategy Discovery Assistant,” Advances in Knowledge Discovery and Data Mining, pp. 249-271, 1996.
[21] D.D. Lewis, “Reuters-21578 Text Categorization Test Collection,” Distribution 1.0, http:/, 1997.
[22] D.D. Lewis and P.J. Hayes, “Guest Editors' Introduction to the Special Issue on Text Categorization,” ACM Trans. Information Systems, vol. 12, no. 3, p. 231, 1994.
[23] W. Li, J. Han, and J. Pei, “Cmar: Accurate and Efficient Classification Based on Multiple-Class Association Rule,” Proc. First IEEE Int'l Conf. Data Mining (ICDM), 2001.
[24] Open Directory Project—ODP, http:/, 2008.
[25] A. Pietramala, V.L. Policicchio, P. Rullo, and I. Sidhu, “A Genetic Algorithm for Text Classification Rule Induction,” Proc. European Conf. Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD '08), W. Daelemans, B.Goethals, and K. Morik, eds., no. 2, pp. 188-203, 2008.
[26] J.R. Quinlan, “Generating Production Rules from Decision Trees,” Proc. 10th Int'l Joint Conf. Artificial Intelligence (IJCAI'87), pp. 304-307, 1987.
[27] P. Rullo, C. Cumbo, and V.L. Policicchio, “Learning Rules with Negation for Text Categorization,” Proc. 22nd Ann. ACM Symp. Applied Computing (SAC '07), pp. 409-416, Mar. 2007.
[28] F. Sebastiani, “Machine Learning in Automated Text Categorization,” ACM Computing Surveys, vol. 34, no. 1, pp. 1-47, 2002.
[29] L.G. Valiant, “A Theory of the Learnable,” Proc. 16th Ann. ACM Symp. Theory of Computing (STOC '84), pp. 436-445, 1984.
[30] S. Weiss and N. Indurkhya, “Optimized Rule Induction,” IEEE Expert, vol. 8, no. 6, pp. 61-69, 1993.
[31] I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, second ed. Morgan Kaufmann, 2005.
[32] X. Wu, C. Zhang, and S. Zhang, “Mining Both Positive and Negative Association Rules,” Proc. 19th Int'l Conf. Machine Learning '02, pp. 658-665, 2002.
[33] Y. Yang and J.O. Pedersen, “A Comparative Study on Feature Selection in Text Categorization,” Proc. 14th Int'l Conf. Machine Learning (ICML '97), D.H. Fisher, ed., pp. 412-420, 1997.
[34] Y. Yang and X. Liu, “A Re-Examination of Text Categorization Methods,” Proc. 22nd ACM Int'l Conf. Research and Development in Information Retrieval (SIGIR '99), pp. 122-130, 1999.
32 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool