Issue No.02 - Feb. (2013 vol.25)

pp: 476-480

K. Shehzad , University of Engineering and Technology, Taxila, Pakistan

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.237

ABSTRACT

Pruning achieves the dual goal of reducing the complexity of the final hypothesis for improved comprehensibility, and improving its predictive accuracy by minimizing the overfitting due to noisy data. This paper presents a new hybrid pruning technique for rule induction, as well as an incremental postpruning technique based on a misclassification tolerance. Although both have been designed for RULES-7, the latter is also applicable to any rule induction algorithm in general. A thorough empirical evaluation reveals that the proposed techniques enable RULES-7 to outperform other state-of-the-art classification techniques. The improved classifier is also more accurate and up to two orders of magnitude faster than before.

INDEX TERMS

Accuracy, Noise measurement, Classification algorithms, Training data, Noise, Runtime, Earth, classification, Overfitting, pruning, noise handling, data mining, knowledge discovery, machine learning, inductive learning, supervised learning, rule induction

CITATION

K. Shehzad, "Simple Hybrid and Incremental Postpruning Techniques for Rule Induction",

*IEEE Transactions on Knowledge & Data Engineering*, vol.25, no. 2, pp. 476-480, Feb. 2013, doi:10.1109/TKDE.2011.237REFERENCES

- [1] F. Klawonn and F. Rehm, "Clustering Techniques for Outlier Detection,"
Encyclopedia of Data Warehousing and Mining, E. Brennan, A. Bubnis, R. Davies, and S. VanderHook eds., vol. 1, pp. 397-402, Idea Group Publishing, 2006.- [2] X. Yin and J. Han, "CPAR: Classification Based on Predictive Association Rules,"
Proc. SIAM Int'l Conf. Data Mining, pp. 345-360, 2003.- [3] D.T. Pham, S. Bigot, and S.S. Dimov, "A Rule Merging Technique for Handling Noise in Inductive Learning,"
Proc. Inst. of Mechanical Engineers, Part C: J. Mechanical Eng. Science, vol. 218, pp. 1255-1268, 2004.- [4] O. Dain, R. Cunningham, and S. Boyer, "IREP++, a Faster Rule Learning Algorithm,"
Proc. Fourth SIAM Int'l Conf. Data Mining, pp. 138-146, 2004.- [5] J. Furnkranz and G. Widmer, "Incremental Reduced Error Pruning,"
Proc. 11th Int'l Conf. Machine Learning, pp. 70-77, 1994.- [6] P. Domingos, "Efficient Specific-to-General Rule Induction,"
Proc. Second Int'l Conf. Knowledge Discovery and Data Mining, pp. 319-322, 1996.- [7] P. Domingos, "Linear-Time Rule Induction,"
Proc. Fourth SIAM Int'l Conf. Data Mining, pp. 96-101, 1996.- [8] D.T. Pham and A.A. Afify, "Three New MDL-Based Pruning Techniques for Robust Rule Induction,"
Proc. Inst. of Mechanical Engineers, Part C: J. Mechanical Eng. Science, vol. 220, pp. 553-564, 2006.- [9] D.T. Pham and A.A. Afify, "RULES-6: A Simple Rule Induction Algorithm for Supporting Decision Making,"
Proc. IEEE 31st Ann. Conf. Industrial Electronics Soc., pp. 2184-2189, 2005.- [10] K. Shehzad, "EDISC: A Class-Tailored Discretization Technique for Rule-Based Classification,"
IEEE Trans. Knowledge and Data Eng., vol. 24, no. 8, pp. 1435-1447, http://dx.doi.org/10.1109TKDE.2011.101, Aug. 2012.- [11] A.A. Afify, "Design and Analysis of Scalable Rule Induction Systems," PhD thesis, Systems Eng. Division, Univ. of Wales, Cardiff, United Kingdom, 2004.
- [12] S. Bigot, "New Techniques for Handling Continuous Values in Inductive Learning," PhD thesis, Systems Eng. Division, Univ. of Wales, Cardiff, United Kingdom, 2002.
- [13] D.T. Pham, S.S. Dimov, and S. Bigot, "RULES-5: A Rule Induction Algorithm for Problems Involving Continuous Attributes,"
Proc. Inst. of Mechanical Engineers, Part C: J. Mechanical Eng. Science, vol. 217, pp. 1273-1286, 2003.- [14] C.L. Blake and C.J. Merz,
UCI Repository of Machine Learning Databases, Dept. of Information and Computer Science, Univ. of California, Irvine, http://archive.ics.uci.eduml/, 1998.- [15] I.H. Witten and E. Frank,
Data Mining - Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, 2005.- [16] W.W. Cohen, "Fast Effective Rule Induction,"
Proc. 12th Int'l Conf. Machine Learning, pp. 115-123, 1995.- [17] J. Wang and G. Karypis, "On Mining Instance-Centric Classification Rules,"
IEEE Trans. Knowledge and Data Eng, vol. 18, no. 11, pp. 1497-1511, Nov. 2006.- [18] E. Frank and I.H. Witten, "Generating Accurate Rule Sets without Global Optimization,"
Proc. 15th Int'l Conf. Machine Learning (ICML '98), pp. 144-151, 1998.- [19] P. Clark and R. Boswell, "Rule Induction with CN2: Some Recent Improvements,"
Proc. Fifth European Working Session on Learning, pp. 151-163, 1991. |