Fast Discovery and the Generalization of Strong Jumping Emerging Patterns for Building Compact and Accurate Classifiers
Issue No.06 - June (2006 vol.18)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2006.95
Classification of large data sets is an important data mining problem that has wide applications. Jumping Emerging Patterns (JEPs) are those itemsets whose supports increase abruptly from zero in one data set to nonzero in another data set. In this paper, we propose a fast, accurate, and less complex classifier based on a subset of JEPs, called Strong Jumping Emerging Patterns (SJEPs). The support constraint of SJEP removes potentially less useful JEPs while retaining those with high discriminating power. Previous algorithms based on the manipulation of border  as well as consEPMiner  cannot directly mine SJEPs. Here, we present a new tree-based algorithm for their efficient discovery. Experimental results show that: 1) the training of our classifier is typically 10 times faster than earlier approaches, 2) our classifier uses much fewer patterns than the JEP-Classifier  to achieve a similar (and, often, improved) accuracy, and 3) in many cases, it is superior to other state-of-the-art classification systems such as Naive Bayes, CBA, C4.5, and bagged and boosted versions of C4.5. We argue that SJEPs are high-quality patterns which possess the most differentiating power. As a consequence, they represent sufficient information for the construction of accurate classifiers. In addition, we generalize these patterns by introducing Noise-tolerant Emerging Patterns (NEPs) and Generalized Noise-tolerant Emerging Patterns (GNEPs). Our tree-based algorithms can be adopted to easily discover these variations. We experimentally demonstrate that SJEPs, NEPs, and GNEPs are extremely useful for building effective classifiers that can deal well with noise.
Data mining, machine learning, emerging patterns, classification, frequent patterns, mining methods and algorithms.
Hongjian Fan, Kotagiri Ramamohanarao, "Fast Discovery and the Generalization of Strong Jumping Emerging Patterns for Building Compact and Accurate Classifiers", IEEE Transactions on Knowledge & Data Engineering, vol.18, no. 6, pp. 721-737, June 2006, doi:10.1109/TKDE.2006.95