|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
K-Means+ID3: A Novel Method for Supervised Anomaly Detection by Cascading K-Means Clustering and ID3 Decision Tree Learning Methods
March 2007 (vol. 19 no. 3)
pp. 345-354
| ASCII Text | x | ||
| Shekhar R. Gaddam, Vir V. Phoha, Kiran S. Balagani, "K-Means+ID3: A Novel Method for Supervised Anomaly Detection by Cascading K-Means Clustering and ID3 Decision Tree Learning Methods," IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 3, pp. 345-354, March, 2007. | |||
| BibTex | x | ||
| @article{ 10.1109/TKDE.2007.44, author = {Shekhar R. Gaddam and Vir V. Phoha and Kiran S. Balagani}, title = {K-Means+ID3: A Novel Method for Supervised Anomaly Detection by Cascading K-Means Clustering and ID3 Decision Tree Learning Methods}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {19}, number = {3}, issn = {1041-4347}, year = {2007}, pages = {345-354}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2007.44}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Knowledge and Data Engineering TI - K-Means+ID3: A Novel Method for Supervised Anomaly Detection by Cascading K-Means Clustering and ID3 Decision Tree Learning Methods IS - 3 SN - 1041-4347 SP345 EP354 EPD - 345-354 A1 - Shekhar R. Gaddam, A1 - Vir V. Phoha, A1 - Kiran S. Balagani, PY - 2007 KW - Anomaly detection KW - classification KW - decision trees KW - k-Means clustering KW - receiver operating characteristic (ROC) curves. VL - 19 JA - IEEE Transactions on Knowledge and Data Engineering ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2007.44
In this paper, we present "K-Means+ID3,” a method to cascade k-Means clustering and the ID3 decision tree learning methods for classifying anomalous and normal activities in a computer network, an active electronic circuit, and a mechanical mass-beam system. The k-Means clustering method first partitions the training instances into k clusters using Euclidean distance similarity. On each cluster, representing a density region of normal or anomaly instances, we build an ID3 decision tree. The decision tree on each cluster refines the decision boundaries by learning the subgroups within the cluster. To obtain a final decision on classification, the decisions of the k-Means and ID3 methods are combined using two rules: 1) the Nearest-neighbor rule and 2) the Nearest-consensus rule. We perform experiments on three data sets: 1) Network Anomaly Data (NAD), 2) Duffing Equation Data (DED), and 3) Mechanical System Data (MSD), which contain measurements from three distinct application domains of computer networks, an electronic circuit implementing a forced Duffing Equation, and a mechanical system, respectively. Results show that the detection accuracy of the K-Means+ID3 method is as high as 96.24 percent at a false-positive-rate of 0.03 percent on NAD; the total accuracy is as high as 80.01 percent on MSD and 79.9 percent on DED.
Index Terms:
Anomaly detection, classification, decision trees, k-Means clustering, receiver operating characteristic (ROC) curves.
Citation:
Shekhar R. Gaddam, Vir V. Phoha, Kiran S. Balagani, "K-Means+ID3: A Novel Method for Supervised Anomaly Detection by Cascading K-Means Clustering and ID3 Decision Tree Learning Methods," IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 3, pp. 345-354, March 2007, doi:10.1109/TKDE.2007.44
Usage of this product signifies your acceptance of the Terms of Use.

