|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
Fifth International Conference on Machine Learning and Applications (ICMLA'06)
Semi-supervised Data Organization for Interactive Anomaly Analysis.
Orlando, Florida
December 14-December 16
ISBN: 0-7695-2735-3
| ASCII Text | x | ||
| Javed Aslam, Sergey Bratus, Virgil Pavlu, "Semi-supervised Data Organization for Interactive Anomaly Analysis.," Machine Learning and Applications, Fourth International Conference on, pp. 55-62, Fifth International Conference on Machine Learning and Applications (ICMLA'06), 2006. | |||
| BibTex | x | ||
| @article{ 10.1109/ICMLA.2006.47, author = {Javed Aslam and Sergey Bratus and Virgil Pavlu}, title = {Semi-supervised Data Organization for Interactive Anomaly Analysis.}, journal ={Machine Learning and Applications, Fourth International Conference on}, volume = {0}, year = {2006}, isbn = {0-7695-2735-3}, pages = {55-62}, doi = {http://doi.ieeecomputersociety.org/10.1109/ICMLA.2006.47}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Machine Learning and Applications, Fourth International Conference on TI - Semi-supervised Data Organization for Interactive Anomaly Analysis. SN - 0-7695-2735-3 SP55 EP62 A1 - Javed Aslam, A1 - Sergey Bratus, A1 - Virgil Pavlu, PY - 2006 KW - null VL - 0 JA - Machine Learning and Applications, Fourth International Conference on ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICMLA.2006.47
We consider the problem of interactive iterative analysis of datasets that consist of a large number of records represented as feature vectors. The record set is known to contain a number of anomalous records that the analyst desires to locate and describe in a short and comprehensive manner. The nature of the anomaly is not known in advance (in particular, it is not known, which features or feature values identify the anomalous records, and which are irrelevant to the search), and becomes clear only in the process of analysis, as the description of the target subset is gradually refined. This situation is common in computer intrusion analysis, when a forensic analyst browses the logs to locate traces of an intrusion of unknown nature and origin, and extends to other tasks and data sets.
To facilitate such "browsing for anomalies", we propose an unsupervised data organization technique for initial summarization and representation of data sets, and a semi-supervised learning technique for iterative modifications of the latter representation. Our approach is based on information content and Jensen-Shannon divergence and is related to information bottleneck methods. We have implemented it as a part of the Kerf log analysis toolkit.
Citation:
Javed Aslam, Sergey Bratus, Virgil Pavlu, "Semi-supervised Data Organization for Interactive Anomaly Analysis.," icmla, pp.55-62, Fifth International Conference on Machine Learning and Applications (ICMLA'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.
