This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2012 IEEE 12th International Conference on Data Mining Workshops
Thorough Analysis of Log Data with Dependency Rules: Practical Solutions and Theoretical Challenges
Brussels, Belgium Belgium
December 10-December 10
ISBN: 978-1-4673-5164-5
In this paper, we present our vision how statistical dependency rule mining could be applied to a thorough analysis of log data. Dependency rules are especially attractive as a first step mining method due to their efficient algorithms and globally optimal results. The major drawback is a rather specific form of the dependencies, which requires binary data. It is not always clear how heterogeneous real world data should be binarized and how the tools should be used so that all interesting dependencies would be caught. We give an overview of typical problems when analyzing log data. The three major problems are: 1) How to balance between groups and individuals such that both general regularities and individual peculiarities can be found? 2) How to handle numerical and periodic variables? 3) How to extract features from the intrinsic dimensions of log data? For each problem, we give practical solutions in the form of preprocessing techniques and constraints which can be used with the existing tools. We also point out important research problems and algorithmic challenges, which would require further research.
Index Terms:
Data mining,Automata,Cows,Redundancy,Feeds,Algorithm design and analysis,Feature extraction,preprocessing,Dependency rule,log data,numerical variable,discretization,hierarchical variable,intrinsic dimensionality
Citation:
Wilhelmiina Hamalainen, "Thorough Analysis of Log Data with Dependency Rules: Practical Solutions and Theoretical Challenges," icdmw, pp.579-586, 2012 IEEE 12th International Conference on Data Mining Workshops, 2012
Usage of this product signifies your acceptance of the Terms of Use.