2013 IEEE 13th International Conference on Data Mining Workshops (2007)
Omaha, Nebraska, USA
Oct. 28, 2007 to Oct. 31, 2007
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2007.52
Grid systems are complex heterogeneous systems, and their modeling constitutes a highly challenging goal. This paper is interested in modeling the jobs handled by the EGEE grid, by mining the Logging and Bookkeeping files. The goal is to discover meaningful job clusters, going beyond the coarse categories of "successfully terminated jobs" and "other jobs". The presented approach is a three- step process: i) Data slicing is used to alleviate the job het- erogeneity and afford discriminant learning; ii) Construc- tive induction proceeds by learning discriminant hypotheses from each data slice; iii) Finally, double clustering is used on the representation built by constructive induction; the clusters are fully validated after the stability criteria pro- posed by Meila (2006). Lastly, the job clusters are submit- ted to the experts and some meaningful interpretations are found.
C?cile Germain, Xiangliang Zhang, Mich?le Sebag, "Toward Behavioral Modeling of a Grid System: Mining the Logging and Bookkeeping Files", 2013 IEEE 13th International Conference on Data Mining Workshops, vol. 00, no. , pp. 581-588, 2007, doi:10.1109/ICDMW.2007.52