loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fifth IEEE International Conference on Data Mining (ICDM'05)
A Thorough Experimental Study of Datasets for Frequent Itemsets
Houston, Texas
November 27-November 30
ISBN: 0-7695-2278-5
Frédéric Flouvat, Laboratoire LIMOS, UMR CNRS and Université Clermont-Ferrand II
Fabien De Marchi, Laboratoire LIRIS, UMR CNRS and Université Lyon I
Jean-Marc Petit, Laboratoire LIRIS, UMR CNRS and INSA Lyon

The discovery of frequent patterns is a famous problem in data mining. While plenty of algorithms have been proposed during the last decade, only a few contributions have tried to understand the influence of datasets on the algorithms behavior. Being able to explain why certain algorithms are likely to perform very well or very poorly on some datasets is still an open question.

In this setting, we describe a thorough experimental study of datasets with respect to frequent itemsets. We study the distribution of frequent itemsets with respect to itemsets size together with the distribution of three concise representations: frequent closed, frequent free and frequent essential itemsets. For each of them, we also study the distribution of their positive and negative borders whenever possible.

From this analysis, we exhibit a new characterization of datasets and some invariants allowing to better predict the behavior of well known algorithms.

The main perspective of this work is to devise adaptive algorithms with respect to dataset characteristics.

Citation:
Frédéric Flouvat, Fabien De Marchi, Jean-Marc Petit, "A Thorough Experimental Study of Datasets for Frequent Itemsets," icdm, pp.162-169, Fifth IEEE International Conference on Data Mining (ICDM'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.