loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Sixth IEEE International Conference on Data Mining (ICDM'06)
Tolerance Closed Frequent Itemsets
Hong Kong
December 18-December 22
ISBN: 0-7695-2701-9
James Cheng, The Hong Kong University of Science and Technology, Hong Kong
Yiping Ke, The Hong Kong University of Science and Technology, Hong Kong
Wilfred Ng, The Hong Kong University of Science and Technology, Hong Kong
In this paper, we study an inherent problem of mining Frequent Itemsets (FIs): the number of FIs mined is often too large. The large number of FIs not only affects the mining performance, but also severely thwarts the application of FI mining. In the literature, Closed FIs (CFIs) and Maximal FIs (MFIs) are proposed as concise representations of FIs. However, the number of CFIs is still too large in many cases, while MFIs lose information about the frequency of the FIs. To address this problem, we relax the restrictive definition of CFIs and propose the \delta-Tolerance CFIs (\delta- TCFIs). Mining \delta-TCFIs recursively removes all subsets of a \delta-TCFI that fall within a frequency distance bounded by \delta. We propose two algorithms, CFI2TCFI and MineTCFI, to mine \delta-TCFIs. CFI2TCFI achieves very high accuracy on the estimated frequency of the recovered FIs but is less efficient when the number of CFIs is large, since it is based on CFI mining. MineTCFI is significantly faster and consumes less memory than the algorithms of the state-of-the-art concise representations of FIs, while the accuracy of MineTCFI is only slightly lower than that of CFI2TCFI.
Citation:
James Cheng, Yiping Ke, Wilfred Ng, "Tolerance Closed Frequent Itemsets," icdm, pp.139-148, Sixth IEEE International Conference on Data Mining (ICDM'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.