loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fourth IEEE International Conference on Data Mining (ICDM'04)
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window
Brighton, United Kingdom
November 01-November 04
ISBN: 0-7695-2142-8
Yun Chi, University of California, Los Angeles
Haixun Wang, IBM Thomas J. Watson Research Center, Hawthorne, NY
Philip S. Yu, IBM Thomas J. Watson Research Center, Hawthorne, NY
Richard R. Muntz, University of California, Los Angeles
This paper considers the problem of mining closed frequent itemsets over a sliding window using limited memory space. We design a synopsis data structure to monitor transactions in the sliding window so that we can output the current closed frequent itemsets at any time. Due to time and memory constraints, the synopsis data structure cannot monitor all possible itemsets. However, monitoring only frequent itemsets will make it impossible to detect new itemsets when they become frequent. In this paper, we introduce a compact data structure, the closed enumeration tree (CET), to maintain a dynamically selected set of itemsets over a sliding-window. The selected itemsets consist of a boundary between closed frequent itemsets and the rest of the itemsets. Concept drifts in a data stream are reflected by boundary movements in the CET. In other words, a status change of any itemset (e.g., from non-frequent to frequent) must occur through the boundary. Because the boundary is relatively stable, the cost of mining closed frequent itemsets over a sliding window is dramatically reduced to that of mining transactions that can possibly cause boundary movements in the CET. Our experiments show that our algorithm performs much better than previous approaches.
Citation:
Yun Chi, Haixun Wang, Philip S. Yu, Richard R. Muntz, "Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window," icdm, pp.59-66, Fourth IEEE International Conference on Data Mining (ICDM'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.