loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'04)
A Scalable Algorithm for Mining Maximal Frequent Sequences Using Sampling
Boca Raton, Florida
November 15-November 17
ISBN: 0-7695-2236-X
Congnan Luo, Wright State University
Soon M. Chung, Wright State University
In this paper, we propose an efficient scalable algorithm for mining Maximal Sequential Patterns using Sampling (MSPS). The MSPS algorithm reduces much more search space than other algorithms because both the subsequence infrequency based pruning and the supersequence frequency based pruning are applied. In MSPS, sampling technique is used to identify long frequent sequences earlier, instead of enumerating all their subsequences. We propose how to adjust the user-specified minimum support level for mining a sample of the database to achieve better performance. This method makes sampling more efficient when the minimum support is small. A signature technique is utilized for the subsequence infrequency based pruning when the seed set of frequent sequences for the candidate generation is too big to be loaded into memory. A prefix tree structure is developed to count the candidate sequences of different sizes during the database scanning, and it also facilitates the customer sequence trimming. Our experiments showed MSPS has very good performance and better scalability than other algorithms.
Citation:
Congnan Luo, Soon M. Chung, "A Scalable Algorithm for Mining Maximal Frequent Sequences Using Sampling," ictai, pp.156-165, 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.