The Community for Technology Leaders
Database Engineering and Applications Symposium, International (2006)
Delhi, India
Dec. 11, 2006 to Dec. 14, 2006
ISSN: 1098-8068
ISBN: 0-7695-2577-6
pp: 113-120
Yitong Wang , Fudan university
Masaru Kitsuregawa , University of Tokyo
Zhenglu Yang , University of Tokyo
ABSTRACT
Sequential pattern mining is very important because it is the basis of many applications. Yet how to efficiently implement the mining is difficult due to the inherent characteristic of the problem - the large size of the dataset. Although there has been a great deal of effort on sequential pattern mining in recent years, its performance is still far from satisfactory. In this paper, we have proposed a new algorithm called PAssed Item Deduced sequential pattern mining (abbreviated as PAID), which can efficiently get all the frequent sequential patterns from a large database. The main difference between our strategy and the existing works is that other algorithms accumulate the candidate support in each iteration from scratch, in contrast, PAID makes good use of the temporary results (support value) of k-length frequent patterns on discovering (k+1)-length patterns, which can reduce the search space greatly in mining sequential patterns. Our experimental results and performance studies show that PAID outperforms the previous works by meaningful margins on large datasets.
INDEX TERMS
null
CITATION
Yitong Wang, Masaru Kitsuregawa, Zhenglu Yang, "PAID: Mining Sequential Patterns by Passed Item Deduction in Large Databases", Database Engineering and Applications Symposium, International, vol. 00, no. , pp. 113-120, 2006, doi:10.1109/IDEAS.2006.34
100 ms
(Ver )