27th Annual International Computer Software and Applications Conference
Mining Sequential Patterns Using Graph Search Techniques
Dallas, Texas
November 03-November 06
ISBN: 0-7695-2020-0
Sequential patterns discovery has emerged as an important problem in data mining. In this paper, we propose an effective GST algorithm for mining sequential patterns in a large transaction database. Different from the Apriori-like algorithms, the GST algorithm can out of order find large k-sequences (k > = 3); i.e., we can find large k-sequences not directly through large (k-1)-sequences. This leads to that our algorithm has much better performance than the Apriori-like algorithms. Besides, we also propose the method to find new sequential patterns by scanning only new transactions since the database was increased. Through several comprehensive experiments, the GST algorithm gains a significant performance improvement over the Apriori-like algorithms. Also we found as long as the ratio of the items purchased in new transactions is not close to 100%, scanning only new transactions is always much better than scanning the entire database.