Subscribe

Issue No.12 - December (2009 vol.21)

pp: 1692-1707

Qian Wan , York University, Toronto

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2009.59

ABSTRACT

A transaction database usually consists of a set of time-stamped transactions. Mining frequent patterns in transaction databases has been studied extensively in data mining research. However, most of the existing frequent pattern mining algorithms (such as Apriori and FP-growth) do not consider the time stamps associated with the transactions. In this paper, we extend the existing frequent pattern mining framework to take into account the time stamp of each transaction and discover patterns whose frequency dramatically changes over time. We define a new type of patterns, called transitional patterns, to capture the dynamic behavior of frequent patterns in a transaction database. Transitional patterns include both positive and negative transitional patterns. Their frequencies increase/decrease dramatically at some time points of a transaction database. We introduce the concept of significant milestones for a transitional pattern, which are time points at which the frequency of the pattern changes most significantly. Moreover, we develop an algorithm to mine from a transaction database the set of transitional patterns along with their significant milestones. Our experimental studies on real-world databases illustrate that mining positive and negative transitional patterns is highly promising as a practical and useful approach for discovering novel and interesting knowledge from large databases.

INDEX TERMS

Data mining, association rule, frequent pattern, transitional pattern, significant milestone.

CITATION

Qian Wan, "Discovering Transitional Patterns and Their Significant Milestones in Transaction Databases",

*IEEE Transactions on Knowledge & Data Engineering*, vol.21, no. 12, pp. 1692-1707, December 2009, doi:10.1109/TKDE.2009.59REFERENCES

- [4] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,”
Proc. 20th Int'l Conf. Very Large Data Bases, pp.487-499, 1994.- [7] J. Bailey, T. Manoukian, and K. Ramamohanarao, “Fast Algorithms for Mining Emerging Patterns,”
Proc. Sixth European Conf. Principles of Data Mining and Knowledge Discovery (PKDD '02), pp.39-50, 2002.- [8] S.D. Bay and M.J. Pazzani, “Detecting Change in Categorical Data: Mining Contrast Sets,”
Proc. Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '99), pp.302-306, 1999.- [9] T. Brijs, G. Swinnen, K. Vanhoof, and G. Wets, “Using Association Rules for Product Assortment Decisions: A Case Study,”
Proc. Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp.254-260, 1999.- [10] S. Brin, R. Motwani, and C. Silverstein, “Beyond Market Baskets: Generalizing Association Rules to Correlations,”
Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '97), pp.265-276, 1997.- [12] G.-Z. Dong and J.-Y. Li, “Efficient Mining of Emerging Patterns: Discovering Trends and Differences,”
Proc. Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '99), pp.43-52, 1999.- [14] S.B. Guthery, “Partition Regression,”
J. Am. Statistical Assoc., vol. 69, no. 348, pp.945-947, 1974.- [17] D.M. Hawkins and D.F. Merriam, “Optimal Zonation of Digitized Sequential Data,”
Math. Geology, vol. 5, no. 4, pp.389-395, 1973.- [18] D.M. Hawkins, “Point Estimation of the Parameters of Piecewise Regression Models,”
J. Royal Statistical Soc. Series C (Applied Statistics), vol. 25, no. 1, pp.51-57, 1976.- [19] X. Huang, A. An, N. Cercone, and G. Promhouse, “Discovery of Interesting Association Rules from Livelink Web Log Data,”
Proc. 2002 IEEE Int'l Conf. Data Mining (ICDM '02), pp.763, 2002.- [20] D. Kifer, S. Ben-David, and J. Gehrke, “Detecting Change in Data Streams,”
Proc. 30th Int'l Conf. Very Large Data Bases (VLDB '04), pp.180-191, 2004.- [21] J.-Y. Li, K. Ramamohanarao, and G.-Z. Dong, “Emerging Patterns and Classification,”
Proc. Sixth Asian Computing Science Conf. Advances in Computing Science (ASIAN '00), pp.15-32, 2000.- [22] Y.-J. Li, P. Ning, X. Sean Wang, and S. Jajodia, “Discovering Calendar-Based Temporal Association Rules,”
Proc. Eighth Int'l Symp. Temporal Representation and Reasoning (TIME '01), pp.111-118, 2001.- [23] B. Liu, W. Hsu, and Y.-M. Ma, “Integrating Classification and Association Rule Mining,”
Proc. Fourth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '98), pp.80-86, 1998.- [24] H. Mannila, H. Toivonen, and A. Inkeri Verkamo, “Discovery of Frequent Episodes in Event Sequences,”
Data Mining and Knowledge Discovery, vol. 1, pp.259-289, 1997.- [25] B. Özden, S. Ramaswamy, and A. Silberschatz, “Cyclic Association Rules,”
Proc. 14th Int'l Conf. Data Eng. (ICDE '98), pp.412-421, 1998.- [26] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, “Discovering Frequent Closed Itemsets for Association Rules,”
Proc. Seventh Int'l Conf. Database Theory (ICDT '99), pp.398-416, 1999.- [28] J. Pei, J. Han, and R.-Y. Mao, “CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets,”
Proc. ACM SIGMOD Workshop Research Issues in Data Mining and Knowledge Discovery, pp.21-30, 2000.- [29] R. Agrawal and R. Srikant, “Mining Sequential Patterns: Generalizations and Performance Improvements,”
Proc. Fifth Int'l Conf. Extending Database Technology (EDBT '96), 1996.- [32] B.W. Silverman,
Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.- [35] P.-N. Tan and V. Kumar, “Mining Indirect Associations in Web Data,”
Proc. Revised Papers from the Third Int'l Workshop Mining Web Log Data Across All Customers Touch Points (WEBKDD '01), pp.145-166, 2002.- [36] P.-N. Tan, V. Kumar, and J. Srivastava, “Indirect Association: Mining Higher Order Dependencies in Data,”
Proc. Fourth European Conf. Principles of Data Mining and Knowledge Discovery (PKDD '00), pp.632-637, 2000.- [38] Q. Wan and A. An, “Transitional Patterns and Their Significant Milestones,”
Proc. Seventh IEEE Int'l Conf. Data Mining, 2007.- [40] M.J. Zaki and C.-J. Hsiao, “CHARM: An Efficient Algorithm for Closed Itemset Mining,”
Proc. Second SIAM Int'l Conf. Data Mining (SIAM '02), pp.34-43, 2002. |