The Community for Technology Leaders
RSS Icon
Issue No.12 - December (2009 vol.21)
pp: 1692-1707
Qian Wan , York University, Toronto
Aijun An , York University, Toronto
A transaction database usually consists of a set of time-stamped transactions. Mining frequent patterns in transaction databases has been studied extensively in data mining research. However, most of the existing frequent pattern mining algorithms (such as Apriori and FP-growth) do not consider the time stamps associated with the transactions. In this paper, we extend the existing frequent pattern mining framework to take into account the time stamp of each transaction and discover patterns whose frequency dramatically changes over time. We define a new type of patterns, called transitional patterns, to capture the dynamic behavior of frequent patterns in a transaction database. Transitional patterns include both positive and negative transitional patterns. Their frequencies increase/decrease dramatically at some time points of a transaction database. We introduce the concept of significant milestones for a transitional pattern, which are time points at which the frequency of the pattern changes most significantly. Moreover, we develop an algorithm to mine from a transaction database the set of transitional patterns along with their significant milestones. Our experimental studies on real-world databases illustrate that mining positive and negative transitional patterns is highly promising as a practical and useful approach for discovering novel and interesting knowledge from large databases.
Data mining, association rule, frequent pattern, transitional pattern, significant milestone.
Qian Wan, Aijun An, "Discovering Transitional Patterns and Their Significant Milestones in Transaction Databases", IEEE Transactions on Knowledge & Data Engineering, vol.21, no. 12, pp. 1692-1707, December 2009, doi:10.1109/TKDE.2009.59
[1] R.C. Agarwal, C.C. Aggarwal, and V.V.V. Prasad, “Depth First Generation of Long Patterns,” Proc. Sixth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '00), pp.108-118, 2000.
[2] C.C. Aggarwal, “A Framework for Diagnosing Changes in Evolving Data Streams,” Proc. 2003 ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '03), pp.575-586, 2003.
[3] R. Agrawal, T. Imieliński, and A. Swami, “Mining Association Rules between Sets of Items in Large Databases,” Proc. 1993 ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '93), pp.207-216, 1993.
[4] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proc. 20th Int'l Conf. Very Large Data Bases, pp.487-499, 1994.
[5] R. Agrawal and R. Srikant, “Mining Sequential Patterns,” Proc. 11th Int'l Conf. Data Eng., pp.3-14, 1995.
[6] J.M. Ale and G.H. Rossi, “An Approach to Discovering Temporal Association Rules,” Proc. 2000 ACM Symp. Applied Computing (SAC '00), pp.294-300, 2000.
[7] J. Bailey, T. Manoukian, and K. Ramamohanarao, “Fast Algorithms for Mining Emerging Patterns,” Proc. Sixth European Conf. Principles of Data Mining and Knowledge Discovery (PKDD '02), pp.39-50, 2002.
[8] S.D. Bay and M.J. Pazzani, “Detecting Change in Categorical Data: Mining Contrast Sets,” Proc. Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '99), pp.302-306, 1999.
[9] T. Brijs, G. Swinnen, K. Vanhoof, and G. Wets, “Using Association Rules for Product Assortment Decisions: A Case Study,” Proc. Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp.254-260, 1999.
[10] S. Brin, R. Motwani, and C. Silverstein, “Beyond Market Baskets: Generalizing Association Rules to Correlations,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '97), pp.265-276, 1997.
[11] D. Burdick, M. Calimlim, J. Flannick, J. Gehrke, and T. Yiu, “Mafia: A Maximal Frequent Itemset Algorithm,” IEEE Trans. Knowledge and Data Eng., vol. 17, no. 11, pp.1490-1504, Nov. 2005.
[12] G.-Z. Dong and J.-Y. Li, “Efficient Mining of Emerging Patterns: Discovering Trends and Differences,” Proc. Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '99), pp.43-52, 1999.
[13] V. Guralnik and J. Srivastava, “Event Detection from Time Series Data,” Proc. Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '99), pp.33-42, 1999.
[14] S.B. Guthery, “Partition Regression,” J. Am. Statistical Assoc., vol. 69, no. 348, pp.945-947, 1974.
[15] J.-W. Han, J. Pei, and X.-F. Yan, “From Sequential Pattern Mining to Structured Pattern Mining: A Pattern-Growth Approach,” J.Computer Science and Technology, vol. 19, no. 3, pp.257-279, 2004.
[16] J.-W. Han, J. Pei, Y.-W. Yin, and R.-Y. Mao, “Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach,” Data Mining and Knowledge Discovery, vol. 8, no. 1, pp.53-87, 2004.
[17] D.M. Hawkins and D.F. Merriam, “Optimal Zonation of Digitized Sequential Data,” Math. Geology, vol. 5, no. 4, pp.389-395, 1973.
[18] D.M. Hawkins, “Point Estimation of the Parameters of Piecewise Regression Models,” J. Royal Statistical Soc. Series C (Applied Statistics), vol. 25, no. 1, pp.51-57, 1976.
[19] X. Huang, A. An, N. Cercone, and G. Promhouse, “Discovery of Interesting Association Rules from Livelink Web Log Data,” Proc. 2002 IEEE Int'l Conf. Data Mining (ICDM '02), pp.763, 2002.
[20] D. Kifer, S. Ben-David, and J. Gehrke, “Detecting Change in Data Streams,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB '04), pp.180-191, 2004.
[21] J.-Y. Li, K. Ramamohanarao, and G.-Z. Dong, “Emerging Patterns and Classification,” Proc. Sixth Asian Computing Science Conf. Advances in Computing Science (ASIAN '00), pp.15-32, 2000.
[22] Y.-J. Li, P. Ning, X. Sean Wang, and S. Jajodia, “Discovering Calendar-Based Temporal Association Rules,” Proc. Eighth Int'l Symp. Temporal Representation and Reasoning (TIME '01), pp.111-118, 2001.
[23] B. Liu, W. Hsu, and Y.-M. Ma, “Integrating Classification and Association Rule Mining,” Proc. Fourth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '98), pp.80-86, 1998.
[24] H. Mannila, H. Toivonen, and A. Inkeri Verkamo, “Discovery of Frequent Episodes in Event Sequences,” Data Mining and Knowledge Discovery, vol. 1, pp.259-289, 1997.
[25] B. Özden, S. Ramaswamy, and A. Silberschatz, “Cyclic Association Rules,” Proc. 14th Int'l Conf. Data Eng. (ICDE '98), pp.412-421, 1998.
[26] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, “Discovering Frequent Closed Itemsets for Association Rules,” Proc. Seventh Int'l Conf. Database Theory (ICDT '99), pp.398-416, 1999.
[27] J. Pei, J. Han, B. Mortazavi-Asl, J.-Y. Wang, H. Pinto, Q.-M. Chen, U. Dayal, and M.-C. Hsu, “Mining Sequential Patterns by Pattern-Growth: The Prefixspan Approach,” IEEE Trans. Knowledge and Data Eng., vol. 16, no. 11, pp.1424-1440, Nov. 2004.
[28] J. Pei, J. Han, and R.-Y. Mao, “CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets,” Proc. ACM SIGMOD Workshop Research Issues in Data Mining and Knowledge Discovery, pp.21-30, 2000.
[29] R. Agrawal and R. Srikant, “Mining Sequential Patterns: Generalizations and Performance Improvements,” Proc. Fifth Int'l Conf. Extending Database Technology (EDBT '96), 1996.
[30] R.J. Bayardo, Jr., “Efficiently Mining Long Patterns from Databases,” Proc. 1998 ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '98), pp.85-93, 1998.
[31] M. Salmenkivi and H. Mannila, “Using Markov Chain Monte Carlo and Dynamic Programming for Event Sequence Data,” Knowledge and Information Systems, vol. 7, no. 3, pp.267-288, 2005.
[32] B.W. Silverman, Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.
[33] R. Srikant and R. Agrawal, “Mining Generalized Association Rules,” Future Generation Computer Systems, vol. 13, nos.2/3, pp.161-180, 1997.
[34] N. Sugiura and T. Ogden, “Testing Change-Point with Linear Trend,” Comm. Statistics B: Simulation and Computation, vol. 23, pp.287-322, 1994.
[35] P.-N. Tan and V. Kumar, “Mining Indirect Associations in Web Data,” Proc. Revised Papers from the Third Int'l Workshop Mining Web Log Data Across All Customers Touch Points (WEBKDD '01), pp.145-166, 2002.
[36] P.-N. Tan, V. Kumar, and J. Srivastava, “Indirect Association: Mining Higher Order Dependencies in Data,” Proc. Fourth European Conf. Principles of Data Mining and Knowledge Discovery (PKDD '00), pp.632-637, 2000.
[37] Q. Wan and A.-J. An, “An Efficient Approach to Mining Indirect Associations,” J. Intelligent Information Systems, vol. 27, no. 2, pp.135-158, 2006.
[38] Q. Wan and A. An, “Transitional Patterns and Their Significant Milestones,” Proc. Seventh IEEE Int'l Conf. Data Mining, 2007.
[39] M.J. Zaki and K. Gouda, “Fast Vertical Mining Using Diffsets,” Proc. Ninth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '03), pp.326-335, 2003.
[40] M.J. Zaki and C.-J. Hsiao, “CHARM: An Efficient Algorithm for Closed Itemset Mining,” Proc. Second SIAM Int'l Conf. Data Mining (SIAM '02), pp.34-43, 2002.
21 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool