This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Progressive Partition Miner: An Efficient Algorithm for Mining General Temporal Association Rules
July/August 2003 (vol. 15 no. 4)
pp. 1004-1017

Abstract—In this paper, we explore a new problem of mining general temporal association rules in publication databases. In essence, a publication database is a set of transactions where each transaction T is a set of items of which each item contains an individual exhibition period. The current model of association rule mining is not able to handle the publication database due to the following fundamental problems, i.e., 1) lack of consideration of the exhibition period of each individual item and 2) lack of an equitable support counting basis for each item. To remedy this, we propose an innovative algorithm Progressive-Partition-Miner (abbreviated as PPM) to discover general temporal association rules in a publication database. The basic idea of PPM is to first partition the publication database in light of exhibition periods of items and then progressively accumulate the occurrence count of each candidate 2\hbox{-}{\rm{itemset}} based on the intrinsic partitioning characteristics. Algorithm PPM is also designed to employ a filtering threshold in each partition to early prune out those cumulatively infrequent 2\hbox{-}{\rm{itemsets}}. The feature that the number of candidate 2\hbox{-}{\rm{itemsets}} generated by PPM is very close to the number of frequent 2\hbox{-}{\rm{itemsets}} allows us to employ the scan reduction technique to effectively reduce the number of database scans. Explicitly, the execution time of PPM is, in orders of magnitude, smaller than those required by other competitive schemes that are directly extended from existing methods. The correctness of PPM is proven and some of its theoretical properties are derived. Sensitivity analysis of various parameters is conducted to provide many insights into Algorithm PPM.

[1] R. Agarwal, C. Aggarwal, and V.V.V. Prasad, A Tree Projection Algorithm for Generation of Frequent Itemsets J. Parallel and Distributed Computing, 2000.
[2] R. Agrawal, T. Imielinski, and A. Swami, Mining Association Rules between Sets of Items in Large Databases Proc. ACM SIGMOD, pp. 207-216, May 1993.
[3] R. Agrawal and J.C. Shafer, Parallel Mining of Association Rules: Design, Implementation, and Experience IEEE Trans. Knowledge and Data Eng., pp. 487-499, Dec. 1996.
[4] R. Agrawal and R. Srikant, Fast Algorithms for Mining Association Rules in Large Databases Proc. 20th Int'l Conf. Very Large Data Bases, pp. 478-499, Sept. 1994.
[5] J.M. Ale and G. Rossi, An Approach to Discovering Temporal Association Rules Proc. ACM Symp. Applied Computing, 2000.
[6] C. Bettini, X.S. Wang, and S. Jajodia, Mining Temporal Relationships with Multiple Granularities in Time Sequences Bull. IEEE Computer Soc. Technical Committee on Data Eng., 1998.
[7] M.-S. Chen, J. Han, and P.S. Yu, Data Mining: An Overview from Database Perspective IEEE Trans. Knowledge and Data Eng., vol. 8, no. 6, pp. 866-883, Dec. 1996.
[8] X. Chen and I. Petr, Discovering Temporal Association Rules: Algorithms, Language, and System Proc. 2000 Int'l Conf. Data Eng., 2000.
[9] X. Chen, I. Petrounias, and H. Heathfield, Discovery of Association Rules in Temporal Databases Proc. Issues and Applications of Database Technology, 1998.
[10] D. Cheung, J. Han, V. Ng, and C.Y. Wong, Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique Proc. 1996 Int'l Conf. Data Eng., pp. 106-114, Feb. 1996.
[11] G. Dong and J. Li, Efficient Mining of Emerging Patterns: Discovering Trends and Differences Knowledge Discovery and Data Mining, pp. 43-52, 1999.
[12] E. Cohen et al., Finding Interesting Associations without Support Pruning IEEE Trans. Knowledge and Data Eng., vol. 13, no. 1, pp. 64-78, Jan./Feb. 2001.
[13] J. Han, G. Dong, and Y. Yin, Efficient Mining of Partial Periodic Patterns in Time Series Database Proc. 15th Int'l Conf. Data Eng., pp. 106-115, Mar. 1999.
[14] J. Han and Y. Fu, Discovery of Multiple-Level Association Rules from Large Databases Proc. 21st Int'l Conf. Very Large Data Bases, pp. 420-431, Sept. 1995.
[15] J. Han and J. Pei, Mining Frequent Patterns by Pattern-Growth: Methodology and Implications ACM SIGKDD Explorations, Dec. 2000.
[16] J. Hipp, U. Guntzer, and G. Nakhaeizadeh, Algorithms for Association Rule Mining A General Survey and Comparison ACM SIGKDD Explorations, vol. 2, no. 1, pp. 58-64, July 2000.
[17] E. Keogh, K. Chakrabarti, S. Mehrotra, and M. Pazzani, Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases Proc. ACM-SIGMOD Conf. Management of Data, 2001.
[18] C.-H. Lee, C.-R. Lin, M.-S. Chen, Sliding-Window Filtering: An Efficient Algorithm for Incremental Mining Proc. ACM 10th Int'l Conf. Information and Knowledge Management, Nov. 2001.
[19] Y. Li, P. Ning, X.S. Wang, and S. Jajodia, Discovering Calendar-Based Temporal Association Rules TIME, pp. 111-118, 2001.
[20] J.-L. Lin and M.H. Dunham, Mining Association Rules: Anti-Skew Algorithms Proc. Int'l Conf. Data Eng., pp. 486-493, 1998.
[21] B. Liu, W. Hsu, and Y. Ma, Mining Association Rules with Multiple Minimum Supports Proc. Int'l Conf. Knowledge Discovery and Data Mining, Aug. 1999.
[22] C.C. Liu, J.L. Hsu, and A.L.P. Chen, Efficient Theme and Non-Trivial Repeating Pattern Discovering in Music Databases Proc. IEEE Int'l Conf. Data Eng., 1999.
[23] J.-S. Park, M.-S. Chen, and P.S. Yu, Mining Association Rules with Adjustable Accuracy Proc. ACM Sixth Int'l Conf. Information and Knowledge Management, pp. 151-160, Nov. 1997.
[24] J.-S. Park, M.-S. Chen, and P.S. Yu, Using a Hash-Based Method with Transaction Trimming for Mining Association Rules IEEE Trans. Knowledge and Data Eng., vol. 9, no. 5, pp. 813-825, Oct. 1997.
[25] J. Pei, J. Han, and L.V.S. Lakshmanan, Mining Frequent Itemsets with Convertible Constraints Proc. Int'l Conf. Data Eng., 2001.
[26] J.F. Roddick and M. Spiliopoulou, A Survey of Temporal Knowledge Discovery Paradigms and Methods IEEE Trans. Knowledge and Data Eng., pending publication, 2000.
[27] A. Savasere, E. Omiecinski, and S. Navathe, An Efficient Algorithm for Mining Association Rules in Large Databases Proc. 21st Int'l Conf. Very Large Data Bases, pp. 432-444, Sept. 1995.
[28] R. Srikant and R. Agrawal, Mining Generalized Association Rules Proc. the 21st Int'l Conf. Very Large Data Bases, pp. 407-419, Sept. 1995.
[29] R. Srikant and R. Agrawal, Mining Quantitative Association Rules in Large Relational Tables Proc. ACM-SIGMOD Conf. Management of Data, 1996.
[30] A.U. Tansel and N.F. Ayan, Discovery of Association Rules in Temporal Databases Proc. AAAI Knowledge Discovery in Databases, 1998.
[31] H. Toivonen, Sampling Large Databases for Association Rules Proc. 22nd Very Large Data Base Conf., pp. 134-145, Sept. 1996.
[32] A.K.H. Tung, J. Han, L.V.S. Lakshmanan, and R.T. Ng, Constraint-Based Clustering in Large Databases Proc. 2001 Int'l Conf. Database Theory, Jan. 2001.
[33] R. Villafane, K.A. Hua, D. Tran, and B. Maulik, Mining Interval Time Series Data Warehousing and Knowledge Discovery, pp. 318-330, 1999.
[34] C. Yang, U. Fayyad, and P. Bradley, Efficient Discovery of Error-Tolerant Frequent Itemsets in High Dimensions Proc. Seventh ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2001.

Index Terms:
Data mining, general temporal association rule, exhibition period, publication database.
Citation:
Chang-Hung Lee, Ming-Syan Chen, Cheng-Ru Lin, "Progressive Partition Miner: An Efficient Algorithm for Mining General Temporal Association Rules," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 4, pp. 1004-1017, July-Aug. 2003, doi:10.1109/TKDE.2003.1209015
Usage of this product signifies your acceptance of the Terms of Use.