The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2013 vol.25)
pp: 541-555
Luca Cagliero , Politecnico di Torino, Turin
ABSTRACT
Frequent itemset mining is a widely exploratory technique that focuses on discovering recurrent correlations among data. The steadfast evolution of markets and business environments prompts the need of data mining algorithms to discover significant correlation changes in order to reactively suit product and service provision to customer needs. Change mining, in the context of frequent itemsets, focuses on detecting and reporting significant changes in the set of mined itemsets from one time period to another. The discovery of frequent generalized itemsets, i.e., itemsets that 1) frequently occur in the source data, and 2) provide a high-level abstraction of the mined knowledge, issues new challenges in the analysis of itemsets that become rare, and thus are no longer extracted, from a certain point. This paper proposes a novel kind of dynamic pattern, namely the History Generalized Pattern (HiGen), that represents the evolution of an itemset in consecutive time periods, by reporting the information about its frequent generalizations characterized by minimal redundancy (i.e., minimum level of abstraction) in case it becomes infrequent in a certain time period. To address HiGen mining, it proposes HiGen Miner, an algorithm that focuses on avoiding itemset mining followed by postprocessing by exploiting a support-driven itemset generalization approach. To focus the attention on the minimally redundant frequent generalizations and thus reduce the amount of the generated patterns, the discovery of a smart subset of HiGens, namely the Non-redundant HiGens, is addressed as well. Experiments performed on both real and synthetic datasets show the efficiency and the effectiveness of the proposed approach as well as its usefulness in a real application context.
INDEX TERMS
Itemsets, Data mining, Taxonomy, Context awareness, Information retrieval, Search methods, mining methods and algorithms, Data mining
CITATION
Luca Cagliero, "Discovering Temporal Change Patterns in the Presence of Taxonomies", IEEE Transactions on Knowledge & Data Engineering, vol.25, no. 3, pp. 541-555, March 2013, doi:10.1109/TKDE.2011.233
REFERENCES
[1] R. Agrawal, T. Imielinski, and A. Swami, "Mining Association Rules between Sets of Items in Large Databases," ACM SIGMOD Record, vol. 22, pp. 207-216, 1993.
[2] R. Agrawal and G. Psaila, "Active Data Mining," Proc. First Int'l Conf. Knowledge Discovery and Data Mining, pp. 3-8, 1995.
[3] R. Agrawal and R. Srikant, "Mining Generalized Association Rules," Proc. 21th Int'l Conf. Very Large Data Bases (VLDB '95), pp. 407-419, 1995.
[4] M.L. Antonie, O.R. Zaiane, and A. Coman, "Application of Data Mining Techniques for Medical Image Classification," Proc. Second Int'l Workshop Multimedia Data Mining (MDM/KDD '01), 2001.
[5] W.-H. Au and K.C.C. Chan, "Mining Changes in Association Rules: A Fuzzy Approach," Fuzzy Sets Systems, vol. 149, pp. 87-104, Jan. 2005.
[6] E. Baralis, L. Cagliero, T. Cerquitelli, V. D'Elia, and P. Garza, "Support Driven Opportunistic Aggregation for Generalized Itemset Extraction," Proc. IEEE Fifth Int'l Conf. Intelligent Systems (IS '10), 2010.
[7] S. Baron, M. Spiliopoulou, and O. Gnther, "Efficient Monitoring of Patterns in Data Mining Environments," Advances in Databases and Information Systems, L. Kalinichenko, R. Manthey, B. Thalheim, and U. Wloka, eds., vol. 2798, pp. 253-265, Springer, 2003.
[8] M. Böttcher, D. Nauck, D. Ruta, and M. Spott, "Towards a Framework for Change Detection in Datasets," Research and Development in Intelligent Systems XXIII, M. Bramer, F. Coenen, and A. Tuson, eds., pp. 115-128, Springer, 2007.
[9] D.W.-L. Cheung, J. Han, V. Ng, and C.Y. Wong, "Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique," Proc. 12th Int'l Conf. Data Eng. (ICDE '96), pp. 106-114, 1996.
[10] C. Clifton, R. Cooley, and J. Rennie, "Topcat: Data Mining for Topic Identification in a Text Corpus," Proc. Third European Conf. Principles and Practice of Knowledge Discovery in Databases, 2002.
[11] G. Dong, J. Han, J.M.W. Lam, J. Pei, K. Wang, and W. Zou, "Mining Constrained Gradients in Large Databases," IEEE Trans. Knowledge and Data Eng., vol. 16, no. 8, pp. 922-938, Aug. 2004.
[12] G. Dong and J. Li, "Mining Border Descriptions of Emerging Patterns from Dataset Pairs," Knowledge and Information Systems, vol. 8, pp. 178-202, Aug. 2005.
[13] U.M. Fayyad and K.B. Irani, "Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning," Proc. 13th Int'l Joint Conf. Artificial Intelligence, pp. 1022-1029, 1993.
[14] S.C. Gates, W. Teiken, and K.-S.F. Cheng, "Taxonomies by the Numbers: Building High-Performance Taxonomies," Proc. 14th ACM Int'l Conf. Information and Knowledge Management (CIKM '05), pp. 568-577, 2005.
[15] P. Giannikopoulos, I. Varlamis, and M. Eirinaki, "Mining Frequent Generalized Patterns for Web Personalization in the Presence of Taxonomies," Int'l J. Data Warehousing and Mining, vol. 6, no. 1, pp. 58-76, 2010.
[16] J. Han and Y. Fu, "Mining Multiple-Level Association Rules in Large Databases," IEEE Trans. Knowledge and Data Eng., vol. 11, no. 7, pp. 798-805, Sept. 1999.
[17] J. Hipp, A. Myka, R. Wirth, and U. Guntzer, "A New Algorithm for Faster Mining of Generalized Association Rules," Proc. Second European Symp. Principles of Data Mining and Knowledge Discovery (PKDD '98), pp. 74-82, 1998.
[18] T. Imieliski, L. Khachiyan, and A. Abdulghani, "Cubegrades: Generalizing Association Rules," Data Mining and Knowledge Discovery, vol. 6, pp. 219-257, 2002, doi: 10.1023/A:1015417610840.
[19] B. Liu, W. Hsu, and Y. Ma, "Discovering the set of Fundamental Rule Changes," Proc. Seventh ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 335-340, 2001.
[20] B. Liu, Y. Ma, and R. Lee, "Analyzing the Interestingness of Association Rules from the Temporal Dimension," Proc. IEEE Int'l Conf. Data Mining (ICDM), pp. 377-384, 2001.
[21] Python, Python Website, 2009.
[22] L.D. Raedt, "Constraint-Based Pattern Set Mining," Proc. SIAM Int'l Conf. Data Mining, pp. 237-248, 2007.
[23] B. Shen, M. Yao, Z. Wu, and Y. Gao, "Mining Dynamic Association Rules with Comments," Knowledge and Information Systems, vol. 23, pp. 73-98, Apr. 2010.
[24] P.-N. Tan, V. Kumar, and J. Srivastava, "Selecting the Right Interestingness Measure for Association Patterns," Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '02), pp. 32-41, July 2002.
[25] Y. Tao and M.T. Özsu, "Mining Frequent Itemsets in Time-Varying Data Streams," Proc. 18th ACM Conf. Information and Knowledge Management (CIKM '09), pp. 1521-1524, 2009.
[26] TPC-H, "The TPC Benchmark H. Transaction Processing Performance Council," http://www.tpc.org/tpchdefault.asp, 2009.
[27] K. Verma and O.P. Vyas, "Efficient Calendar Based Temporal Association Rule," ACM SIGMOD Record, vol. 34, pp. 63-70, Sept. 2005.
77 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool