The Community for Technology Leaders
RSS Icon
Issue No.06 - June (2010 vol.22)
pp: 784-797
Claudia Marinica , KOD Team—LINA CNRS, Polytech'Nantes—Site de la Chantrerie, France
In Data Mining, the usefulness of association rules is strongly limited by the huge amount of delivered rules. To overcome this drawback, several methods were proposed in the literature such as itemset concise representations, redundancy reduction, and postprocessing. However, being generally based on statistical information, most of these methods do not guarantee that the extracted rules are interesting for the user. Thus, it is crucial to help the decision-maker with an efficient postprocessing step in order to reduce the number of rules. This paper proposes a new interactive approach to prune and filter discovered rules. First, we propose to use ontologies in order to improve the integration of user knowledge in the postprocessing task. Second, we propose the Rule Schema formalism extending the specification language proposed by Liu et al. for user expectations. Furthermore, an interactive framework is designed to assist the user throughout the analyzing task. Applying our new approach over voluminous sets of rules, we were able, by integrating domain expert knowledge in the postprocessing step, to reduce the number of rules to several dozens or less. Moreover, the quality of the filtered rules was validated by the domain expert at various points in the interactive process.
Clustering, classification, and association rules, interactive data exploration and discovery, knowledge management applications.
Claudia Marinica, "Knowledge-Based Interactive Postmining of Association Rules Using Ontologies", IEEE Transactions on Knowledge & Data Engineering, vol.22, no. 6, pp. 784-797, June 2010, doi:10.1109/TKDE.2010.29
[1] R. Agrawal, T. Imielinski, and A. Swami, "Mining Association Rules between Sets of Items in Large Databases," Proc. ACM SIGMOD, pp. 207-216, 1993.
[2] U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 1996.
[3] A. Silberschatz and A. Tuzhilin, "What Makes Patterns Interesting in Knowledge Discovery Systems," IEEE Trans. Knowledge and Data Eng. vol. 8, no. 6, pp. 970-974, Dec. 1996.
[4] M.J. Zaki and M. Ogihara, "Theoretical Foundations of Association Rules," Proc. Workshop Research Issues in Data Mining and Knowledge Discovery (DMKD '98), pp. 1-8, June 1998.
[5] D. Burdick, M. Calimlim, J. Flannick, J. Gehrke, and T. Yiu, "Mafia: A Maximal Frequent Itemset Algorithm," IEEE Trans. Knowledge and Data Eng., vol. 17, no. 11, pp. 1490-1504, Nov. 2005.
[6] J. Li, "On Optimal Rule Discovery," IEEE Trans. Knowledge and Data Eng., vol. 18, no. 4, pp. 460-471, Apr. 2006.
[7] M.J. Zaki, "Generating Non-Redundant Association Rules," Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 34-43, 2000.
[8] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, "Efficient Mining of Association Rules Using Closed Itemset Lattices," Information Systems, vol. 24, pp. 25-46, 1999.
[9] H. Toivonen, M. Klemettinen, P. Ronkainen, K. Hatonen, and H. Mannila, "Pruning and Grouping of Discovered Association Rules," Proc. ECML-95 Workshop Statistics, Machine Learning, and Knowledge Discovery in Databases, pp. 47-52, 1995.
[10] B. Baesens, S. Viaene, and J. Vanthienen, "Post-Processing of Association Rules," Proc. Workshop Post-Processing in Machine Learning and Data Mining: Interpretation, Visualization, Integration, and Related Topics with Sixth ACM SIGKDD, pp. 20-23, 2000.
[11] J. Blanchard, F. Guillet, and H. Briand, "A User-Driven and Quality-Oriented Visualization for Mining Association Rules," Proc. Third IEEE Int'l Conf. Data Mining, pp. 493-496, 2003.
[12] B. Liu, W. Hsu, K. Wang, and S. Chen, "Visually Aided Exploration of Interesting Association Rules," Proc. Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD), pp. 380-389, 1999.
[13] G. Birkhoff, Lattice Theory, vol. 25. Am. Math. Soc., 1967.
[14] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, "Discovering Frequent Closed Itemsets for Association Rules," Proc. Seventh Int'l Conf. Database Theory (ICDT '99), pp. 398-416, 1999.
[15] M. Zaki, "Mining Non-Redundant Association Rules," Data Mining and Knowledge Discovery, vol. 9, pp. 223-248, 2004.
[16] A. Maedche and S. Staab, "Ontology Learning for the Semantic Web," IEEE Intelligent Systems, vol. 16, no. 2, pp. 72-79, Mar. 2001.
[17] B. Liu, W. Hsu, L.-F. Mun, and H.-Y. Lee, "Finding Interesting Patterns Using User Expectations," IEEE Trans. Knowledge and Data Eng., vol. 11, no. 6, pp. 817-832, Nov. 1999.
[18] I. Horrocks and P.F. Patel-Schneider, "Reducing owl Entailment to Description Logic Satisfiability," J. Web Semantics, pp. 17-29, vol. 2870, 2003.
[19] J. Pei, J. Han, and R. Mao, "Closet: An Efficient Algorithm for Mining Frequent Closed Itemsets," Proc. ACM SIGMOD Workshop Research Issues in Data Mining and Knowledge Discovery, pp. 21-30, 2000.
[20] M.J. Zaki and C.J. Hsiao, "Charm: An Efficient Algorithm for Closed Itemset Mining," Proc. Second SIAM Int'l Conf. Data Mining, pp. 34-43, 2002.
[21] M.Z. Ashrafi, D. Taniar, and K. Smith, "Redundant Association Rules Reduction Techniques," AI 2005: Advances in Artificial Intelligence – Proc 18th Australian Joint Conf. Artificial Intelligence pp. 254-263, 2005.
[22] M. Hahsler, C. Buchta, and K. Hornik, "Selective Association Rule Generation," Computational Statistic, vol. 23, no. 2, pp. 303-315, Kluwer Academic Publishers, 2008.
[23] J. Bayardo, J. Roberto, and R. Agrawal, "Mining the Most Interesting Rules," Proc. ACM SIGKDD, pp. 145-154, 1999.
[24] R.J. Bayardo,Jr., R. Agrawal, and D. Gunopulos, "Constraint-Based Rule Mining in Large, Dense Databases," Proc. 15th Int'l Conf. Data Eng. (ICDE '99), pp. 188-197, 1999.
[25] E.R. Omiecinski, "Alternative Interest Measures for Mining Associations in Databases," IEEE Trans. Knowledge and Data Eng., vol. 15, no. 1, pp. 57-69, Jan./Feb. 2003.
[26] F. Guillet and H. Hamilton, Quality Measures in Data Mining. Springer, 2007.
[27] P.-N. Tan, V. Kumar, and J. Srivastava, "Selecting the Right Objective Measure for Association Analysis," Information Systems, vol. 29, pp. 293-313, 2004.
[28] G. Piatetsky-Shapiro and C.J. Matheus, "The Interestingness of Deviations," Proc. AAAI'94 Workshop Knowledge Discovery in Databases, pp. 25-36, 1994.
[29] M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A.I. Verkamo, "Finding Interesting Rules from Large Sets of Discovered Association Rules," Proc. Int'l Conf. Information and Knowledge Management (CIKM), pp. 401-407, 1994.
[30] E. Baralis and G. Psaila, "Designing Templates for Mining Association Rules," J. Intelligent Information Systems, vol. 9, pp. 7-32, 1997.
[31] B. Padmanabhan and A. Tuzhuilin, "Unexpectedness as a Measure of Interestingness in Knowledge Discovery," Proc. Workshop Information Technology and Systems (WITS), pp. 81-90, 1997.
[32] T. Imielinski, A. Virmani, and A. Abdulghani, "Datamine: Application Programming Interface and Query Language for Database Mining," Proc. Int'l Conf. Knowledge Discovery and Data Mining (KDD), pp. 256-262, , 1996.
[33] R.T. Ng, L.V.S. Lakshmanan, J. Han, and A. Pang, "Exploratory Mining and Pruning Optimizations of Constrained Associations Rules," Proc. ACM SIGMOD Int'l Conf. Management of Data, vol. 27, pp. 13-24, 1998.
[34] A. An, S. Khan, and X. Huang, "Objective and Subjective Algorithms for Grouping Association Rules," Proc. Third IEEE Int'l Conf. Data Mining (ICDM '03), pp. 477-480, 2003.
[35] A. Berrado and G.C. Runger, "Using Metarules to Organize and Group Discovered Association Rules," Data Mining and Knowledge Discovery, vol. 14, no. 3, pp. 409-431, 2007.
[36] M. Uschold and M. Grüninger, "Ontologies: Principles, Methods, and Applications," Knowledge Eng. Rev., vol. 11, pp. 93-155, 1996.
[37] T.R. Gruber, "A Translation Approach to Portable Ontology Specifications," Knowledge Acquisition, vol. 5, pp. 199-220, 1993.
[38] N. Guarino, "Formal Ontology in Information Systems," Proc. First Int'l Conf. Formal Ontology in Information Systems, pp. 3-15, 1998.
[39] H. Nigro, S.G. Cisaro, and D. Xodo, Data Mining with Ontologies: Implementations, Findings and Frameworks. Idea Group, Inc., 2007.
[40] R. Srikant and R. Agrawal, "Mining Generalized Association Rules," Proc. 21st Int'l Conf. Very Large Databases, pp. 407-419, , 1995.
[41] V. Svatek and M. Tomeckova, "Roles of Medical Ontology in Association Mining Crisp-dm Cycle," Proc. Workshop Knowledge Discovery and Ontologies in ECML/PKDD, 2004.
[42] X. Zhou and J. Geller, "Raising, to Enhance Rule Mining in Web Marketing with the Use of an Ontology," Data Mining with Ontologies: Implementations, Findings and Frameworks, pp. 18-36, Idea Group Reference, 2007.
[43] M.A. Domingues and S.A. Rezende, "Using Taxonomies to Facilitate the Analysis of the Association Rules," Proc. Second Int'l Workshop Knowledge Discovery and Ontologies, held with ECML/PKDD, pp. 59-66, 2005.
[44] A. Bellandi, B. Furletti, V. Grossi, and A. Romei, "Ontology-Driven Association Rule Extraction: A Case Study," Proc. Workshop Context and Ontologies: Representation and Reasoning, pp. 1-10, 2007.
[45] R. Natarajan and B. Shekar, "A Relatedness-Based Data-Driven Approach to Determination of Interestingness of Association Rules," Proc. 2005 ACM Symp. Applied Computing (SAC), pp. 551-552, 2005.
[46] A.C.B. Garcia and A.S. Vivacqua, "Does Ontology Help Make Sense of a Complex World or Does It Create a Biased Interpretation?" Proc. Sensemaking Workshop in CHI '08 Conf. Human Factors in Computing Systems, 2008.
[47] A.C.B. Garcia, I. Ferraz, and A.S. Vivacqua, "From Data to Knowledge Mining," Artificial Intelligence for Eng. Design, Analysis and Manufacturing, vol. 23, pp. 427-441, 2009.
[48] L.M. Garshol, "Metadata? Thesauri? Taxonomies? Topic Maps Making Sense of It All," J. Information Science, vol. 30, no. 4, pp. 378-391, 2004.
[49] I. Horrocks and P.F. Patel-Schneider, "A Proposal for an owl Rules Language," Proc. 13th Int'l Conf. World Wide Web, pp. 723-731, 2004.
[50] W.E. Grosso, H. Eriksson, R.W. Fergerson, J.H. Gennari, S.W. Tu, and M.A. Musen, "Knowledge Modeling at the Millennium (the Design and Evolution of Protege-2000)," Proc. 12th Workshop Knowledge Acquisition, Modeling and Management (KAW '99), 1999.
[51] M.-A. Storey, N.F. Noy, M. Musen, C. Best, R. Fergerson, and N. Ernst, "Jambalaya: An Interactive Environment for Exploring Ontologies," Proc. Seventh Int'l Conf. Intelligent User Interfaces (IUI '02), pp. 239-239, 2002.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool