This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
On Characterization and Discovery of Minimal Unexpected Patterns in Rule Discovery
February 2006 (vol. 18 no. 2)
pp. 202-216
A drawback of traditional data-mining methods is that they do not leverage prior knowledge of users. In prior work, we proposed a method that could discover unexpected patterns in data by using domain knowledge in a systematic manner. In this paper, we present new methods for discovering a minimal set of unexpected patterns by combining the two independent concepts of minimality and unexpectedness, both of which have been well-studied in the KDD literature. We demonstrate the strengths of this approach experimentally using a case study in a marketing domain.

[1] R. Agrawal, T. Imielinski, and A. Swami, “Mining Association Rules between Sets of Items in Large Databases,” Proc. ACM SIGMOD, pp. 207-216, 1993.
[2] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A.I. Verkamo, “Fast Discovery of Association Rules,” Advances in Knowledge Discovery and Data Mining, U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, eds., AAAI Press, 1995.
[3] C. Aggarwal and P.S. Yu, “A New Approach to Online Generation of Association Rules,” IEEE Trans. Knowledge and Data Eng., vol. 13, no. 4, pp. 527-540, July/Aug. 2001.
[4] R. Bayardo, “Efficiently Mining Long Patterns from Databases,” Proc. ACM SIGMOD, pp. 85-93, 1998.
[5] R. Bayardo and R. Agrawal, “Mining the Most Interesting Rules,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 145-154, 1999.
[6] R. Bayardo, R. Agrawal, and D. Gunopulos, “Constraint-Based Rule Mining in Large, Dense Databases,” Proc. Int'l Conf. Data Eng., pp. 188-197, 1999.
[7] J.F. Boulicaut, A. Bykowski, and C. Rigotti, “Approximation of Frequency Queries by Means of Free Sets,” Proc. Int'l Conf. Principles of Data Mining and Knowledge Discovery, pp. 75-85, 2000.
[8] Y. Bastide, N. Pasquier, R. Taouil, G. Stumme, and L. Lakhal, “Mining Minimal Non-Redundant Association Rules Using Frequent Closed Itemsets,” Proc. First Int'l Conf. Computational Logic, pp. 972-986, 2000.
[9] G. Berger and A. Tuzhilin, “Discovering Unexpected Patterns in Temporal Data Using Temporal Logic,” Temporal Databases: Research and Practice, O. Etzion, S. Jajodia, and S. Sripada, eds., Springer, 1998.
[10] B.G. Buchanan and T.M. Mitchell, “Model Directed Learning of Production Rules,” Pattern-Directed Inference Systems, Waterman and Hayes-Roth, eds., New York: Academic Press, 1978.
[11] S. Brin, R. Motwani, J.D. Ullman, and S. Tsur, “Dynamic Itemset Counting and Implication Rules for Market Basket Data,” Proc. ACM SIGMOD, pp. 255-264, 1997.
[12] B. Crémilleux and J.F. Boulicaut, “Simplest Rules Characterizing Classes Generated by Delta-Free Sets,” Proc. 22nd Ann. Int'l Conf. Knowledge Based Systems and Applied Artificial Intelligence (ES '02), pp. 33-46, Dec. 2002.
[13] T. Calders and B. Goethals, “Mining All Non-Derivable Frequent Itemsets,” Proc. Sixth European Conf. Principles of Data Mining and Knowledge Discovery, pp. 74-85, 2002.
[14] S. Chakrabarti, S. Sarawagi, and B. Dom, “Mining Surprising Patterns Using Temporal Description Length,” Proc. Int'l Conf. Very Large Databases, pp. 606-617, 1998.
[15] M.J. Druzdzel and L.C. van der Gaag, “Building Probabilistic Networks: Where Do the Numbers Come From?” IEEE Trans. Knowledge and Data Eng., vol. 12, no. 4, pp. 481-486, July/Aug. 2000.
[16] G. Dong and J. Li, “Interestingness of Discovered Association Rules in terms of Neighborhood-Based Unexpectedness,” Proc. Second Pacific-Asia Conf. Know. Discovery and Data Mining, pp. 72-86, 1998.
[17] M. Hayes-Roth and D. Mostow, “An Automatically Compliable Recognition Network for Structured Patterns,” Proc. Int'l Conf. Artificial Intelligence, pp. 356-362, 1975.
[18] S. Jaroszewicz and D.A. Simovici, “Pruning Redundant Association Rules Using Maximum Entropy Principle,” Proc. Sixth Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining, pp. 135-147, 2002.
[19] B. Liu and W. Hsu, “Post-Analysis of Learned Rules,” Proc. 13th Nat'l Conf. Artificial Intelligence (AAAI '96), pp. 828-834, 1996.
[20] B. Liu, W. Hsu, and S. Chen, “Using General Impressions to Analyze Discovered Classification Rules,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 31-36, 1997.
[21] B. Liu, W. Hsu, and Y. Ma, “Pruning and Summarizing the Discovered Rules,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 125-134, 1999.
[22] T.M. Mitchell, “Version Spaces: A Candidate Elimination Approach to Rule Learning,” Proc. Int'l Joint Conf. Artificial Intelligence, pp. 305-310, 1977.
[23] T. Mitchell, “Generalization as Search,” Artificial Intelligence, pp. 203-226, 1982.
[24] H. Mannila and H. Toivonen, “Multiple Uses of Frequent Sets and Condensed Representations,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 189-194, 1996.
[25] T.M. Mitchell, P.E. Utgoff, and R.B. Banerji, “Learning Problem-Solving Heuristics by Experimentation,” Machine Learning, R.S. Michalski et al., eds., Palo Alto, Calif.: Tioga Press, 1982.
[26] G.D. Plotkin, “A Note on Inductive Generalization,” Machine Intelligence, Meltzer and Michie, eds., Edinburgh Univ. Press, 1970.
[27] B. Padmanabhan, “Discovering Unexpected Patterns in Data Mining Applications,” doctoral dissertation, New York Univ., 1999.
[28] G. Piatetsky-Shapiro and C.J. Matheus, “The Interestingness of Deviations,” Proc. AAAI '94 Workshop Knowledge Discovery in Databases, pp. 25-36, 1994.
[29] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, “Discovering Frequent Closed Itemsets for Association Rules,” Proc. Seveth Int'l Conf. Database Theory, pp. 398-416, 1999.
[30] B. Padmanabhan and A. Tuzhilin, “A Belief-Driven Method for Discovering Unexpected Patterns,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 94-100, 1998.
[31] B. Padmanabhan and A. Tuzhilin, “Unexpectedness as a Measure of Interestingness in Knowledge Discovery,” Decision Support Systems, vol. 27, no. 3, pp. 303-318, 1999.
[32] B. Padmanabhan and A. Tuzhilin, “Small Is Beautiful: Discovering the Minimal Set of Unexpected Patterns,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 54-63, 2000.
[33] S. Sahar, “Interestingness PreProcessing,” Proc. Int'l Conf. Data Mining, pp. 489-496, 2001.
[34] S. Sahar, “What Is Interesting: Studies on Interestingness in Knowledge Discovery,” PhD thesis, Tel-Aviv Univ., 2003.
[35] E. Suzuki, “Autonomous Discovery of Reliable Exception Rules,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 259-262, 1997.
[36] R. Srikant and R. Agrawal, “Mining Quantitative Association Rules in Large Relational Tables,” Proc. ACM SIGMOD Conf. Management of Data, pp. 1-12, 1996.
[37] D. Shah, L.V.S. Lakshmanan, K. Ramamritham, and S. Sudarshan, “Interestingness and Pruning of Mined Patterns,” Proc. 1999 ACM SIGMOD Workshop Research Issues in Data Mining and Knowledge Discovery (DMKD), 1999.
[38] E. Suzuki and M. Shimura, “Exceptional Knowledge Discovery in Databases Based on Information Theory,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 275-278, 1996.
[39] A. Silberschatz and A. Tuzhilin, “On Subjective Measures of Interestingness in Knowledge Discovery,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 275-281, 1995.
[40] A. Silberschatz and A. Tuzhilin, “What Makes Patterns Interesting in Knowledge Discovery Systems,” IEEE Trans. Knowledge and Data Eng., vol. 8, no. 6, pp. 970-974, Dec. 1996.
[41] R. Subramonian, “Defining Diff as a Data Mining Primitive,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 334-338, 1998.
[42] H. Toivonen, M. Klemetinen, P. Ronkainen, K. Hatonen, and H. Mannila, “Pruning and Grouping Discovered Association Rules,” Proc. MLNet Workshop Statistics, Machine Learning, and Discovery in Databases, pp. 47-52, 1995.
[43] S.A. Vere, “Inductive Learning of Relational Productions,” Pattern-Directed Inference Systems, D.A. Waterman and F. Hayes-Roth, eds., Academic Press, 1978.
[44] P.H. Winston, “Learning Structural Descriptions from Examples,” The Psychology of Computer Vision, P. Winston, ed., McGraw Hill, 1975.
[45] K. Wang, Y. Jiang, L. Lakshmanan, “Mining Unexpected Rules by Pushing User Dynamics,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 246-255, 2003.
[46] M.J. Zaki, “Generating Non-Redundant Association Rules,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 34-43, 2000.
[47] M.J. Zaki and C. Hsiao, “CHARM: An Efficient Algorithm for Closed Itemset Mining,” Proc. Second SIAM Int'l Conf. Data Mining, pp. 457-473, 2002.
[48] H. Zhang, B. Padmanabhan, and A. Tuzhilin, “On the Discovery of Significant Statistical Quantitative Rules,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 374-383, 2004.

Index Terms:
Index Terms- Data mining, association rules, unexpectedness, minimality.
Citation:
Balaji Padmanabhan, Alexander Tuzhilin, "On Characterization and Discovery of Minimal Unexpected Patterns in Rule Discovery," IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 2, pp. 202-216, Feb. 2006, doi:10.1109/TKDE.2006.32
Usage of this product signifies your acceptance of the Terms of Use.