This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
On Optimal Rule Discovery
April 2006 (vol. 18 no. 4)
pp. 460-471
In machine learning and data mining, heuristic and association rules are two dominant schemes for rule discovery. Heuristic rule discovery usually produces a small set of accurate rules, but fails to find many globally optimal rules. Association rule discovery generates all rules satisfying some constraints, but yields too many rules and is infeasible when the minimum support is small. Here, we present a unified framework for the discovery of a family of optimal rule sets and characterize the relationships with other rule-discovery schemes such as nonredundant association rule discovery. We theoretically and empirically show that optimal rule discovery is significantly more efficient than association rule discovery independent of data structure and implementation. Optimal rule discovery is an efficient alternative to association rule discovery, especially when the minimum support is low.

[1] R. Agrawal, T. Imielinski, and A. Swami, “Mining Associations Between Sets of Items in Massive Databases,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 207-216, 1993.
[2] R. Bayardo and R. Agrawal, “Mining the Most Interesting Rules,” Proc. Fifth ACM SIGKDD Int'l Conf Knowledge Discovery and Data Mining, pp. 145-154, 1999.
[3] R. Bayardo, R. Agrawal, and D. Gunopulos, “Constraint-Based Rule Mining in Large, Dense Databases,” Data Mining and Knowledge Discovery J., vol. 4, nos. 2/3, pp. 217-240, 2000.
[4] E.K.C. Blake and C.J. Merz, UCI Repository of Machine Learning Databases, http://www.ics.uci . edu/~mlearnMLRepository. html , 1998.
[5] S. Brin, R. Motwani, J.D. Ullman, and S. Tsur, “Dynamic Itemset Counting and Implication Rules for Market Basket Data,” Proc. ACM SIGMOD Int'l Conf. Management of Data, vol. 26, no. 2, pp. 255-264, 1997.
[6] P. Clark and R. Boswell, “Rule Induction with CN2: Some Recent Improvements in Machine Learning,“ Proc. Fifth European Working Session Learning (EWSL '91), pp. 151-163, 1991
[7] W.W. Cohen, “Fast, Effective Rule Induction,” Proc. 12th Int'l Conf. Machine Learning (ICML), pp. 115-123, 1995.
[8] V. Dhar and A. Tuzhilin, ”Abstract-Driven Pattern Discovery in Databases,“ IEEE Trans. Knowledge and Data Eng., vol. 5, no. 6, pp. 926-938, Dec. 1993.
[9] T. Fukuda, Y. Morimoto, S. Morishita, and T. Tokuyama, “Data Mining Using Two-Dimensional Optimized Association Rules: Scheme, Algorithms, and Visualization,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 13-23, 1996.
[10] H. Hu and J. Li, “Using Association Rules to Make Rule-Based Classifiers Robust,” Proc. 16th Australasian Database Conf. (ADC), pp. 47-52, 2005.
[11] J. Li, H. Shen, and R. Topor, “Mining the Optimal Class Association Rule Set,” Knowledge-Based Systems, vol. 15, no. 7, pp. 399-405, 2002.
[12] B. Liu, W. Hsu, and Y. Ma, “Integrating Classification and Association Rule Mining,” Proc. Fourth Int'l Conf. Knowledge Discovery and Data Mining (KDD '98), pp. 27-31, 1998.
[13] E. Omiecinski, “Alternative Interest Measures for Mining Associations in Databases,” IEEE Trans. Knowledge and Data Eng., vol. 15, no. 1, pp. 57-69, Jan. 2003.
[14] G. Piatetsky-Shapiro, “Discovery, Analysis and Presentation of Strong Rules,” Knowledge Discovery in Databases, G. Piatetsky-Shapiro, ed., pp. 229-248. Menlo Park, Calif.: AAAI Press/The MIT Press, 1991.
[15] J.R. Quinlan, C4.5: Programs for Machine Learning. San Mateo, Calif.: Morgan Kaufmann, 1993.
[16] P. Tan, V. Kumar, and J. Srivastava, “Selecting the Right Objective Measure for Association Analysis,” Information Systems, vol. 29, no. 4, pp. 293-313, 2004.
[17] G.I. Webb, “OPUS: An Efficient Admissible Algorithm for Unordered Search,” J. Artificial Intelligence Research, vol. 3, pp. 431-465, 1995.
[18] G.I. Webb and S. Zhang, “K-Optimal Rule Discovery,” Data Mining and Knowledge Discovery J., vol. 10, no. 1, pp. 39-79, 2005.
[19] M.J. Zaki, “Mining Non-Redundant Association Rules,” Data Mining and Knowledge Discovery J. vol. 9, pp. 223-248, 2004.
[20] M.J. Zaki and C.J. Hsiao, “Charm: An Efficient Algorithm for Closed Association Rule Mining,” Proc. SIAM Int'l Conf. Data Mining, 2002.

Index Terms:
Data mining, rule discovery, optimal rule set.
Citation:
Jiuyong Li, "On Optimal Rule Discovery," IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 4, pp. 460-471, April 2006, doi:10.1109/TKDE.2006.65
Usage of this product signifies your acceptance of the Terms of Use.