Issue No. 06 - November/December (2003 vol. 15)
Sam Y. Sung , IEEE Computer Society
Peter A. Ng , IEEE
<p><b>Abstract</b>—An important issue that needs to be addressed when using <it>data mining tools</it> is the validity of the rules outside of the data set from which they are generated. Rules are typically derived from the patterns in a particular data set. When a new situation occurs, the change in the set of rules obtained from the new data set could be significant. In this paper, we provide a novel model for understanding how the differences between two situations affect the changes of the rules, based on the concept of fine partitioned groups that we call <it>caucuses</it>. Using this model, we provide a simple technique called <it>Combination Data Set</it>, to get a good estimate of the set of rules for a new situation. Our approach works independently of the core mining process and it can be easily implemented with all variations of rule mining techniques. Through experiments with real-life and synthetic data sets, we show the effectiveness of our technique in finding the correct set of rules under different situations.</p>
Combination data set, data mining, extending association rule, fine partition, proportionate sampling.
S. Y. Sung, P. A. Ng, C. L. Tan and Z. Li, "Forecasting Association Rules Using Existing Data Sets," in IEEE Transactions on Knowledge & Data Engineering, vol. 15, no. , pp. 1448-1459, 2003.