Maui, HI, USA
Jan. 6, 2001 to Jan. 6, 2001
The mining association algorithm is one of the most important data mining algorithms to derive association rules at high speed from huge databases. However, the algorithm tends to derive those rules that contain noises such as stopwords then some systems remove the noises using noise filters. We have been improving the algorithm and developing navigation systems for semi-structured data using the algorithm, and we also use a dictionary to remove noises from derived association rules. In order to derive effective rules, it is very important how to determine system parameters such as threshold values of the minimum support and the minimum confidence. Then we have adapted the ROC analysis to the algorithm on our navigation systems and evaluated the performance of derived rules. In this paper, we import the parameters from the ROC analysis into the algorithm to propose extended mining association algorithms. Moreover, we evaluate the performance of our proposed algorithms using an experimental database and show how our proposed algorithms can derive effective association rules. We also show that our proposed algorithms can remove stopwords automatically from raw data.
M. Kawahara, H. Kawano, "Mining Association Algorithm with Threshold based on ROC Analysis", HICSS, 2001, Proceedings of Hawaii International Conference on System Sciences. HICSS-34, Proceedings of Hawaii International Conference on System Sciences. HICSS-34 2001, pp. 3010, doi:10.1109/HICSS.2001.926303