Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06) (2006)
Hong Kong, China
Dec. 18, 2006 to Dec. 22, 2006
Lei Zou , HuaZhong University of Science and Technology, Wuhan, 430074,P. R. China
Yansheng Lu , HuaZhong University of Science and Technology, Wuhan, 430074,P. R. China
Huaming Zhang , University of Alabama in Huntsville
Rong Hu , HuaZhong University of Science and Technology, Wuhan, 430074,P. R. China
Mining frequent induced subtree patterns is very useful in domains such as XML databases, web log analyzing. However, because of the combinatorial explosion, mining all frequent subtree patterns becomes infeasible for a large and dense tree database. And too many frequent subtree patterns also confuse users. Usually only a small set of the mining results can arouse users? interests. In this paper, we propose a problem to discover frequent induced subtree patterns that are super trees of a given pattern tree specified by users, i.e. frequent induced subtree patterns with subtree-constraint. Most existing frequent subtree mining algorithms are based on right-most extension, which does not work well in the new problem. So free extension is presented to replace right-most extension in this paper. To avoid the duplicate pattern problem caused by free extension, we develop an efficient method that ensures no duplicate patterns in mining process or results. Then Subtree- Constraint Frequent Subtree Patterns Mining Algorithm, i.e.SCFS algorithm, is given. The experiment results also show that our algorithm achieves good performance.
Y. Lu, R. Hu, L. Zou and H. Zhang, "Mining Frequent Induced Subtree Patterns with Subtree-Constraint," Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06)(ICDMW), Hong Kong, China, 2006, pp. 3-7.