The Community for Technology Leaders
Green Image
Mining frequent trees is very useful in domains like bioinformatics, Web mining, mining semistructured data, etc. We formulate the problem of mining (embedded) subtrees in a forest of rooted, labeled, and ordered trees. We present TreeMiner, a novel algorithm to discover all frequent subtrees in a forest, using a new data structure called scope-list. We contrast TreeMiner with a pattern matching tree mining algorithm (PatternMatcher), and we also compare it with TreeMinerD, which counts only distinct occurrences of a pattern. We conduct detailed experiments to test the performance and scalability of these methods. We also use tree mining to analyze RNA structure and phylogenetics data sets from bioinformatics domain.
Index Terms- Frequent tree mining, rooted, ordered, labeled trees, subtree enumeration, pattern matching, RNA structure, phylogenetic trees, data mining.
Mohammed J. Zaki, "Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications", IEEE Transactions on Knowledge & Data Engineering, vol. 17, no. , pp. 1021-1035, August 2005, doi:10.1109/TKDE.2005.125
89 ms
(Ver 3.3 (11022016))