Third IEEE International Conference on Data Mining (ICDM'03) MaPle: A Fast Algorithm for Maximal Pattern-based Clustering Melbourne, Florida November 19-November 22 ISBN: 0-7695-1978-4
Pattern-based clustering is important in many applications, such as DNA micro-array data analysis, automatic recommendation systems and target marketing systems. However, pattern-based clustering in large databases is challenging. On the one hand, there can be a huge number of clusters and many of them can be redundant and thus make the pattern-based clustering ineffective. On the other hand, the previous proposed methods may not be efficient or scalable in mining large databases.In this paper, we study the problem of maximal pattern-based clustering. Redundant clusters are avoided completely by mining only the maximal pattern-based clusters. MaPle, an efficient and scalable mining algorithm is developed. It conducts a depth-first, divide-and-conquer search and prunes unnecessary branches smartly. Our extensive performance study on both synthetic data sets and real data sets shows that maximal pattern-based clustering is effective. It reduces the number of clusters substantially. Moreover, MaPle is more efficient and scalable than the previously proposed pattern-based clustering methods in mining large databases.
Citation:
Jian Pei, Xiaoling Zhang, Moonjung Cho, Haixun Wang, Philip S. Yu, "MaPle: A Fast Algorithm for Maximal Pattern-based Clustering," icdm, pp.259, Third IEEE International Conference on Data Mining (ICDM'03), 2003 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||