Issue No. 06 - June (2015 vol. 27)
Shixia Liu , School of Software, Tsinghua University, Beijing, China
Xiting Wang , , Tsinghua University
Yangqiu Song , Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL
Baining Guo , , Microsoft Research Asia and Tsinghua University, Beijing, China
We present an evolutionary multi-branch tree clustering method to model hierarchical topics and their evolutionary patterns over time. The method builds evolutionary trees in a Bayesian online filtering framework. The tree construction is formulated as an online posterior estimation problem, which well balances both the fitness of the current tree and the smoothness between trees. The state-of-the-art multi-branch clustering method, Bayesian rose trees, is employed to generate a topic tree with a high fitness value. A constraint model is also introduced to preserve the smoothness between trees. A set of comprehensive experiments on real world news data demonstrates that the proposed method better incorporates historical tree information and is more efficient and effective than the traditional evolutionary hierarchical clustering algorithm. In contrast to our previous method
, we implement two additional baseline algorithms to compare them with our algorithm. We also evaluate the performance of the clustering algorithm based on multiple constraint trees. Furthermore, two case studies are conducted to demonstrate the effectiveness and usefulness of our algorithm in helping users understand the major hierarchical topic evolutionary patterns in text data.
Fans, Clustering algorithms, Bayes methods, Merging, Vegetation, Binary trees, Electronic mail
S. Liu, X. Wang, Y. Song and B. Guo, "Evolutionary Bayesian Rose Trees," in IEEE Transactions on Knowledge & Data Engineering, vol. 27, no. 6, pp. 1533-1546, 2015.