Parallel Architectures, Algorithms and Programming, International Symposium on (2010)
Dalian, Liaoning China
Dec. 18, 2010 to Dec. 20, 2010
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PAAP.2010.38
An algorithm named SMHP is proposed, which aims at improving the efficiency of Topic Detection. In SMHP, a T-MI-TFIDF model is designed by introducing mutual information (MI) and enhancing the weight of terms in the title. Then VSM is constructed according to terms' weight, and the dimension is reduced by combining H-TOPN and PCA. Then topics are grouped based on SMHP. Experiment results show the proposed methods are more suitable for clustering topics. SMHP with novel approaches can effectively solve the relationship of multiple stories problem and improve the accuracy of cluster results.
H. Shen, X. Liu, F. Ma and H. Lin, "Hypergraph Partition with Harmonic Average Top-N and PCA for Topic Detection," Parallel Architectures, Algorithms and Programming, International Symposium on(PAAP), Dalian, Liaoning China, 2010, pp. 269-276.