loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Sixth IEEE International Conference on Data Mining (ICDM'06)
Diverse Topic Phrase Extraction through Latent Semantic Analysis
Hong Kong
December 18-December 22
ISBN: 0-7695-2701-9
Jilin Chen, University of Minnesota, USA
Jun Yan, Microsoft Research Asia, China
Benyu Zhang, Microsoft Research Asia, China
Qiang Yang, Hong Kong University of Science and Technology, Hong Kong
Zheng Chen, Microsoft Research Asia, China
We propose a novel algorithm for extracting diverse topic phrases in order to provide summary for large corpora. Previous works often ignore the importance of diversity and thus extract phrases crowded on some hot topics while failing to cover other less obvious but important topics. We solve this problem through document re-weighting and phrase diversification by using latent semantic analysis (LSA). Experiments on various datasets show that our new algorithm can improve relevance as well as diversity over different topics for topic phrase extraction problems.
Citation:
Jilin Chen, Jun Yan, Benyu Zhang, Qiang Yang, Zheng Chen, "Diverse Topic Phrase Extraction through Latent Semantic Analysis," icdm, pp.834-838, Sixth IEEE International Conference on Data Mining (ICDM'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.