This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2008 Eighth IEEE International Conference on Data Mining
Collective Latent Dirichlet Allocation
December 15-December 19
ISBN: 978-0-7695-3502-9
In this paper, we propose a new variant of Latent Dirichlet Allocation (LDA): Collective LDA (C-LDA), for multiple corpora modeling. C-LDA combines multiple corpora during learning such that it can transfer knowledge from one corpus to another; meanwhile it keeps a discriminative node which represents the corpus ID to constrain the learned topics in each corpus. Compared with LDA locally applied to the target corpus, C-LDA results in refined topic-word distribution, while compared with applying LDA globally and straightforwardly to the combined corpus, C-LDA keeps each topic only for one corpus. We demonstrate that C-LDA has improved performance with these advantages by experiments on several benchmark document data sets.
Index Terms:
collective LDA
Citation:
Zhi-Yong Shen, Jun Sun, Yi-Dong Shen, "Collective Latent Dirichlet Allocation," icdm, pp.1019-1024, 2008 Eighth IEEE International Conference on Data Mining, 2008
Usage of this product signifies your acceptance of the Terms of Use.