This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2011 IEEE 11th International Conference on Data Mining
Learning Dirichlet Processes from Partially Observed Groups
Vancouver, Canada
December 11-December 14
ISBN: 978-0-7695-4408-3
Motivated by the task of vernacular news analysis using known news topics from national news-papers, we study the task of topic analysis, where given source datasets with observed topics, data items from a target dataset need to be assigned either to observed source topics or to new ones. Using Hierarchical Dirichlet Processes for addressing this task imposes unnecessary and often inappropriate generative assumptions on the observed source topics. In this paper, we explore Dirichlet Processes with partially observed groups (POG-DP). POG-DP avoids modeling the given source topics. Instead, it directly models the conditional distribution of the target data as a mixture of a Dirichlet Process and the posterior distribution of a Hierarchical Dirichlet Process with known groups and topics. This introduces coupling between selection probabilities of all topics within a source, leading to effective identification of source topics. We further improve on this with a Combinatorial Dirichlet Process with partially observed groups (POG-CDP) that captures finer grained coupling between related topics by choosing intersections between sources. We evaluate our models in three different real-world applications. Using extensive experimentation, we compare against several baselines to show that our model performs significantly better in all three applications.
Index Terms:
topic analysis, grouped data, partial observations, Dirichlet Process
Citation:
Avinava Dubey, Indrajit Bhattacharya, Mrinal Das, Tanveer Faruquie, Chiranjib Bhattacharyya, "Learning Dirichlet Processes from Partially Observed Groups," icdm, pp.141-150, 2011 IEEE 11th International Conference on Data Mining, 2011
Usage of this product signifies your acceptance of the Terms of Use.