|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2010 IEEE International Conference on Data Mining
Topic Modeling Ensembles
Sydney, Australia
December 13-December 17
ISBN: 978-0-7695-4256-0
| ASCII Text | x | ||
| Zhiyong Shen, Ping Luo, Shengwen Yang, Xukun Shen, "Topic Modeling Ensembles," Data Mining, IEEE International Conference on, pp. 1031-1036, 2010 IEEE International Conference on Data Mining, 2010. | |||
| BibTex | x | ||
| @article{ 10.1109/ICDM.2010.113, author = {Zhiyong Shen and Ping Luo and Shengwen Yang and Xukun Shen}, title = {Topic Modeling Ensembles}, journal ={Data Mining, IEEE International Conference on}, volume = {0}, year = {2010}, issn = {1550-4786}, pages = {1031-1036}, doi = {http://doi.ieeecomputersociety.org/10.1109/ICDM.2010.113}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Data Mining, IEEE International Conference on TI - Topic Modeling Ensembles SN - 1550-4786 SP1031 EP1036 A1 - Zhiyong Shen, A1 - Ping Luo, A1 - Shengwen Yang, A1 - Xukun Shen, PY - 2010 KW - Topic model KW - Ensemble VL - 0 JA - Data Mining, IEEE International Conference on ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2010.113
In this paper we propose a framework of topic modeling ensembles, a novel solution to combine the models learned by topic modeling over each partition of the whole corpus. It has the potentials for applications such as distributed topic modeling for large corpora, and incremental topic modeling for rapidly growing corpora. Since only the base models, not the original documents, are required in the ensemble, all these applications can be performed in a privacy preserving manner. We explore the theoretical foundation of the proposed framework, give its geometric interpretation, and implement it for both PLSA and LDA. The evaluation of the implementations over the synthetic and real-life data sets shows that the proposed framework is much more efficient than modeling the original corpus directly while achieves comparable effectiveness in terms of perplexity and classification accuracy.
Index Terms:
Topic model, Ensemble
Citation:
Zhiyong Shen, Ping Luo, Shengwen Yang, Xukun Shen, "Topic Modeling Ensembles," icdm, pp.1031-1036, 2010 IEEE International Conference on Data Mining, 2010
Usage of this product signifies your acceptance of the Terms of Use.
