May 10, 2006 to May 12, 2006
Yuichiro Sekiguchi , NTT Corporation, Japan
Harumi Kawashima , NTT Corporation, Japan
Hidenori Okuda , NTT Corporation, Japan
Masahiro Oku , NTT Corporation, Japan
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MDM.2006.153
In this paper, we describe a method to detect topic words from blog documents. We define "topic words" as words frequently used by people who share the same interests. In this method, each blogger?s interests are extracted from each blog site, and interest similarities between bloggers are calculated. Unusual words that are used by bloggers who have a high level of similarity are then extracted as topic words. We evaluated the precision of this method using blog documents, and the results show that the proposed method is superior (by 4.4 %) to the traditional TF-IDF method in terms of precision.
Yuichiro Sekiguchi, Harumi Kawashima, Hidenori Okuda, Masahiro Oku, "Topic Detection from Blog Documents Using Users? Interests", MDM, 2006, 7th International Conference on Mobile Data Management, 7th International Conference on Mobile Data Management 2006, pp. 108, doi:10.1109/MDM.2006.153