Web Intelligence and Intelligent Agent Technology, IEEE/WIC/ACM International Conference on (2013)
Atlanta, GA, USA USA
Nov. 17, 2013 to Nov. 20, 2013
A hash tag is defined to be a word or phrase prefixed with the symbol "#". It is widely used in current social media sites including Twitter and Google+, and serves as a significant meta tag to categorize users' messages, to propagate ideas and topic trends. The use of hash tags has become an integral part of the social media culture. However, the free-form nature and the varied contexts of hash tags bring challenges: how to understand hash tags and discover their relationships? In this paper, we propose Tag-Latent Dirichlet Allocation (TLDA), a new topic modeling approach to bridge hash tags and topics. TLDA extends Latent Dirichlet Allocation by incorporating the observed hash tags in the generative process. In TLDA, a hash tag is mapped into the form of a mixture of shared topics. This representation further enables the analysis of the relationships between the hash tags. Applying our model to tweet data, we first illustrate the ability of our approach to explain hard-to-understand hash tags with topics. We also demonstrate that our approach enables users to further analyze the relationships between the hash tags.
Twitter analysis, topic model, hashtag
Z. Ma, W. Dou, X. Wang and S. Akella, "Tag-Latent Dirichlet Allocation: Understanding Hashtags and Their Relationships," 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)(WI-IAT), Atlanta, GA, USA, 2013, pp. 260-267.