The Community for Technology Leaders
2013 IEEE 29th International Conference on Data Engineering (ICDE) (2013)
Brisbane, Australia Australia
Apr. 8, 2013 to Apr. 12, 2013
ISSN: 1063-6382
ISBN: 978-1-4673-4909-3
pp: 1308-1311
C. P. Sayers , Hewlett-Packard Labs., Palo Alto, CA, USA
Meichun Hsu , Hewlett-Packard Labs., Palo Alto, CA, USA
ABSTRACT
To enable the interactive exploration of large social media datasets we exploit the temporal distributions of word n-grams within the message stream to discover “interesting” concepts, determine “relatedness” between concepts, and find representative examples for display. We present a new algorithm for context-dependent “interestingness” using the coefficient of variation of the temporal distribution, apply the well-known technique of Pearson's Correlation to tweets using equi-height histogramming to determine correlation, and employ an asymmetric variant for computing “relatedness” to encourage exploration. We further introduce techniques using interestingness, correlation, and relatedness to automatically discover concepts and select preferred word N-grams for display. These techniques are demonstrated on an 800,000 tweet dataset from the Academy Awards.
INDEX TERMS
Correlation, Awards activities, Histograms, Media, Visualization, Context, Twitter
CITATION

C. P. Sayers and Meichun Hsu, "Extracting interesting related context-dependent concepts from social media streams using temporal distributions," 2013 29th IEEE International Conference on Data Engineering (ICDE 2013)(ICDE), Brisbane, QLD, 2013, pp. 1308-1311.
doi:10.1109/ICDE.2013.6544931
90 ms
(Ver 3.3 (11022016))