From the May 2015 issue

On Summarization and Timeline Generation for Evolutionary Tweet Streams

By Zhenhua Wang, Lidan Shou, Ke Chen, Gang Chen, and Sharad Mehrotra

Featured ArticleShort-text messages such as tweets are being created and shared at an unprecedented rate. Tweets, in their raw form, while being informative, can also be overwhelming. For both end-users and data analysts, it is a nightmare to plow through millions of tweets which contain enormous amount of noise and redundancy. In this paper, we propose a novel continuous summarization framework called Sumblr to alleviate the problem. In contrast to the traditional document summarization methods which focus on static and small-scale data set, Sumblr is designed to deal with dynamic, fast arriving, and large-scale tweet streams. Our proposed framework consists of three major components. First, we propose an online tweet stream clustering algorithm to cluster tweets and maintain distilled statistics in a data structure called tweet cluster vector (TCV). Second, we develop a TCV-Rank summarization technique for generating online summaries and historical summaries of arbitrary time durations. Third, we design an effective topic evolution detection method, which monitors summary-based/volume-based variations to produce timelines automatically from tweet streams. Our experiments on large-scale real tweets demonstrate the efficiency and effectiveness of our framework.

  • TKDE celebrates its 25th Anniversary. Editor-in-Chief Jian Pei says, "We are celebrating the 25th Anniversary of TKDE. Since its first issue in March 1989, TKDE has published 2,981 articles, and another 220 articles in the early access portal. With 898 submissions and 79 accepted articles in 2012, TKDE is now the premier journal in the broad and general fields of data management, data mining, and knowledge engineering. We thank all the authors, reviewers, and readers for their continuing support to TKDE. As always, we are eager to hear your ideas and suggestions, and will do our best to meet your expectations. With all your passions, contributions, and supports, TKDE is embracing the new era of big data and big data analytics. Happy birthday to TKDE!"


