Conference, International Asia-Pacific Web (2010)
Buscan, Korea
Apr. 6, 2010 to Apr. 8, 2010
ISBN: 978-0-7695-4012-2
pp: 372-374
Social media, e.g. Weblog and Internet forum, generate rich historical textual datasets which record lots of valuable events. Automatic event detection tries to discover important and interesting events and their related documents. Existing solutions to event detection, however, are mostly proposed for high quality news stories and may not work well when they are applied to noisy social media datasets, where content quality varies drastically from informative to trivial or even spamming. In this paper, an event detection framework, which directly utilizes burst property of events to filter out noise, is proposed. Experimental results on real dataset from Tencent Internet forum, a popular forum in China, demonstrate the effectiveness of the proposed framework.
Event detection, burst property, noisy textual dataset

