2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS) (2017)
Atlanta, Georgia, USA
June 5, 2017 to June 8, 2017
We study the problem of using Social Media to detect natural disasters, of which we are interested in a special kind, namely landslides. Employing information from Social Media presents unique research challenges, as there exists a considerable amount of noise due to multiple meanings of the search keywords, such as "landslide" and "mudslide". To tackle these challenges, we propose REX, a rapid ensemble classification system which can filter out noisy information by implementing two key ideas: (I) a new method for constructing independent classifiers that can be used for rapid ensemble classification of Social Media texts, where each classifier is built using randomized Explicit Semantic Analysis; and (II) a self-correction approach which takes advantage of the observation that the majority label assigned to Social Media texts belonging to a large event is highly accurate. We perform experiments using real data from Twitter over 1.5 years to show that REX classification achieves 0.98 in F-measure, which outperforms the standard Bag-of-Words algorithm by an average of 0.14 and the state-of-the-art Word2Vec algorithm by 0.04. We also release the annotated datasets used in the experiments as a contribution to the research community containing 282k labeled items.
Terrain factors, Semantics, Twitter, Encyclopedias, Electronic publishing
A. Musaev, D. Wang, J. Xie and C. Pu, "REX: Rapid Ensemble Classification System for Landslide Detection Using Social Media," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, Georgia, USA, 2017, pp. 1240-1249.