The Community for Technology Leaders
2016 International Conference on Big Data and Smart Computing (BigComp) (2016)
Hong Kong, China
Jan. 18, 2016 to Jan. 20, 2016
ISSN: 2375-9356
ISBN: 978-1-4673-8795-8
pp: 129-136
Kumar TK Ashwin , Department of Computer Science, Oklahoma State University, Stillwater, USA
Prashanth Kammarpally , Department of Computer Science, Oklahoma State University, Stillwater, USA
KM George , Department of Computer Science, Oklahoma State University, Stillwater, USA
ABSTRACT
Twitter is a powerful real-time micro-blogging service and a platform where users communicate with each other instantaneously. Thus, tweets form an integral part of big data ecosystem. While this platform serves as an efficient information diffusion medium, it can also be used to spread misinformation intentionally or unintentionally, which can damage the reputation of an individual or a corporation. Misinformation could also be harmful to society in general. As veracity in big data gains more attention, it is also important to develop methods to estimate veracity of tweets. There are no definitive measures to determine the veracity of tweets from tweets themselves. Other information that are required to verify tweets may not be readily available. Hence, there is a need for such mechanisms to determine the level of accuracy of tweets from available data. In this paper we propose three quantitative measures we name as topic diffusion, geographic dispersion, and spam index as indicators of veracity of tweets. These measures are derived from tweets themselves independent of any corroborating data. The proposed measures are tested using tweets about oil companies as validators. To validate the proposed measures, information extracted from tweets are compared with information collected from official data sources. Our experiments show that the proposed measures were able to estimate the level of veracity among tweets in most topics we tested. We also found the measures useful to compare the veracity of different topics as points in a 3-dimensional space. Another application of veracity indices to positions of political candidates is also described.
INDEX TERMS
Twitter, Media, Big data, Companies, Nominations and elections, Position measurement
CITATION

K. T. Ashwin, P. Kammarpally and K. George, "Veracity of information in twitter data: A case study," 2016 International Conference on Big Data and Smart Computing (BigComp)(BIGCOMP), Hong Kong, China, 2016, pp. 129-136.
doi:10.1109/BIGCOMP.2016.7425811
98 ms
(Ver 3.3 (11022016))