The Community for Technology Leaders
2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (2018)
Barcelona, Spain
Aug. 28, 2018 to Aug. 31, 2018
ISSN: 2473-9928
ISBN: 978-1-5386-6052-2
pp: 151-158
Muhammad Rizal Khaefi , Pulse Lab Jakarta of United Nations Global Pulse
Rajius Idzalika , Pulse Lab Jakarta of United Nations Global Pulse
Imaduddin Amin , Pulse Lab Jakarta of United Nations Global Pulse
Zakiya Pramestri , Pulse Lab Jakarta of United Nations Global Pulse
Pamungkas Jutta , Pulse Lab Jakarta of United Nations Global Pulse
Yulistina Riyadi , Pulse Lab Jakarta of United Nations Global Pulse
George Hodge , Pulse Lab Jakarta of United Nations Global Pulse
Jong Gun Lee , Pulse Lab Jakarta of United Nations Global Pulse
ABSTRACT
Text-based media possess a wealth of insights that can be mined to understand perceptions and actions. Researchers and public officials can use these data to inform development policy and humanitarian action. An important step in analyzing text-based databases, such as social media, is the creation of taxonomies which are used to filter information relevant to topics of interest. We worked with thousands of online volunteers to translate 2,137 keywords or phrases in English to formal or vernacular expressions in 29 different languages with the aim of understanding human responses to natural disasters, as well as developing sets of corpus on non popular languages (non English and non EU languages) which still has limited studies. In processing the data set, we faced a challenge in selecting a set of quality translations for each language. This paper aims to estimate the quality of the crowdsourced translations by non-professional translators. This paper presents an extensive empirical study using 91 features from 29 languages corpora to describe (a) translators, (b) source expressions, and (c) translated expressions. Our results show that our approach exploring two regression models and two supervised learning methods produces better results than a baseline approach with a commonly used metric, namely peer-review scores.
INDEX TERMS
crowdsourcing translation, translation quality estimation, text analysis
CITATION

M. R. Khaefi et al., "Estimating the Quality of Crowdsourced Translations Based on the Characteristics of Source and Target Words and Participants," 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Barcelona, Spain, 2018, pp. 151-158.
doi:10.1109/ASONAM.2018.8508319
193 ms
(Ver 3.3 (11022016))