2011 IEEE 27th International Conference on Data Engineering (2011)
Apr. 11, 2011 to Apr. 16, 2011
Daniel Deutch , Tel-Aviv University, Israel
Ohad Greenshpan , Tel-Aviv University, Israel
Boris Kostenko , Tel-Aviv University, Israel
Tova Milo , Tel-Aviv University, Israel
We introduce in this Demonstration a system called Trivia Masster that generates a very large Database of facts in a variety of topics, and uses it for question answering. The facts are collected from human users (the "crowd"); the system motivates users to contribute to the Database by using a Trivia Game, where users gain points based on their contribution. A key challenge here is to provide a suitable Data Cleaning mechanism that allows to identify which of the facts (answers to Trivia questions) submitted by users are indeed correct / reliable, and consequently how many points to grant users, how to answer questions based on the collected data, and which questions to present to the Trivia players, in order to improve the data quality. As no existing single Data Cleaning technique provides a satisfactory solution to this challenge, we propose here a novel approach, based on a declarative framework for defining recursive and probabilistic Data Cleaning rules. Our solution employs an algorithm that is based on Markov Chain Monte Carlo Algorithms.
D. Deutch, T. Milo, B. Kostenko and O. Greenshpan, "Using Markov Chain Monte Carlo to play Trivia," 2011 IEEE 27th International Conference on Data Engineering(ICDE), Hannover, Germany, 2011, pp. 1308-1311.