The Community for Technology Leaders
RSS Icon
Subscribe
Long Beach, CA, USA
Mar. 1, 2010 to Mar. 6, 2010
ISBN: 978-1-4244-5445-7
pp: 321-332
Kuang Chen , Department of Electrical Engineering and Computer Science, University of California, Berkeley, 2599 Hearst Ave, 94720 USA
Harr Chen , Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, 02139 USA
Neil Conway , Department of Electrical Engineering and Computer Science, University of California, Berkeley, 2599 Hearst Ave, 94720 USA
Joseph M. Hellerstein , Department of Electrical Engineering and Computer Science, University of California, Berkeley, 2599 Hearst Ave, 94720 USA
Tapan S. Parikh , School of Information, University of California, Berkeley, 102 South Hall, 94720 USA
ABSTRACT
Data quality is a critical problem in modern databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for improving data quality at entry time. In this paper, we propose USHER, an end-to-end system for form design, entry, and data quality assurance. Using previous form submissions, USHER learns a probabilistic model over the questions of the form. USHER then applies this model at every step of the data entry process to improve data quality. Before entry, it induces a form layout that captures the most important data values of a form instance as quickly as possible. During entry, it dynamically adapts the form to the values being entered, and enables real-time feedback to guide the data enterer toward their intended values. After entry, it re-asks questions that it deems likely to have been entered incorrectly. We evaluate all three components of USHER using two real-world data sets. Our results demonstrate that each component has the potential to improve data quality considerably, at a reduced cost when compared to current practice.
CITATION
Kuang Chen, Harr Chen, Neil Conway, Joseph M. Hellerstein, Tapan S. Parikh, "USHER: Improving data quality with dynamic forms", ICDE, 2010, 2013 IEEE 29th International Conference on Data Engineering (ICDE), 2013 IEEE 29th International Conference on Data Engineering (ICDE) 2010, pp. 321-332, doi:10.1109/ICDE.2010.5447832
33 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool