The Community for Technology Leaders
Green Image
Issue No. 06 - June (2014 vol. 26)
ISSN: 1041-4347
pp: 1332-1348
Lukasz Golab , Department of Engineering, University of Waterloo, Waterloo, ON, Canada
Howard Karloff , , AT&T Labs-Research, Florham Park, NJ, USA
Flip Korn , , AT&T Labs-Research, Florham Park, NJ, USA
Barna Saha , , AT&T Labs-Research, Florham Park, NJ, USA
Divesh Srivastava , , AT&T Labs-Research, Florham Park, NJ, USA
ABSTRACT
Many applications process data in which there exists a “conservation law” between related quantities. For example, in traffic monitoring, every incoming event, such as a packet's entering a router or a car's entering an intersection, should ideally have an immediate outgoing counterpart. We propose a new class of constraints-Conservation Rules-that express the semantics and characterize the data quality of such applications. We give confidence metrics that quantify how strongly a conservation rule holds and present approximation algorithms (with error guarantees) for the problem of discovering a concise summary of subsets of the data that satisfy a given conservation rule. Using real data, we demonstrate the utility of conservation rules and we show order-of-magnitude performance improvements of our discovery algorithms over naive approaches.
INDEX TERMS
data mining, approximation theory
CITATION

L. Golab, H. Karloff, F. Korn, B. Saha and D. Srivastava, "Discovering Conservation Rules," in IEEE Transactions on Knowledge & Data Engineering, vol. 26, no. 6, pp. 1332-1348, 2014.
doi:10.1109/TKDE.2012.171
1420 ms
(Ver 3.3 (11022016))