2012 IEEE 28th International Conference on Data Engineering (2012)
Arlington, Virginia USA
Apr. 1, 2012 to Apr. 5, 2012
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDE.2012.105
Many applications process data in which there exists a ``conservation law'' between related quantities. For example, in traffic monitoring, every incoming event, such as a packet's entering a router or a car's entering an intersection, should ideally have an immediate outgoing counterpart. We propose a new class of constraints -- Conservation Rules -- that express the semantics and characterize the data quality of such applications. We give confidence metrics that quantify how strongly a conservation rule holds and present approximation algorithms (with error guarantees) for the problem of discovering a concise summary of subsets of the data that satisfy a given conservation rule. Using real data, we demonstrate the utility of conservation rules and we show order-of-magnitude performance improvements of our discovery algorithms over naive approaches.
B. Saha, F. Korn, H. Karloff, D. Srivastava and L. Golab, "Discovering Conservation Rules," 2012 IEEE 28th International Conference on Data Engineering(ICDE), Arlington, Virginia USA, 2012, pp. 738-749.