Issue No. 03 - March (2008 vol. 20)
We consider the problem of creating sample view of a database table. A sample view is an indexed, materialized view that permits efficient sampling from an arbitrary range query over the view. Such "sample views'' are very useful to applications that require random samples from a database: approximate query processing, online aggregation, data mining, and randomized algorithms are a few examples. Our core technical contribution is a new file organization called the ACE Tree that is suitable for organizing and indexing a sample view. One of the most important aspects of the ACE Tree is that it supports online random sampling from the view. That is, at all times, the set of records returned by the ACE Tree constitutes a statistically random sample of the database records satisfying the relational selection predicate over the view. Our paper presents experimental results that demonstrate the utility of the ACE Tree.
Indexing methods, Query processing, Sampling
S. Joshi and C. Jermaine, "Materialized Sample Views for Database Approximation," in IEEE Transactions on Knowledge & Data Engineering, vol. 20, no. , pp. 337-351, 2007.