Robert Krauthgamer , Weizmann Institute, Rehovot, Israel and IBM Almaden, San Jose, CA, USA.
Aranyak Mehta , Google Inc., Mountain View, CA, USA.
Vijayshankar Raman , IBM Almaden, San Jose, CA, USA.
Atri Rudra , University at Buffalo, State University of New York, Buffalo, NY, USA.
A common technique for processing conjunctive queries is to first match each predicate separately using an index lookup, and then compute the intersection of the resulting rowid lists, via an AND-tree. The performance of this technique depends crucially on the order of lists in this tree: it is important to compute early the intersections that will produce small results. But this optimization is hard to do when the data or predicates have correlation. We present a new algorithm for ordering the lists in an AND-tree tree by sampling the intermediate intersection sizes. We prove that our algorithm is near-optimal and validate its effectiveness experimentally on datasets with a variety of distributions.

