2008 IEEE 24th International Conference on Data Engineering (2008)
Apr. 7, 2008 to Apr. 12, 2008
Robert Krauthgamer , Weizmann Institute, Rehovot, Israel and IBM Almaden, San Jose, CA, USA. firstname.lastname@example.org
Aranyak Mehta , Google Inc., Mountain View, CA, USA. email@example.com
Vijayshankar Raman , IBM Almaden, San Jose, CA, USA. firstname.lastname@example.org
Atri Rudra , University at Buffalo, State University of New York, Buffalo, NY, USA. email@example.com
A common technique for processing conjunctive queries is to first match each predicate separately using an index lookup, and then compute the intersection of the resulting rowid lists, via an AND-tree. The performance of this technique depends crucially on the order of lists in this tree: it is important to compute early the intersections that will produce small results. But this optimization is hard to do when the data or predicates have correlation. We present a new algorithm for ordering the lists in an AND-tree tree by sampling the intermediate intersection sizes. We prove that our algorithm is near-optimal and validate its effectiveness experimentally on datasets with a variety of distributions.
R. Krauthgamer, V. Raman, A. Mehta and A. Rudra, "Greedy List Intersection," 2008 IEEE 24th International Conference on Data Engineering(ICDE), Cancun, Mexico, 2008, pp. 1033-1042.