The Community for Technology Leaders
2008 IEEE 24th International Conference on Data Engineering (2008)
Cancun, Mexico
Apr. 7, 2008 to Apr. 12, 2008
ISBN: 978-1-4244-1836-7
pp: 1033-1042
Robert Krauthgamer , Weizmann Institute, Rehovot, Israel and IBM Almaden, San Jose, CA, USA. robert.krauthgamer@weizmann.ac.il
Aranyak Mehta , Google Inc., Mountain View, CA, USA. aranyak@google.com
Vijayshankar Raman , IBM Almaden, San Jose, CA, USA. ravijay@us.ibm.com
Atri Rudra , University at Buffalo, State University of New York, Buffalo, NY, USA. atri@cse.buffalo.edu
ABSTRACT
A common technique for processing conjunctive queries is to first match each predicate separately using an index lookup, and then compute the intersection of the resulting rowid lists, via an AND-tree. The performance of this technique depends crucially on the order of lists in this tree: it is important to compute early the intersections that will produce small results. But this optimization is hard to do when the data or predicates have correlation. We present a new algorithm for ordering the lists in an AND-tree tree by sampling the intermediate intersection sizes. We prove that our algorithm is near-optimal and validate its effectiveness experimentally on datasets with a variety of distributions.
INDEX TERMS
CITATION

R. Krauthgamer, V. Raman, A. Mehta and A. Rudra, "Greedy List Intersection," 2008 IEEE 24th International Conference on Data Engineering(ICDE), Cancun, Mexico, 2008, pp. 1033-1042.
doi:10.1109/ICDE.2008.4497512
108 ms
(Ver 3.3 (11022016))