Proceedings 18th International Conference on Data Engineering (2002)
San Jose, California
Feb. 26, 2002 to Mar. 1, 2002
AnHai Doan , University of Washington
Alon Halevy , University of Washington
The goal of a data integration system is to provide a uniform interface to a multitude of data sources. Given a user query formulated in this interface, the system translates it into a set of query plans. Each plan is a query formulated over the data sources, and specifies a way to access sources and combine data to answer the user query.In practice, when the number of sources is large, a data-integration system must generate and execute many query plans with significantly varying utilities. Hence, it is crucial that the system finds the best plans efficiently and executes them first, to guarantee acceptable time to and the quality of the first answers. We describe efficient solutions to this problem. First, we formally define the problem of ordering query plans. Second, we identify several interesting structural properties of the problem and describe three ordering algorithms that exploit these properties. Finally, we describe experimental results that suggest guidance on which algorithms perform best under which conditions.
A. Halevy and A. Doan, "Efficiently Ordering Query Plans for Data Integration," Proceedings 18th International Conference on Data Engineering(ICDE), San Jose, California, 2002, pp. 0393.