2007 IEEE 23rd International Conference on Data Engineering (2007)
Apr. 15, 2007 to Apr. 20, 2007
Bolin Ding , The Chinese University of Hong Kong, China, firstname.lastname@example.org
Jeffrey Xu Yu , The Chinese University of Hong Kong, China, email@example.com
Shan Wang , Key Laboratory of Data Engineering and Knowledge Engineering, MOE of China, Renmin University of China, China, firstname.lastname@example.org
Lu Qin , The Chinese University of Hong Kong, China, email@example.com
Xiao Zhang , Key Laboratory of Data Engineering and Knowledge Engineering, MOE of China, Renmin University of China, China, firstname.lastname@example.org
Xuemin Lin , The University of New South Wales&NICTA, Australia, email@example.com
It is widely realized that the integration of database and information retrieval techniques will provide users with a wide range of high quality services. In this paper, we study processing an l-keyword query, p1, p2, ???, pl, against a relational database which can be modeled as a weighted graph, G(V, E). Here V is a set of nodes (tuples) and E is a set of edges representing foreign key references between tuples. Let Vi V be a set of nodes that contain the keyword pi. We study finding top-k minimum cost connected trees that contain at least one node in every subset Vi, and denote our problem as GST-k. When k = 1, it is known as a minimum cost group Steiner tree problem which is NP-Complete. We observe that the number of keywords, l, is small, and propose a novel parameterized solution, with l as a parameter, to find the optimal GST-1, in time complexity O(3ln + 2l((l + log n)n + m)), where n and m are the numbers of nodes and edges in graph G. Our solution can handle graphs with a large number of nodes. Our GST-1 solution can be easily extended to support GST-k, which outperforms the existing GST-k solutions over both weighted undirected/directed graphs. We conducted extensive experimental studies, and report our finding.
S. Wang, B. Ding, X. Zhang, L. Qin, J. X. Yu and X. Lin, "Finding Top-k Min-Cost Connected Trees in Databases," 2007 IEEE 23rd International Conference on Data Engineering(ICDE), Istanbul, Turkey, 2007, pp. 836-845.