2007 IEEE 23rd International Conference on Data Engineering (2007)

Istanbul, Turkey

Apr. 15, 2007 to Apr. 20, 2007

ISBN: 1-4244-0802-4

pp: 836-845

Bolin Ding , The Chinese University of Hong Kong, China, blding@se.cuhk.edu.hk

Jeffrey Xu Yu , The Chinese University of Hong Kong, China, yu@se.cuhk.edu.hk

Shan Wang , Key Laboratory of Data Engineering and Knowledge Engineering, MOE of China, Renmin University of China, China, swang@ruc.edu.cn

Lu Qin , The Chinese University of Hong Kong, China, lqin@se.cuhk.edu.hk

Xiao Zhang , Key Laboratory of Data Engineering and Knowledge Engineering, MOE of China, Renmin University of China, China, zhangxiao@ruc.edu.cn

Xuemin Lin , The University of New South Wales&NICTA, Australia, lxue@cse.unsw.edu.au

ABSTRACT

It is widely realized that the integration of database and information retrieval techniques will provide users with a wide range of high quality services. In this paper, we study processing an l-keyword query, p1, p2, ???, pl, against a relational database which can be modeled as a weighted graph, G(V, E). Here V is a set of nodes (tuples) and E is a set of edges representing foreign key references between tuples. Let Vi V be a set of nodes that contain the keyword pi. We study finding top-k minimum cost connected trees that contain at least one node in every subset Vi, and denote our problem as GST-k. When k = 1, it is known as a minimum cost group Steiner tree problem which is NP-Complete. We observe that the number of keywords, l, is small, and propose a novel parameterized solution, with l as a parameter, to find the optimal GST-1, in time complexity O(3ln + 2l((l + log n)n + m)), where n and m are the numbers of nodes and edges in graph G. Our solution can handle graphs with a large number of nodes. Our GST-1 solution can be easily extended to support GST-k, which outperforms the existing GST-k solutions over both weighted undirected/directed graphs. We conducted extensive experimental studies, and report our finding.

INDEX TERMS

null

CITATION

S. Wang, B. Ding, X. Zhang, L. Qin, J. X. Yu and X. Lin, "Finding Top-k Min-Cost Connected Trees in Databases,"

*2007 IEEE 23rd International Conference on Data Engineering(ICDE)*, Istanbul, Turkey, 2007, pp. 836-845.

doi:10.1109/ICDE.2007.367929

CITATIONS

SEARCH