Issue No. 07 - July (2014 vol. 26)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2013.85
Mehdi Kargar , Department of Computer Science and Engineering, York University, Toronto, ON, Canada
Aijun An , Department of Computer Science and Engineering, York University, Toronto, ON, Canada
Xiaohui Yu , School of Information Technology, York University, Toronto, ON, Canada
Keyword search over a graph searches for a subgraph that contains a set of query keywords. A problem with most existing keyword search methods is that they may produce duplicate answers that contain the same set of content nodes (i.e., nodes containing a query keyword) although these nodes may be connected differently in different answers. Thus, users may be presented with many similar answers with trivial differences. In addition, some of the nodes in an answer may contain query keywords that are all covered by other nodes in the answer. Removing these nodes does not change the coverage of the answer but can make the answer more compact. The answers in which each content node contains at least one unique query keyword are called minimal answers in this paper. We define the problem of finding duplication-free and minimal answers, and propose algorithms for finding such answers efficiently. Extensive performance studies using two large real data sets confirm the efficiency and effectiveness of the proposed methods.
query processing, graph theory