The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - December (2011 vol.23)
pp: 1781-1794
Jianhua Feng , Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
ABSTRACT
Existing studies on keyword search over relational databases usually find Steiner trees composed of connected database tuples as answers. They on-the-fly identify Steiner trees by discovering rich structural relationships between database tuples, and neglect the fact that such structural relationships can be precomputed and indexed. Recently, tuple units are proposed to improve search efficiency by indexing structural relationships, and existing methods identify a single tuple unit to answer keyword queries. However, in many cases, multiple tuple units should be integrated to answer a keyword query. Thus, these methods will involve false negatives. To address this problem, in this paper, we study how to integrate multiple related tuple units to effectively answer keyword queries. To achieve a high performance, we devise two novel indexes, single-keyword-based structure-aware index and keyword-pair-based structure-aware index, and incorporate structural relationships between different tuple units into the indexes. We use the indexes to efficiently identify the answers of integrated tuple units. We develop new ranking techniques and algorithms to progressively find the top-k answers. We have implemented our method in real database systems, and the experimental results show that our approach achieves high search efficiency and result quality, and outperforms state-of-the-art methods significantly.
INDEX TERMS
trees (mathematics), database indexing, query processing, relational databases, ranking technique, top-k answer, keyword search, relational database, Steiner tree, connected database tuple, tuple structural relationship, keyword query, single-keyword-based structure-aware index, keyword-pair-based structure-aware index, Relational databases, Keyword search, Steiner trees, Information retrieval, Periodic structures, Indexes, tuple units., Keyword search, relational databases, single-keyword-based index, keyword-pair-based index
CITATION
Jianhua Feng, "Finding Top-k Answers in Keyword Search over Relational Databases Using Tuple Units", IEEE Transactions on Knowledge & Data Engineering, vol.23, no. 12, pp. 1781-1794, December 2011, doi:10.1109/TKDE.2011.61
REFERENCES
[1] S. Agrawal, S. Chaudhuri, and G. Das, "DBXplorer: A System for Keyword-Based Search over Relational Databases," Proc. Int'l Conf. Data Eng. (ICDE), pp. 5-16, 2002.
[2] B. Arai, G. Das, D. Gunopulos, and N. Koudas, "Anytime Measures for Top-k Algorithms," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2007.
[3] A. Balmin, V. Hristidis, and Y. Papakonstantinou, "Objectrank: Authority-Based Keyword Search in Databases," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 564-575, 2004.
[4] Z. Bao, T.W. Ling, B. Chen, and J. Lu, "Effective XML Keyword Search with Relevance Oriented Ranking," Proc. IEEE Int'l Conf. Data Eng. (ICDE), pp. 517-528, 2009.
[5] G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan, "Keyword Searching and Browsing in Databases Using Banks," Proc. Int'l Conf. Data Eng. (ICDE), pp. 431-440, 2002.
[6] L.J. Chen and Y. Papakonstantinou, "Supporting Top-k Keyword Search in Xml Databases," Proc. IEEE Int'l Conf. Data Eng. (ICDE), pp. 689-700, 2010.
[7] E. Chu, A. Baid, X. Chai, A. Doan, and J.F. Naughton, "Combining Keyword Search and Forms for Ad Hoc Querying of Databases," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 349-360, 2009.
[8] S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv, "Xsearch: A Semantic Search Engine for XML," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 45-56, 2003.
[9] B.B. Dalvi, M. Kshirsagar, and S. Sudarshan, "Keyword Search on External Memory Data Graphs," Proc. VLDB Endowment, vol. 1, no. 1, pp. 1189-1204, 2008.
[10] B. Ding et al., "Finding Top-k Min-Cost Connected Trees in Databases," Proc. IEEE Int'l Conf. Data Eng. (ICDE), 2007.
[11] R. Fagin, "Combining Fuzzy Information from Multiple Systems," Proc. ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems (PODS), pp. 216-226, 1996.
[12] R. Fagin, "Fuzzy Queries in Multimedia Database Systems," Proc. ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems (PODS), pp. 1-10, 1998.
[13] J. Feng, G. Li, J. Wang, and L. Zhou, "Finding and Ranking Compact Connected Trees for Effective Keyword Proximity Search in XML Documents," Information Systems, vol. 35, no. 2, pp. 186-203, 2010.
[14] K. Golenberg, B. Kimelfeld, and Y. Sagiv, "Keyword Proximity Search in Complex Data Graphs," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 927-940, 2008.
[15] L. Guo, J. Shanmugasundaram, and G. Yona, "Topology Search over Biological Databases," Proc. IEEE Int'l Conf. Data Eng. (ICDE),, 2007.
[16] L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram, "Xrank: Ranked Keyword Search over XML Documents," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 16-27, 2003.
[17] H. He, H. Wang, J. Yang, and P. Yu, "Blinks: Ranked Keyword Searches on Graphs," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2007.
[18] V. Hristidis, L. Gravano, and Y. Papakonstantinou, "Efficient Ir-Style Keyword Search over Relational Databases," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 850-861, 2003.
[19] V. Hristidis, N. Koudas, Y. Papakonstantinou, and D. Srivastava, "Keyword Proximity Search in Xml Trees," IEEE Trans. Knowledge and Data Eng., vol. 18, no. 4, pp. 525-539, Apr. 2006.
[20] V. Hristidis and Y. Papakonstantinou, "Discover: Keyword Search in Relational Databases," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 670-681, 2002.
[21] V. Hristidis, Y. Papakonstantinou, and A. Balmin, "Keyword Proximity Search on Xml Graphs," Proc. Int'l Conf. Data Eng. (ICDE), pp. 367-378, 2003.
[22] M. Hua, J. Pei, A.W.C. Fu, X. Lin, and H.-F. Leung, "Efficiently Answering Top-k Typicality Queries on Large Databases," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2007.
[23] V. Kacholia et al., "Bidirectional Expansion for Keyword Search on Graph Databases," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 505-516, 2005.
[24] B. Kimelfeld and Y. Sagiv, "Finding Approximating Top-k Answers in Keyword Proximity Search," Proc. ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems (PODS), 2006.
[25] J.M. Kleinberg, "Authoritative Sources in a Hyperlinked Environment," J. ACM, vol. 46, no. 5, pp. 604-632, 1999.
[26] G. Li, J. Feng, and J. Wang, "Structure-Aware Indexing for Keyword Search in Databases," Proc. ACM Conf. Information and Knowledge Management (CIKM), pp. 1453-1456, 2009.
[27] G. Li, J. Feng, J. Wang, and L. Zhou, "Efficient Keyword Search for Valuable Lcas over XML Documents," Proc. ACM Conf. Information and Knowledge Management (CIKM), 2007.
[28] G. Li, J. Feng, and L. Zhou, "Retune: Retrieving and Materializing Tuple Units for Effective Keyword Search over Relational Databases," Proc. Int'l Conf. Conceptual Modeling (ER), pp. 469-483, 2008.
[29] G. Li, S. Ji, C. Li, and J. Feng, "Efficient Type-Ahead Search on Relational Data: A Tastier Approach," Proc. SIGMOD Int'l Conf. Management of Data, pp. 695-706, 2009.
[30] G. Li, B.C. Ooi, J. Feng, J. Wang, and L. Zhou, "Ease: An Effective 3-in-1 Keyword Search Method for Unstructured, Semi-Structured and Structured Data," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 903-914, 2008.
[31] G. Li, X. Zhou, J. Feng, and J. Wang, "Progressive Keyword Search in Relational Databases," Proc. IEEE Int'l Conf. Data Eng. (ICDE), pp. 1183-1186, 2009.
[32] G. Li, J. Feng, X. Zhou, and J. Wang, "Providing Built-in Keyword Search Capabilities in RDBMS," The VLDB J., vol. 20, no. 1, pp. 1-19, 2011.
[33] J. Li, C. Liu, R. Zhou, and W. Wang, "Suggestion of Promising Result Types for XML Keyword Search," Proc. Int'l Conf. Extending Database Technology (EDBT), pp. 561-572, 2010.
[34] Y. Li, C. Yu, and H.V. Jagadish, "Schema-Free Xquery," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 72-84, 2004.
[35] F. Liu, C. Yu, W. Meng, and A. Chowdhury, "Effective Keyword Search in Relational Databases," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 563-574, 2006.
[36] Z. Liu and Y. Chen, "Identifying Return Information for Xml Keyword Search," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2007.
[37] Z. Liu and Y. Chen, "Reasoning and Identifying Relevant Matches for Xml Keyword Search," Proc. VLDB Endowment, vol. 1, no. 1, pp. 921-932, 2008.
[38] Z. Liu, P. Sun, and Y. Chen, "Structured Search Result Differentiation," Proc. VLDB Endowment, vol. 2, no. 1, pp. 313-324, 2009.
[39] Y. Luo, X. Lin, W. Wang, and X. Zhou, "Spark: Top-k Keyword Query in Relational Databases," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2007.
[40] A. Markowetz, Y. Yang, and D. Papadias, "Keyword Search on Relational Data Streams," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2007.
[41] L. Qin, J.X. Yu, and L. Chang, "Keyword Search in Databases: The Power of RDBMS," Proc. SIGMOD Int'l Conf. Management of Data, pp. 681-694, 2009.
[42] M. Sayyadian, H. LeKhac, A. Doan, and L. Gravano, "Efficient Keyword Search across Heterogeneous Relational Databases," Proc. IEEE Int'l Conf. Data Eng. (ICDE), 2007.
[43] K. Schnaitter, J. Spiegel, and N. Polyzotis, "Depth Estimation for Ranking Query Optimization," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2007.
[44] F. Shao, L. Guo, C. Botev, A. Bhaskar, M. Chettiar, F. Yang, and J. Shanmugasundaram, "Efficient Keyword Search over Virtual XML Views," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2007.
[45] Q. Su and J. Widom, "Indexing Relational Database Content Offline for Efficient Keyword-Based Search," Proc. Int'l Database Eng. and Application Symp. (IDEAS), 2005.
[46] C. Sun, C.Y. Chan, and A.K. Goenka, "Multiway SLCA-Based Keyword Search in XML Data," Proc. Int'l Conf. World Wide Web (WWW), pp. 1043-1052, 2007.
[47] Y. Xu and Y. Papakonstantinou, "Efficient Keyword Search for Smallest LCAs in XML Databases," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 527-538, 2005.
[48] Y. Xu and Y. Papakonstantinou, "Efficient LCA Based Keyword Search in XML Data," Proc. Int'l Conf. Extending Database Technology (EDBT), pp. 535-546, 2008.
[49] B. Yu, G. Li, K. Sollins, and A.K.H. Tung, "Effective Keyword-Based Selection of Relational Databases," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 139-150, 2007.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool