The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - December (2011 vol.23)
pp: 1811-1825
Xiping Liu , Jiangxi University of Finance and Economics, Nanchang
Changxuan Wan , Jiangxi University of Finance and Economics, Nanchang
Lei Chen , The Hong Kong University of Science and Technology, Hong Kong
ABSTRACT
Keyword search is an effective paradigm for information discovery and has been introduced recently to query XML documents. In this paper, we address the problem of returning clustered results for keyword search on XML documents. We first propose a novel semantics for answers to an XML keyword query. The core of the semantics is the conceptually related relationship between keyword matches, which is based on the conceptual relationship between nodes in XML trees. Then, we propose a new clustering methodology for XML search results, which clusters results according to the way they match the given query. Two approaches to implement the methodology are discussed. The first approach is a conventional one which does clustering after search results are retrieved; the second one clusters search results actively, which has characteristics of clustering on the fly. The generated clusters are then organized into a cluster hierarchy with different granularities to enable users locate the results of interest easily and precisely. Experimental results demonstrate the meaningfulness of the proposed semantics as well as the efficiency of the proposed methods.
INDEX TERMS
XML keyword search, search results clustering, cluster hierarchy.
CITATION
Xiping Liu, Changxuan Wan, Lei Chen, "Returning Clustered Results for Keyword Search on XML Documents", IEEE Transactions on Knowledge & Data Engineering, vol.23, no. 12, pp. 1811-1825, December 2011, doi:10.1109/TKDE.2011.183
REFERENCES
[1] "DBLP Bibliography," www.informatik.uni-trier.de/~leydb/, 2011.
[2] http://www.cs.washington.edu/researchxmldatasets /, 2011.
[3] http://www.sigmod.org/publications/sigmod-record xml- edition, 2011.
[4] http:/www.xml-benchmark.org/, 2011.
[5] A.V. Aho, J.E. Hopcroft, and J.D. Ullman, "On Finding Lowest Common Ancestors in Trees," Proc. Fifth Ann. ACM Symp. Theory of Computing, 1973.
[6] S. Amer-Yahia, L.V.S. Lakshmanan, and S. Pandit, "FleXPath: Flexible Structure and Full-Text Querying for XML," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2004.
[7] Z. Bao, T.W. Ling, B. Chen, and J. Lu, "Effective XML Keyword Search with Relevance Oriented Ranking," Proc. 25th Int'l Conf. Data Eng., 2009.
[8] C. Carpineto, S. Osiński, G. Romano, and D. Weiss, "A Survey of Web Clustering Engines," ACM Computing Surveys, vol. 41, no. 3, pp. 1-38, 2009.
[9] L. Chen and Y. Papakonstantinou, "Supporting Top-K Keyword Search in XML Databases," Proc. 26th Int'l Conf. Data Eng., 2010.
[10] T. Chen, J. Lu, and T.W. Ling, "On Boosting Holism in XML Twig Pattern Matching Using Structural Indexing Techniques," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2005.
[11] J. Clark and S. DeRose, "XML Path Language (XPath) Version 1.0," W3C Recommendation, 1999.
[12] S. Cohen, Y. Kanza, B. Kimelfeld, and Y. Sagiv, "Interconnection Semantics for Keyword Search in XML," Proc. ACM Int'l Conf. Information and Knowledge Management (CIKM), 2005.
[13] S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv, "XSEarch: A Semantic Search Engine for XML," Proc. 29th Int'l Conf. Very Large Data Bases, 2003.
[14] S. Flesca, G. Manco, E. Masciari, L. Pontieri, and A. Pugliese, "Fast Detection of XML Structural Similarity," IEEE Trans. Knowledge and Data Eng., vol. 17, no. 2, pp. 160-175, Feb. 2005.
[15] R. Goldman and J. Widom, "DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases," Proc. 23rd Int'l Conf. Very Large Data Bases, 1997.
[16] L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram, "XRANK: Ranked Keyword Search over XML Documents," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2003.
[17] V. Hristidis, N. Koudas, Y. Papakonstantinou, and D. Srivastava, "Keyword Proximity Search in XML Trees," IEEE Trans. Knowledge and Data Eng., vol. 18, no. 4, pp. 525-539, Apr. 2006.
[18] Y. Huang, Z. Liu, and Y. Chen, "Query Biased Snippet Generation in XML Search," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2008.
[19] K. Kummamuru, R. Lotlikar, S. Roy, K. Singal, and R. Krishnapuram, "A Hierarchical Monothetic Document Clustering Algorithm for Summarization and Browsing Search Results," Proc. 13th Int'l Conf. World Wide Web, 2004.
[20] G. Li, J. Feng, J. Wang, and L. Zhou, "Effective Keyword Search for Valuable LCAs over XML Documents," Proc. 16th ACM Conf. Information and Knowledge Management, 2007.
[21] Y. Li, C. Yu, and H.V. Jagadish, "Schema-Free XQuery," Proc. 30th Int'l Conf. Very Large Data Bases, 2004.
[22] W. Lian, D.W.-L. Cheung, N. Mamoulis, and S.-M. Yiu, "An Efficient and Scalable Algorithm for Clustering XML Documents by Structure," IEEE Trans. Knowledge and Data Eng., vol. 16, no. 1, pp. 82-96, Jan. 2004.
[23] Z. Liu and Y. Chen, "Identifying Meaningful Return Information for XML Keyword Search," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2007.
[24] Z. Liu and Y. Chen, "Reasoning and Identifying Relevant Matches for XML Keyword Search," Proc. VLDB Endowment, vol. 1, no. 1, pp. 921-932, 2008.
[25] Z. Liu and Y. Chen, "Return Specification Inference and Result Clustering for Keyword Search on XML," ACM Trans. Database Systems, vol. 35, no. 2, pp. 1-47, 2010.
[26] Z. Liu, P. Sun, and Y. Chen, "Structured Search Result Differentiation," Proc. VLDB Endowment, vol. 2, no. 1, pp. 313-324, 2009.
[27] M. Necasky, "Conceptual Modeling for XML: A Survey," Technical Report No. 2006-3, Dept. of Software Eng., Faculty of Math. and Physics, Charles Univ., 2006, http://www.necasky. net/paperstr2006.pdf .
[28] A. Schmidt, M. Kersten, and M. Windhouwer, "Querying XML Documents Made Easy: Nearest Concept Queries," Proc. 17th Int'l Conf. Data Eng., 2001.
[29] C. Sun, C.-Y. Chan, and A.K. Goenka, "Multiway SLCA-Based Keyword Search in XML Data," Proc. 16th Int'l Conf. World Wide Web, 2007.
[30] Y. Xu and Y. Papakonstantinou, "Efficient Keyword Search for Smallest LCAs in XML Databases," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2005.
[31] Y. Xu and Y. Papakonstantinou, "Efficient LCA Based Keyword Search in XML Data," Proc. 11th Int'l Conf. Extending Database Technology: Advances in Database Technology (EDBT), 2008.
32 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool