The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.07 - July (2011 vol.23)
pp: 1035-1049
Guoliang Li , Tsinghua Univsersity, Beijing
Jianhua Feng , Tsinghua University, Beijing
Jianyong Wang , Tsinghua University, Beijing
Lizhu Zhou , Tsinghua University, Beijing
ABSTRACT
This paper studies the problem of XML message brokering with user subscribed profiles of keyword queries and presents a KEyword-based XML Message Broker (KEMB) to address this problem. In contrast to traditional-path-expressions-based XML message brokers, KEMB stores a large number of user profiles, in the form of keyword queries, which capture the data requirement of users/applications, as opposed to path expressions, such as XPath/XQuery expressions. KEMB brings new challenges: 1) how to effectively identify relevant answers of keyword queries in XML data streams; and 2) how to efficiently answer large numbers of concurrent keyword queries. We adopt compact lowest common ancestors (CLCAs) to effectively identify relevant answers. We devise an automaton-based method to process large numbers of queries and devise an effective optimization strategy to enhance performance and scalability. We have implemented and evaluated KEMB on various data sets. The experimental results show that KEMB achieves high performance and scales very well.
INDEX TERMS
Keyword search, XML data stream, XML message brokers, compact lowest common ancestor (CLCA).
CITATION
Guoliang Li, Jianhua Feng, Jianyong Wang, Lizhu Zhou, "KEMB: A Keyword-Based XML Message Broker", IEEE Transactions on Knowledge & Data Engineering, vol.23, no. 7, pp. 1035-1049, July 2011, doi:10.1109/TKDE.2010.159
REFERENCES
[1] S. Agrawal, S. Chaudhuri, and G. Das, "Dbxplorer: A System for Keyword-Based Search over Relational Databases," Proc. Int'l Conf. Data Eng. (ICDE), pp. 5-16, 2002.
[2] M. Altinel and M.J. Franklin, "Efficient Filtering of XML Documents for Selective Dissemination of Information," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 53-64, 2000.
[3] G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan, "Keyword Searching and Browsing in Databases Using Banks," Proc. Int'l Conf. Data Eng. (ICDE), pp. 431-440, 2002.
[4] N. Bruno, L. Gravano, N. Koudas, and D. Srivastava, "Navigation- vs. Index-Based XML Multi-Query Processing," Proc. Int'l Conf. Data Eng. (ICDE), pp. 139-150, 2003.
[5] K.S. Candan et al, "Afilter: Adaptable XML Filtering with Prefix-Caching and Suffix-Clustering," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 559-570, 2006.
[6] C.Y. Chan, P. Felber, M.N. Garofalakis, and R. Rastogi, "Efficient Filtering of XML Documents with XPath Expressions," Proc. Int'l Conf. Data Eng. (ICDE), pp. 235-244, 2002.
[7] C.-Y. Chan and Y. Ni, "Efficient XML Data Dissemination with Piggybacking," Proc. ACM SIGMOD, 2007.
[8] S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv, "XSearch: A Semantic Search Engine for XML," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2003.
[9] Y. Diao, P.M. Fischer, M.J. Franklin, and R. To, "YFilter: Efficient and Scalable Filtering of XML Documents," Proc. Int'l Conf. Data Eng. (ICDE), 2002.
[10] B. Ding, J.X. Yu, S. Wang, L. Qin, X. Zhang, and X. Lin, "Finding Top-k Min-Cost Connected Trees in Databases," Proc. Int'l Conf. Data Eng. (ICDE), 2007.
[11] F. Fabret, H.-A. Jacobsen, F. Llirbat, J. Pereira, K.A. Ross, and D. Shasha, "Implementing a Scalable XML Publish/Subscribe System Using a Relational Database System," Proc. ACM SIGMOD, pp. 479-490, 2004.
[12] J. Feng, G. Li, J. Wang, and L. Zhou, "Finding and Ranking Compact Connected Trees for Effective Keyword Proximity Search in XML Documents," Proc. Information System, http://dx. doi.org/10.1016j.is.2009.05.004 , 2009.
[13] T.J. Green, G. Miklau, M. Onizuka, and D. Suciu, "Processing XML Streams with Deterministic Automata," Proc. Int'l Conf. Database Theory (ICDT), pp. 173-189, 2003.
[14] L. Guo, J. Shanmugasundaram, and G. Yona, "Topology Search over Biological Databases," Proc. Int'l Conf. Data Eng. (ICDE), 2007.
[15] L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram, "XRank: Ranked Keyword Search over XML Documents," Proc. ACM SIGMOD, pp. 16-27, 2003.
[16] A.K. Gupta and D. Suciu, "Stream Processing of XPath Queries with Predicates," Proc. ACM SIGMOD, pp. 419-430, 2003.
[17] H. He, H. Wang, J. Yang, and P. Yu, "Blinks : Ranked Keyword Searches on Graphs," Proc. ACM SIGMOD, 2007.
[18] V. Hristidis, L. Gravano, and Y. Papakonstantinou, "Efficient IR-Style Keyword Search over Relational Databases," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 850-861, 2003.
[19] V. Hristidis, N. Koudas, Y. Papakonstantinou, and D. Srivastava, "Keyword Proximity Search in XML Trees," IEEE Trans. Knowledge and Data Eng., vol. 18, no. 4, pp. 525-539, Apr. 2006.
[20] V. Hristidis and Y. Papakonstantinou, "Discover: Keyword Search in Relational Databases," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 670-681, 2002.
[21] V. Hristidis, Y. Papakonstantinou, and A. Balmin, "Keyword Proximity Search on XML Graphs," Proc. Int'l Conf. Data Eng. (ICDE), pp. 367-378, 2003.
[22] V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar, "Bidirectional Expansion for Keyword Search on Graph Databases," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 505-516, 2005.
[23] G. Kazai and M. Lalmas, "INEX 2005 Evaluation Measures," Proc. INitiative for the Evaluation of XML Retrieval (INEX), http://www. dcs.gla.ac.uk/mounia/Papersinex-2005-metrics.pdf ., 2005.
[24] J. Kwon, P. Rao, B. Moon, and S. Lee, "FiST: Scalable XML Document Filtering by Sequencing Twig Patterns," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 217-228, 2005.
[25] J. Kwon, P. Rao, B. Moon, and S. Lee, "Predicate-Based Filtering of XPath Expressions," Proc. Int'l Conf. Data Eng. (ICDE), 2006.
[26] L.V.S. Lakshmanan and S. Parthasarathy, "On Efficient Matching of Streaming XML Documents and Queries," Proc. Int'l Conf. Extending Database Technology (EDBT), pp. 142-160, 2002.
[27] G. Li, J. Feng, J. Wang, X. Song, and L. Zhou, "Sailer: An Effective Search Engine for Unified Retrieval of Heterogeneous XML and Web Documents," Proc. Int'l Conf. World Wide Web (WWW), pp. 1061-1062, 2008.
[28] G. Li, J. Feng, J. Wang, B. Yu, and Y. He, "Race: Finding and Ranking Compact Connected Trees for Keyword Proximity Search over XML Documents," Proc. Int'l Conf. World Wide Web (WWW), pp. 1045-1046, 2008.
[29] G. Li, J. Feng, J. Wang, and L. Zhou, "Effective Keyword Search for Valuable LCAs over XML Documents," Proc. Conf. Information and Knowledge Management (CIKM), pp. 31-40, 2007.
[30] G. Li, B.C. Ooi, J. Feng, J. Wang, and L. Zhou, "Ease: An Effective 3-In-1 Keyword Search Method for Unstructured, Semi-Structured and Structured Data," Proc. ACM SIGMOD, pp. 903-914, 2008.
[31] G. Li, X. Zhou, J. Feng, and J. Wang, "Progressive Top-k Keyword Search in Relational Databases," Proc. Int'l Conf. Data Eng. (ICDE), 2009.
[32] Y. Li, C. Yu, and H.V. Jagadish, "Schema-Free XQuery," Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 72-84, 2004.
[33] F. Liu, C. Yu, W. Meng, and A. Chowdhury, "Effective Keyword Search in Relational Databases," Proc. ACM SIGMOD, pp. 563-574, 2006.
[34] Z. Liu and Y. Chen, "Identifying Meaningful Return Information for XML Keyword Search," Proc. ACM SIGMOD, 2007.
[35] Z. Liu and Y. Chen, "Reasoning and Identifying Relevant Matches for XML Keyword Search" Proc. Very Large Data Bases Endowment, vol. 1, no. 1, pp. 921-932, 2008.
[36] Y. Luo, X. Lin, W. Wang, and X. Zhou, "Spark: Top-k Keyword Query in Relational Databases," Proc. ACM SIGMOD, 2007.
[37] A. Markowetz, Y. Yang, and D. Papadias, "Keyword Search on Relational Data Streams," Proc. ACM SIGMOD, 2007.
[38] A. Raj and P. Kumar, "Branch Sequencing Based XML Message Broker Architecture," Proc. Int'l Conf. Data Eng. (ICDE), 2007.
[39] M. Sayyadian, H. Le Khac, A. Doan, and L. Gravano, "Efficient Keyword Search across Heterogeneous Relational Databases," Proc. Int'l Conf. Data Eng. (ICDE), 2007.
[40] C. Sun, C.Y. Chan, and A.K. Goenka, "Multiway SLCA-Based Keyword Search in XML Data," Proc. Int'l Conf. World Wide Web (WWW), pp. 1043-1052, 2007.
[41] Y. Xu and Y. Papakonstantinou, "Efficient Keyword Search for Smallest LCAs in XML Databases," Proc. ACM SIGMOD, pp. 527-538, 2005.
[42] Y. Xu and Y. Papakonstantinou, "Efficient LCA Based Keyword Search in XML Data," Proc. Int'l Conf. Extending Database Technology (EDBT), pp. 535-546, 2008.
20 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool