Issue No.02 - March/April (2003 vol.15)
Sourav S. Bhowmick , IEEE Computer Society
Sanjay Kumar Madria , IEEE
Wee Keong Ng , IEEE Computer Society
<p><b>Abstract</b>—In this paper, we present a mechanism for detecting and representing changes, given the old and new versions of a set of interlinked Web documents, retrieved in response to a user's query. In particular, we show how to detect and represent <it>Web deltas</it>, i.e., changes in the Web documents that are relevant to a user's query in the context of our <it>Web warehousing</it> system called W<scp>howeda</scp> (<it>W</it>are<it>h</it>ouse <it>o</it>f <it>We</it>b <it>Da</it>ta). In W<scp>howeda</scp>, Web information is materialized views stored in <it>Web tables</it> in the form of <it>Web tuples</it>. These Web tuples, represented as directed graphs, can be manipulated using a set of <it>Web algebraic operators</it>. In this paper, we present a mechanism to detect <it>relevant</it> Web deltas using Web algebraic operators such as the <it>Web join</it> and the <it>outer Web join</it>. Web join is used to detect <it>identical documents</it> residing in two Web tables, whereas, outer Web join, a derivative of Web join, is used to identify <it>dangling Web tuples</it>. We show how to represent these changes using <it>delta Web tables</it>. We develop formal algorithms for the generation of delta Web tables identifying Web documents which have been added, deleted, or modified since the last query.</p>
Web deltas, Web warehouse, Web join, outer Web join, delta Web tables, algorithm.
Sourav S. Bhowmick, Sanjay Kumar Madria, Wee Keong Ng, "Detecting and Representing Relevant Web Deltas in WHOWEDA", IEEE Transactions on Knowledge & Data Engineering, vol.15, no. 2, pp. 423-441, March/April 2003, doi:10.1109/TKDE.2003.1185843