Issue No. 07 - July (2007 vol. 19)
Ka Cheung Sia , IEEE
Junghoo Cho , IEEE
Hyun-Kyu Cho , IEEE
Recently, there has been a dramatic increase in the use of XML data to deliver information over the Web. Personal Weblogs, news Web sites, and discussion forums are now publishing RSS feeds for their subscribers to retrieve new postings. As the popularity of personal Weblogs and RSS feeds grows rapidly, RSS aggregation services and blog search engines have appeared, which try to provide a central access point for simpler access and discovery of new content from a large number of diverse RSS sources. In this paper, we study how the RSS aggregation services should monitor the data sources to retrieve new content quickly using minimal resources and to provide its subscribers with fast news alerts. We believe that the change characteristics of RSS sources and the general user access behavior pose distinct requirements that make this task significantly different from the traditional index refresh problem for Web search engines. Our studies on a collection of 10,000 RSS feeds reveal some general characteristics of the RSS feeds and show that, with proper resource allocation and scheduling, the RSS aggregator provides news alerts significantly faster than the best existing approach.
Information search and retrieval, online information services, performance evaluation, user profiles, alert services.
K. C. Sia, J. Cho and H. Cho, "Efficient Monitoring Algorithm for Fast News Alerts," in IEEE Transactions on Knowledge & Data Engineering, vol. 19, no. , pp. 950-961, 2007.