This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Keeping Up with the Changing Web
May 2000 (vol. 33 no. 5)
pp. 52-58

Because information depreciates over time, keeping Web pages current presents new design challenges. This article quantifies what "current" means for Web search engines and estimates how often they must reindex the Web to keep current with its changing pages and structure.

Most information--from a newspaper story to a temperature sensor measurement to a Web page--is dynamic. When monitoring an information source, when do our previous observations become stale and need refreshing? How can we schedule these refresh operations to satisfy a required level of currency without violating resource constraints--such as band-width or computing limitations on how much data can be observed in a given time?

The authors investigate the trade-offs involved in monitoring dynamic information sources and discuss the Web in detail, estimating how fast documents change and exploring what constitutes a "current" Web index. For a simple class of Web-monitoring systems--search engines--they combine their idea of currency with actual measured data to estimate revisit rates.

Citation:
Brian E. Brewington, George Cybenko, "Keeping Up with the Changing Web," Computer, vol. 33, no. 5, pp. 52-58, May 2000, doi:10.1109/2.841784
Usage of this product signifies your acceptance of the Terms of Use.