The Community for Technology Leaders
2008 IEEE International Conference on Data Mining Workshops (2008)
Dec. 15, 2008 to Dec. 19, 2008
ISBN: 978-0-7695-3503-6
pp: 909-918
ABSTRACT
Recently, a new temporal dataset has been made public: it is made of a series of twelve 100M pages snapshots of the \texttt{.uk} domain~\cite{BSVLTAG}. The Web graphs of the twelve snapshots have been merged into a single \emph{time-aware} graph that provide constant-time access to temporal information. In this paper we present the first statistical analysis performed on this graph, with the goal of checking whether the information contained in the graph is reliable (i.e., whether it depends essentially on appearance and disappearance of pages and links, or on the crawler behaviour). We perform a number of tests that show that the graph is actually reliable, and provide the first public data on the evolution of the Web that use a large scale and a significant diversity in the sites considered.
INDEX TERMS
Temporal-evolution, Web-evolution, Web-characterization
CITATION

D. Donato, P. Boldi, S. Vigna, M. Santini and I. Bordino, "Temporal Evolution of the UK Web," 2008 IEEE International Conference on Data Mining Workshops(ICDMW), vol. 00, no. , pp. 909-918, 2008.
doi:10.1109/ICDMW.2008.88
101 ms
(Ver 3.3 (11022016))