The Community for Technology Leaders
RSS Icon
Subscribe
Cholula, Mexico
Oct. 25, 2006 to Oct. 27, 2006
ISBN: 0-7695-2693-4
pp: 209-219
David F. Nettleton , University Pompeu Fabra, Spain
Liliana Calder?n-Benavides , University Pompeu Fabra, Spain
Ricardo Baeza-Yates , Yahoo! Research, Spain
ABSTRACT
In this paper we process and analyze web search engine query and click data from the perspective of the documents (URL?s) selected. We initially define possible document categories and select descriptive variables to define the documents. The URL dataset is preprocessed and analyzed using some traditional statistical methods, and then processed by the Kohonen SOM clustering technique[5], which we use to produce a two level clustering. The clusters are interpreted in terms of the document categories and variables defined initially. Then we apply the C4.5[9] rule induction algorithm to produce a decision tree for the document category. The objective of the work is to apply a systematic data mining process to click data, contrasting non-supervised (Kohonen) and supervised (C4.5) methods to cluster and model the data, in order to identify document profiles which relate to theoretical user behavior, and document (URL) organization.
INDEX TERMS
null
CITATION
David F. Nettleton, Liliana Calder?n-Benavides, Ricardo Baeza-Yates, "Analysis of Web Search Engine Clicked Documents", LA-WEB, 2006, Web Congress, Latin American, Web Congress, Latin American 2006, pp. 209-219, doi:10.1109/LA-WEB.2006.6
20 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool