Search For:

Displaying 1-25 out of 25 total
Runtime Optimizations for Tree-Based Machine Learning Models
Found in: IEEE Transactions on Knowledge and Data Engineering
By Nima Asadi,Jimmy Lin,Arjen P. de Vries
Issue Date:September 2014
pp. 1-1
Tree-based models have proven to be an effective solution for web ranking as well as other machine learning problems in diverse domains. This paper focuses on optimizing the runtime performance of applying such models to make predictions, specifically usin...
 
Obtaining High-Quality Relevance Judgments Using Crowdsourcing
Found in: IEEE Internet Computing
By Jeroen B.P. Vuurens,Arjen P. de Vries
Issue Date:September 2012
pp. 20-27
The performance of information retrieval (IR) systems is commonly evaluated using a test set with known relevance. Crowdsourcing is one method for learning the relevant documents to each query in the test set. However, the quality of relevance learned thro...
 
Understanding Similarity Metrics in Neighbour-based Recommender Systems
Found in: Proceedings of the 2013 Conference on the Theory of Information Retrieval (ICTIR '13)
By Alejandro Bellogín, Arjen P. de Vries
Issue Date:September 2013
pp. 48-55
Neighbour-based collaborative filtering is a recommendation technique that provides meaningful and, usually, accurate recommendations. The method's success depends however critically upon the similarity metric used to find the most similar users (neighbour...
     
Characterizing stages of a multi-session complex search task through direct and indirect query modifications
Found in: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '13)
By Arjen P. de Vries, Jiyin He, Marc Bron
Issue Date:July 2013
pp. 897-900
Search systems use context to effectively satisfy a user's information need as expressed by a query. Tasks are important factors in determining user context during search and many studies have been conducted that identify tasks and task stages through user...
     
Copulas for information retrieval
Found in: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '13)
By Arjen P. de Vries, Carsten Eickhoff, Kevyn Collins-Thompson
Issue Date:July 2013
pp. 663-672
In many domains of information retrieval, system estimates of document relevance are based on multidimensional quality criteria that have to be accommodated in a unidimensional result ranking. Current solutions to this challenge are often inconsistent with...
     
The downside of markup: examining the harmful effects of CSS and javascript on indexing today's web
Found in: Proceedings of the 21st ACM international conference on Information and knowledge management (CIKM '12)
By Arjen P. de Vries, Carsten Eickhoff, Karl Gyllstrom, Marie-Francine Moens
Issue Date:October 2012
pp. 1990-1994
The continued development and maturation of advanced HTML features such as Cascading style sheets (CSS), Javascript, and AJAX, as well as their widespread adoption by browsers, has enabled web pages to flourish with sophistication and interactivity. Unfort...
     
Contextualization using hyperlinks and internal hierarchical structure of Wikipedia documents
Found in: Proceedings of the 21st ACM international conference on Information and knowledge management (CIKM '12)
By Arjen P. de Vries, Muhammad Ali Norozi, Paavo Arvola
Issue Date:October 2012
pp. 734-743
Context surrounding hyperlinked semi-structured documents, externally in the form of citations and internally in the form of hierarchical structure, contains a wealth of useful but implicit evidence about a document's relevance. These rich sources of infor...
     
Want a coffee?: predicting users' trails
Found in: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '12)
By Arjen P. de Vries, Carsten Eickhoff, Wen Li
Issue Date:August 2012
pp. 1171-1172
Twitter and Foursquare are two well-connected platforms for sharing information where growing numbers of users post location-related messages. In contrast to the longitude-latitude geotags commonly used online, e.g., on photos and tweets, new place-tags co...
     
Quality through flow and immersion: gamifying crowdsourced relevance assessments
Found in: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '12)
By Arjen P. de Vries, Carsten Eickhoff, Christopher G. Harris, Padmini Srinivasan
Issue Date:August 2012
pp. 871-880
Crowdsourcing is a market of steadily-growing importance upon which both academia and industry increasingly rely. However, this market appears to be inherently infested with a significant share of malicious workers who try to maximise their profits through...
     
What to do when one size does not fit all?
Found in: Proceedings of the fourth workshop on Exploiting semantic annotations in information retrieval (ESAIR '11)
By Arjen P. de Vries
Issue Date:October 2011
pp. 1-2
This talk addresses the theme how semantic annotations could improve information access. In this context, "semantic annotation" may refer to any (perhaps typed) clue about documents in a collection that can be useful for retrieval purposes: the call for pa...
     
The where in the tweet
Found in: Proceedings of the 20th ACM international conference on Information and knowledge management (CIKM '11)
By Arjen P. de Vries, Carsten Eickhoff, Martha Larson, Pavel Serdyukov, Wen Li
Issue Date:October 2011
pp. 2473-2476
Twitter is a widely-used social networking service which enables its users to post text-based messages, so-called tweets. POI tags on tweets can show more human-readable high-level information about a place rather than just a pair of coordinates. In this p...
     
The task-dependent effect of tags and ratings on social media access
Found in: ACM Transactions on Information Systems (TOIS)
By Arjen P. De Vries, Maarten Clements, Marcel J. T. Reinders
Issue Date:November 2010
pp. 1-42
Recently, online social networks have emerged that allow people to share their multimedia files, retrieve interesting content, and discover like-minded people. These systems often provide the possibility to annotate the content with tags and ratings. Using...
     
Search by strategy
Found in: Proceedings of the third workshop on Exploiting semantic annotations in information retrieval (ESAIR '10)
By Arjen P. de Vries, Roberto Cornacchia, Wouter Alink
Issue Date:October 2010
pp. 27-28
This position statement advocates that the integration of information retrieval and databases, a topic that has been studied for many years (see e.g. [3]), is now in a state where the technology is ready to be brought out of the laboratory, and that this t...
     
Web page classification on child suitability
Found in: Proceedings of the 19th ACM international conference on Information and knowledge management (CIKM '10)
By Arjen P. de Vries, Carsten Eickhoff, Pavel Serdyukov
Issue Date:October 2010
pp. 1425-1428
Children spend significant amounts of time on the Internet. Recent studies showed, that during these periods they are often not under adult supervision. This work presents an automatic approach to identifying suitable web pages for children based on topica...
     
Using flickr geotags to predict user travel behaviour
Found in: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval (SIGIR '10)
By Arjen P. de Vries, Maarten Clements, Marcel J.T. Reinders, Pavel Serdyukov
Issue Date:July 2010
pp. 851-852
We propose a method to predict a user's favourite locations in a city, based on his Flickr geotags in other cities. We define a similarity between the geotag distributions of two users based on a Gaussian kernel convolution. The geotags of the most similar...
     
Image annotation using clickthrough data
Found in: Proceeding of the ACM International Conference on Image and Video Retrieval (CIVR '09)
By Anastasios Delopoulos, Arjen P. de Vries, Christos Diou, Theodora Tsikrika
Issue Date:July 2009
pp. 1-8
Automatic image annotation using supervised learning is performed by concept classifiers trained on labelled example images. This work proposes the use of clickthrough data collected from search logs as a source for the automatic generation of concept trai...
     
Detecting synonyms in social tagging systems to improve content retrieval
Found in: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '08)
By Arjen P. de Vries, Maarten Clements, Marcel J.T. Reinders
Issue Date:July 2008
pp. 2-2
Collaborative tagging used in online social content systems is naturally characterized by many synonyms, causing low precision retrieval. We propose a mechanism based on user preference profiles to identify synonyms that can be used to retrieve more releva...
     
Relevance assessment: are judges exchangeable and does it matter
Found in: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '08)
By Arjen P. de Vries, Emine Yilmaz, Ian Soboroff, Nick Craswell, Paul Thomas, Peter Bailey
Issue Date:July 2008
pp. 2-2
We investigate to what extent people making relevance judgements for a reusable IR test collection are exchangeable. We consider three classes of judge: "gold standard" judges, who are topic originators and are experts in a particular information seeking t...
     
Unified relevance models for rating prediction in collaborative filtering
Found in: ACM Transactions on Information Systems (TOIS)
By Arjen P. de Vries, Jun Wang, Marcel J. T. Reinders
Issue Date:June 2008
pp. 1-42
Collaborative filtering aims at predicting a user's interest for a given item based on a collection of user profiles. This article views collaborative filtering as a problem highly related to information retrieval, drawing an analogy between the concepts o...
     
Using small XML elements to support relevance
Found in: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '06)
By Arjen P. de Vries, Georgina Ramirez, Thijs Westerveld
Issue Date:August 2006
pp. 693-694
Small XML elements are often estimated relevant by the retrieval model but they are not desirable retrieval units. This paper presents a generic model that exploits the information obtained from small elements. We identify relationships between small and r...
     
Unifying user-based and item-based collaborative filtering approaches by similarity fusion
Found in: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '06)
By Arjen P. de Vries, Jun Wang, Marcel J. T. Reinders
Issue Date:August 2006
pp. 501-508
Memory-based methods for collaborative filtering predict new ratings by averaging (weighted) ratings between, respectively, pairs of similar users or items. In practice, a large number of ratings from similar users or similar items are not available, due t...
     
The overlap problem in content-oriented XML retrieval evaluation
Found in: Proceedings of the 27th annual international conference on Research and development in information retrieval (SIGIR '04)
By Arjen P. de Vries, Gabriella Kazai, Mounia Lalmas
Issue Date:July 2004
pp. 72-79
Within the INitiative for the Evaluation of XML Retrieval(INEX) a number of metrics to evaluate the effectiveness of content-oriented XML retrieval approaches were developed. Although these metrics provide a solution towards addressing the problem of overl...
     
A case study on array query optimisation
Found in: Proceedings of the 1st international workshop on Computer vision meets databases (CVDB '04)
By Alex van Ballegooij, Arjen P. de Vries, Roberto Cornacchia
Issue Date:June 2004
pp. 3-10
The development of applications involving multi-dimensional data sets on top of a RDBMS raises several difficulties that are not directly related to the scientific problem being addressed. In particular, an additional effort is needed to solve the mismatch...
     
Efficient k-NN search on vertically decomposed data
Found in: Proceedings of the 2002 ACM SIGMOD international conference on Management of data (SIGMOD '02)
By Arjen P. de Vries, Martin Kersten, Niels Nes, Nikos Mamoulis
Issue Date:June 2002
pp. 322-333
Applications like multimedia retrieval require efficient support for similarity search on large data collections. Yet, nearest neighbor search is a difficult problem in high dimensional spaces, rendering efficient applications hard to realize: index struct...
     
The psychology of multimedia databases
Found in: Proceedings of the fifth ACM conference on Digital libraries (DL '00)
By Arjen P. de Vries, Mark G. L. M. van Doorn
Issue Date:June 2000
pp. 1-9
Multimedia information retrieval in digital libraries is a difficult task for computers in general. Humans on the other hand are experts in perception, concept representation, knowledge organization and memory retrieval. Cognitive psychology and sci...
     
 1