This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Examining the Limits of Crowdsourcing for Relevance Assessment
July-Aug. 2013 (vol. 17 no. 4)
pp. 32-38
Paul Clough, University of Sheffield
Mark Sanderson, RMIT University
Jiayu Tang, Alibaba.com
Tim Gollins, The National Archives, UK
Amy Warner, Royal Holloway, University of London
Evaluation is instrumental to developing and managing effective information retrieval systems. For this process, enlisting crowdsourcing has proven viable. However, less understood are crowdsourcing's limits for evaluation, particularly for domain-specific search. The authors compare relevance assessments gathered using crowdsourcing with those from a domain expert to evaluate different search engines in a large government archive. Although crowdsourced judgments rank the tested search engines in the same order as expert judgments, crowdsourced workers appear unable to distinguish different levels of highly accurate search results the way expert assessors can.
Index Terms:
Performance evaluation,Navigation,Search engines,System analysis and design,Internet,Crowdsourcing,Search methods,Information retrieval,crowdsourcing,information search and retrieval,performance of systems
Citation:
Paul Clough, Mark Sanderson, Jiayu Tang, Tim Gollins, Amy Warner, "Examining the Limits of Crowdsourcing for Relevance Assessment," IEEE Internet Computing, vol. 17, no. 4, pp. 32-38, July-Aug. 2013, doi:10.1109/MIC.2012.95
Usage of this product signifies your acceptance of the Terms of Use.