An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
The Effect of Assessor Errors on IR System Evaluation
Published
Author(s)
Ben Carterette, Ian Soboroff
Abstract
Recent efforts in test collection building have focused on scaling back the number of necessary relevance judgments and then scaling up the number of search topics. Since the largest source of variation in a Cranfield-style experiment comes from the topics, this is a reasonable approach. However, as topic set sizes grow, and researchers look to crowdsourcing and Amazon's Mechanical Turk to collect relevance judgments, we are faced with issues of quality control. This paper examines the robustness of the TREC Million Query track methods when some assessors make significant and systematic errors. We find that while averages are robust, assessor errors can have a large effect on system rankings.
Proceedings Title
Proceedings of the 33nd Annual International ACM SIGIR Conference on Research and Development Information Retrieval
Carterette, B.
and Soboroff, I.
(2010),
The Effect of Assessor Errors on IR System Evaluation, Proceedings of the 33nd Annual International ACM SIGIR Conference on Research and Development Information Retrieval, Geneva, CH
(Accessed October 31, 2024)