Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Search Publications by: Ian Soboroff (Fed)

Search Title, Abstract, Conference, Citation, Keyword or Author
Displaying 51 - 69 of 69

Overview of the TREC 2006 Enterprise Track

February 25, 2008
Author(s)
Ian M. Soboroff, Arjen de Vries, Nick Craswell
The goal of the enterprise track is to conduct experiments with enterprise data --- intranet pages, email archives, document repositories --- that reflect the experiences of users in real organizations, such that for example, an email ranking technique

Overview of the TREC 2006 Blog Track

November 27, 2007
Author(s)
Iadh Ounis, Maarten de Rijke, Craig Macdonald, Gilad Mishne, Ian Soboroff
The Blog track began this year, with the aim to explore the information seeking behaviour in the blogosphere. For this purpose, a new large-scale test collection, namely the TREC Blog06 collection, has been created. In the first pilot run of the track in

A Comparison of Pooled and Sampled Relevance Judgments

August 29, 2007
Author(s)
Ian M. Soboroff
Test collections are most useful when they are reusable, that is, when they can be reliably used to rank systems that did not contribute to the pools. Pooled relevance judgments for very large collections may not be reusable for two reasons: they will be

The TREC 2005 Terabyte Track

August 27, 2007
Author(s)
Charles L. Clarke, Falk Scholer, Ian Soboroff
The Terabyte Track explores how retrieval and evaluation techniques can scale to terabyte-sized collections, examining both efficiency and effectiveness issues. TREC 2005 is the second year for the track. The track was introduced as part of TREC 2004, with

Problems with Kendall's Tau

July 23, 2007
Author(s)
Mark Sanderson, Ian Soboroff
Test collections are most useful when they are reusable, that is, when they can be reliably used to rank systems that did not contribute to the pools. Pooled relevance judgments for very large collections may not be reusable for two reasons: they will be

Reliable Information Retrieval Evaluation With Incomplete and Biased Judgements

July 23, 2007
Author(s)
Stefan Buttcher, Charles L. Clarke, Peter C. Yeung, Ian Soboroff
Information retrieval evaluation based on the pooling method is inherently biased against systems that did not contribute to the pool of judged documents. This may distort the results obtained about the relative quality of the systems evaluated and thus

Bias and the Limits of Pooling for Large Collections

July 17, 2007
Author(s)
C E. Buckley, Darrin L. Dimmick, Ian Soboroff, Ellen M. Voorhees
Modern retrieval test collections are built through a process called pooling in which only a sample of the entire document set is judged for each topic. The idea behind pooling is to find enough relevant documents such that when unjudged documents are

Dynamic Test Collections: Measuring Search Effectiveness on the Live Web

January 22, 2007
Author(s)
Ian M. Soboroff
Existing methods for measuring the quality of search algorithms use a static collection of documents. A set of queries and a mapping from the queries to the relevant documents allow the experimenter to see how well different search engines or engine

Overview of the TREC-2005 Enterprise Track

October 16, 2006
Author(s)
Nick Craswell, Arjen P. Vries, Ian Soboroff
The goal of the enterprise track it to conduct experiments with enterprise data-- intranet pages, email archives, document repositories -- that reflect the experiences of real users in real organizations. Such that, for example, an email ranking technique

Overview of the TREC 2004 Terabyte Track

October 3, 2005
Author(s)
Charles L. Clarke, Nick Craswell, Ian Soboroff
The Terabyte Track explores how adhoc retrieval and evaluationtechniques can scale to terabyte-sized collections. For TREC 2004, ourfirst year, 50 new adhoc topics were created and evaluated over a426GB collection of 25 million documents taken from the

Novelty Detection: The TREC Experience

October 1, 2005
Author(s)
Ian M. Soboroff, Donna K. Harman
A challenge for search systems is to detect not only when an item is relevant to the user's information need, but also when it contains something new which the user has not seen before. In the TREC novelty track, the task was to highlight sentences

Overview of the TREC 2004 Novelty Track

August 1, 2005
Author(s)
Ian M. Soboroff
TREC 2004 marks the third and final year for the Novelty Track. The task is as follows: Given a TREC topic and an ordered list of documents, systems must find relevant and novel sentences that should be returned to the user from this set. This task

Building a Filtering Test Collection for TREC 2002

July 28, 2003
Author(s)
Ian M. Soboroff, S E. Robertson
Test collections for the filtering track in TREC have typically used either past sets of relevance judgments, or categorized collections such as Reuters Corpus Volume 1 or OHSUMED, because filtering systems need relevance judgments during the experiment

Building a Filtering Tst Collection for TREC 2002

July 1, 2003
Author(s)
Ian Soboroff, S E. Robertson
Test collections for the filtering track in TREC have typically used either past sets of relevance judgments, or categorized collections such as Reuters Corpus Volume 1 or OHSUMED, because filtering systems need relevance judgments during the experiment

The TREC-2002 Filtering Track Report

April 1, 2003
Author(s)
S E. Robertson, Ian Soboroff
The TREC¿11 filtering track measures the ability of systems to build persistent user profiles which successfully separate relevant and non-relevant documents in an incoming stream. It consists of three major subtasks; adaptive filtering, batch filtering

The TREC-2001 Filtering Track Report

April 1, 2002
Author(s)
S E. Robertson, Ian Soboroff
The TREC-10 filtering track measures the ability of systems to build persistent user profiles which successfully separate relevant and non-relevant documents. It consists of three major subtasks: adaptive filtering, batch filtering, and routing. In