NIST logo

Information Retrieval Research Conference Adds Tracks on Health Records, Crowd-sourcing and Micro-blogging

From NIST Tech Beat: February 15, 2011

*

Contact: Evelyn Brown
301-975-5661

If you found this article through a search engine, you can thank an automated text retrieval system. For 20 years, the Text REtrieval Conference (TREC) sponsored by the National Institute of Standards and Technology (NIST) has been one of the major research efforts in the field, and TREC is seeking researchers to participate in this year's workshop series. This year researchers will add new tracks to study information retrieval in more challenging areas such as electronic healthcare records.

Applications are being accepted until May.

A recent economic impact study* prepared for NIST found that the NIST-led TREC project has significantly improved search engines' ability to retrieve electronic data and the report notes that TREC-related improvements are responsible for about one-third of the web-search advances between 1999 and 2009. The report notes that the improvements may have saved up to 3 billion hours of web-search time.

TREC brings together scientists from academia and public and private-sector organizations to focus on improving information retrieval in specific areas. The groups develop algorithms to find information from large, challenging datasets often provided by NIST.

New tracks in 2011 include health records, crowd-sourcing, which is outsourcing tasks to a community through an open call, and "micro-blogging"— blogging smaller items and information, such as status updates on social networking sites. Other tracks continuing from 2010 are chemical, entity (non-documents), session, web and the growing legal track.

"We are searching for concepts that are well beyond the simple string-of-words technique," explained TREC organizer Ellen Voorhees. "For example, in electronic health care records, we want to find data beyond what is in the typical fields for case number and patient name. We are interested in finding information in the care providers' notes that could, for example, use many different phrases to describe that a patient has a history of smoking." Those phrases could include "heavy tobacco use" or "occasional smoker," she added.

Results of the studies are shared at a workshop in November at NIST's Gaithersburg, Md., campus. TREC research is precompetitive and results are shared openly.

For more information on TREC and participating, see trec.nist.gov.

Note: The author relied heavily on web-based information retrieval for this article.

* B.R. Rowe, D.W.Wood, A.N. Link and D.A. Simoni. Economic Impact Assessment of NIST's Text REtrieval Conference (TREC) Program. RTI Project Number 0211875, July 2010. Available on-line at: trec.nist.gov/pubs/2010.economic.impact.pdf.