NIST logo
Bookmark and Share

Machine Translation Evaluation


The Multimodal Information Group's machine translation (MT) program includes several activities contributing to machine translation technology and metrology advancements, primarily through systematic and targeted annual evaluations.

Since 2002, the Multimodal Information Group has coordinated evaluations of text-to-text MT technology through our OpenMT series. NIST-led open evaluations such as OpenMT provides a test bed for experimentation of evaluation techniques that may then be applied to sponsored MT technology evaluations. Similarly, NIST's Metrics for Machine Translation Challenge (MetricsMaTr) provides a forum to research and promote innovative techniques that advance the measurement sciences used in MT evaluations. Participation in these open evaluation activities is open to all researchers that find the tasks of interest and are able to abide by the particular evaluation's task protocols and rules.

NIST also organizes and implements evaluations for sponsored programs that focus on specific aspects of MT, such as the DARPA BOLT and DARPA MADCAT programs.


Current MT technology evaluation activities:

  • OpenMT: A biannual NIST evaluation of text-to-text MT technology. Focus is placed on the core task of MT technology where information learned is applicable to other MT technology types.
  • MADCAT: The Multilingual Automatic Document Classification Analysis and Translation program is a DARPA-sponsored program to evaluate technologies that translate Arabic document images into English text. NIST evaluates the overall system as well as the major components (OCR and MT) of the system.
  • OpenHaRT: The NIST Handwriting Recognition and Translation evaluation focuses on evaluating technologies that contribute to document understanding with emphasis on core tasks such as recognition and translation.
  • BOLT: The Broad Operational Language Translation evaluation is a DARPA-sponsored program to evaluate technologies that translate and extract information as well as facilitate bilingual communication.

Current MT metrology evaluation activities:

  • MetricsMaTr: A biannual evaluation of MT metrology. Focus is placed on the improvement of automated measurement techniques of MT technology, specifically towards providing insight into the quality of a translation.
  • MFLTS: This activity is sponsored by the US Army Machine Foreign Language Translation System program. NIST chairs the Metrics-IPT whose task is to develop a new metric that is grounded to the Interagency Language Roundtable (ILR) rating system Metrics-IPT working group.

Past MT evaluation activities:

  • TRANSTAC: The Spoken Language Communication and Translation System for Tactical Use project was a DARPA-sponsored program to develop and field speech-to-speech MT technology, enabling two-way spoken communication between U.S. Soldiers and Marines (speaking only English) and civilian populations who speak only other languages. NIST evaluated the performance of the TRANSTAC systems, including the systems as a whole, as well as their speech recognition, machine translation, and text-to-speech components.
  • GALE: The Global Autonomous Language Exploitation program was a five-year DARPA program that included both an MT and a Distillation component. NIST organized and implemented GALE's yearly evaluations for speech-to-text and text-to-text MT technology.

Lead Organizational Unit: