The goal of the OpenASR (Open Automatic Speech Recognition) Challenge is to assess the state of the art of ASR technologies for low-resource languages.
The OpenASR Challenge is an open challenge created out of the IARPA (Intelligence Advanced Research Projects Activity) MATERIAL (Machine Translation for English Retrieval of Information in Any Language) program that encompasses more tasks, including CLIR (cross-language information retrieval), domain classification, and summarization. Also see NIST's MATERIAL page.
For every year of MATERIAL, NIST supports a simplified, smaller scale evaluation open to all, focusing on a particular technology aspect of MATERIAL. CLIR technologies were the focus of the first open challenge in 2019, OpenCLIR. Since 2020, the focus has been on ASR. The capabilities tested in the open challenges are expected to ultimately support the MATERIAL task of effective triage and analysis of large volumes of text and audio content in a variety of less-studied languages.
Please email openasr_poc [at] nist.gov (openasr_poc[at]nist[dot]gov) for any questions or comments regarding the OpenASR Challenge.
The second OpenASR Challenge associated with MATERIAL, OpenASR21, opened for registration August 9, 2021. The evaluation period in November 2021. OpenASR21 features ASR evaluation opportunities for 15 low-resource languages:
For the languages from OpenASR20, the same evaluation datasets from 2020 will be used, consisting of conversational telephone speech (CTS) data. For the five new languages, the main evaluation dataset will also consist of CTS data. These datasets will be scored (where applicable) case-insensitively.
New for OpenASR21 will be case-sensitive scoring for three of the new languages, as indicated below. Case-sensitive scoring will be performed for system output from separate evaluation datasets from a mix of genres for these languages, in order to assess low-resource ASR performance specifically on proper nouns.
OpenASR21 languages:
OpenASR21 will be implemented as a track of NIST’s OpenSAT (Open Speech Analytic Technologies) evaluation series, using the OpenSAT web server for registration, data access, submission, and scoring purposes.
For more details, please refer to the OpenASR21 Challenge Evaluation Plan in the Documentation and Resources section below.
Milestone | Date |
---|---|
Evaluation plan release | July 2021 |
Registration period | August 9 – October 15, 2021 |
Development period | August 9 – November 2, 2021 (potentially longer but excluding evaluation period) |
- Build and Dev datasets release | August 9, 2021 |
- Scoring server accepts submissions for Dev datasets | August 30 – November 2, 2021 (potentially longer but excluding evaluation period) |
Registration closes | October 15, 2021 |
Evaluation period | November 3 – 10, 2021 |
- Release of Eval datasets | November 3, 2021 |
- Scoring server accepts submissions | November 4 – 10, 2021 |
- System output due at NIST | November 10, 2021 |
System description due at NIST | November 19, 2021 |
Registration opened on August 9, 2021. Please register via the OpenSAT web server.
The OpenASR21 evaluation was conducted in November 2021. Please see the OpenASR21 Challenge Results page.
OpenASR21 participants, as well as others working in the low-resource ASR problem space, are strongly encouraged to submit their work to the preliminarily accepted Low-Resource ASR Development special session at INTERSPEECH 2022. Please see the Call for Papers. This special session welcomes contributions from anyone working in the low-resource ASR problem space.
The first OpenASR Challenge associated with MATERIAL, OpenASR20, was opened for registration in July 2020, with an evaluation period in November 2020. It featured ASR evaluation opportunities for these ten low-resource languages:
It was implemented as a track of NIST’s OpenSAT (Open Speech Analytic Technologies) evaluation series, using the OpenSAT web server for registration, data access, submission, and scoring purposes.
The evaluation plan posted in the Documentation and Resources section below describes the OpenASR20 Challenge in detail.
Registration for the OpenASR20 Challenge is now closed.
The OpenASR20 evaluation was conducted in November 2020. Please see the OpenASR20 Challenge Results page.
OpenASR20 participants, as well as others working in the low-resource ASR problem space, were strongly encouraged to submit their work to an OpenASR special session at INTERSPEECH 2021. Please see the OpenASR20 and Low-Resource ASR Development Special Session at INTERSPEECH 2021 Call for Papers. This special session also welcomed contributions from others working in the low-resource ASR problem space who did not participate in OpenASR20.