NIST logo
*
Bookmark and Share

Speech Analytics

Summary:

The Multimodal Information Group's speech analytics program has a long history of activities supporting the development of technologies that extract content from language-based recordings and of metrology advancements, primarily through systematic and targeted annual evaluations.

Since 1987, the Multimodal Information Group has coordinated several speech transcription technology evaluations that explored several aspects of language production including the domain of discourse, source language, transcription, keyword search, speech/non-speech segmentation (speech activity detection), and disfluency detection, to name a few.

Description:

Current speech analytics work:

  • OpenSAD: The purpose of a Speech Activity Detection (SAD) system is to find regions of speech in an audio file. The NIST Open Speech-Activity-Detection evaluation (OpenSAD) is intended to provide Speech-Activity-Detection system developers with an independent evaluation of performance on a variety of audio data. The OpenSAD evaluation is a counterpart of the DARPA RATS SAD evaluations, but is open to all interested participants.
  • OpenKWS: An annual evaluation of technologies that perform keyword search in a new language each year. The evaluation is an outgrowth of the 2006 Spoken Term Detection evaluation.

Past speech analysis work:

Rich Transcription: The Rich Transcription evaluation series promotes and gauges advances in the state-of-the-art in several automatic speech recognition technologies. The goal of the evaluation series is to create recognition technologies that will produce transcriptions which are more readable by humans and more useful for machines.

Lead Organizational Unit:

itl