The 2016 NIST Speaker Recognition Evaluation

Seyed Omid Sadjadi; Timothée N. Kheyrkhah; Audrey N. Tong; Craig S. Greenberg; Douglas A. Reynolds; Elliot Singer; Lisa Mason; Jaime Hernandez-Cordero

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

The 2016 NIST Speaker Recognition Evaluation

Published

August 20, 2017

Author(s)

Seyed Omid Sadjadi, Timothée N. Kheyrkhah, Audrey N. Tong, Craig S. Greenberg, Douglas A. Reynolds, Elliot Singer, Lisa Mason, Jaime Hernandez-Cordero

Abstract

In 2016, NIST conducted the most recent in an ongoing series of speaker recognition evaluations (SRE) to foster research in robust text-independent speaker recognition, as well as measure performance of the current state-of-the-art systems, targeting in particular domain and language mismatch scenarios. Compared to the previous SREs, the 2016 evaluation introduced several new aspects such as i) an entirely online evaluation platform, ii) using fixed and specified training data, iii) a wider range of durations for test segments (uniformly distributed between 10s and 60s), and iv) providing labeled and unlabeled development (a.k.a. validation) sets for system hyperparameter tuning. Both the development and evaluation sets contained conversational telephony speech (CTS) collected outside North America, spoken in Tagalog and Cantonese (referred to as the major languages) as well as Cebuano and Mandarin (referred to as the minor languages). A total of 66 research organizations (from industry and academia) registered for the 2016 SRE, out of which 43 teams submitted 121 valid system outputs that produced scores. The evaluation results indicated a significant impact on performance due to several factors including domain/channel, language, and duration mismatch. Effective use of the labeled and unlabeled development sets seemed to be essential for many top-performing systems. Finally, although mega fusion systems achieved the best performance, top single systems yielded 90% of the performance.

Conference Dates

August 20-24, 2017

Conference Location

Stockholm

Conference Title

Interspeech 2017

Pub Type

Conferences

Download Paper

Local Download

Keywords

NIST evaluation, NIST SRE, speaker detection, speaker recognition, speaker verification

Image and signal processing, Human language technology, Experiment design and Artificial intelligence

Citation

, S. , Kheyrkhah, T. , Tong, A. , Greenberg, C. , Olson, D. , Singer, E. , Mason, L. and Hernandez-Cordero, J. (2017), The 2016 NIST Speaker Recognition Evaluation, Interspeech 2017, Stockholm, -1, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=922849 (Accessed April 19, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact reflib@nist.gov.

Created August 20, 2017, Updated February 27, 2020

The 2016 NIST Speaker Recognition Evaluation

Author(s)

Abstract

Download Paper

Keywords

Citation

Additional citation formats

Issues