Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Curated variation benchmarks for challenging medically relevant autosomal genes

Published

Author(s)

Justin Wagner, Nathanael David Olson, Lindsay Harris, Jennifer McDaniel, Fritz Sedlazeck, Chen-Shan Chin, Justin Zook

Abstract

The repetitive nature and complexity of some medically relevant genes poses a challenge for their accurate analysis in a clinical setting. The Genome in a Bottle Consortium has provided variant benchmark sets, but these exclude nearly 400 medically relevant genes due to their repetitiveness or polymorphic complexity. Here, we characterize 273 of these 395 challenging autosomal genes using a haplotype-resolved whole-genome assembly. This curated benchmark reports over 17,000 single-nucleotide variations, 3,600 insertions and deletions and 200 structural variations each for human genome reference GRCh37 and GRCh38 across HG002. We show that false duplications in either GRCh37 or GRCh38 result in reference-specific, missed variants for short- and long-read technologies in medically relevant genes, including CBS, CRYAA and KCNE1. When masking these false duplications, variant recall can improve from 8% to 100%. Forming benchmarks from a haplotype-resolved whole-genome assembly may become a prototype for future benchmarks covering the whole genome.
Citation
Nature Biotechnology
Volume
40

Keywords

DNA sequencing, genomics, bioinformatics

Citation

Wagner, J. , Olson, N. , Harris, L. , McDaniel, J. , Sedlazeck, F. , Chin, C. and Zook, J. (2022), Curated variation benchmarks for challenging medically relevant autosomal genes, Nature Biotechnology, [online], https://doi.org/10.1038/s41587-021-01158-1, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=932579 (Accessed January 28, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact reflib@nist.gov.

Created February 7, 2022, Updated November 29, 2022