Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Search Publications by: Justin Zook (Fed)

Search Title, Abstract, Conference, Citation, Keyword or Author
Displaying 1 - 25 of 71

The Platinum Pedigree: A long-read benchmark for genetic variants

October 3, 2024
Author(s)
Zev Kronenberg, Nathanael Olson, Justin Zook, Michael Eberle
Recent advances in genome sequencing have improved variant calling in complex regions of the human genome. However, it is difficult to quantify variant calling performance because existing standards often focus on specificity, neglecting completeness in

Genome-wide profiling of genetic variation at tandem repeat from long reads

July 4, 2024
Author(s)
Helyaneh Jam, Justin Zook, Sara Javadzadeh, Jonghun Park, Aarushi Sehgal, Melissa Gymrek
Tandem repeats are frequent across the human genome, and variation in repeat length has been linked to a variety of traits. Recent improvements in long read sequencing technologies have the potential to greatly improve TR analysis, especially for long or

Cybersecurity of Genomic Data

December 20, 2023
Author(s)
Ronald Pulivarti, Natalia Martin, Frederick R. Byers, Justin Wagner, Justin Zook, Samantha Maragh, Jennifer McDaniel, Kevin Wilson, Martin Wojtyniak, Brett Kreider, Ann-Marie France, Sallie Edwards, Tommy Morris, Jared Sheldon, Scott Ross, Phillip Whitlow
Genomic data has enabled the rapid growth of the U.S. bioeconomy and is valuable to the individual, industry, and government because it has multiple intrinsic properties that in combination make it different from other types of high value data which

The complete sequence of a human Y chromosome

August 23, 2023
Author(s)
Arang Rhie, Sergey Nurk, Monika Cechova, Savannah Hoyt, Dylan Taylor, Nathanael David Olson, Justin Zook, Adam Phillippy
The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1,2,3. As a result, more than half of the Y chromosome is

A Draft Human Pangenome Reference

May 10, 2023
Author(s)
Wen-Wei Liao, Mobin Asri, Jana Ebler, Jennifer McDaniel, Nathanael David Olson, Justin Wagner, Justin Zook, Erik Garrison, Tobias Marschall, Ira Hall, Heng Li, Benedict Paten
Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the

Variant calling and benchmarking in an era of complete human genome sequences

April 14, 2023
Author(s)
Nathanael David Olson, Justin Wagner, Nathan Dwarshuis, Karen Miga, Marc L. Salit, Justin Zook
Genetic variant calling from DNA sequencing has enabled understanding of germline variation in hundreds of thousands of humans. Sequencing technologies and variant-calling methods have advanced rapidly, routinely providing reliable variant calls in most of

FixItFelix: improving genomic analysis by fixing reference errors

February 21, 2023
Author(s)
Sairam Behera, Jonathan LeFaive, Peter Orchard, Justin Zook, Fritz Sedlazeck
The current version of the human reference genome, GRCh38, contains a number of errors including 1.2 Mbp of falsely duplicated and 8.04 Mbp of collapsed regions. These errors impact the variant calling of 33 protein-coding genes, including 12 with medical

Semi-automated assembly of high-quality diploid human reference genomes

October 19, 2022
Author(s)
Erich Jarvis, Giulio Formenti, Jennifer McDaniel, Nathanael David Olson, Justin Wagner, Justin Zook, Kerstin Howe, Karen Miga
The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of

Benchmarking challenging small variants with linked and long reads

May 11, 2022
Author(s)
Justin Wagner, Nathanael David Olson, Lindsay Harris, Marc L. Salit, Fritz Sedlazeck, Chunlin Xiao, Justin Zook
Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling and sequencing methods. Here we use accurate linked and long reads to expand benchmarks in 7 samples to include difficult-to-map

A complete reference genome improves analysis of human genetic variation

April 1, 2022
Author(s)
Sergey Aganezov, Stephanie Yan, Daniela Soto, Melanie Kirsche, Samantha Zarate, Justin Wagner, Jennifer McDaniel, Nathanael David Olson, Rajiv McCoy, Megan Dennis, Justin Zook, Michael Schatz
Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands of structural errors, and unlocks the most complex regions of the human genome for clinical and functional study. We show

Complete genomic and epigenetic maps of human centromeres

April 1, 2022
Author(s)
Nicolas Altemose, Glennis Logsdon, Andrey Bzikadze, Pragya Sidhwani, Sasha Langley, Gina Caldas, Justin Zook, Ivan Alexandrov, Karen Miga
Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a

The complete sequence of a human genome

March 31, 2022
Author(s)
Sergey Nurk, Sergey Koren, Arang Rhie, Mikko Rautiainen, Jennifer McDaniel, Nathanael David Olson, Justin Wagner, Justin Zook, Evan Eichler, Karen Miga, Adam Phillippy
Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T)

Curated variation benchmarks for challenging medically relevant autosomal genes

February 7, 2022
Author(s)
Justin Wagner, Nathanael David Olson, Lindsay Harris, Jennifer McDaniel, Fritz Sedlazeck, Chen-Shan Chin, Justin Zook
The repetitive nature and complexity of some medically relevant genes poses a challenge for their accurate analysis in a clinical setting. The Genome in a Bottle Consortium has provided variant benchmark sets, but these exclude nearly 400 medically

Challenges of Accuracy in Germline Clinical Sequencing Data

July 20, 2021
Author(s)
Justin Zook, Ryan Poplin, Mark DePristo
Physicians are increasingly using clinical sequencing tests to establish diagnoses of patients who might have genetic disorders, which means that accuracy of sequencing and interpretation are important elements in ensuring the benefits of genetic testing

One in seven pathogenic variants can be challenging to detect by NGS: an analysis of 450,000 patients with implications for clinical sensitivity and genetic test implementation

May 18, 2021
Author(s)
Stephen Lincoln, Tina Hambuch, Justin Zook, Sara Bristow, Kathryn Hatchell, Rebecca Truty, Michael Kennemer, Brian Shirts, Andrew Fellowes, Shimul Chowdhury, Eric Klee, Shazia Mahamdallie, Megan Cleveland, Peter Vallone, Yan Ding, Sheila Seal, Wasanthi DeSilva, Farol Tomson, Catherine Huang Huang, Russell Garlick, Nazneen Rahman, Marc L. Salit, Stephen Kingsmore, Matthew Ferber, Swaroop Aradhya, Robert Nussbaum
Next-generation sequencing (NGS) is widely used and cost-effective. However, depending on the specific methods used, NGS can have limitations with certain technically challenging variant types. These types are poorly represented in some validation studies

Chromosome-scale, haplotype-resolved assembly of human genomes

December 7, 2020
Author(s)
Justin Zook, Shilpa Garg, Heng Li
Haplotype-resolved or phased genome assembly provides a complete picture of genomes and their complex genetic variations. However, current algorithms for phased assembly either do not generate chromosome-scale phasing or require pedigree information, which