Increasingly, high stakes decisions impacting public health and safety are being made using microbial genomic sequencing data. For example, whole genome sequencing of microbial pathogens is being increasingly used to identify and source of contaminated foods; and in some cases these data have been used to prosecute negligent food manufacturers. As the stakes increase so does the required level of confidence in the measurement. A microbial genomic DNA reference material RM 8375 MG001-MG004 was developed to help advance the measurement assurance of microbial genomic sequencing and DNA sequencing in general.
The RM 8375 is a stable and homogeneous material whose intended use is whole genome sequencing quality control and proficiency testing. The genomic DNA is intended to be analyzed in the same way as any other sample a laboratory would analyze extracted DNA, such as through the use of genome assembly or variant calling bioinformatics pipelines (Olson et al. 2015). Because the RM is extracted DNA, it does not assess pre-analytical steps such as DNA extraction.
RM 8375 consists of vials of genomic DNA from four different bacterial strains, ranging in genome size and GC content to best challenge DNA sequencing technologies.
Strain | Biosample | Size | GC | |
---|---|---|---|---|
MG001 | Salmonella enterica LT2 | SAMN02854572 | 4.8 Mb | 52 |
MG002 | Staphylococcus aureus | SAMN02854573 | 2.8 Mb | 33 |
MG003 | Pseudomonas aeruginosa | SAMN02854574 | 6.3 Mb | 67 |
MG004 | Clostridium sporogenes | SAMN02854575 | 4.1 Mb | 28 |
The materials were characterized for genome purity and homogeneity using orthogonal sequencing technologies. First a genome assembly was constructed using long read sequencing data. The assembly was then validated using optical mapping and short read sequencing data. Then the short read data was used to characterize the base level purity of the genome sequence and genomic purity (presence of genomic DNA from contaminants). Additionally, the stability of the genomic material was assessed using pulse field gel electrophoresis following an extended incubation at 37o C.
A bioinformatic tool, PEPR (Pipelines for Evaluating Prokaryotic References), was used to ensure that RM 8375 was characterized in a reproducible and transparent manner (Olson et al. 2016).
The material is currently available (as of Fall 2016) and can be obtained from the NIST SRM site http://www.nist.gov/srm/. We ask that any sequence data generated using the material be submitted to Genbank Sequence Read Archive with the appropriate Biosample Number so that we can use the data to help improve the characterized reference genome.
RM Characterization Methods