Genome editing technologies are transforming biosciences and biotechnology and are being actively utilized to advance product development including medicine. There is a need for standardization in terms and definitions for this field to support accurate communication of concepts, data, and results.
Standards in the field of genome editing will harmonize and accelerate effective communication, technology development, qualification, and evaluation of genome editing products. This lexicon was developed to provide a unified reference set of terms and technical definitions that standardizes their use and meaning to serve the needs of the biotechnology community. It is expected to improve confidence in and clarify scientific communication, data reporting, and data interpretation in the genome editing field.
It is recognized that in rare instances exceptions may exist for some definitions within specific applications.
This Lexicon has been developed into an ISO Standard for Genome Editing Vocabulary: ISO 5058-1:2021 Biotechnology — Genome editing — Part 1: Vocabulary. This lexicon is also available as an ontology through BioPortal.
The definitions are worded with the intention that additional context may be added with supplementary language when they are used. It is also recognized that genome editing is a rapidly evolving biotechnology and additional terms and definitions will be needed as genome editing technologies mature.
This document provides a vocabulary that standardizes the use and meaning of terms associated with genome editing. This document is organized into categories and sub-categories as follows:
1. genome editing concepts
Terms within categories are listed alphabetically. In the Genome editing tools section, the sub-category “General” contains terms that apply to all types of genome editing tools. Additional sub-categories contain terms specific to the sub-category of genome editing technology: “CRISPR specific”, “meganuclease specific”, “megaTAL specific”, “TALEN specific” and “ZFN specific”. A glossary listing all terms alphabetically precedes the Terms and definitions.
It is also recognized that genome editing is a rapidly developing and evolving biotechnology, and additional terms and definitions will be needed as genome editing technologies mature.
Term | Term number |
Cas nuclease | 2.2.1 |
Cas nuclease target site | 2.2.2 |
CRISPR associated nuclease | 2.2.1 |
CRISPR RNA | 2.2.3 |
CRISPR target strand | 2.2.8 |
crRNA | 2.2.3 |
Cys2His2 zinc finger | 2.6.1 |
DNA edit | 3.1 |
DNA, RNA, or epigenome edit | 3.1 |
edit | 3.1 |
epigenome edit | 3.1 |
gene editing | 1.1 |
genome editing | 1.2 |
genome editing off-target | 1.4 |
genome editing target | 1.6 |
genome editing target specificity | 1.5 |
genome engineering | 1.3 |
gRNA | 2.2.4 |
guide RNA | 2.2.4 |
HDR | 3.2 |
homology-directed repair | 3.2 |
indel | 3.3 |
InDel mutation | 3.3 |
intended edit | 3.4 |
meganuclease | 2.3.1 |
meganuclease linker | 2.3.2 |
meganuclease single chain | 2.3.3 |
meganuclease target site | 2.3.4 |
megaTAL | 2.4.1 |
megaTAL linker | 2.4.2 |
megaTAL target site | 2.4.3 |
microhomology-mediated end joining repair | 3.5 |
MMEJ | 3.5 |
NHEJ | 3.6 |
non-homologous end joining | 3.6 |
off-target | 1.4 |
PAM | 2.2.5 |
protospacer adjacent motif | 2.2.5 |
repair template | 2.1.1 |
repeat variable diresidue | 2.5.1 |
ribonucleoprotein | 2.2.6 |
RNA edit | 3.1 |
RNP | 2.2.6 |
RVDs | 2.5.1 |
sequence-specific nuclease | 2.1.3 |
sgRNA | 2.2.7 |
single-guide RNA | 2.2.7 |
site-directed DNA modification enzyme | 2.1.2 |
site-directed nuclease | 2.1.3 |
specificity | 1.5 |
TALEN | 2.5.2 |
TALEN linker | 2.5.3 |
TALEN target site | 2.5.4 |
target | 1.6 |
target strand | 2.2.8 |
tracrRNA | 2.2.9 |
trans-activating CRISPR RNA | 2.2.9 |
transcription activator-like effector nuclease | 2.5.2 |
unintended edit | 3.7 |
ZFN | 2.6.2 |
ZFN linker | 2.6.3 |
ZFN recognition helix | 2.6.4 |
ZFN target site | 2.6.5 |
ZFP | 2.6.6 |
zinc finger | 2.6.1 |
zinc finger nuclease | 2.6.2 |
zinc finger protein | 2.6.6 |
gene editing
techniques for genome engineering (1.3) that involve nucleic acid damage, repair mechanisms, replication and/or recombination for incorporating site-specific modification(s) into a gene or genes
Note 1 to entry: Gene editing is a subclass of genome editing (1.2).
Note 2 to entry: There are various genome editing tools (see 1.2)
genome editing
techniques for genome engineering (1.3) that involve nucleic acid damage, repair mechanisms, replication and/or recombination for incorporating site-specific modification(s) into a genomic DNA
Note 1 to entry: Gene editing (1.1) is a subclass of genome editing.
Note 2 to entry: There are various genome editing tools (see 1.2)
genome engineering
process of introducing intentional changes to genomic nucleic acid
Note 1 to entry: Gene editing (1.1) and genome editing (1.2) are techniques used in genome engineering.
off-target
genome editing off-target
genomic position and/or nucleic acid sequence distinct from the target (1.6).
EXAMPLE:
Off-target binding, off-target cleavage, off-target edit, off-target sequence change.
Note 1 to entry: An off-target edit is an example of an unintended edit (3.7).
specificity
genome editing target specificity
extent to which an editing agent or procedure acts only on its intended target (1.6)
Note 1 to entry: When using this term, the procedure is defined, the intended target is defined, the action or outcome is measured and reported, and limits of detection are reported.
target
genome editing target
nucleic acid sequence subject to intentional binding, modification and/or cleavage during a genome editing (1.2) process
Note 1 to entry: See also off-target (1.4), Cas nuclease target site (2.2.2), meganuclease target site (2.3.4), megaTAL target site (2.4.3), TALEN target site (2.5.4) and ZFN target site (2.6.5).
repair template
nucleic acid sequence used to direct cellular DNA repair pathways to incorporate specific DNA sequence changes at or near a target (1.6)
site-directed DNA modification enzyme
enzyme capable of modifying DNA at a specific sequence
EXAMPLE: Site-directed nuclease (2.1.3), site-directed adenosine deaminase.
site-directed nuclease
sequence-specific nuclease
enzyme capable of cleaving the phosphodiester bond between adjacent nucleotides in a nucleic acid polymer at a specific sequence
Cas nuclease
CRISPR associated nuclease
enzyme that is a component of CRISPR systems that is capable of breaking the phosphodiester bonds between nucleotides
EXAMPLE:
Cas3, Cas9, Cas12a, Cas13, CasX.
Note 1 to entry: Some but not all Cas nucleases interact with a gRNA (2.2.4). See also crRNA (2.2.3), sgRNA (2.2.7) and tracrRNA (2.2.9).
Cas nuclease target site
nucleotide sequence comprising the PAM (2.2.5), in most cases, and a region that hybridizes to the target sequence specific guide of a Cas RNP (2.2.6)
crRNA
CRISPR RNA
polyribonucleotide that includes sequence complementarity to the target (1.6) and a sequence that interacts with a Cas protein and optionally tracrRNA (2.2.9)
Note 1 to entry: crRNA is a component of gRNA (2.2.4) or a complete gRNA, depending on the CRISPR system.
Note 2 to entry: In some CRISPR systems, a portion of the crRNA will base-pair with the tracrRNA (e.g. Cas9). Other CRISPR systems lack tracrRNA (e.g. Cas12a/Cpf1). In systems that do not require tracrRNA, the gRNA is called a “CRISPR RNA” or simply “crRNA”.
gRNA
guide RNA
polyribonucleotide containing regions sufficient for productive interaction with a Cas nuclease (2.2.1) or variant to direct interaction with the specific target (1.6)
Note 1 to entry: See crRNA (2.2.3), tracrRNA (2.2.9) and sgRNA (2.2.7).
Note 2 to entry: For Cas9-type proteins, the natural gRNA comprises a crRNA that imparts sequence specificity and the tracrRNA that interacts with and activates the protein. This is sometimes referred to as a “dual guide”. Other Cas proteins can have different gRNA structures.
Note 3 to entry: sgRNA for Cas9 proteins are non-naturally occurring polyribonucleotides where the crRNA and tracrRNA are fused with an artificial linker.
Note 4 to entry: In some cases, chemical modifications of the polyribonucleotide are used, such as modifications to the phosphodiester linkages, bases or sugar moieties. These can include substitution of DNA (2′-deoxy) or 2′-methoxy nucleotides for RNA nucleotides, etc.
PAM
protospacer adjacent motif
short nucleotide motif in the targeted region of nucleic acid required for guided Cas nuclease (2.2.1) or variant binding
Note 1 to entry: PAMs are distinct from, but in close proximity to, nucleic acid sequence targeted by gRNA (2.2.4).
RNP
ribonucleoprotein
complex comprising protein bound to RNA
Note 1 to entry: In the context of CRISPR-based genome editing (1.2), RNP refers to the complex of Cas protein(s) and gRNA (2.2.4).
sgRNA
single-guide RNA
fusion of crRNA (2.2.3) and tracrRNA (2.2.9)
Note 1 to entry: See gRNA (2.2.4).
target strand
CRISPR target strand
single-stranded nucleic acid sequence that is complementary to the gRNA (2.2.4) of a Cas protein or variant
tracrRNA
trans-activating CRISPR RNA
polyribonucleotide that base-pairs with the crRNA (2.2.3) and interacts with a Cas nuclease (2.2.1) to enable sequence-specific interaction of the target (1.6)
Note 1 to entry: tracrRNA is an optional component of gRNA (2.2.4).
meganuclease
variant of the LAGLIDADG subtype of homing endonucleases engineered to recognize a 15 to 40 base pair DNA target (1.6) different from the site recognized by the parent endonuclease
Note 1 to entry: The LAGLIDADG consensus sequence represents an alpha helix that serves as a dimerization interface and key component in the DNA cleavage site in this family of meganucleases.
meganuclease linker
natural or artificially derived polypeptide sequence that links two LAGLIDADG domains to one another to form a single polypeptide chain
Note 1 to entry: The LAGLIDADG consensus sequence represents an alpha helix that serves as a dimerization interface and key component in the DNA cleavage site in this family of meganucleases (2.3.1).
meganuclease single chain
meganuclease (2.3.1) composed of two LAGLIDADG domains joined by either a natural or artificially derived polypeptide linker in order to be expressed as a single polypeptide chain
Note 1 to entry: The LAGLIDADG consensus sequence represents an alpha helix that serves as a dimerization interface and key component in the DNA cleavage site in this family of meganucleases.
meganuclease target site
DNA sequence recognized by meganucleases (2.3.1)
Note 1 to entry: Meganuclease target sites are 15 to 40 base pair DNA sequence consisting of two equal length half sites separated by a 4 base pair middle sequence (also known as “central 4”). Cleavage occurs at the junction of the half sites and the middle site on each DNA strand leaving a 4 nucleotide 3′ overhang.
megaTAL
artificial chimeric nucleases composed of an array of transcription activator-like (TAL) effector (TALE)[1] DNA binding domains, a megaTAL linker (2.4.2) and a meganuclease (2.3.1)
megaTAL linker
amino acid sequence that links an array of TAL DNA binding domains and a meganuclease (2.3.1)
megaTAL target site
intended DNA binding site of a megaTAL (2.4.1), encompassing the DNA sequence targeted by both the TAL array and the meganuclease (2.3.1)
RVDs
repeat variable diresidue
two amino acid sequence in TAL repeats that imparts DNA binding specificity
TALEN
transcription activator-like effector nuclease
artificial nuclease composed of an endodeoxyribonuclease fused to DNA-binding domains of TALEs[1] that cleaves DNA at a defined distance from TALE recognition sequences
Note 1 to entry: A TALEN can refer to a pair of TALE-FokI fusion proteins that dimerize on opposite strands of DNA adjacent to a target (1.6) for cleavage.
TALEN linker
polypeptide sequence that links an array of TAL DNA binding domains and an endodeoxyribonuclease, typically FokI
TALEN target site
DNA sequence recognized by TALENs (2.5.2)
Note 1 to entry: Typical TALEN target sites are recognized by a pair of TALENs and contain a central spacer region flanked by upstream and downstream sequences that are each recognized by one TALEN. This pair is designed in such a way that two TALEN nuclease domains dimerize to cleave DNA within the spacer region.
zinc finger
Cys2His2 zinc finger
DNA binding domain that folds via coordination of zinc into a compact structure consisting of two beta strands and one alpha-helix (β β α)
Note 1 to entry: Zinc finger DNA binding domains typically contain 28 amino acids.
ZFN
zinc finger nuclease
chimeric protein consisting of an array of zinc fingers (2.6.1) linked to a DNA cleavage domain
Note 1 to entry: FokI is prevalently used as the DNA cleavage domain bound to a zinc finger.
Note 2 to entry: Binding of two ZFNs to a pair of appropriately spaced DNA target sites enables nuclease domain dimerization and DNA cleavage between the targets.
ZFN linker
polypeptide sequence that links an array of zinc finger (2.6.1) binding domains and a DNA cleavage domain
Note 1 to entry: FokI is prevalently used as the DNA cleavage domain bound to a zinc finger.
ZFN recognition helix
seven residue positions within a zinc finger (2.6.1) that are most directly responsible for its DNA binding preference
Note 1 to entry: The seven residues comprise the first six residues of the alpha helix, along with the residue immediately preceding the N-terminal of the helix. They are typically referred to as positions +1 to +6 (within the alpha helix) and position (−1) (immediately preceding the helix).
ZFN target site
DNA sequence recognized by a pair of ZFNs (2.6.2)
Note 1 to entry: Typical ZFN target sites contain a central spacer region flanked by DNA sequences that are each recognized by an array of zinc fingers (2.6.1) oriented such that the ZFN nuclease domains dimerize and cleave within the spacer.
ZFP
zinc finger protein
DNA binding protein consisting of a tandem array of multiple zinc fingers (2.6.1)
edit
DNA edit
RNA edit
epigenome edit
DNA, RNA or epigenome edit
modification to nucleic acid sequence resulting from the application of genome editing (1.2) components
EXAMPLE:
Insertion, deletion, substitution, deamination, methylation, demethylation.
Note 1 to entry: Genome editing components can include a nuclease and repair template (2.1.1).
3.2
HDR
homology-directed repair
mechanism of recombinational DNA repair[2] where repair is templated by a polynucleotide with regions corresponding to sequences flanking the target (1.6)
EXAMPLE:
Single-stranded DNA oligonucleotide templated HDR.
Note 1 to entry: Repair templates (2.1.1) can be exogenously introduced to achieve sequence changes in genome editing (1.2) approaches.
indel
InDel mutation
sequence change caused by the insertion and/or deletion of nucleotides
intended edit
designed modification to a target (1.6) resulting from the application of genome editing (1.2) components
Note 1 to entry: See edit (3.1).
Note 2 to entry: Genome editing components can include a nuclease and repair template (2.1.1).
MMEJ
microhomology-mediated end joining repair
mechanism of DNA end-joining repair[3] where the DNA ends are rejoined to each other using short regions of homology flanking the initiating double-stranded break to align the ends for repair
Note 1 to entry: MMEJ repair of DNA breaks in genome editing (1.2) approaches can result in deletion between pairs of microhomology regions.
Note 2 to entry: Short regions of homology for MMEJ are typically 2 to 25 base pairs.
NHEJ
non-homologous end joining
mechanism of DNA end-joining repair[3] in which DNA ends are joined in a homology-independent manner
Note 1 to entry: NHEJ repair of DNA breaks in genome editing (1.2) workflows can result in indel (3.3) formation.
unintended edit
modification to nucleic acid at the target (1.6) that is not the designed change or at an off-target (1.4) resulting from the application of genome editing (1.2) components
Note 1 to entry: See edit (3.1).
Note 2 to entry: Genome editing components can include a nuclease and repair template (2.1.1).
bp – base pairs
DNA – deoxyribonucleic acid
CRISPR – clustered regularly interspaced short palindromic repeats
RNA – ribonucleic acid
TAL – Transcription Activator-like
TALE – transcription activator-like effector
[1] U.S. National Library of Medicine. MeSH Descriptor Data 2020: Transcription Activator-Like Effectors. Available from: https://meshb.nlm.nih.gov/record/ui?name=TRANSCRIPTION%20ACTIVATOR-LIKE%20EFFECTORS
[2] U.S. National Library of Medicine. MeSH Descriptor Data 2020: Recombinational DNA Repair. Available from: https://meshb.nlm.nih.gov/record/ui?ui=D059767
[3] U.S. National Library of Medicine. MeSH Descriptor Data 2020: DNA End-Joining Repair. Available from: https://meshb.nlm.nih.gov/record/ui?ui=D059766