ArticlePDF AvailableLiterature Review

Next‐Generation Genotoxicology: Using Modern Sequencing Technologies to Assess Somatic Mutagenesis and Cancer Risk

Authors:

Abstract and Figures

Mutations have a profound effect on human health, particularly through an increased risk of carcinogenesis and genetic disease. The strong correlation between mutagenesis and carcinogenesis has been a driving force behind genotoxicity research for more than 50 years. The stochastic and infrequent nature of mutagenesis makes it challenging to observe and to study. Indeed, decades have been spent developing increasingly sophisticated assays and methods to study these low frequency genetic errors, in hopes of better predicting which chemicals may be carcinogens, understanding their mode of action, and informing guidelines to prevent undue human exposure. While effective, widely used genetic selection‐based technologies have a number of limitations that have hampered major advancements in the field of genotoxicity. Emerging new tools, in the form of enhanced next generation sequencing platforms and methods, are changing this paradigm. In this review, we discuss rapidly evolving sequencing tools and technologies, such as error‐corrected sequencing and single cell analysis, that we anticipate will fundamentally reshape the field. In addition, we consider a variety emerging applications for these new technologies, including the detection of DNA adducts, inference of mutational processes based on genomic site and local sequence contexts, and evaluation of genome engineering fidelity, as well as other cutting‐edge challenges for the next 50 years of environmental and molecular mutagenesis research. This article is protected by copyright. All rights reserved.
Content may be subject to copyright.
Review
Next-Generation Genotoxicology: Using Modern Sequencing
Technologies to Assess Somatic Mutagenesis and Cancer Risk
Jesse J. Salk
1, 2
and Scot t R.Kennedy
3
*
1
Department of Medicine, Division of Medical Oncology, University of Washington School of
Medicine, Seattle, Washington
2
TwinStrand Biosciences, Seattle, Washington
3
Department of Pathology, University of Washington, Seattle, Washington
Mutations have a profound effect on human health,
particularly through an increased risk of carcinogen-
esis and genetic disease. The strong correlation
between mutagenesis and carcinogenesis has been
a driving force behind genotoxicity research for
more than 50 years. The stochastic and infrequent
nature of mutagenesis makes it challenging to
observe and to study. Indeed, decades have been
spent developing increasingly sophisticated assays
and methods to study these low-frequency genetic
errors, in hopes of better predicting which chemicals
may be carcinogens, understanding their mode of
action, and informing guidelines to prevent undue
human exposure. While effective, widely used
genetic selection-based technologies have a number
of limitations that have hampered major advance-
ments in the eld of genotoxicity. Emerging new
tools, in the form of enhanced next-generation
sequencing platforms and methods, are changing
this paradigm. In this review, we discuss rapidly
evolving sequencing tools and technologies, such as
error-corrected sequencing and single cell analysis,
which we anticipate will fundamentally reshape the
eld. In addition, we consider a variety emerging
applications for these new technologies, including
the detection of DNA adducts, inference of muta-
tional processes based on genomic site and local
sequence contexts, and evaluation of genome engi-
neering delity, as well as other cutting-edge chal-
lenges for the next 50 years of environmental and
molecular mutagenesis research. Environ. Mol.
Mutagen. 61:135151, 2020. ©2019 The Authors.
Environmental and Molecular Mutagenesis published by Wiley
Periodicals, Inc. on behalf of Environmental Mutagen Society.
Key words: chemical carcinogenesis; cancer riskassessment; in vivo mutation; error- corrected NGS; consensus
sequencing; single - cell sequencing; single molecule sequencing
INTRODUCTION
Exposure to environmental factors has been known to
alter the genetic makeup of organisms since the seminal
work by Hermann Muller in 1927 showing that Drosoph-
ila exposed to X-rays led to new heritable traits (Muller
1927). Other environmental factors, including ultraviolet
light and reactive chemicals, were reported soon after
(Stadler and Sprague 1936; Auerbach et al. 1947). It
wasnt until the publication of the structure of DNA in
1953, and the subsequent description of DNA polymer-
ases that a mechanism linking environmental exposures to
mutagenesis and heritable changes became fully apparent
(Watson and Crick 1953; Bessman et al. 1958; Lehman
et al. 1958). The ensuing years led to a rapid expansion of
studies to catalog and better understand environmental
mutagens. By the mid-1970s, experiments in rodent
models indicated that the majority of known mutagens
were, in fact, carcinogenic (McCann et al. 1975). Because
of the strong link, as well as the desire to save both time
and money, evaluating the mutagenic potential of a
compound has become a de facto surrogate for carcinoge-
nicity (Fig. 1). A detailed treatment of the regulatory
aspects of this important subject area is provided else-
where in this issue (Heich et al. 2020).
Grant sponsor: National Institute of Environmental Health Sciences; Grant
number: R44ES030642.
Grant sponsor: National Institute of Justice; Grant number: 2017-DN-
BX- 0160.
Grant sponsor: Safeway/Albertsons Early Career Award For Cancer
Research; Grant number: n/a.
Grant sponsor: U.S. Department of Defense; Grant number: W81XWH-
18-1-0339.
*Correspondence to: Scott R. Kennedy, Department of Pathology, Uni-
versity of Washington, Seattle, WA.
E-mail: scottrk@uw.edu
Received 9 August 2019; Revised 20 September 2019; Accepted 25
September 2019
DOI: 10.1002/em.22342
Published online 8 October 2019 in
Wiley Online Library (wileyonlinelibrary.com).
Environmental and Molecular Mu tagenesis 61:135^151 (2020)
©2019 The Au thors. Environmental and Molecular Mutagenesis published by Wi leyPeriodicals, Inc. on behalf of Environmental Mutagen S ociety.
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any
medium, provided the original work is properly cited.
A number of key technologies have been developed
over the past 50 years to quantify genotoxicity in both
in vitro and in vivo settings. The spontaneous mutation
rate in normal somatic mammalian cells is estimated to
be in the range of 10
8
10
9
mutations per nucleotide
per cell division (Lynch 2010). Directly detecting these
rare events at the DNA sequence level is technically chal-
lenging (Milholland et al. 2017)the molecular equiva-
lent of Wheres Waldo?(Handford 2007). Not only
does one need to screen a very large number of nucleo-
tides cells to obtain a reasonable statistical condence of
mutant frequencies, but the method for detecting muta-
tions must also have an error rate below the true mutant
frequency.
To circumvent these challenges, most standard mutagen-
esis assays rely on some means of biological enrichment,
whereby mutations are detected by a selectable phenotype
they create. While the specics differ, the general approach
relies on exposing bacterial or mammalian cells to a puta-
tive mutagen and then quantifying the ratio of cells harbor-
ing a mutation in a selectable marker to the number of cells
present in the absence of selection. In vitro selection-based
mutagenesis assays include the classic Ames assay and sev-
eral mammalian cell culture-based mutation tests, such as
HPRT and APRT (Ames et al. 1973; Thompson et al.
1980). While highly effective, in vitro assays have several
limitations that make them imperfect surrogates for human
toxicology, including differences in metabolic activation/
inactivation of the tested compound, the use of only a small
number of cell types, and continuous cellular proliferation
that can result in potential jackpotevents. In vivo assays
include transgenic rodent models, such as the MutaMouse
and the BigBlue mouse/rat assays which involve multistep
transfer of DNA from mutagen-exposed rodents into phage
and then into bacteria (Kohler et al. 1991; Myhr 1991). By
taking advantage of the in vivo context, transgenic animals
solve some of the issues inherent to the in vitro assays. As
a testament to their utility, these selection based assays are
still widely used decades after their initial development. A
history detailing the importance of these technologies is
provided in this issue by DeMarini (DeMarini 2019).
While these methods are ubiquitous in both research and
regulatory settings, reliance on selection to quantify muta-
genesis comes at a cost. The nuclear genome is a dynamic
system with spatially heterogeneous levels of biomolecular
activity, such as transcription, chromatin accessibility, adja-
cent nucleotide context, and DNA repair which strongly
modulate susceptibility to mutagenesis across the genome
(Hodgkinson and Eyre-Walker 2011). Most such assays
rely on a single reporter locus that is often articially intro-
duced. Furthermore, the number of possible mutations that
render a selectable phenotype may be limited in some
cases, leading to an underestimation bias arising from the
inability to observe variants that result in no phenotypic
changes (eg, synonymous mutations). Lastly, selectable
markers are not always portable between different
0 YEARS 20
Metastasis
DNA Damage
Genotoxicity Preneoplastic biology Oncology
Mutation Self-Sufficient
Growth
Resistance
to apoptosis
Invasion Tumor Mass
Fig. 1. The genesis of cancer. Cancer exists on a continuum. Mutations
arise as a result of repair and replication errors due to endogenous
processes and environmental factors. These mutations are the substrate for
neoplastic clonal evolution: those that confer a proliferative or survival
advantage upon the host cell will be naturally selected. Carcinogens
promote tumorigenesis by increasing the rate of mutation or by enhancing
net-positive selection. Given the often impractically long lag-time between
a carcinogenic insult and overt tumor formation, technologies that are able
to sensitively detect DNA damage, mutation induction, and clonal
outgrowths are essential tools in a genetic toxicologists armamentarium.
Environmental and Molecular Mutagenesis. DOI 10.1002/em
136 Salk and Kennedy
experimental systems and are currently limited to a few
common organismal models.
Technologies that directly identify mutations in DNA of
primary tissue samples without necessitating a multistep
selection and cloning process would open up opportunities
to identify mutagenic compounds in a more unbiased man-
ner. One such method is the Pig-a assay (Bryce et al.
2008). This assay uses ow cytometry to rapidly screen
millions of cells for those that lack expression of a particu-
lar nonessential surface protein due to inactivating muta-
tions. Helpfully, this approach can be applied to both
humans and model organisms, but generally only to red
blood cells, limiting its applicability to the other tissues in
the body and making it difcult to conrm the exact nature
of the mutations themselves (mature red blood cells are
enucleate).
Several sensitive biochemical assays for mutation detec-
tion have been developed, often based on resistance to
endonuclease cleavage or allele-specic PCR. While
extremely sensitive, these methods are either too low-
throughput or excessively narrow in scope (ie, interrogate
only one or a few bases) to gain wide usage (Parsons and
Heich 1997; Bielas and Loeb 2005). Thus, until the
advent of modern next-generation sequencing (NGS), also
referred to as massively parallel sequencing, selection-
based assays have been the dominant technology for evalu-
ating mutagenesis.
Beginning in approximately 2005, NGS has revolution-
ized many of elds of life science, including cancer biol-
ogy, population genetics, evolutionary biology, and cellular
biology. There are a several commercially available NGS
platforms that differ in their underlying approaches to
obtaining sequence information, but all share the ability to
simultaneously obtain this information from tens of thou-
sands to billions of individual DNA templates. Conse-
quently, it is now possible to obtain data on a genome-
wide scale. In addition, NGS technologies are read-based.
This digital tabulationapproach differs from conven-
tional Sanger sequencing methods by obtaining the nucleo-
tide sequence of many individual DNA molecules, thus
enhancing the ability to detect minor mutant populations
within a heterogeneous DNA mixture which is generally
the context in which somatic mutagenesis occurs (Metzker
2010; Fig. 2).
The distinct advantages offered by NGS will revolution-
ize environmental mutagenesis and toxicology by overcom-
ing past limitations and providing new opportunities for
study. Despite its transformative potential, NGS has only
recently gained attention in this eld, as several key techni-
cal hurdles have now been overcome. In this review, we
discuss the advances in modern DNA sequencing technolo-
gies that are enhancing the ability to detect low-frequency
mutagenic events and DNA damage. We review cutting
edge applications that are currently being facilitated by
these new technologies and others we see on the horizon.
NEXT-GENERATION SEQUENCING TECHNOLOGIES
In genetic toxicology, most applications of NGS to date
have focused on augmenting and enhancing the throughput
of well-established genotoxicity assaysfor example,
increasing the throughput of sequencing of mutant shuttle
vectors or plaques from transgenic models (Yuan et al.
2011; Besaratinia et al. 2012; Beal et al. 2015; Chang et al.
2015). Other applications have included non-mutational
assessments of genetic toxicology, such as epigenetic and
transcriptional changes, induced by chemical exposure
(Chauhan et al. 2016; Li et al. 2017a), as well as the whole-
genome detection of environmentally induced de novo muta-
tions in offspring of exposed individuals (Reviewed in
[Marchetti et al. 2019; Godschalk et al. 2019]).
However, neither of these cases fully realize the aspira-
tional goal of being able to directly measure genotoxin-
induced DNA mutations in any tissue type of any organ-
ism. This is because modern sequencing platforms are not
without their limitations. Given the random nature of geno-
toxic insults, genetic toxicology assessment in the absence
of biological selection generally necessitates being able to
detect low-frequency somatic mutations in a large popula-
tion of non-mutant DNA molecules. In theory, DNA sub-
populations of any size should be detectable by NGS when
assessing a sufcient number of molecules. However, while
notably better than Sanger sequencing, standard NGS plat-
forms still generate errors at a substantial rate. Mistakes
arising during DNA preparation, amplication, cluster gen-
eration, and the many steps of sequencing itself typically
result in ~1% artifactual bases, and this background can be
signicantly higher in certain sequence contexts (reviewed
in (Salk et al. 2018)). In contrast, the biological mutation
frequency of even heavily mutagenized animals is on the
order of one mutation per million nucleotides. Therefore, to
detect chemically induced somatic mutations, far more sen-
sitive NGS technologies are needed.
Error- Corrected Next- Generation Sequencing
Several approaches have been employed to improve the
accuracy of NGS. Initial efforts to reduce the technical
error rate of NGS focused on bioinformatic ltering of
low-condence sequences. For example, a number of vari-
ant calling tools lter the data based on the distribution of
variants with the sequencing reads or require variants to be
seen in multiple independent sequencing reads in both read
orientations (Wang et al. 2013). More recently, statistical
approaches have been specically developed to improve
variant calling by modeling the error prole of specic
sequencing platforms (Wei et al. 2011; Wilm et al. 2012).
These bioinformatic approaches allow for the detection of
variants to mutant fraction of ~0.5%. This level of sensitiv-
ity is effective for clonally expanded mutations (such as
those arising in the germ line or found in tumors) but is
Environmental and Molecular Mutagenesis. DOI 10.1002/em
137Next-Generation Genotoxicology
AB
D
C
F
G
E
Fig. 2. Analog vs. digital DNA sequencing. A common need in genetic
toxicology is to identify mutations in cell populations. The appropriateness
of the sequencing technology depends on mutational clonality. (A) Clonal
mutations are those present in all or most cells in a tissue (gray), whereas
subclonal mutations (colors) are present in only a subset. (B) When DNA is
extracted from a tissue, a mutations clonality is reected in the isolated
molecules that are then (C) prepared for sequencing. (D) With traditional
Sanger sequencing, all molecules from the same genomic region are
genotyped together en masse in a capillary system, which produces an
analog output (electropherogram tracing) that is the average of many
different DNA molecules. (E) Generally only substantially clonal mutations
can be reliably detected. (F) In contrast, next-generation sequencing
operates by massively parallel sequencing of millions of individual
molecules digitally. On the widely used Illumina sequencing-by-synthesis
platform, this is accomplished by owing uorescently labeled nucleotides
across a surface coated with small biochemically generated colonies of
individual molecules (clusters), and recording the sequence of colors of each
cluster through multiple cycles of addition. (G) The resulting output is not a
single sequence, but millions of individual ones that reect both clonal and
subclonal mutations down to approximately 1% abundance.
Environmental and Molecular Mutagenesis. DOI 10.1002/em
138 Salk and Kennedy
still orders of magnitude above the spontaneous mutant fre-
quency of DNA (Martincorena et al. 2015, 2017).
In addition to bioinformatic ltering, enzymatic removal
of DNA damage has been shown to reduce the number of
false variant calls in NGS. For example, 8-oxo-dG and
cytidine deamination, two of the most common DNA dam-
aging events, can be biochemically removed with the
damage-specic glycosylases FPG and UDG, respectively.
Combinations of glycosylases with other repair enzymes
can further repair damage-induced artifacts (Chen et al.
2017b), yet not all mutagenic lesions are recognized by
these enzymes, nor is the delity of in vitro repair perfect,
and the possibility exists that these approaches introduce
new errors at low levels.
The approach to error-corrected next-generation sequenc-
ing (ecNGS) that has, thus far, proven the most signicant
for improving accuracy is consensus-based error correction
(Fig. 3). The technique relies on the general concept of
grouping reads that are copies derived from an original
DNA molecule and then bioinformatically creating a con-
sensus sequence from the related molecules. An important
aspect of this approach is the need to identify related reads,
which can be accomplished by the use of a uniquely identi-
fying molecular barcode(also referred to as unique
molecular identier(UMI), single molecule identier,
or simply a tag) for each original DNA fragment that will
be propagated to all daughter molecules during amplica-
tion and sequencing. Molecular barcodes can be comprised
of unique fragmentation shear points, exogenously intro-
duced degenerate DNA sequences, or a combination of the
two. Importantly, they must provide enough sequence
diversity to minimize the probability that two independent
molecules will share the same molecular barcode by
chance.
Several groups introduced the idea of using molecular
barcodes to correct sequencing-based errors, but these ini-
tial studies focused on non-variant detection applications,
such as read assembly and molecular counting (Hiatt et al.
2010; Casbon et al. 2011; Fu et al. 2011). With the publi-
cation of the SafeSeqS method, Kinde et al. denitively
introduced the idea of using molecular barcoding for
improving the accuracy of mutation detection by applying
single-stranded molecular barcodes in the tails of PCR
primers, reducing the error rate to ~10
5
(Kinde et al.
2011; Fig. 3A). A number of variations on this concept
have been published, including single-molecule molecular
inversion probes (Hiatt et al. 2013), circular sequencing
(Lou et al. 2013), and CypherSeq (Gregory et al. 2016),
among others. Consensus-making techniques that label just
one strand of original double-stranded molecules or cannot
distinguish the identity of the two strands markedly reduce
sequencer-based artifacts, such as base calling errors and
amplication errors introduced during cluster generation,
thereby reducing the methodological background by two to
three orders of magnitude and making it possible to
condently identify rare variants at ~0.1% abundance (Salk
et al. 2018).
However, methods relying on single-stranded tagging
are fundamentally limited by base selectivity of DNA poly-
merases which, at best, have error rates of ~10
6
(McInerney et al. 2014). Of particular relevance is the ele-
vated rate of misincorporations at sites of mutagenic DNA
damage. For example, the presence of 8-oxo-dG adducts or
deaminated cytosine bases (dU) dramatically increases the
misincorporation rate of polymerases upon traversal of the
lesion (Shibutani et al. 1991; Lindahl 1993). These mis-
incorporation events can be propagated to daughter mole-
cules during PCR, making it difcult to distinguish
between artifacts induced by chemical adducts and bona
de variants occurring at dC and dG bases. Moreover, dif-
ferent DNA adducts are repaired with vastly different ef-
ciencies by the cell (Wood 1996). Thus, with these
methods, experiments involving mutagen exposure run the
risk of detecting the presence of both adducts and true
mutations. Given that mammalian cells are quite adept at
recognizing and repairing adducts in vivo, it is incorrect to
equate adducts with mutations (the vast majority will be
repaired before mutation occurs in vivo). Cumulatively,
these factors contribute to a practical detection limit of
~10
4
10
5
, depending on DNA quality and experimental
conditions (reviewed in (Salk et al. 2018)). This is excel-
lent for many applications but does not reach the accuracy
threshold needed for direct mutagenesis assessment.
Some mutagenic compounds are capable of increasing
the mutation frequency of DNA by ~1000-fold or more.
However, because the spontaneous mutation frequency of
the mammalian nuclear genome is normally very low
(on the order of one-per-10-million base pairs), even a
1000-fold increase is still below what is reliably detectable
by single-strand UMI-based methods. Extending the con-
cept of molecular barcoding to include asymmetric double-
stranded UMIs allows for the sequencing information
derived from complementary strands of original double-
stranded to be compared for an additional level of error
correction. Double-stranded consensus calling requires
uniquely identifying each original DNA molecule (ie, a
unique molecular identier) and its constituent strands (ie,
a strand-dening element) in a way that allows the
sequences to be related to each other. Duplex Sequencing
was the rst method to use double-stranded consensuses to
remove both sequencer and early PCR derived errors
(Schmitt et al. 2012; Kennedy et al. 2014; Fig. 3B). A
number of derivative approaches, including BiSeqS
(Mattox et al. 2017), muSeq (Kumar et al. 2018), and
BotSeqS (Hoang et al. 2016), have been developed that
establish molecular barcodes and strand-dening elements
via partial bisulte treatments or random shear points in
conjunction with ultra-low genome coverage. With all these
approaches, the theoretical error rate of double-strand con-
sensus methods is estimated to be ~10
9
, which roughly
Environmental and Molecular Mutagenesis. DOI 10.1002/em
139Next-Generation Genotoxicology
(Wild-type)
(Mutation)
(Error)
Tagging by
primer-extension PCR amplication Grouping duplicates Consensus making
A Safe Sequencing System (SafeSeqS)
(Wild-type)
(Mutation)
Starting
population PCR amplication
Single-strand
consensus making
B Duplex Sequencing
Ligate Duplex-
tagged adapters Group duplicates
(Wild-type)
Duplex
consensus making
C 2D Nanopore Sequencing
D PacBio Circular Consensus Sequencing
Ligate hairpin adapters
to circulize
Zero mode waveguide sequencing
of closed loo
p
Hairpin ligation Nanopore sequencing Comparing duplex
strand-pair sequences
(Mutation)
Consensus making
(Mutation)
Consensus making
Comparing tandem
strand-
p
air se
q
uences
(Fluorescence)
Starting
population
Fig. 3. Techniques for error corrected DNA sequencing (ecNGS). The
highest accuracy NGS methods rely on sequencing-by-consensus, whereby
data from multiple sequence reads derived from an original molecule are
combined to reduce the impact of sequencing or sample preparation errors in
each read. (A) The SafeSeqS approach uses random molecular barcodes
appliedtoPCRprimerstouniquelytagPCR amplicons, which are then further
amplied and sequenced. Variation within the sequence of reads with identical
tags can be discounted as technical artifacts (Xs). Some errors that occur
during the rst extension cycle may escape correction (triangles). (B) Duplex
Sequencing relies on ligation to apply molecular barcodes to both strands of
original double-stranded molecules. These are used alone or in combination
with fragmentation points to uniquely label both strands such that derivative
sequence reads from each strand can be directly related back to their founder
strand and compared to those from its complement. The method is signicantly
more accurate that single-stranded consensus-making methods but is more
sequencing-intensive. (C) 2D sequencing on nanopore platforms uses physical
linkage of the two strands of an original duplex, which are then sequenced
together without the need for amplication. The method is fast and simple, but
nanopore platforms are lower accuracy and throughput than more widely used
sequencing-by-synthesis platforms. (D) Circular Consensus Sequencing on the
PacBio single-molecule platform similarly links the two strands of an original
double-stranded with hairpins to allow multiple sequencing passes across both
original strands. As with 2D, lower raw platform accuracy and throughput are
drawbacks but very long reads can be obtained.
Environmental and Molecular Mutagenesis. DOI 10.1002/em
140 Salk and Kennedy
reects the square of the error rate for single-strand molec-
ular barcoding methods. Duplex methods have been used
by a number of groups to study the occurrence of muta-
tions arising from a number of genotoxic species, including
smoking, aatoxin, aristolochic acid, urethane, benzo[a]
pyrene, and reactive oxygen species (Kennedy et al. 2013;
Hoang et al. 2016; Chawanthayatham et al. 2017).
Single -Cell Sequencing Technologies
Typical NGS protocols rely on fragmenting the genomes
of thousands of cells. The result is a mixture of contribut-
ing cellular genotypes when the underlying population is
heterogeneous. In such situations, ecNGS approaches are
needed to detect these rare variants in the sea of wild-type
sequences if their abundance is below approximately 1%.
However, the creation of a heterogeneous mixture of DNA
fragments from many different genomes eliminates the
ability to identify variants to within the same cell, poten-
tially underestimating the mutagenic potential of a com-
pound that may only bio-accumulate in certain cell types
(or cell division states). Sequencing the DNA from single
cells overcomes this problem and ensures that observed
mutations came from the same cell.
Typical single-cell sequencing (SCS) protocols require
isolation of individual cells followed by lysis and usually
some form of whole-genome amplication to generate
enough DNA for sequencing (Zong et al. 2012; Fu et al.
2015; Dong et al. 2017; Chen et al. 2017a). Somatic muta-
tions would typically be heterozygous (absent recombina-
tion events or loss of heterozygosity) and expected to be
present in 50% of reads mapping to the genomic position
of interest. SCS methods have been able to successfully
detect structural variants (Wang et al. 2012), copy-number
variations (Navin et al. 2011), and single nucleotide vari-
ants (Dong et al. 2017) on a genome wide scale. To date,
SCS approaches have not been widely deployed to evalu-
ate genotoxicity at the single-cell level. However, recent
work by the Vijg group demonstrated the ability of SCS
to detect mutations induced by mutagenic exposure with
N-ethyl-N-nitrosourea, indicating its potential utility
(Dong et al. 2017).
Another barrier to deploying SCS for genotoxicity appli-
cations is throughput and, by extension, cost. In response
to the need for more high-throughput methods, microuidic
sorting of cells (Rinke et al. 2014), nano-well technologies
(Gierahn et al. 2017), and emulsion droplet partitioning
technologies (Klein et al. 2015) have been developed and
have increased throughput up to ~10,000 cells. A promis-
ing new approach to massively parallel SCS, termed com-
binatorial cellular indexing, uses intact xed cells or nuclei
as reaction vesselsto physically partition the nucleic
acids of interest. A unique combination of DNA sequences
(ie, a cellular index) are enzymatically introduced to all the
nucleic acids present within each cell/nucleus, a technique
sometimes referred to as combinatorial indexingor
split-pool barcoding.Because all sequencing reads
derived from nucleic acids from the same cell share the
same cell-specic index, the sequencing data can be com-
putationally grouped and assigned to a specic cell. This
approach offers the ability to examine hundreds of thou-
sands of cells without the need for complex single-cell han-
dling equipment and has been used to study structural
variations, transcriptomics, and epigenetics (Cusanovich
et al. 2015; Cao et al. 2017; Vitak et al. 2017; Rosenberg
et al. 2018). The steady improvements in throughput and
cost makes SCS increasingly attractive for answering
important hypotheses about genotoxicity that can only be
answered at the level of individual cells. The efcient com-
bination of SCS with high-accuracy single-molecule con-
sensus sequencing methods would be an extremely
powerful tool of the future.
Direct Sing le-Molecule S equencing
Several mutagenesis assays are routinely used to detect
clastogenic compounds, such as the micronucleus and chro-
mosomal aberration assays (Araldi et al. 2015). Although
effective from a risk assessment perspective, these classic
tools do not yield specic sequence information. Modern
sequencing platforms are able to detect structural variants,
but with the added benet of providing detailed sequence
information and genomic location. While Illuminas revers-
ible terminator dye technology, with its reasonably good
accuracy and high throughput, is well suited to detect
single-nucleotide changes, it is currently limited to read
lengths of less than 300 bases (600 bases for paired-end).
Short read length signicantly hinders the ability to detect
large structural variations and genomic rearrangements.
Therefore, structural variants are bioinformatically detected
by searching for reads spanning a break point or inferred
by read-pairs mapping farther apart than a few kilobases or
to different chromosomes (Alkan et al. 2011). Bioinfor-
matic detection tends to have highly variable sensitivity
and specicity rates due to the size of the structural variant,
occurrence of chimeric PCR products prior to sequencing,
overlapping clusters or read-hopping on the sequencer, or
the occurrence of erroneous read mapping arising from
pseudogene sequences elsewhere in the genome (Alkan
et al. 2011; Kosugi et al. 2019).
Direct single-molecule sequencing (SMS) is a relatively
new technology that offers a number of advantages over
short read sequencing methods. Two different SMS tech-
nologies are currently commercially available: single-
molecule real-time sequencing (SMRT; commercialized by
Pacic Biosciences) and nanopore (commercialized by
Oxford Nanopore Technologies). van Dijk et al. (2018)
provides a detailed comparison of these two technologies.
Both approaches produce very long reads (10250 kb) and
directly sequence genomic DNA without the need for
Environmental and Molecular Mutagenesis. DOI 10.1002/em
141Next-Generation Genotoxicology
intermediate PCR amplication. The elimination of PCR
chimeras and the addition of more sequence information
within a single read signicantly reduce mis-mappings and
increases the probability of spanning breakpoints, minimiz-
ing false positives.
Although these technologies enhance the ability to detect
structural variants, they exhibit much higher error rates in
the detection of single nucleotide variants, often as high as
15%20% (Quail et al. 2012; Ross et al. 2013; Jain et al.
2017). However, these platforms are amenable to platform
specic variations of consensus sequencing to reduce their
high false-positive rates. For example, in SMRT-based plat-
forms, circularized original DNA molecules can be
sequenced repeatedly with a highly processive DNA poly-
merase and a circular consensus sequencemade for each
template, improving the accuracy of SNV calls by several
orders of magnitude (Travers et al. 2010; Fig. 3C).
Nanopore-based technologies, however, are not yet amend-
able to signicant consensus error correction by repeated
sequencing of the same molecule. Currently, a type of
double-strand consensus can be made by afxing a hairpin
adapter to the DNA fragments such that the two strands
can be sequentially sequenced in a reverse complementary
fashion, referred to as two-directionalsequencing
(Fig. 3D). This approach has been reported to reduce the
error rate to ~3%5% (Jain et al. 2015; Tyler et al. 2018).
Two recent methods, termed Rolling-Circle to Con-
catameric Consensus and Intramolecular-ligated Nanopore
Consensus Sequencing, offer the possibility of increasing
the accuracy of nanopore-based platforms by implementing
a circular consensus sequencing-like approach, analogous
to what is performed on the PacBio platform (Li et al.
2016; Volden et al. 2018).
NEXT-GENERATION SEQUENCING APPLICATIONS
Modern sequencing platforms are rapidly transforming
the ability to detect, quantify, and characterize genomic
DNA at an ever increasing rate and scale. These technolo-
gies open up new potential avenues of research that are
likely to have a profound impact on the study of genomic
toxicology and mutagenesis. We highlight a number of
emerging applications for modern sequencing platforms
that are of high relevance for genotoxicity studies.
Adduct Detect ion by Sequencing
Genotoxic compounds that induce mutagenesis typically
do so by chemical modication of the DNA that induces
base mis-insertion by DNA polymerases during genome
replication or repair. The majority of damage is effectively
removed by multifaceted cellular repair processes before
mutation occurs (Sancar et al. 2004). However, the level of
DNA damage and efciency of repair can vary widely by
genomic context and damage type, with some adducts and
genomic locations being essentially unrepaired (Chang
et al. 2015; Perera et al. 2016; Geacintov and Broyde
2017). As such, there is far from a one-to-one relationship
between the presence of an adduct and risk of mutagenesis.
Indeed, this is the impetus behind the widely used comet
assay that grossly quanties the aggregate presence of
DNA break and adducts but has the limitation of not pro-
viding sequence context or genomic location information.
While outside the scope of this review, the presence of
unrepaired DNA adducts has been shown to lead to
increases in transcriptional mutagenesis and signicant
physiological consequences, even when the underlying
DNA sequences is unchanged (reviewed in (Brégeon and
Doetsch 2011)).
A number of approaches have been developed to take
advantage of modern sequencing platforms to assess the
distribution of DNA adducts on a genome wide scale and,
frequently, at single-nucleotide resolution. Current short-
read technologies, such as the Illumina platform, are typi-
cally unable to directly detect DNA adducts, so the pres-
ence of chemical alterations must be inferred by other
means. One strategy is the detection of read start or termi-
nation positions. This approach relies on the ability of
bulky lesions, such as alkyl groups, to block the DNA
polymerases during the PCR steps used in library prepara-
tion (Hu et al. 2016; Hu et al. 2017; Wu et al. 2018). The
result is that the DNA fragments being sequenced will ter-
minate immediately adjacent to the blocking moiety. The
use of DNA repair enzymes or chemical treatments has also
been employed to specically cleave DNA at sites of dam-
age followed by adapter ligation and sequencing. The result
is similar to the above, whereby the 50-end of a read
denotes a site immediately adjacent to a site of damage.
This strategy has been used to detect UV (Mao et al. 2016;
Hu et al. 2017), cisplatin (Hu et al. 2016), and bulky alkyl
adducts (Mao et al. 2017; Aloisi et al. 2019). The presence
and location of ribose bases in DNA can be similarly
inferred, simply by inducing breaks with alkaline hydroly-
sis (Orebaugh et al. 2018).
Another frequently used strategy to infer DNA damage
employs enrichment for, or depletion of, DNA fragments
containing adducts. Depletion-based approaches make use
of enzymatic removal of adducts that render those DNA
fragments unsequenceable. The readout is a drop in cover-
age areas of the genome prone to DNA damage relative to
undamaged ones (Bryan et al. 2014). This approach
exhibits poor sensitivity when adducts are present in only a
small minority of DNA molecules, as is the case in many
in vivo applications. One solution is to enrich adduct-
containing molecules via immunoprecipitation of DNA
bearing specic adducts or bound repair proteins (ie, base
excision repair or nucleotide excision repair, etc.) (Bryan
et al. 2014; Hu et al. 2017, Hu et al. 2016; Li et al. 2017b).
In an analogous approach, base adducts that are poorly
targeted by immunoprecipitation can be chemically
Environmental and Molecular Mutagenesis. DOI 10.1002/em
142 Salk and Kennedy
modied to make them amendable for capture (Wu et al.
2018). Both methods can signicantly improve detection of
damage or repair activity on a genome-wide scale.
An advantage of many single-molecule sequencing plat-
forms is that many DNA adducts can be directly detected
without prior manipulation. In the case of the PacBio
SMRT sequencing technology, chemical modications to
the template base affect the kinetics of dNTP incorporation
by DNA polymerases in a dened way that is relatively
specic to each adduct (Clark et al. 2011). Most studies
have focused on endogenous epigenetic modications (ie,
methylation), but the methods and statistical analysis
employed by these studies could easily be adapted to gen-
otoxicity applications.
Challenges in detecting blocking lesions are one notable
limitation for this polymerase-based approach. Nanopore
technologies, on the other hand, are well suited for identi-
fying bulky adducts. Base-calling is accomplished by
observing changes in ionic current/impendence that are
specic to the template base as it passes through the
nanopore structure (reviewed in (Deamer et al. 2016)).
Base modications are detectable because they alter this
characteristic prole in an adduct-specic way. Most efforts
have focused on detecting endogenous methylations
(Laszlo et al. 2013; Schreiber et al. 2013), but an increasing
number of reports are beginning to characterize a wider
variety of exogenous DNA adducts more relevant to
genetic toxicology, including pyrimidine dimers, benzo[a]
pyrine, 8-oxo-dG, abasic sites, and double-strand cross-
links (An et al. 2012; Wolna et al. 2013; An et al. 2015;
Perera et al. 2015; Zhang et al. 2015).
Characterizing Genotoxicit y by Muta tional Signatures
One of the primary goals of genotoxicity testing is to
link specic exposures to mutagenesis and, ultimately, car-
cinogenesis. Controlled exposure studies in animal models
are currently the gold standard for relating exposure to car-
cinogenicity. However, the linking of mutagenic exposure
to cancer in human populations is far more complex and
largely depends on population level epidemiological studies
(Wild 2008). With some rare exceptions, such as skin can-
cer with sun exposure and cervical cancer with human pap-
illomavirus, denitive attribution of a specic instance of
cancer to a specic genotoxic event is extremely difcult,
especially when compounded with the naturally occurring
accumulation of mutations in cancer relevant genes during
aging (reviewed in (Risques and Kennedy 2018)). Tools
that enable detection of genotoxic exposure in humans, and
more closely link its relationship to cancer, would have a
profound impact on clinical medicine and public health, as
well as important legal and ethical implications.
The relative incidence of different types (or spectra) of
single-base substitutions are nonrandom and strongly
depends on the specic nature of the mutagen. On their
own, simple mutation spectra (ie, A!Gvs.C!A) have
limited specicity due to signicant overlap between differ-
ent mutagens and their predominant mutation type. Local
sequence context, however, strongly inuences the fre-
quency of a given type of mutation. The identity of
anking nucleotides adds a great deal of additional infor-
mation that can be harnessed to better indicate the exact eti-
ology of observed mutations (Fig. 4).
Data generated by The Cancer Genome Atlas and other
large-scale sequencing efforts have provided an opportunity
to identify many distinct mutational patterns in a wide vari-
ety of cancer types. By taking into account known cancer
biology and patient medical history, analysis of the tumor
mutation patterns can, in some cases, provide a correlative
link between exposure and the observed mutational pat-
terns; for example, high levels of mutations seen in mela-
noma are consistent with pyrimidine dimers (The Cancer
Genome Atlas 2015). These patterns can be readily
detected in tumors using standard NGS techniques because
of the clonal nature of tumor formation. Mutations present
early in neoplastic transformation are propagated to descen-
dent tumor cells, where they are easily identied as well
above the background error rate of sequencing (Fig. 4A).
This is in contrast to early genotoxin-associated mutations
in normal tissues, which are present in only a minority of
cells among a larger unmutated population, and where far
more sensitive methods are required.
The primary challenge in performing this type of spectral
analysis has been that somatic tumor mutations are the
result of the cumulative mutational processes incurred by
the founding cancer cells lineage since embryogenesis. As
such, it is necessary to deconvolute the relative contribu-
tions of each of these mutational processes. Alexandrov
et al. were the rst to report the use of nonnegative matrix
factorization, a statistical method developed for decomposi-
tion of multivariant data, to computationally parse out con-
stituent mutational processes based on both the specic
mutation type (ie, G!T/C!A) and the identity of the
adjacent 50and 30bases (Alexandrov et al. 2013). In their
initial work, the authors reported 21 mutational signa-
tures(or trinucleotide signatures) across the TGCA data
set, with some of the signatures exhibiting high tumor-type
specicity (Alexandrov et al. 2013). Recent analysis of
tumor sequencing data, comprising 4645 whole genomes
and 19,184 exomes, has validated the vast majority of the
initially reported signatures, as well as further expanded the
number of mathematically dened signatures to now
include a total of 49 single-base substitution signatures,
11 doublet-base substitution signatures, 4 clustered-base
substitution signatures, and 17 small insertion/deletion sig-
natures (Alexandrov et al. 2018).
Mutational signatures have risen to prominence in the
genomic literature over the last 5 years (reviewed in
(Phillips 2018)), but they are not without limitations. Sig-
natures are computationally derived. Some portion of the
Environmental and Molecular Mutagenesis. DOI 10.1002/em
14 3Next-Generation Genotoxicology
described signatures could be computational artifacts or
subfeatures within other processes. Furthermore, the bulk
of research on mutational signatures has focused on their
presence in tumors, for the practical reasons described
above. The signatures observed in a tumor may not fully
recapitulate processes in normal tissues. Signatures in
tumors arise from both endogenous and exogenous sources
(Alexandrov et al. 2013; Alexandrov et al. 2015;
Alexandrov et al. 2018) and are an amalgamation of muta-
genic processes that may be somewhat biased by clonal
sweeps that occur during tumor formation when effects
unrelated to exposure-associated mutagenesis are operative.
Recent work using error-corrected sequencing to study
aatoxin-induced mutations in normal mouse tissue
ACDEB
ACA
ACC
ACG
ACT
CCA
CCC
CCG
CCT
GCA
GCC
GCG
GCT
TCA
TCC
TCG
TCT
ACA
ACC
ACG
ACT
CCA
CCC
CCG
CCT
GCA
GCC
GCG
GCT
TCA
TCC
TCG
TCT
ACA
ACC
ACG
ACT
CCA
CCC
CCG
CCT
GCA
GCC
GCG
GCT
TCA
TCC
TCG
TCT
ATA
ATC
ATG
ATT
CTA
CTC
CTG
CTT
GTA
GTC
GTG
GTT
TTA
TTC
TTG
TTT
ATA
ATC
ATG
ATT
CTA
CTC
CTG
CTT
GTA
GTC
GTG
GTT
TTA
TTC
TTG
TTT
ATA
ATC
ATG
ATT
CTA
CTC
CTG
CTT
GTA
GTC
GTG
GTT
TTA
TTC
TTG
TTT
0.00 0.05 0.10
Fig. 4. Approaches for assessing mutational signatures. Mutational
spectra, particularly polynucleotide mutational signatures, provide
important mechanistic insights into mutational processes. Most of what
we know about these patterns has come from natural or articial means
of single cell cloning. (A) Exome or whole-genome sequencing of tumor
populations reects the somatic processes operative in the founding cell
of the most recent clonal sweep. (B) Single cells can be cloned from
cultured populations exposed to known or suspected mutagens to assess
their mutational signatures (C) The clonal variants present in individuals
that were not present in their parents reects the state of mutational
processes during gametogenesis or early embryogenesis. (D) Sequencing
of cloned cells or molecules from certain selection-based mutagenicity
assays can be used similarly, although the patterns may be distorted by
the selection system itself. (E) With ecNGS, it is now possible to obtain
mutational spectra by directly sequencing DNA from any tissue of any
organism.
Environmental and Molecular Mutagenesis. DOI 10.1002/em
144 Salk and Kennedy
demonstrated the low-frequency signature to be distinctly
different from that observed in the tumor itself. This sug-
gests additional mutagenic processes may have developed
during tumorigenesis that were unrelated to aatoxin expo-
sure (Chawanthayatham et al. 2017; Fedeles et al. 2017).
For most genetic toxicologists, a forensic analysis of
the mutational processes that led to clonal tumors is only
useful insofar as the knowledge can be applied for pro-
spectively screening new compounds. Sequencing human
cancers that follow natural exposures, similar to sequenc-
ing of family trios to infer germ line processes that intro-
duce mutations between generations (Fig. 4C), is simply
not a practical tool in this regard. Most conventional gen-
otoxicity assays are not equipped to take advantage of tri-
nucleotide signature analysis due to their reliance on
selective markers with a narrow nucleotide repertoire
which can signicantly bias observed spectrum (Fig. 4D).
Simple, and even trinucleotide, mutational spectrums can
be assessed from transgenic rodent assays by manually
picking hundreds of phage plaques for sequencing, but in
addition to being very labor intensive, the approach is still
complicated by an incomplete repertoire of three base-pair
groups within the small reporter genes and the fact that
synonymous mutations do not result in phenotypic
changes.
A less biased approach for experimentally obtaining
detailed mutational spectra without any biological selec-
tion is cloning of single cells after compound exposure
followed by large-scale sequencing (Fig. 4B). In an out-
standing recent study by the Nik-Zanal group, the authors
carried out whole-genome sequencing on induced pluripo-
tent stem cells that were cloned from populations treated
with nearly 80 known or suspected carcinogens, identify-
ing dozens of distinct signatures (Kucab et al. 2019). This
more than quadrupled the existing collection of signatures
that have been experimentally ascribed to from exogenous
sourcesa list which will undoubtedly continue to grow
(Chawanthayatham et al. 2017; Huang et al. 2017; Ng
et al. 2017; Boot et al. 2018).
Cultured cells cannot fully recapitulate all the meta-
bolic and distribution complexities of in vivo exposures
and single-cell cloning is not trivial (Blokzijl et al.
2016). However, the extensive signature knowledge and
mathematical methods generated from both this approach
and from genotyping tumors can be readily applied to
the above-described new sequencing technologies. Many
of these have sufcient accuracy to detect low-frequency
genotoxin-induced mutations without need for clonal
expansion of any form (Fig. 4E). This opens the possibil-
ity of being able to assess mutational signatures in any
cell type from any tissue from any species directly from
extracted DNA (Chawanthayatham et al. 2017). Much
remains to be done in this emerging space, but the future
remains bright for its applications in genomic
toxicology.
Neo -Genotoxicity: Genome E ngineering Technologies
The classic elds of genetic toxicology and environmen-
tal mutagenesis have typically focused on the effects of
broadly acting DNA damaging chemicals and their effects
to human health. However, the emergence of new genetic
manipulation technologies, what we term neo-
genotoxins,presents both new challenges and new oppor-
tunities for the eld. A critical aspect of these tools, espe-
cially from a regulatory perspective, is determining their
specicity in altering the genome in the desired way. Like
traditional chemical mutagens, off-target DNA cutting or
gene mis-insertion could increase the risk of cancer by
inadvertently interrupting an oncogene or tumor suppres-
sor. However, unlike randomly acting small molecules, the
rules for predicting where in the genome this might hap-
pen, and the technical complexities for site-specic screen-
ing, are completely different.
With the development of programmable endonucleases,
such as zinc-nger nucleases, transcription activator-like
effector nucleases, and, most recently, CRISPR/Cas nucle-
ases, it is now possible to make targeted genomic alter-
ations in situ (reviewed in (Gaj et al. 2013)). In theory, the
2040 bases targeted by these enzymes should be more
than sufcient to ensure complete specicity, but the pres-
ence of pseudogenes, human genetic variation, and a toler-
ance for sequence changes in the recognition sequence, can
reduce site specicity (Lessard et al. 2017). In silico
methods have been developed to help predict off-target
effects of these nucleases, especially for the CRISPR/Cas
family of endonucleases, but have shown only moderate
concordance with experimental data (reviewed in (Chuai
et al. 2017)).
Using modern sequencing platforms, several unbiased
methods have been developed to detect the presence of
double-strand breaks. A primary concern with these tech-
nologies is the hundreds to thousands of potential off-target
sites that exist across the genome. Further complicating the
issue is that the probability of cutting off-target sites can
vary by several orders of magnitude which means that
brute force sequencing may not be sensitive enough to
detect rare off-target events. While the specics of each
approach are different, they largely depend on using
in vitro digestion with the nuclease in question followed by
the introduction of a known universal sequence via liga-
tion/integration or the cells homologous recombination
machinery that can be selected by PCR or targeted
pulldown. These methods have reported a wide range of
off-target cutting depending on the method used (Fu et al.
2013; Frock et al. 2015; Tsai et al. 2015; Cameron et al.
2017; Tsai et al. 2017). There is a substantial need for more
accurate and sensitive methods to detect off-target cut sites.
A notable limitation of these methods is the inability to
practically assess off-target effects in vivo, which will be
critical for regulatory testing and widespread medical use
Environmental and Molecular Mutagenesis. DOI 10.1002/em
14 5Next-Generation Genotoxicology
of genome-editing technologies. To date, we are aware of
only one in vivo method, termed Verication of in vivo
Off-targets(VIVO), that has been published. This
approach uses a combination of in vitro off-target detection
with evaluating the observed off-target sites seen in the
in vitro data for characteristic deletion events caused by
in vivo expression of CRISPR/Cas9 in mouse liver (Tsai
et al. 2017; Akcakaya et al. 2018). Further complicating
matters is that the highly sequence-dependent nature of
both on-target and off-target effects makes animals untena-
ble surrogates for assessing genotoxicity induced by
human-genome targeted nucleases.
The clinical importance of neogenotoxins has become
even more apparent with the emergence of cell-based thera-
pies. While cells do not constitute a genotoxin per se, the
genetic engineering and potential for clonal selection of
mutation-harboring subpopulations during their develop-
ment can lead to increased risk of acquiring cancer from
within the transplanted cells. For example, recent studies
have shown that genome editing using CRISPR-Cas9
results in TP53-mediated DNA damage response and cell-
cycle arrest. Consequently, there is a strong selective
advantage for cells harboring inactivating mutations in this
important tumor suppressor (Haapaniemi et al. 2018; Ihry
et al. 2018; Sinha et al. 2018). In other words, the effect of
even perfectly accurate on-target cutting is natural selection
of cells bearing the most common genetic driver in all
human cancers. These issues, and others that have not yet
been discovered, are likely to complicate therapeutic appli-
cations involving genetically engineered cells, such as for
regenerative medicine or CAR-T-based cancer therapies.
Technologies for accessing these risks will need to be
extremely accurate, quickly adaptable to new targets, and
equally applicable to in vitro preclinical usage as to in vivo
human studiesa tall order by any estimation.
Carcinogenicit y vs. Mu tagenicity
While essentially all human mutagens are carcinogens,
the reverse is not always true. Mutagenesis is an imperfect
surrogate for cancer risk. Nonmutagenic carcinogens may
drive neoplasia through inammation, epigenetic modica-
tions, and endocrine disruption that drives aberrant cellular
proliferation (Ohshima et al. 2003; Baccarelli and Bollati
2009; Soto and Sonnenschein 2010). In these cases, classic
selection-based mutagenesis assays would not easily detect
these compounds as likely carcinogenic, indicating why
2-year rodent studies remain a safety requirement for new
drug approval.
A number of recent reports show that clonal expansion
of cells harboring somatic mutations in cancer-associated
genes is a normal part of aging (reviewed in (Risques and
Kennedy 2018)). Because non-genotoxic carcinogens are
generally believed to accelerate carcinogenesis by forcing
unregulated cell division, clonal expansions of mutations
could be used as a marker of emerging ability to proliferate
outside the connes of the normal regulated tissue architec-
ture (Salk and Horwitz 2010). The development of ultra-
accurate ecNGS may offer a way to quantify these expan-
sions and correlate their presence with environmental expo-
sure or potentially cancer risk. Approaches could involve
the sequencing of large panels of cancer driver genes or
hypermutable portions of the genome for clonal expan-
sions. A similar idea has been used in studying somatic
evolution in dysplastic and cancerous tissue (Salk et al.
2009; Naxerova et al. 2017; Baker et al. 2019). Detection
of very early preneoplastic changes at the cellular level by
observing accelerated growth of small clones could be car-
ried out in conjunction with mutagenesis screening using
the same ecNGS methods. For an in-depth discussion on
this topic, please see the accompanying review by Parsons
and colleagues (Harris et al., 2020).
FUTURE APPLICATIONS AND CONCLUSIONS
The utility of modern sequencing platforms has
expanded well beyond the initial use of sequencing DNA
for genome assembly and germ line variant detection, for
which they were originally developed. While in its infancy,
these technologies are ushering in a renaissance for the
study of genotoxicity and somatic mutagenesis. The digital
nature and massive scale at which these technologies oper-
ate is already providing rich data sets that are orders of
magnitude beyond that which was available to the elds
pioneers.
Ultimately, the technologies and methods that we have
described here will be deployable for direct monitoring of
exposures in human populationsa concept famously
envisioned by William Thilly more than three decades ago
(Sattaur 1985). Widely recognized environmental carcino-
gens such as aatoxin and aristocholic acid cause thou-
sands of cancer deaths globally per year, but, at the current
time, it is impossible to know which individuals may have
been exposed during their lives and are at the greatest risk
(Ng et al. 2017). From the point of view of an individual,
routine screening in at-risk populations could identify those
who would most benet from close clinical surveillance.
From a public health perspective, population testing
could aid in identifying regional exposure hot spots where
source control efforts could be most effective. Numerous
statistically dened cancer clustershave been described,
frequently near industrial sites (Thun and Sinks 2004).
New tools that more directly link chemical exposure of
individuals to an instance of cancer could empower com-
munities with objective data to more effectively demand
cleanup and provide local governments and regulators with
early detection tools to prevent clusters in the rst place.
Due to the generalizability of NGS technologies to any
source of DNA, surveying native organisms for mutagenic
Environmental and Molecular Mutagenesis. DOI 10.1002/em
146 Salk and Kennedy
signatures in their genome would allow for environmental
monitoring for the presence of mutagens. An amusing, yet
entirely appropriate, analogy is the proverbial canary-in-a-
coal mine; in this modern rendition, it is the canarys
genome that serves as a biosensor for mutagenic coal dust
(Fig. 5). We envision that many of the varieties and appli-
cations of the new technologies outlined in this review can
be combined to obtain a more complete picture of gen-
otoxicity and cancer risk both in model systems and
humans. The use-cases described herein are likely to be
only the beginning of our needs as we look toward engag-
ing with mutagenic new environments, such as inter-
planetary space, and consider new high-risk medical
frontiers, such as gene editing of the germ line. The full
breadth of applications for these new tools remains to be
seen, but their use will undoubtedly offer new avenues of
research and further drive development of technologies that
will carry us through the next 50 years.
AUTHORCONTRIBUTIONS
S.R.K. and J.J.S. conceptualized the review topics.
S.R.K. wrote the initial manuscript draft. S.R.K. and
J.J.S. contributed to the gures and manuscript.
Conflict of Interest
J.J.S. is an employee and equity holder at TwinStrand
Biosciences. S.R.K. is a paid consultant and equity holder
ACA
ACC
ACG
ACT
CCA
CCC
CCG
CCT
GCA
GCC
GCG
GCT
TCA
TCC
TCG
TCT
ACA
ACC
ACG
ACT
CCA
CCC
CCG
CCT
GCA
GCC
GCG
GCT
TCA
TCC
TCG
TCT
ACA
ACC
ACG
ACT
CCA
CCC
CCG
CCT
GCA
GCC
GCG
GCT
TCA
TCC
TCG
TCT
ATA
ATC
ATG
ATT
CTA
CTC
CTG
CTT
GTA
GTC
GTG
GTT
TTA
TTC
TTG
TTT
ATA
ATC
ATG
ATT
CTA
CTC
CTG
CTT
GTA
GTC
GTG
GTT
TTA
TTC
TTG
TTT
ATA
ATC
ATG
ATT
CTA
CTC
CTG
CTT
GTA
GTC
GTG
GTT
TTA
TTC
TTG
TTT
Fig. 5. Canary-in-a-coal-mine: a century later. A hundred years ago, at
the suggestion of John Scott Haldane, caged canaries were routinely
brought into British coal mines as an early warning sign of human-
relevant toxic gases. Although their routine use ceased in the 1980s, the
broader concept of using sentinel species to infer the presence of
environmental hazards remains highly germane in modern genetic
toxicology. Should it have been possible to collect and analyze a DNA
sample from one of Haldanes birds using modern ecNGS techniques, it is
quite likely that the mutagenic signature of benzo[a]pyrene could have
been identied and used to inform efforts to mitigate the environmental
cancer risk. Other naturally present sentinel organisms, including humans
themselves, can be similarly used.
Environmental and Molecular Mutagenesis. DOI 10.1002/em
14 7Next-Generation Genotoxicology
at TwinStrand Biosciences and a paid consultant for Wil-
cox & Savage, PC.
Acknowledgments
We would like to thank Dr. Penny M. Faires for scien-
tic editing, Clint Valentine for mutational spectra
graphics, and the scientic team at TwinStrand Biosciences
and members of the HESI Genetic Toxicology Technical
Committee (GTTC) for inspiration and championing new
genomic technologies. This work was supported in part by
DOD/CDMRP grant W81XWH-18-1-0339, NIJ grant
2017-DN-BX-0160, and Safeway/Albertsons Early Career
Award in Cancer Research to S.R.K. and NIH/NIEHS
grant R44ES030642 to J.J.S.
REFERENCES
Akcakaya P, Bobbin ML, Guo JA, Malagon-Lopez J, Clement K,
Garcia SP, Fellows MD, Porritt MJ, Firth MA, Carreras A, et al.
2018. In vivo CRISPR editing with no detectable genome-wide
off-target mutations. Nature 561:416419.
Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SAJR, Behjati S,
Biankin AV, Bignell GR, Bolli N, Borg A, Børresen-Dale A-L,
et al. 2013. Signatures of mutational processes in human cancer.
Nature 500:415421.
Alexandrov LB, Jones PH, Wedge DC, Sale JE, Campbell PJ, Nik-
Zainal S, Stratton MR. 2015. Clock-like mutational processes in
human somatic cells. Nat Genet 47:14021407.
Alexandrov L, Kim J, Haradhvala NJ, Huang MN, Ng AWT, Boot A,
Covington KR, Gordenin DA, Bergstrom E, Lopez-Bigas N, et al.
2018. The repertoire of mutational signatures in human cancer. bio-
Rxiv 322859. https://doi.org/10.1101/322859.
Alkan C, Coe BP, Eichler EE. 2011. Genome structural variation discov-
ery and genotyping. Nat Rev Genet 12:363376.
Aloisi CMN, Sturla SJ, Gahlon HL. 2019. A gene-targeted polymerase-
mediated strategy to identify O
6
-methylguanine damage. Chem
Commun 55:38953898.
Ames BN, Lee FD, Durston WE. 1973. An improved bacterial test system
for the detection and classication of mutagens and carcinogens.
Proc Natl Acad Sci USA 70:782786.
An N, Fleming AM, White HS, Burrows CJ. 2012. Crown ether-
electrolyte interactions permit nanopore detection of individual
DNA abasic sites in single molecules. Proc Natl Acad Sci USA
109:1150411509.
An N, Fleming AM, White HS, Burrows CJ. 2015. Nanopore detection of
8-oxoguanine in the human telomere repeat sequence. ACS Nano
9:42964307.
Araldi RP, de Melo TC, Mendes TB, de Sa Junior PL, Nozima BHN,
Ito ET, de Carvalho RF, de Souza EB, de Cassia Stocco R. 2015.
Using the comet and micronucleus assays for genotoxicity studies:
A review. Biomed Pharmacother 72:7482.
Auerbach C, Robson JM, Carr JG. 1947. The chemical production of
mutations. Science 105:243247.
Baccarelli A, Bollati V. 2009. Epigenetics and environmental chemicals.
Curr Opin Pediatr 21:243251.
Baker KT, Nachmanson D, Kumar S, Emond MJ, Ussakli C,
Brentnall TA, Kennedy SR, Risques RA. 2019. Mitochondrial
DNA mutations are associated with ulcerative colitis preneoplasia
but tend to be negatively selected in cancer. Mol Cancer Res 17:
488498.
Beal MA, Gagné R, Williams A, Marchetti F, Yauk CL. 2015. Character-
izing Benzo[a]pyrene-induced lacZ mutation spectrum in trans-
genic mice using next-generation sequencing. BMC Genomics
16:812.
Besaratinia A, Li H, Yoon J-I, Zheng A, Gao H, Tommasi S. 2012. A
high-throughput next-generation sequencing-based method for
detecting the mutational ngerprint of carcinogens. Nucleic Acids
Res 40: e116.
Bessman MJ, Lehman IR, Simms ES, Kornberg A. 1958. Enzymatic syn-
thesis of deoxyribonucleic acid II. General properties of the reac-
tion. J Biol Chem 233:171177.
Bielas JH, Loeb LA. 2005. Quantication of random genomic mutations.
Nat Methods 2:285290.
Blokzijl F, de Ligt J, Jager M, Sasselli V, Roerink S, Sasaki N, Huch M,
Boymans S, Kuijk E, Prins P, et al. 2016. Tissue-specic mutation
accumulation in human adult stem cells during life. Nature 538:
260264.
Boot A, Huang MN, Ng AWT, Ho S-C, Lim JQ, Kawakami Y,
Chayama K, Teh BT, Nakagawa H, Rozen SG. 2018. In-depth char-
acterization of the cisplatin mutational signature in human cell lines
and in esophageal and liver tumors. Genome Res 28:654665.
Brégeon D, Doetsch PW. 2011. Transcriptional mutagenesis: Causes and
involvement in tumour development. Nat Rev Cancer 11:218227.
Bryan DS, Ransom M, Adane B, York K, Hesselberth JR. 2014. High res-
olution mapping of modied DNA nucleobases using excision
repair enzymes. Genome Res 24:15341542.
Bryce SM, Bemis JC, Dertinger SD. 2008. In vivo mutation assay based
on the endogenous Pig-a locus. Environ Mol Mutagen 49:
256264.
Cameron P, Fuller CK, Donohoue PD, Jones BN, Thompson MS,
Carter MM, Gradia S, Vida B, Garner E, Slorach EM, et al. 2017.
Mapping the genomic landscape of CRISPRCas9 cleavage. Nat
Methods 14:600606.
Cao J, Packer JS, Ramani V, Cusanovich DA, Huynh C, Daza R, Qiu X,
Lee C, Furlan SN, Steemers FJ, et al. 2017. Comprehensive single-
cell transcriptional proling of a multicellular organism. Science
357:661667.
Casbon JA, Osborne RJ, Brenner S, Lichtenstein CP. 2011. A method for
counting PCR template molecules with application to next-
generation sequencing. Nucleic Acids Res 39: e81.
Chang S, Fedeles BI, Wu J, Delaney JC, Li D, Zhao L, Christov PP,
Yau E, Singh V, Jost M, et al. 2015. Next-generation sequencing
reveals the biological signicance of the N2,3-ethenoguanine
lesion in vivo. Nucleic Acids Res 43:54895500.
Chauhan V, Kuo B, McNamee JP, Wilkins RC, Yauk CL. 2016. Tran-
scriptional benchmark dose modeling: Exploring how advances in
chemical risk assessment may be applied to the radiation eld:
BMD and radiation risk assessment. Environ Mol Mutagen 57:
589604.
Chawanthayatham S, Valentine CC, Fedeles BI, Fox EJ, Loeb LA,
Levin SS, Slocu SL, Wogan GN, Croy RG, Essigmann JM. 2017.
Mutational spectra of aatoxin B
1
in vivo establish biomarkers of
exposure for human hepatocellular carcinoma. Proc Natl Acad Sci
USA 114:E3101E3109.
Chen C, Xing D, Tan L, Li H, Zhou G, Huang L, Xie XS. 2017a. Single-
cell whole-genome analyses by linear amplication via transposon
insertion (LIANTI). Science 356:189194.
Chen L, Liu P, Evans TC, Ettwiller LM. 2017b. DNA damage is a perva-
sive cause of sequencing errors, directly confounding variant iden-
tication. Science 355:752756.
Environmental and Molecular Mutagenesis. DOI 10.1002/em
148 Salk and Kennedy
Chuai G, Wang Q-L, Liu Q. 2017. In silico meets in vivo : Towards com-
putational CRISPR-based sgRNA design. Trends in Biotechnology
35:1221.
Clark TA, Spittle KE, Turner SW, Korlach J. 2011. Direct detection and
sequencing of damaged DNA bases. Genome Integr 2:10.
Cusanovich DA, Daza R, Adey A, Pliner HA, Christiansen L,
Gunderson KL, Steemers FJ, Trapnell C, Shendure J. 2015. Multi-
plex single-cell proling of chromatin accessibility by combinato-
rial cellular indexing. Science 348:910914.
Deamer D, Akeson M, Branton D. 2016. Three decades of nanopore
sequencing. Nat Biotechnol 34:518524.
DeMarini DM. 2019. The mutagenesis moonshot: The propitious begin-
nings of the environmental mutagenesis and genomics society.
Environ Mol Mutagen This issue.
van Dijk EL, Jaszczyszyn Y, Naquin D, Thermes C. 2018. The third
revolution in sequencing technology. Trends Genet 34:
666681.
Dong X, Zhang L, Milholland B, Lee M, Maslov AY, Wang T, Vijg J.
2017. Accurate identication of single-nucleotide variants in
whole-genome-amplied single cells. Nat Methods 14:491493.
Fedeles BI, Chawanthayatham S, Croy RG, Wogan GN, Essigmann JM.
2017. Early detection of the aatoxin B
1
mutational ngerprint: A
diagnostic tool for liver cancer. Mol Cell Oncol 4: e1329693.
Frock RL, Hu J, Meyers RM, Ho Y-J, Kii E, Alt FW. 2015. Genome-wide
detection of DNA double-stranded breaks induced by engineered
nucleases. Nat Biotechnol 33:179186.
Fu GK, Hu J, Wang P-H, Fodor SPA. 2011. Counting individual DNA
molecules by the stochastic attachment of diverse labels. Proc Natl
Acad Sci USA 108:90269031.
Fu Y, Foden JA, Khayter C, Maeder ML, Reyon D, Joung JK, Sander JD.
2013. High-frequency off-target mutagenesis induced by CRISPR-
Cas nucleases in human cells. Nat Biotechnol 31:822826.
Fu Y, Li C, Lu S, Zhou W, Tang F, Xie XS, Huang Y. 2015. Uniform
and accurate single-cell sequencing based on emulsion whole-
genome amplication. Proc Natl Acad Sci USA 112:1192311928.
Gaj T, Gersbach CA, Barbas CF. 2013. ZFN, TALEN, and CRISPR/Cas-
based methods for genome engineering. Trends Biotechnol 31:
397405.
Geacintov NE, Broyde S. 2017. Repair-resistant DNA lesions. Chem Res
Toxicol 30:15171548.
Gierahn TM, Ii MHW, Hughes TK, Bryson BD, Butler A, Satija R,
Fortune S, Love JC, Shalek AK. 2017. Seq-well: Portable, low-
cost RNA sequencing of single cells at high throughput. Nat
Methods 14:395398.
Godschalk RWL, Yauk CL, van Benthem J, Douglas G, Marchetti F,
2019. In utero exposure to genotoxins leading to genetic mosai-
cisms: A forgotten window of susceptibility in genetic toxicology
testing? Environ Mol Mutagen This issue
Gregory MT, Bertout JA, Ericson NG, Taylor SD, Mukherjee R,
Robins HS, Drescher CW, Bielas JH. 2016. Targeted single mole-
cule mutation detection with massively parallel sequencing.
Nucleic Acids Res 44: e22.
Haapaniemi E, Botla S, Persson J, Schmierer B, Taipale J. 2018.
CRISPRCas9 genome editing induces a p53-mediated DNA dam-
age response. Nat Med 24:927930.
Handford M. 2007. Wheres Waldo? First U.S. Paperback Edition, Vol.
2007. Somerville, MA: Candlewick Press ©1997.
Harris KL, Myers MB, McKim KL, Elespuru RK, Parsons BL. 2020.
Rationale and roadmap for developing panels of hotspot cancer
driver gene mutations as biomarkers of cancer risk. Environ Mol
Mutagen 61:152175.
Heich R, Johnson G, Zeller A, Francesco M, Douglas G, Witt K,
Gollapudi BB, White P. 2020. Mutation as a toxicological endpoint
for regulatory decision-making. Environ Mol Mutagen 61:3441.
Hiatt JB, Patwardhan RP, Turner EH, Lee C, Shendure J. 2010. Parallel,
tag-directed assembly of locally derived short sequence reads. Nat
Methods 7:119122.
Hiatt JB, Pritchard CC, Salipante SJ, ORoak BJ, Shendure J. 2013. Single
molecule molecular inversion probes for targeted, high-accuracy
detection of low-frequency variation. Genome Res 23:843854.
Hoang ML, Kinde I, Tomasetti C, McMahon KW, Rosenquist TA,
Grollman AP, Kinzler KW, Vogelstein B, Papadopoulos N. 2016.
Genome-wide quantication of rare somatic mutations in normal
human tissues using massively parallel sequencing. Proc Natl Acad
Sci USA 113:98469851.
Hodgkinson A, Eyre-Walker A. 2011. Variation in the mutation rate
across mammalian genomes. Nat Rev Genet 12:756766.
Hu J, Lieb JD, Sancar A, Adar S. 2016. Cisplatin DNA damage and repair
maps of the human genome at single-nucleotide resolution. Proc
Natl Acad Sci USA 113:1150711512.
Hu J, Adebali O, Adar S, Sancar A. 2017. Dynamic maps of UV damage
formation and repair for the human genome. Proc Natl Acad Sci
USA 114:67586763.
Huang MN, Yu W, Teoh WW, Ardin M, Jusakul A, Ng AWT, Boot A,
Abedi-Ardekani B, Villar S, Myint SS, et al. 2017. Genome-scale
mutational signatures of aatoxin in cells, mice, and human
tumors. Genome Res 27:14751486.
Ihry RJ, Worringer KA, Salick MR, Frias E, Ho D, Theriault K,
Kommineni S, Chen J, Sondey M, Ye C, et al. 2018. p53 inhibits
CRISPRCas9 engineering in human pluripotent stem cells. Nat
Med 24:939946.
Jain M, Fiddes IT, Miga KH, Olsen HE, Paten B, Akeson M. 2015.
Improved data analysis for the MinION nanopore sequencer. Nat
Methods 12:351356.
Jain M, Tyson JR, Loose M, Ip CLC, Eccles DA, OGrady J, Malla S,
Leggett RM, Wallerman O, Jansen HJ, et al. 2017. MinION analy-
sis and reference consortium: Phase 2 data release and analysis of
R9.0 chemistry. F1000Res 6:760.
Kennedy SR, Salk JJ, Schmitt MW, Loeb LA. 2013. Ultra-sensitive
sequencing reveals an age-related increase in somatic mitochon-
drial mutations that are inconsistent with oxidative damage. PLoS
Genet 9: e1003794.
Kennedy SR, Schmitt MW, Fox EJ, Kohrn BF, Salk JJ, Ahn EH,
Prindle MJ, Kuong KJ, Shen J-C, Risques R-A, et al. 2014.
Detecting ultralow-frequency mutations by duplex sequencing. Nat
Protoc 9:25862606.
Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B. 2011. Detec-
tion and quantication of rare mutations with massively parallel
sequencing. Proc Natl Acad Sci USA 108:95309535.
Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V,
Peshkin L, Weitz DA, Kirschner MW. 2015. Droplet barcoding for
single-cell transcriptomics applied to embryonic stem cells. Cell
161:11871201.
Kohler SW, Provost GS, Fieck A, Kretz PL, Bullock WO, Sorge JA,
Putman DL, Short JM. 1991. Spectra of spontaneous and mutagen-
induced mutations in the lacI gene in transgenic mice. Proc Natl
Acad Sci USA 88:79587962.
Kosugi S, Momozawa Y, Liu X, Terao C, Kubo M, Kamatani Y. 2019.
Comprehensive evaluation of structural variation detection algo-
rithms for whole genome sequencing. Genome Biol 20:117.
Kucab JE, Zou X, Morganella S, Joel M, Nanda AS, Nagy E, Gomez C,
Degasperi A, Harris R, Jackson SP, et al. 2019. A compendium of
Environmental and Molecular Mutagenesis. DOI 10.1002/em
14 9Next-Generation Genotoxicology
mutational signatures of environmental agents. Cell 177:
821836.e16.
Kumar V, Rosenbaum J, Wang Z, Forcier T, Ronemus M, Wigler M,
Levy D. 2018. Partial bisulte conversion for unique template
sequencing. Nucleic Acids Res 46: e10.
Laszlo AH, Derrington IM, Brinkerhoff H, Langford KW, Nova IC,
Samson JM, Bartlett JJ, Pavlenok M, Gundlach JH. 2013. Detection
and mapping of 5-methylcytosine and 5-hydroxymethylcytosine with
nanopore MspA. Proc Natl Acad Sci USA 110:1890418909.
Lehman IR, Bessman MJ, Simms ES, Kornberg A. 1958. Enzymatic syn-
thesis of deoxyribonucleic acid I. preparation of substrates and par-
tial purication of an enzyme from Escherichia coli. J Biol Chem
233:163170.
Lessard S, Francioli L, Alfoldi J, Tardif J-C, Ellinor PT, MacArthur DG,
Lettre G, Orkin SH, Canver MC. 2017. Human genetic variation
alters CRISPR-Cas9 on- and off-targeting specicity at therapeuti-
cally implicated loci. Proc Natl Acad Sci USA 114:
E11257E11266.
Li C, Chng KR, Boey EJH, Ng AHQ, Wilm A, Nagarajan N. 2016. INC-
Seq: Accurate single molecule reads using nanopore sequencing.
GigaSci 5:34.
Li H-H, Chen R, Hyduke DR, Williams A, Frötschl R, Ellinger-
Ziegelbauer H, OLone R, Yauk CL, Aubrecht J, Fornace AJ.
2017a. Development and validation of a high-throughput trans-
criptomic biomarker to address 21st century genetic toxicology
needs. Proc Natl Acad Sci USA 114:E10881E10889.
Li W, Hu J, Adebali O, Adar S, Yang Y, Chiou Y-Y, Sancar A. 2017b.
Human genome-wide repair map of DNA damage caused by the
cigarette smoke carcinogen benzo[a]pyrene. Proc Natl Acad Sci
USA 114:67526757.
Lindahl T. 1993. Instability and decay in the primary structure of DNA.
Nature 362:709715.
Lou DI, Hussmann JA, McBee RM, Acevedo A, Andino R, Press WH,
Sawyer SL. 2013. High-throughput DNA sequencing errors are
reduced by orders of magnitude using circle sequencing. Proc Natl
Acad Sci USA 110:1987219877.
Lynch M. 2010. Rate, molecular spectrum, and consequences of human
mutation. Proc Natl Acad Sci USA 107:961968.
Mao P, Smerdon MJ, Roberts SA, Wyrick JJ. 2016. Chromosomal land-
scape of UV damage formation and repair at single-nucleotide res-
olution. Proc Natl Acad Sci USA 113:90579062.
Mao P, Brown AJ, Malc EP, Mieczkowski PA, Smerdon MJ, Roberts SA,
Wyrick JJ. 2017. Genome-wide maps of alkylation damage, repair,
and mutagenesis in yeast reveal mechanisms of mutational hetero-
geneity. Genome Res 27:16741684.
Marchetti F, Douglas GR, Yauk CL. 2019. A return to the origin of the
EMGS: Rejuvenating the quest for human germ cell mutagens and
determining the risk to future generations. Environ Mol Mutagen
This issue.
Martincorena I, Roshan A, Gerstung M, Ellis P, Van Loo P, McLaren S,
Wedge DC, Fullam A, Alexandrov LB, Tubio JM, et al. 2015.
High burden and pervasive positive selection of somatic mutations
in normal human skin. Science 348:880886.
Martincorena I, Raine KM, Gerstung M, Dawson KJ, Haase K, Van
Loo P, Davies H, Stratton MR, Campbell PJ. 2017. Universal pat-
terns of selection in cancer and somatic tissues. Cell 171:1029.
e211041.e21.
Mattox AK, Wang Y, Springer S, Cohen JD, Yegnasubramanian S,
Nelson WG, Kinzler KW, Vogelstein B, Papadopoulos N. 2017.
Bisulte-converted duplexes for the strand-specic detection and
quantication of rare mutations. Proc Natl Acad Sci USA 114:
47334738.
McCann J, Choi E, Yamasaki E, Ames BN. 1975. Detection of carcino-
gens as mutagens in the Salmonella/microsome test: Assay of
300 chemicals. Proc Natl Acad Sci USA 72:51355139.
McInerney P, Adams P, Hadi MZ. 2014. Error rate comparison during
polymerase chain reaction by DNA polymerase. Mol Biol Int
2014:18.
Metzker ML. 2010. Sequencing technologiesThe next generation. Nat
Rev Genet 11:3146.
Milholland B, Dong X, Zhang L, Hao X, Suh Y, Vijg J. 2017. Differences
between germline and somatic mutation rates in humans and mice.
Nat Commun 8: 15183.
Muller HJ. 1927. Articial transmutation of the gene. Science 66:8487.
Myhr BC. 1991. Validation studies with Mutamouse: A transgenic
mouse model for detecting mutations in vivo. Environ Mol Muta-
gen 18:308315.
Navin N, Kendall J, Troge J, Andrews P, Rodgers L, McIndoo J, Cook K,
Stepansky A, Levy D, Esposito D, et al. 2011. Tumour evolution
inferred by single-cell sequencing. Nature 472:9094.
Naxerova K, Reiter JG, Brachtel E, Lennerz JK, van de Wetering M,
Rowan A, Cai T, Clevers H, Swanton C, Nowak MA, et al. 2017.
Origins of lymphatic and distant metastases in human colorectal
cancer. Science 357:5560.
Ng AWT, Poon SL, Huang MN, Lim JQ, Boot A, Yu W, Suzuki Y,
Thangaraju S, Ng CCY, Tan P, et al. 2017. Aristolochic acids and
their derivatives are widely implicated in liver cancers in Taiwan
and throughout Asia. Sci Transl Med 9: eaan6446.
Ohshima H, Tatemichi M, Sawa T. 2003. Chemical basis of inammation-
induced carcinogenesis. Arch Biochem Biophys 417:311.
Orebaugh CD, Lujan SA, Burkholder AB, Clausen AR, Kunkel TA. 2018.
Mapping ribonucleotides incorporated into DNA by hydrolytic
end-sequencing. Methods Mol Biol 1672:329345.
Parsons BL, Heich RH. 1997. Genotypic selection methods for the direct
analysis of point mutations. Mutat Res 387:97121.
Perera RT, Fleming AM, Johnson RP, Burrows CJ, White HS. 2015.
Detection of benzo[a]pyrene-guanine adducts in single-stranded
DNA using the α-hemolysin nanopore. Nanotechnol 26: 074002.
Perera D, Poulos RC, Shah A, Beck D, Pimanda JE, Wong JWH. 2016.
Differential DNA repair underlies mutation hotspots at active pro-
moters in cancer genomes. Nature 532:259263.
Phillips DH. 2018. Mutational spectra and mutational signatures: Insights
into cancer aetiology and mechanisms of DNA damage and repair.
DNA Repair 71:611.
Quail M, Smith ME, Coupland P, Otto TD, Harris SR, Connor TR,
Bertoni A, Swerdlow HP, Gu Y. 2012. A tale of three next genera-
tion sequencing platforms: Comparison of Ion Torrent, Pacic Bio-
sciences and Illumina MiSeq sequencers. BMC Genomics 13:341.
Rinke C, Lee J, Nath N, Goudeau D, Thompson B, Poulton N,
Dmitrieff E, Malmstrom R, Stepanauskas R, Woyke T. 2014.
Obtaining genomes from uncultivated environmental microorgan-
isms using FACSbased single-cell genomics. Nat Protoc 9:
10381048.
Risques RA, Kennedy SR. 2018. Aging and the rise of somatic cancer-
associated mutations in normal tissues. PLoS Genet 14: e1007108.
Rosenberg AB, Roco CM, Muscat RA, Kuchina A, Sample P, Yao Z,
Graybuck LT, Peeler DJ, Mukherjee S, Chen W, et al. 2018. Sin-
gle-cell proling of the developing mouse brain and spinal cord
with split-pool barcoding. Science 360:176182.
Ross MG, Russ C, Costello M, Hollinger A, Lennon NJ, Hegarty R,
Nusbaum C, Jaffe DB. 2013. Characterizing and measuring bias in
sequence data. Genome Biol 14:R51.
Salk JJ, Horwitz MS. 2010. Passenger mutations as a marker of clonal cell
lineages in emerging neoplasia. Semin Cancer Biol 20:294303.
Environmental and Molecular Mutagenesis. DOI 10.1002/em
150 Salk and Kennedy
Salk JJ, Salipante SJ, Risques RA, Crispin DA, Li L, Bronner MP,
Brentnall TA, Rabinovitch PS, Horwitz MS, Loeb LA. 2009.
Clonal expansions in ulcerative colitis identify patients with neo-
plasia. Proc Natl Acad Sci USA 106:2087120876.
Salk JJ, Schmitt MW, Loeb LA. 2018. Enhancing the accuracy of next-
generation sequencing for detecting rare and subclonal mutations.
Nat Rev Genet 19:269285.
Sancar A, Lindsey-Boltz LA, Ünsal-Kaçmaz K, Linn S. 2004. Molecular
mechanisms of mammalian DNA repair and the DNA damage
checkpoints. Annu Rev Biochem 73:3985.
Sattaur O. 1985. Mutation spectra from a drop of blood. New Scientist 31:20.
Schmitt MW, Kennedy SR, Salk JJ, Fox EJ, Hiatt JB, Loeb LA. 2012.
Detection of ultra-rare mutations by next-generation sequencing.
Proc Natl Acad Sci USA 109:1450814513.
Schreiber J, Wescoe ZL, Abu-Shumays R, Vivian JT, Baatar B,
Karplus K, Akeson M. 2013. Error rates for nanopore discrimina-
tion among cytosine, methylcytosine, and hydroxymethylcytosine
along individual DNA strands. Proc Natl Acad Sci USA 110:
1891018915.
Shibutani S, Takeshita M, Grollman AP. 1991. Insertion of specic bases
during DNA synthesis past the oxidation-damaged base 8-oxodG.
Nature 349:431434.
Sinha S, Cheng K, Leiserson MD, Wilson DM, Ryan BM, Lee JS,
Ruppin E. 2018. A systematic genome-wide mapping of the onco-
genic risks associated with CRISPR-Cas9 editing. bioRxiv 407767.
https://doi.org/10.1101/407767.
Soto AM, Sonnenschein C. 2010. Environmental causes of cancer:
Endocrine disruptors as carcinogens. Nat Rev Endocrinol 6:
363370.
Stadler LJ, Sprague GF. 1936. Genetic effects of ultra-violet radiation in
maize. I. Unltered radiation. Proc Natl Acad Sci USA 22:572578.
The Cancer Genome Atlas. 2015. Genomic classication of cutaneous
melanoma. Cell 161:16811696.
Thompson LH, Fong S, Brookman K. 1980. Validation of conditions for
efcient detection of HPRT and APRT mutations in suspension-
cultured chinese hamster ovary cells. Mutat Res 74:2136.
Thun MJ, Sinks T. 2004. Understanding cancer clusters. CA Cancer J Clin
54:273280.
Travers KJ, Chin C-S, Rank DR, Eid JS, Turner SW. 2010. A exible and
efcient template format for circular consensus sequencing and
SNP detection. Nucleic Acids Res 38: e159.
Tsai SQ, Zheng Z, Nguyen NT, Liebers M, Topkar VV, Thapar V,
Wyvekens N, Khayter C, Iafrate AJ, Le LP, et al. 2015. GUIDE-
seq enables genome-wide proling of off-target cleavage by
CRISPR-Cas nucleases. Nat Biotechnol 33:187197.
Tsai SQ, Nguyen NT, Malagon-Lopez J, Topkar VV, Aryee MJ,
Joung JK. 2017. CIRCLE-seq: A highly sensitive in vitro screen
for genome-wide CRISPRCas9 nuclease off-targets. Nat Methods
14:607614.
Tyler AD, Mataseje L, Urfano CJ, Schmidt L, Antonation KS,
Mulvey MR, Corbett CR. 2018. Evaluation of Oxford Nanopores
MinION sequencing device for microbial whole genome sequenc-
ing applications. Sci Rep 8: 10931.
Vitak SA, Torkenczy KA, Rosenkrantz JL, Fields AJ, Christiansen L,
Wong MH, Carbone L, Steemers FJ, Adey A. 2017. Sequencing
thousands of single-cell genomes with combinatorial indexing. Nat
Methods 14:302308.
Volden R, Palmer T, Byrne A, Cole C, Schmitz RJ, Green RE, Vollmers C.
2018. Improving nanopore read accuracy with the R2C2 method
enables the sequencing of highly multiplexed full-length single-cell
cDNA. Proc Natl Acad Sci USA 115:97269731.
Wang J, Fan HC, Behr B, Quake SR. 2012. Genome-wide single-cell anal-
ysis of recombination activity and de novo mutation rates in human
sperm. Cell 150:402412.
Wang Q, Jia P, Li F, Chen H, Ji H, Hucks D, Dahlman K, Pao W,
Zhao Z. 2013. Detecting somatic point mutations in cancer genome
sequencing data: A comparison of mutation callers. Genome Med
5:91.
Watson JD, Crick FHC. 1953. Molecular structure of nucleic acids: A
structure for deoxyribose nucleic acid. Nature 171:737738.
Wei Z, Wang W, Hu P, Lyon GJ, Hakonarson H. 2011. SNVer: A statis-
tical tool for variant calling in analysis of pooled or individual
next-generation sequencing data. Nucleic Acids Research 39:
e132.
Wild CP. 2008. Environmental exposure measurement in cancer epidemi-
ology. Mutagenesis 24:117125.
Wilm A, Aw PPK, Bertrand D, Yeo GHT, Ong SH, Wong CH, Khor CC,
Petric R, Hibberd ML, Nagarajan N. 2012. LoFreq: A sequence-
quality aware, ultra-sensitive variant caller for uncovering cell-
population heterogeneity from high-throughput sequencing
datasets. Nucleic Acids Res 40:1118911201.
Wolna AH, Fleming AM, An N, He L, White HS, Burrows CJ. 2013.
Electrical current signatures of DNA base modications in single
molecules immobilized in the α-hemolysin ion channel. Isr J Chem
53:417430.
Wood RD. 1996. DNA repair in eukaryotes. Ann Rev Biochem 65:
135167.
Wu J, McKeague M, Sturla SJ. 2018. Nucleotide-resolution genome-wide
mapping of oxidative DNA damage by click-code-Seq. J Am
Chem Soc 140:97839787.
Yuan B, Wang J, Cao H, Sun R, Wang Y. 2011. High-throughput analysis
of the mutagenic and cytotoxic properties of DNA lesions by next-
generation sequencing. Nucleic Acids Res 39:59455954.
Zhang X, Price NE, Fang X, Yang Z, Gu L-Q, Gates KS. 2015. Character-
ization of interstrand DNADNA cross-links using the
α-hemolysin protein nanopore. ACS Nano 9:1181211819.
Zong C, Lu S, Chapman AR, Xie XS. 2012. Genome-wide detection of
single-nucleotide and copy-number variations of a single human
cell. Science 338:16221626.
Accepted by
C. Yauk
Environmental and Molecular Mutagenesis. DOI 10.1002/em
151Next- Generation Genotoxicology
... We propose a global research programme that uses a range of organisms as biosentinels (organisms to assay mutations induced by pollution), where the species chosen would vary in their relevance to humans, prevalence in urban areas, generation time and genomic resources (Fig. 3). A biosentinel programme could detect mutagenic effects even when specific mutagens are difficult to identify 129,130 . Bacteria, plants and human cell lines have all been proposed as urban biosentinels 131 . ...
... ecNGS has potential advantages over classic mutagenesis assays that rely upon reporter genes and genetically modified animals, and been proposed as a promising alternative approach in replacing the traditional mutagenicity assays recommended by the OECD (Dodge et al. 2023). Currently, several ecNGS technologies are available, e.g., the Safe Sequencing System (SafeSeqS) that uses single-stranded molecular barcodes in the tails of PCR primers, Duplex Sequencing (DS) that ligates molecular barcodes to both strands of double-stranded molecules, and PacBio High-Fidelity (HiFi) Sequencing that uses long-read sequencing platforms to repeatedly determine the sequence of both DNA strands of circularized molecules (Revollo et al. 2021;Salk and Kennedy 2020). DS targets an optimized set of twenty 2.4-kb representative genomic regions (a total target size of 48 kb) spread across the autosomes of the human genome and sequences at relatively high depth; whereas HiFi Sequencing examines the majority of the human genome for mutation and sequences at relatively low depth (Cho et al. 2023;Miranda et al. 2022). ...
Article
Full-text available
Human liver-derived metabolically competent HepaRG cells have been successfully employed in both two-dimensional (2D) and 3D spheroid formats for performing the comet assay and micronucleus (MN) assay. In the present study, we have investigated expanding the genotoxicity endpoints evaluated in HepaRG cells by detecting mutagenesis using two error-corrected next generation sequencing (ecNGS) technologies, Duplex Sequencing (DS) and High-Fidelity (HiFi) Sequencing. Both HepaRG 2D cells and 3D spheroids were exposed for 72 h to N-nitrosodimethylamine (NDMA), followed by an additional incubation for the fixation of induced mutations. NDMA-induced DNA damage, chromosomal damage, and mutagenesis were determined using the comet assay, MN assay, and ecNGS, respectively. The 72-h treatment with NDMA resulted in concentration-dependent increases in cytotoxicity, DNA damage, MN formation, and mutation frequency in both 2D and 3D cultures, with greater responses observed in the 3D spheroids compared to 2D cells. The mutational spectrum analysis showed that NDMA induced predominantly A:T → G:C transitions, along with a lower frequency of G:C → A:T transitions, and exhibited a different trinucleotide signature relative to the negative control. These results demonstrate that the HepaRG 2D cells and 3D spheroid models can be used for mutagenesis assessment using both DS and HiFi Sequencing, with the caveat that severe cytotoxic concentrations should be avoided when conducting DS. With further validation, the HepaRG 2D/3D system may become a powerful human-based metabolically competent platform for genotoxicity testing.
... There is currently a lot of interest among Indonesian people in increasingly sophisticated technology (Cheng dkk., 2019;O'Kane dkk., 2018;Valentine dkk., 2020). What makes educators and students today or what is called the 5.0 era (Archer-Brown dkk., 2018;Bradford, 2018;Salk & Kennedy, 2020). At this time technology has become an inseparable part of all aspects of human life. ...
Article
Full-text available
Along with the many advances in technology in the world of education, especially in website-based learning. Student learning achievement can be improved by using the Quizlet application. Quizlet is a platform used to make learning evaluations using the internet network. The purpose of this research is to determine the benefits of the Quizlet application as an interactive quiz to improve student learning achievement. This research uses quantitative methods using surveys and in-depth interviews, the survey was conducted online. The results of this research explain that the Quizlet platform can be used to create online-based interactive quizzes and can improve student achievement. The conclusion of this research explains that in using the applicationQuizlet really helps educators and students in the teaching and learning process, especially in implementing quizzes in learning. The limitation of this research is that the researcher only uses the Quizlet platform as an interactive quiz and the researcher hopes that future researchers can carry out the same research but with a more interesting Quiz application or can be used offline.
... Over the last couple of decades, novel in vitro and in vivo methods and techniques were developed in the scientific discipline genotoxicology, enabling investigators to quantify genotoxicity attributed to exposure to certain compounds [4,5]. Acute or chronic exposure to environmental contaminants is known to be associated with several adverse health conditions, including cancer, impaired immune and reproductive function, as well as imbalanced gastrointestinal microbiota, which regulates a range of host metabolic and immune processes. ...
Article
Full-text available
Humans and animals may be exposed on a continuous daily basis to a mixture of environmental contaminants that may act on several organ systems through differing mechanisms [...]
... Modern ecNGS approaches improve sequencing accuracy and mutation scoring by employing methods for error correction, including DNA molecule labeling with a unique molecular barcode to achieve error-corrected consensus sequencing. Duplex sequencing (DS) (Salk et al., 2018;Salk & Kennedy, 2020;Schmitt et al., 2012) is one of the ultra-high-accuracy ecNGS technologies that is emerging as an alternative approach to conventional assays based on detection of a mutant phenotype. In contrast to other mutation assays, ecNGS can be applied to detect and quantify mutations in any cell or tissue, from any species, and in any genomic location. ...
Article
Quantitative risk assessments of chemicals are routinely performed using in vivo data from rodents; however, there is growing recognition that non‐animal approaches can be human‐relevant alternatives. There is an urgent need to build confidence in non‐animal alternatives given the international support to reduce the use of animals in toxicity testing where possible. In order for scientists and risk assessors to prepare for this paradigm shift in toxicity assessment, standardization and consensus on in vitro testing strategies and data interpretation will need to be established. To address this issue, an Expert Working Group (EWG) of the 8 th International Workshop on Genotoxicity Testing (IWGT) evaluated the utility of quantitative in vitro genotoxicity concentration‐response data for risk assessment. The EWG first evaluated available in vitro methodologies and then examined the variability and maximal response of in vitro tests to estimate biologically relevant values for the critical effect sizes considered adverse or unacceptable. Next, the EWG reviewed the approaches and computational models employed to provide human‐relevant dose context to in vitro data. Lastly, the EWG evaluated risk assessment applications for which in vitro data are ready for use and applications where further work is required. The EWG concluded that in vitro genotoxicity concentration‐response data can be interpreted in a risk assessment context. However, prior to routine use in regulatory settings, further research will be required to address the remaining uncertainties and limitations. This article is protected by copyright. All rights reserved.
Article
Exposure levels without appreciable human health risk may be determined by dividing a point of departure on a dose–response curve (e.g., benchmark dose) by a composite adjustment factor (AF). An “effect severity” AF (ESAF) is employed in some regulatory contexts. An ESAF of 10 may be incorporated in the derivation of a health‐based guidance value (HBGV) when a “severe” toxicological endpoint, such as teratogenicity, irreversible reproductive effects, neurotoxicity, or cancer was observed in the reference study. Although mutation data have been used historically for hazard identification, this endpoint is suitable for quantitative dose–response modeling and risk assessment. As part of the 8th International Workshops on Genotoxicity Testing, a sub‐group of the Quantitative Analysis Work Group (WG) explored how the concept of effect severity could be applied to mutation. To approach this question, the WG reviewed the prevailing regulatory guidance on how an ESAF is incorporated into risk assessments, evaluated current knowledge of associations between germline or somatic mutation and severe disease risk, and mined available data on the fraction of human germline mutations expected to cause severe disease. Based on this review and given that mutations are irreversible and some cause severe human disease, in regulatory settings where an ESAF is used, a majority of the WG recommends applying an ESAF value between 2 and 10 when deriving a HBGV from mutation data. This recommendation may need to be revisited in the future if direct measurement of disease‐causing mutations by error‐corrected next generation sequencing clarifies selection of ESAF values.
Article
Increased risk for the development of hepatocellular carcinoma (HCC) is driven by a number of etiological factors including hepatitis viral infection and dietary exposures to foods contaminated with aflatoxin-producing molds. Intracellular metabolic activation of aflatoxin B1 (AFB1) to a reactive epoxide generates highly mutagenic AFB1-Fapy-dG adducts. Previously, we demonstrated that repair of AFB1-Fapy-dG adducts can be initiated by the DNA glycosylase NEIL1 and that male Neil1−/− mice were significantly more susceptible to AFB1-induced HCC relative to wild-type mice. To investigate the mechanisms underlying this enhanced carcinogenesis, WT and Neil1−/− mice were challenged with a single, 4 mg/kg dose of AFB1 and frequencies and spectra of mutations were analyzed in liver DNAs 2.5 months post-injection using duplex sequencing. The analyses of DNAs from AFB1-challenged mice revealed highly elevated mutation frequencies in the nuclear genomes of both males and females, but not the mitochondrial genomes. In both WT and Neil1−/− mice, mutation spectra were highly similar to the AFB1-specific COSMIC signature SBS24. Relative to wild-type, the NEIL1 deficiency increased AFB1-induced mutagenesis with concomitant elevated HCCs in male Neil1−/− mice. Our data establish a critical role of NEIL1 in limiting AFB1-induced mutagenesis and ultimately carcinogenesis.
Article
This article describes a range of high‐dimensional data visualization strategies that we have explored for their ability to complement machine learning algorithm predictions derived from MultiFlow® assay results. For this exercise, we focused on seven biomarker responses resulting from the exposure of TK6 cells to each of 126 diverse chemicals over a range of concentrations. Obviously, challenges associated with visualizing seven biomarker responses were further complicated whenever there was a desire to represent the entire 126 chemical data set as opposed to results from a single chemical. Scatter plots, spider plots, parallel coordinate plots, hierarchical clustering, principal component analysis, toxicological prioritization index, multidimensional scaling, t‐distributed stochastic neighbor embedding, and uniform manifold approximation and projection are each considered in turn. Our report provides a comparative analysis of these techniques. In an era where multiplexed assays and machine learning algorithms are becoming the norm, stakeholders should find some of these visualization strategies useful for efficiently and effectively interpreting their high‐dimensional data.
Article
Full-text available
Somatic mutations in cancer genomes are caused by multiple mutational processes, each of which generates a characteristic mutational signature1. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium2 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we characterized mutational signatures using 84,729,690 somatic mutations from 4,645 whole-genome and 19,184 exome sequences that encompass most types of cancer. We identified 49 single-base-substitution, 11 doublet-base-substitution, 4 clustered-base-substitution and 17 small insertion-and-deletion signatures. The substantial size of our dataset, compared with previous analyses3–15, enabled the discovery of new signatures, the separation of overlapping signatures and the decomposition of signatures into components that may represent associated—but distinct—DNA damage, repair and/or replication mechanisms. By estimating the contribution of each signature to the mutational catalogues of individual cancer genomes, we revealed associations of signatures to exogenous or endogenous exposures, as well as to defective DNA-maintenance processes. However, many signatures are of unknown cause. This analysis provides a systematic perspective on the repertoire of mutational processes that contribute to the development of human cancer. The characterization of 4,645 whole-genome and 19,184 exome sequences, covering most types of cancer, identifies 81 single-base substitution, doublet-base substitution and small-insertion-and-deletion mutational signatures, providing a systematic overview of the mutational processes that contribute to cancer development.
Article
Full-text available
In utero development represents a sensitive window for the induction of mutations. These mutations may subsequently expand clonally to populate entire organs or anatomical structures. Although not all adverse mutations will affect tissue structure or function, there is growing evidence that clonally expanded genetic mosaics contribute to various monogenic and complex diseases, including cancer. We posit that genetic mosaicism is an underestimated potential health problem that is not fully addressed in the current regulatory genotoxicity testing paradigm. Genotoxicity testing focuses exclusively on adult exposures and may thus not capture the complexity of genetic mosaicisms that contribute to human disease. Numerous studies have shown that conversion of genetic damage into mutations during early developmental exposures can result in much higher mutation burdens than equivalent exposures in adults in certain tissues. Therefore, we assert that analysis of genetic effects caused by in utero exposures should be considered in the current regulatory testing paradigm, which is possible by harmonization with current reproductive/developmental toxicology testing strategies. This is particularly important given the recent proposed paradigm change from simple hazard identification to quantitative mutagenicity assessment. Recent developments in sequencing technologies offer practical tools to detect mutations in any tissue or species. In addition to mutation frequency and spectrum, these technologies offer the opportunity to characterize the extent of genetic mosaicism following exposure to mutagens. Such integration of new methods with existing toxicology guideline studies offers the genetic toxicology community a way to modernize their testing paradigm and to improve risk assessment for vulnerable populations. This article is protected by copyright. All rights reserved.
Article
Full-text available
Mutations induced in somatic cells and germ cells are responsible for a variety of human diseases, and mutation per se has been considered an adverse health concern since the early part of the 20th Century. Although in vitro and in vivo somatic cell mutation data are most commonly used by regulatory agencies for hazard identification, that is, determining whether or not a substance is a potential mutagen and carcinogen, quantitative mutagenicity dose–response data are being used increasingly for risk assessments. Efforts are currently underway to both improve the measurement of mutations and to refine the computational methods used for evaluating mutation data. We recommend continuing the development of these approaches with the objective of establishing consensus regarding the value of including the quantitative analysis of mutation per se as a required endpoint for comprehensive assessments of toxicological risk. This article is protected by copyright. All rights reserved.
Article
Full-text available
Fifty years ago, the Environmental Mutagen Society (now Environmental Mutagenesis and Genomics Society) was founded with a laser‐focus on germ cell mutagenesis and the protection of “our most vital assets” – the sperm and egg genomes. Yet, five decades on, despite the fact that many agents have been demonstrated to induce inherited changes in the offspring of exposed laboratory rodents, there is no consensus on whether human germ cell mutagens exist. We argue that it is time to reevaluate the available data and conclude that we already have evidence for the existence of environmental exposures that impact human germ cells. What is missing are definite data to demonstrate a significant increase in de novo mutations in the offspring of exposed parents. We believe that with over two decades of research advancing knowledge and technologies in genomics, we are at the cusp of generating data to conclusively show that environmental exposures cause heritable de novo changes in the human offspring. We call on the research community to harness our technologies, synergize our efforts, and return to our Founders' original focus. The next 50 years must involve collaborative work between clinicians, epidemiologists, genetic toxicologists, genomics experts and bioinformaticians to precisely define how environmental exposures impact germ cell genomes. It is time for the research and regulatory communities to prepare to interpret the coming outpouring of data and develop a framework for managing, communicating and mitigating the risk of exposure to human germ cell mutagens. This article is protected by copyright. All rights reserved.
Article
Full-text available
Cancer driver mutations (CDMs) are necessary and causal for carcinogenesis and have advantages as reporters of carcinogenic risk. However, little progress has been made toward developing measurements of CDMs as biomarkers for use in cancer risk assessment. Impediments for using a CDM‐based metric to inform cancer risk include the complexity and stochastic nature of carcinogenesis, technical difficulty in quantifying low‐frequency CDMs and lack of established relationships between cancer driver mutant fractions and tumor incidence. Through literature review and database analyses, this review identifies the most promising targets to investigate as biomarkers of cancer risk. Mutational hotspots were discerned within the 20 most mutated genes across the ten deadliest cancers. Forty genes were identified that encompass 108 mutational hotspot codons overrepresented in the COSMIC database; 424 different mutations within these hotspot codons account for approximately 63,000 tumors and their prevalence across tumor types is described. The review summarizes literature on the prevalence of CDMs in normal tissues and suggests such mutations are direct and indirect substrates for chemical carcinogenesis, which occurs in a spatially‐stochastic manner. Evidence that hotspot CDMs (hCDMs) frequently occur as tumor subpopulations is presented, indicating COSMIC data may underestimate mutation prevalence. Analyses of online databases show that genes containing hCDMs are enriched in functions related to intercellular communication. In its totality, the review provides a roadmap for the development of tissue‐specific, CDM‐based biomarkers of carcinogenic potential, which are comprised of batteries of hCDMs and can be measured by error‐correct next‐generation sequencing. This article is protected by copyright. All rights reserved.
Article
Full-text available
Background: Structural variations (SVs) or copy number variations (CNVs) greatly impact the functions of the genes encoded in the genome and are responsible for diverse human diseases. Although a number of existing SV detection algorithms can detect many types of SVs using whole genome sequencing (WGS) data, no single algorithm can call every type of SVs with high precision and high recall. Results: We comprehensively evaluate the performance of 69 existing SV detection algorithms using multiple simulated and real WGS datasets. The results highlight a subset of algorithms that accurately call SVs depending on specific types and size ranges of the SVs and that accurately determine breakpoints, sizes, and genotypes of the SVs. We enumerate potential good algorithms for each SV category, among which GRIDSS, Lumpy, SVseq2, SoftSV, Manta, and Wham are better algorithms in deletion or duplication categories. To improve the accuracy of SV calling, we systematically evaluate the accuracy of overlapping calls between possible combinations of algorithms for every type and size range of SVs. The results demonstrate that both the precision and recall for overlapping calls vary depending on the combinations of specific algorithms rather than the combinations of methods used in the algorithms. Conclusion: These results suggest that careful selection of the algorithms for each type and size range of SVs is required for accurate calling of SVs. The selection of specific pairs of algorithms for overlapping calls promises to effectively improve the SV detection accuracy.
Article
Full-text available
Whole-genome-sequencing (WGS) of human tumors has revealed distinct mutation patterns that hint at the causative origins of cancer. We examined mutational signatures in 324 WGS human-induced pluripotent stem cells exposed to 79 known or suspected environmental carcinogens. Forty-one yielded characteristic substitution mutational signatures. Some were similar to signatures found in human tumors. Additionally, six agents produced double-substitution signatures and eight produced indel signatures. Investigating mutation asymmetries across genome topography revealed fully functional mismatch and transcription-coupled repair pathways. DNA damage induced by environmental mutagens can be resolved by disparate repair and/or replicative pathways, resulting in an assortment of signature outcomes even for a single agent. This compendium of experimentally induced mutational signatures permits further exploration of roles of environmental agents in cancer etiology and underscores how human stem cell DNA is directly vulnerable to environmental agents. Video Abstract: The effects of a range of environmental mutagens in terms of the kinds of mutations they induce and how these are repaired by the cell is presented in the form of a resource.
Article
A mutagenesis moonshot addressing the influence of the environment on our genetic wellbeing was launched just two months before astronauts landed on the moon. Its impetus included the discovery that X‐rays (Muller, 1927) and chemicals (Auerbach and Robson, 1947) were germ‐cell mutagens, the introduction of a growing number of untested chemicals into the environment after World War II, and an increasing awareness of the role of environmental pollution on human health. Due to mounting concern from influential scientists that germ‐cell mutagens might be ubiquitous in the environment, Alexander Hollaender and colleagues founded in 1969 the Environmental Mutagen Society (EMS), now the Environmental Mutagenesis and Genomics Society (EMGS); Frits Sobels founded the European EMS in 1970. As Fred de Serres noted, such societies were necessary because protecting populations from environmental mutagens could not be addressed by existing scientific societies, and new multi‐disciplinary alliances were required to spearhead this movement. The nascent EMS gathered policy makers and scientists from government, industry, and academia who became advocates for laws requiring genetic toxicity testing of pesticides and drugs and helped implement those laws. They created an electronic database of the mutagenesis literature; established a peer‐reviewed journal; promoted basic and applied research in DNA repair and mutagenesis; and established training programs that expanded the science worldwide. Despite these successes, one objective remains unfulfilled: identification of human germ‐cell mutagens. After 50 years, the voyage continues, and a vibrant EMGS is needed to bring the mission to its intended target of protecting populations from genetic hazards. This article is protected by copyright. All rights reserved.