ArticlePDF AvailableLiterature Review

Next‐Generation Genotoxicology: Using Modern Sequencing Technologies to Assess Somatic Mutagenesis and Cancer Risk

October 2019
Environmental and Molecular Mutagenesis 61(1)

October 2019
61(1)

License
CC BY 4.0

Authors:

Mutations have a profound effect on human health, particularly through an increased risk of carcinogenesis and genetic disease. The strong correlation between mutagenesis and carcinogenesis has been a driving force behind genotoxicity research for more than 50 years. The stochastic and infrequent nature of mutagenesis makes it challenging to observe and to study. Indeed, decades have been spent developing increasingly sophisticated assays and methods to study these low frequency genetic errors, in hopes of better predicting which chemicals may be carcinogens, understanding their mode of action, and informing guidelines to prevent undue human exposure. While effective, widely used genetic selection‐based technologies have a number of limitations that have hampered major advancements in the field of genotoxicity. Emerging new tools, in the form of enhanced next generation sequencing platforms and methods, are changing this paradigm. In this review, we discuss rapidly evolving sequencing tools and technologies, such as error‐corrected sequencing and single cell analysis, that we anticipate will fundamentally reshape the field. In addition, we consider a variety emerging applications for these new technologies, including the detection of DNA adducts, inference of mutational processes based on genomic site and local sequence contexts, and evaluation of genome engineering fidelity, as well as other cutting‐edge challenges for the next 50 years of environmental and molecular mutagenesis research. This article is protected by copyright. All rights reserved.

The genesis of cancer. Cancer exists on a continuum. Mutations arise as a result of repair and replication errors due to endogenous processes and environmental factors. These mutations are the substrate for neoplastic clonal evolution: those that confer a proliferative or survival advantage upon the host cell will be naturally selected. Carcinogens

…

Analog vs. digital DNA sequencing. A common need in genetic toxicology is to identify mutations in cell populations. The appropriateness of the sequencing technology depends on mutational clonality. (A) Clonal mutations are those present in all or most cells in a tissue (gray), whereas subclonal mutations (colors) are present in only a subset. (B) When DNA is extracted from a tissue, a mutation's clonality is reflected in the isolated molecules that are then (C) prepared for sequencing. (D) With traditional Sanger sequencing, all molecules from the same genomic region are genotyped together en masse in a capillary system, which produces an analog output (electropherogram tracing) that is the average of many

…

Approaches for assessing mutational signatures. Mutational spectra, particularly polynucleotide mutational signatures, provide important mechanistic insights into mutational processes. Most of what we know about these patterns has come from natural or artificial means of single cell cloning. (A) Exome or whole-genome sequencing of tumor populations reflects the somatic processes operative in the founding cell of the most recent clonal sweep. (B) Single cells can be cloned from cultured populations exposed to known or suspected mutagens to assess their mutational signatures (C) The clonal variants present in individuals that were not present in their parents reflects the state of mutational processes during gametogenesis or early embryogenesis. (D) Sequencing of cloned cells or molecules from certain selection-based mutagenicity assays can be used similarly, although the patterns may be distorted by the selection system itself. (E) With ecNGS, it is now possible to obtain mutational spectra by directly sequencing DNA from any tissue of any organism.

…

Canary-in-a-coal-mine: a century later. A hundred years ago, at the suggestion of John Scott Haldane, caged canaries were routinely brought into British coal mines as an early warning sign of humanrelevant toxic gases. Although their routine use ceased in the 1980s, the broader concept of using sentinel species to infer the presence of environmental hazards remains highly germane in modern genetic toxicology. Should it have been possible to collect and analyze a DNA sample from one of Haldane's birds using modern ecNGS techniques, it is quite likely that the mutagenic signature of benzo[a]pyrene could have been identified and used to inform efforts to mitigate the environmental cancer risk. Other naturally present sentinel organisms, including humans themselves, can be similarly used.

…

Figures - available via license: Creative Commons Attribution 4.0 International

Content may be subject to copyright.

Available via license: CC BY

Content may be subject to copyright.

Review

Next-Generation Genotoxicology: Using Modern Sequencing

Technologies to Assess Somatic Mutagenesis and Cancer Risk

Jesse J. Salk

1, 2

and Scot t R.Kennedy

Department of Medicine, Division of Medical Oncology, University of Washington School of

Medicine, Seattle, Washington

TwinStrand Biosciences, Seattle, Washington

Department of Pathology, University of Washington, Seattle, Washington

Mutations have a profound effect on human health,

particularly through an increased risk of carcinogen-

esis and genetic disease. The strong correlation

between mutagenesis and carcinogenesis has been

a driving force behind genotoxicity research for

more than 50 years. The stochastic and infrequent

nature of mutagenesis makes it challenging to

observe and to study. Indeed, decades have been

spent developing increasingly sophisticated assays

and methods to study these low-frequency genetic

errors, in hopes of better predicting which chemicals

may be carcinogens, understanding their mode of

action, and informing guidelines to prevent undue

human exposure. While effective, widely used

genetic selection-based technologies have a number

of limitations that have hampered major advance-

ments in the ﬁeld of genotoxicity. Emerging new

tools, in the form of enhanced next-generation

sequencing platforms and methods, are changing

this paradigm. In this review, we discuss rapidly

evolving sequencing tools and technologies, such as

error-corrected sequencing and single cell analysis,

which we anticipate will fundamentally reshape the

ﬁeld. In addition, we consider a variety emerging

applications for these new technologies, including

the detection of DNA adducts, inference of muta-

tional processes based on genomic site and local

sequence contexts, and evaluation of genome engi-

neering ﬁdelity, as well as other cutting-edge chal-

lenges for the next 50 years of environmental and

molecular mutagenesis research. Environ. Mol.

Environmental and Molecular Mutagenesis published by Wiley

Periodicals, Inc. on behalf of Environmental Mutagen Society.

Key words: chemical carcinogenesis; cancer riskassessment; in vivo mutation; error- corrected NGS; consensus

sequencing; single - cell sequencing; single molecule sequencing

INTRODUCTION

Exposure to environmental factors has been known to

alter the genetic makeup of organisms since the seminal

work by Hermann Muller in 1927 showing that Drosoph-

ila exposed to X-rays led to new heritable traits (Muller

1927). Other environmental factors, including ultraviolet

light and reactive chemicals, were reported soon after

(Stadler and Sprague 1936; Auerbach et al. 1947). It

wasn’t until the publication of the structure of DNA in

1953, and the subsequent description of DNA polymer-

ases that a mechanism linking environmental exposures to

mutagenesis and heritable changes became fully apparent

(Watson and Crick 1953; Bessman et al. 1958; Lehman

et al. 1958). The ensuing years led to a rapid expansion of

studies to catalog and better understand environmental

mutagens. By the mid-1970’s, experiments in rodent

models indicated that the majority of known mutagens

were, in fact, carcinogenic (McCann et al. 1975). Because

of the strong link, as well as the desire to save both time

and money, evaluating the mutagenic potential of a

compound has become a de facto surrogate for carcinoge-

nicity (Fig. 1). A detailed treatment of the regulatory

aspects of this important subject area is provided else-

where in this issue (Heﬂich et al. 2020).

Grant sponsor: National Institute of Environmental Health Sciences; Grant

number: R44ES030642.

Grant sponsor: National Institute of Justice; Grant number: 2017-DN-

BX- 0160.

Grant sponsor: Safeway/Albertsons Early Career Award For Cancer

Research; Grant number: n/a.

Grant sponsor: U.S. Department of Defense; Grant number: W81XWH-

18-1-0339.

*Correspondence to: Scott R. Kennedy, Department of Pathology, Uni-

versity of Washington, Seattle, WA.

E-mail: scottrk@uw.edu

Received 9 August 2019; Revised 20 September 2019; Accepted 25

September 2019

DOI: 10.1002/em.22342

Published online 8 October 2019 in

Wiley Online Library (wileyonlinelibrary.com).

Environmental and Molecular Mu tagenesis 61:135^151 (2020)

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any

medium, provided the original work is properly cited.

A number of key technologies have been developed

over the past 50 years to quantify genotoxicity in both

in vitro and in vivo settings. The spontaneous mutation

rate in normal somatic mammalian cells is estimated to

be in the range of 10

−8

–10

−9

mutations per nucleotide

per cell division (Lynch 2010). Directly detecting these

rare events at the DNA sequence level is technically chal-

lenging (Milholland et al. 2017)—the molecular equiva-

lent of “Where’s Waldo?”(Handford 2007). Not only

does one need to screen a very large number of nucleo-

tides cells to obtain a reasonable statistical conﬁdence of

mutant frequencies, but the method for detecting muta-

tions must also have an error rate below the true mutant

frequency.

To circumvent these challenges, most standard mutagen-

esis assays rely on some means of biological enrichment,

whereby mutations are detected by a selectable phenotype

they create. While the speciﬁcs differ, the general approach

relies on exposing bacterial or mammalian cells to a puta-

tive mutagen and then quantifying the ratio of cells harbor-

ing a mutation in a selectable marker to the number of cells

present in the absence of selection. In vitro selection-based

mutagenesis assays include the classic Ames assay and sev-

eral mammalian cell culture-based mutation tests, such as

HPRT and APRT (Ames et al. 1973; Thompson et al.

1980). While highly effective, in vitro assays have several

limitations that make them imperfect surrogates for human

toxicology, including differences in metabolic activation/

inactivation of the tested compound, the use of only a small

number of cell types, and continuous cellular proliferation

that can result in potential “jackpot”events. In vivo assays

include transgenic rodent models, such as the MutaMouse

and the BigBlue mouse/rat assays which involve multistep

transfer of DNA from mutagen-exposed rodents into phage

and then into bacteria (Kohler et al. 1991; Myhr 1991). By

taking advantage of the in vivo context, transgenic animals

solve some of the issues inherent to the in vitro assays. As

a testament to their utility, these selection based assays are

still widely used decades after their initial development. A

history detailing the importance of these technologies is

provided in this issue by DeMarini (DeMarini 2019).

While these methods are ubiquitous in both research and

regulatory settings, reliance on selection to quantify muta-

genesis comes at a cost. The nuclear genome is a dynamic

system with spatially heterogeneous levels of biomolecular

activity, such as transcription, chromatin accessibility, adja-

cent nucleotide context, and DNA repair which strongly

modulate susceptibility to mutagenesis across the genome

(Hodgkinson and Eyre-Walker 2011). Most such assays

rely on a single reporter locus that is often artiﬁcially intro-

duced. Furthermore, the number of possible mutations that

render a selectable phenotype may be limited in some

cases, leading to an underestimation bias arising from the

inability to observe variants that result in no phenotypic

changes (eg, synonymous mutations). Lastly, selectable

markers are not always portable between different

0 YEARS 20

Metastasis

DNA Damage

Genotoxicity Preneoplastic biology Oncology

Mutation Self-Sufficient

Growth

Resistance

to apoptosis

Invasion Tumor Mass

Fig. 1. The genesis of cancer. Cancer exists on a continuum. Mutations

arise as a result of repair and replication errors due to endogenous

processes and environmental factors. These mutations are the substrate for

neoplastic clonal evolution: those that confer a proliferative or survival

advantage upon the host cell will be naturally selected. Carcinogens

promote tumorigenesis by increasing the rate of mutation or by enhancing

net-positive selection. Given the often impractically long lag-time between

a carcinogenic insult and overt tumor formation, technologies that are able

to sensitively detect DNA damage, mutation induction, and clonal

outgrowths are essential tools in a genetic toxicologist’s armamentarium.

Environmental and Molecular Mutagenesis. DOI 10.1002/em

136 Salk and Kennedy

experimental systems and are currently limited to a few

common organismal models.

Technologies that directly identify mutations in DNA of

primary tissue samples without necessitating a multistep

selection and cloning process would open up opportunities

to identify mutagenic compounds in a more unbiased man-

ner. One such method is the Pig-a assay (Bryce et al.

2008). This assay uses ﬂow cytometry to rapidly screen

millions of cells for those that lack expression of a particu-

lar nonessential surface protein due to inactivating muta-

tions. Helpfully, this approach can be applied to both

humans and model organisms, but generally only to red

blood cells, limiting its applicability to the other tissues in

the body and making it difﬁcult to conﬁrm the exact nature

of the mutations themselves (mature red blood cells are

enucleate).

Several sensitive biochemical assays for mutation detec-

tion have been developed, often based on resistance to

endonuclease cleavage or allele-speciﬁc PCR. While

extremely sensitive, these methods are either too low-

throughput or excessively narrow in scope (ie, interrogate

only one or a few bases) to gain wide usage (Parsons and

Heﬂich 1997; Bielas and Loeb 2005). Thus, until the

advent of modern next-generation sequencing (NGS), also

referred to as massively parallel sequencing, selection-

based assays have been the dominant technology for evalu-

ating mutagenesis.

Beginning in approximately 2005, NGS has revolution-

ized many of ﬁelds of life science, including cancer biol-

ogy, population genetics, evolutionary biology, and cellular

biology. There are a several commercially available NGS

platforms that differ in their underlying approaches to

obtaining sequence information, but all share the ability to

simultaneously obtain this information from tens of thou-

sands to billions of individual DNA templates. Conse-

quently, it is now possible to obtain data on a genome-

wide scale. In addition, NGS technologies are read-based.

This “digital tabulation”approach differs from conven-

tional Sanger sequencing methods by obtaining the nucleo-

tide sequence of many individual DNA molecules, thus

enhancing the ability to detect minor mutant populations

within a heterogeneous DNA mixture which is generally

the context in which somatic mutagenesis occurs (Metzker

2010; Fig. 2).

The distinct advantages offered by NGS will revolution-

ize environmental mutagenesis and toxicology by overcom-

ing past limitations and providing new opportunities for

study. Despite its transformative potential, NGS has only

recently gained attention in this ﬁeld, as several key techni-

cal hurdles have now been overcome. In this review, we

discuss the advances in modern DNA sequencing technolo-

gies that are enhancing the ability to detect low-frequency

mutagenic events and DNA damage. We review cutting

edge applications that are currently being facilitated by

these new technologies and others we see on the horizon.

NEXT-GENERATION SEQUENCING TECHNOLOGIES

In genetic toxicology, most applications of NGS to date

have focused on augmenting and enhancing the throughput

of well-established genotoxicity assays—for example,

increasing the throughput of sequencing of mutant shuttle

vectors or plaques from transgenic models (Yuan et al.

2011; Besaratinia et al. 2012; Beal et al. 2015; Chang et al.

2015). Other applications have included non-mutational

assessments of genetic toxicology, such as epigenetic and

transcriptional changes, induced by chemical exposure

(Chauhan et al. 2016; Li et al. 2017a), as well as the whole-

genome detection of environmentally induced de novo muta-

tions in offspring of exposed individuals (Reviewed in

[Marchetti et al. 2019; Godschalk et al. 2019]).

However, neither of these cases fully realize the aspira-

tional goal of being able to directly measure genotoxin-

induced DNA mutations in any tissue type of any organ-

ism. This is because modern sequencing platforms are not

without their limitations. Given the random nature of geno-

toxic insults, genetic toxicology assessment in the absence

of biological selection generally necessitates being able to

detect low-frequency somatic mutations in a large popula-

tion of non-mutant DNA molecules. In theory, DNA sub-

populations of any size should be detectable by NGS when

assessing a sufﬁcient number of molecules. However, while

notably better than Sanger sequencing, standard NGS plat-

forms still generate errors at a substantial rate. Mistakes

arising during DNA preparation, ampliﬁcation, cluster gen-

eration, and the many steps of sequencing itself typically

result in ~1% artifactual bases, and this background can be

signiﬁcantly higher in certain sequence contexts (reviewed

in (Salk et al. 2018)). In contrast, the biological mutation

frequency of even heavily mutagenized animals is on the

order of one mutation per million nucleotides. Therefore, to

detect chemically induced somatic mutations, far more sen-

sitive NGS technologies are needed.

Error- Corrected Next- Generation Sequencing

Several approaches have been employed to improve the

accuracy of NGS. Initial efforts to reduce the technical

error rate of NGS focused on bioinformatic ﬁltering of

low-conﬁdence sequences. For example, a number of vari-

ant calling tools ﬁlter the data based on the distribution of

variants with the sequencing reads or require variants to be

seen in multiple independent sequencing reads in both read

orientations (Wang et al. 2013). More recently, statistical

approaches have been speciﬁcally developed to improve

variant calling by modeling the error proﬁle of speciﬁc

sequencing platforms (Wei et al. 2011; Wilm et al. 2012).

These bioinformatic approaches allow for the detection of

variants to mutant fraction of ~0.5%. This level of sensitiv-

ity is effective for clonally expanded mutations (such as

those arising in the germ line or found in tumors) but is

Environmental and Molecular Mutagenesis. DOI 10.1002/em

137Next-Generation Genotoxicology

Fig. 2. Analog vs. digital DNA sequencing. A common need in genetic

toxicology is to identify mutations in cell populations. The appropriateness

of the sequencing technology depends on mutational clonality. (A) Clonal

mutations are those present in all or most cells in a tissue (gray), whereas

subclonal mutations (colors) are present in only a subset. (B) When DNA is

extracted from a tissue, a mutation’s clonality is reﬂected in the isolated

molecules that are then (C) prepared for sequencing. (D) With traditional

Sanger sequencing, all molecules from the same genomic region are

genotyped together en masse in a capillary system, which produces an

analog output (electropherogram tracing) that is the average of many

different DNA molecules. (E) Generally only substantially clonal mutations

can be reliably detected. (F) In contrast, next-generation sequencing

operates by massively parallel sequencing of millions of individual

molecules digitally. On the widely used Illumina sequencing-by-synthesis

platform, this is accomplished by ﬂowing ﬂuorescently labeled nucleotides

across a surface coated with small biochemically generated colonies of

individual molecules (clusters), and recording the sequence of colors of each

cluster through multiple cycles of addition. (G) The resulting output is not a

single sequence, but millions of individual ones that reﬂect both clonal and

subclonal mutations down to approximately 1% abundance.

Environmental and Molecular Mutagenesis. DOI 10.1002/em

138 Salk and Kennedy

still orders of magnitude above the spontaneous mutant fre-

quency of DNA (Martincorena et al. 2015, 2017).

In addition to bioinformatic ﬁltering, enzymatic removal

of DNA damage has been shown to reduce the number of

false variant calls in NGS. For example, 8-oxo-dG and

cytidine deamination, two of the most common DNA dam-

aging events, can be biochemically removed with the

damage-speciﬁc glycosylases FPG and UDG, respectively.

Combinations of glycosylases with other repair enzymes

can further repair damage-induced artifacts (Chen et al.

2017b), yet not all mutagenic lesions are recognized by

these enzymes, nor is the ﬁdelity of in vitro repair perfect,

and the possibility exists that these approaches introduce

new errors at low levels.

The approach to error-corrected next-generation sequenc-

ing (ecNGS) that has, thus far, proven the most signiﬁcant

for improving accuracy is consensus-based error correction

(Fig. 3). The technique relies on the general concept of

grouping reads that are copies derived from an original

DNA molecule and then bioinformatically creating a con-

sensus sequence from the related molecules. An important

aspect of this approach is the need to identify related reads,

which can be accomplished by the use of a uniquely identi-

fying “molecular barcode”(also referred to as “unique

molecular identiﬁer”(UMI), “single molecule identiﬁer”,

or simply a “tag”) for each original DNA fragment that will

be propagated to all daughter molecules during ampliﬁca-

tion and sequencing. Molecular barcodes can be comprised

of unique fragmentation shear points, exogenously intro-

duced degenerate DNA sequences, or a combination of the

two. Importantly, they must provide enough sequence

diversity to minimize the probability that two independent

molecules will share the same molecular barcode by

chance.

Several groups introduced the idea of using molecular

barcodes to correct sequencing-based errors, but these ini-

tial studies focused on non-variant detection applications,

such as read assembly and molecular counting (Hiatt et al.

2010; Casbon et al. 2011; Fu et al. 2011). With the publi-

cation of the SafeSeqS method, Kinde et al. deﬁnitively

introduced the idea of using molecular barcoding for

improving the accuracy of mutation detection by applying

single-stranded molecular barcodes in the tails of PCR

primers, reducing the error rate to ~10

−5

(Kinde et al.

2011; Fig. 3A). A number of variations on this concept

have been published, including single-molecule molecular

inversion probes (Hiatt et al. 2013), circular sequencing

(Lou et al. 2013), and CypherSeq (Gregory et al. 2016),

among others. Consensus-making techniques that label just

one strand of original double-stranded molecules or cannot

distinguish the identity of the two strands markedly reduce

sequencer-based artifacts, such as base calling errors and

ampliﬁcation errors introduced during cluster generation,

thereby reducing the methodological background by two to

three orders of magnitude and making it possible to

conﬁdently identify rare variants at ~0.1% abundance (Salk

et al. 2018).

However, methods relying on single-stranded tagging

are fundamentally limited by base selectivity of DNA poly-

merases which, at best, have error rates of ~10

−6

(McInerney et al. 2014). Of particular relevance is the ele-

vated rate of misincorporations at sites of mutagenic DNA

damage. For example, the presence of 8-oxo-dG adducts or

deaminated cytosine bases (dU) dramatically increases the

misincorporation rate of polymerases upon traversal of the

lesion (Shibutani et al. 1991; Lindahl 1993). These mis-

incorporation events can be propagated to daughter mole-

cules during PCR, making it difﬁcult to distinguish

between artifacts induced by chemical adducts and bona

ﬁde variants occurring at dC and dG bases. Moreover, dif-

ferent DNA adducts are repaired with vastly different efﬁ-

ciencies by the cell (Wood 1996). Thus, with these

methods, experiments involving mutagen exposure run the

risk of detecting the presence of both adducts and true

mutations. Given that mammalian cells are quite adept at

recognizing and repairing adducts in vivo, it is incorrect to

equate adducts with mutations (the vast majority will be

repaired before mutation occurs in vivo). Cumulatively,

these factors contribute to a practical detection limit of

~10

−4

–10

−5

, depending on DNA quality and experimental

conditions (reviewed in (Salk et al. 2018)). This is excel-

lent for many applications but does not reach the accuracy

threshold needed for direct mutagenesis assessment.

Some mutagenic compounds are capable of increasing

the mutation frequency of DNA by ~1000-fold or more.

However, because the spontaneous mutation frequency of

the mammalian nuclear genome is normally very low

(on the order of one-per-10-million base pairs), even a

1000-fold increase is still below what is reliably detectable

by single-strand UMI-based methods. Extending the con-

cept of molecular barcoding to include asymmetric double-

stranded UMIs allows for the sequencing information

derived from complementary strands of original double-

stranded to be compared for an additional level of error

correction. Double-stranded consensus calling requires

uniquely identifying each original DNA molecule (ie, a

unique molecular identiﬁer) and its constituent strands (ie,

a strand-deﬁning element) in a way that allows the

sequences to be related to each other. Duplex Sequencing

was the ﬁrst method to use double-stranded consensuses to

remove both sequencer and early PCR derived errors

(Schmitt et al. 2012; Kennedy et al. 2014; Fig. 3B). A

number of derivative approaches, including BiSeqS

(Mattox et al. 2017), muSeq (Kumar et al. 2018), and

BotSeqS (Hoang et al. 2016), have been developed that

establish molecular barcodes and strand-deﬁning elements

via partial bisulﬁte treatments or random shear points in

conjunction with ultra-low genome coverage. With all these

approaches, the theoretical error rate of double-strand con-

sensus methods is estimated to be ~10

−9

, which roughly

Environmental and Molecular Mutagenesis. DOI 10.1002/em

139Next-Generation Genotoxicology

(Wild-type)

(Mutation)

(Error)

Tagging by

primer-extension PCR amplication Grouping duplicates Consensus making

A Safe Sequencing System (SafeSeqS)

(Wild-type)

(Mutation)

Starting

population PCR amplication

Single-strand

consensus making

B Duplex Sequencing

Ligate Duplex-

tagged adapters Group duplicates

(Wild-type)

Duplex

consensus making

C 2D Nanopore Sequencing

D PacBio Circular Consensus Sequencing

Ligate hairpin adapters

to circulize

Zero mode waveguide sequencing

of closed loo

Hairpin ligation Nanopore sequencing Comparing duplex

strand-pair sequences

(Mutation)

Consensus making

(Mutation)

Consensus making

Comparing tandem

strand-

air se

uences

(Fluorescence)

Starting

population

Fig. 3. Techniques for error corrected DNA sequencing (ecNGS). The

highest accuracy NGS methods rely on sequencing-by-consensus, whereby

data from multiple sequence reads derived from an original molecule are

combined to reduce the impact of sequencing or sample preparation errors in

each read. (A) The SafeSeqS approach uses random molecular barcodes

appliedtoPCRprimerstouniquelytagPCR amplicons, which are then further

ampliﬁed and sequenced. Variation within the sequence of reads with identical

tags can be discounted as technical artifacts (X’s). Some errors that occur

during the ﬁrst extension cycle may escape correction (triangles). (B) Duplex

Sequencing relies on ligation to apply molecular barcodes to both strands of

original double-stranded molecules. These are used alone or in combination

with fragmentation points to uniquely label both strands such that derivative

sequence reads from each strand can be directly related back to their founder

strand and compared to those from its complement. The method is signiﬁcantly

more accurate that single-stranded consensus-making methods but is more

sequencing-intensive. (C) 2D sequencing on nanopore platforms uses physical

linkage of the two strands of an original duplex, which are then sequenced

together without the need for ampliﬁcation. The method is fast and simple, but

nanopore platforms are lower accuracy and throughput than more widely used

sequencing-by-synthesis platforms. (D) Circular Consensus Sequencing on the

PacBio single-molecule platform similarly links the two strands of an original

double-stranded with hairpins to allow multiple sequencing passes across both

original strands. As with 2D, lower raw platform accuracy and throughput are

drawbacks but very long reads can be obtained.

Environmental and Molecular Mutagenesis. DOI 10.1002/em

140 Salk and Kennedy

reﬂects the square of the error rate for single-strand molec-

ular barcoding methods. Duplex methods have been used

by a number of groups to study the occurrence of muta-

tions arising from a number of genotoxic species, including

smoking, aﬂatoxin, aristolochic acid, urethane, benzo[a]

pyrene, and reactive oxygen species (Kennedy et al. 2013;

Hoang et al. 2016; Chawanthayatham et al. 2017).

Single -Cell Sequencing Technologies

Typical NGS protocols rely on fragmenting the genomes

of thousands of cells. The result is a mixture of contribut-

ing cellular genotypes when the underlying population is

heterogeneous. In such situations, ecNGS approaches are

needed to detect these rare variants in the sea of wild-type

sequences if their abundance is below approximately 1%.

However, the creation of a heterogeneous mixture of DNA

fragments from many different genomes eliminates the

ability to identify variants to within the same cell, poten-

tially underestimating the mutagenic potential of a com-

pound that may only bio-accumulate in certain cell types

(or cell division states). Sequencing the DNA from single

cells overcomes this problem and ensures that observed

mutations came from the same cell.

Typical single-cell sequencing (SCS) protocols require

isolation of individual cells followed by lysis and usually

some form of whole-genome ampliﬁcation to generate

enough DNA for sequencing (Zong et al. 2012; Fu et al.

2015; Dong et al. 2017; Chen et al. 2017a). Somatic muta-

tions would typically be heterozygous (absent recombina-

tion events or loss of heterozygosity) and expected to be

present in 50% of reads mapping to the genomic position

of interest. SCS methods have been able to successfully

detect structural variants (Wang et al. 2012), copy-number

variations (Navin et al. 2011), and single nucleotide vari-

ants (Dong et al. 2017) on a genome wide scale. To date,

SCS approaches have not been widely deployed to evalu-

ate genotoxicity at the single-cell level. However, recent

work by the Vijg group demonstrated the ability of SCS

to detect mutations induced by mutagenic exposure with

N-ethyl-N-nitrosourea, indicating its potential utility

(Dong et al. 2017).

Another barrier to deploying SCS for genotoxicity appli-

cations is throughput and, by extension, cost. In response

to the need for more high-throughput methods, microﬂuidic

sorting of cells (Rinke et al. 2014), nano-well technologies

(Gierahn et al. 2017), and emulsion droplet partitioning

technologies (Klein et al. 2015) have been developed and

have increased throughput up to ~10,000 cells. A promis-

ing new approach to massively parallel SCS, termed com-

binatorial cellular indexing, uses intact ﬁxed cells or nuclei

as “reaction vessels”to physically partition the nucleic

acids of interest. A unique combination of DNA sequences

(ie, a cellular index) are enzymatically introduced to all the

nucleic acids present within each cell/nucleus, a technique

sometimes referred to as “combinatorial indexing”or

“split-pool barcoding.”Because all sequencing reads

derived from nucleic acids from the same cell share the

same cell-speciﬁc index, the sequencing data can be com-

putationally grouped and assigned to a speciﬁc cell. This

approach offers the ability to examine hundreds of thou-

sands of cells without the need for complex single-cell han-

dling equipment and has been used to study structural

variations, transcriptomics, and epigenetics (Cusanovich

et al. 2015; Cao et al. 2017; Vitak et al. 2017; Rosenberg

et al. 2018). The steady improvements in throughput and

cost makes SCS increasingly attractive for answering

important hypotheses about genotoxicity that can only be

answered at the level of individual cells. The efﬁcient com-

bination of SCS with high-accuracy single-molecule con-

sensus sequencing methods would be an extremely

powerful tool of the future.

Direct Sing le-Molecule S equencing

Several mutagenesis assays are routinely used to detect

clastogenic compounds, such as the micronucleus and chro-

mosomal aberration assays (Araldi et al. 2015). Although

effective from a risk assessment perspective, these classic

tools do not yield speciﬁc sequence information. Modern

sequencing platforms are able to detect structural variants,

but with the added beneﬁt of providing detailed sequence

information and genomic location. While Illumina’s revers-

ible terminator dye technology, with its reasonably good

accuracy and high throughput, is well suited to detect

single-nucleotide changes, it is currently limited to read

lengths of less than 300 bases (600 bases for paired-end).

Short read length signiﬁcantly hinders the ability to detect

large structural variations and genomic rearrangements.

Therefore, structural variants are bioinformatically detected

by searching for reads spanning a break point or inferred

by read-pairs mapping farther apart than a few kilobases or

to different chromosomes (Alkan et al. 2011). Bioinfor-

matic detection tends to have highly variable sensitivity

and speciﬁcity rates due to the size of the structural variant,

occurrence of chimeric PCR products prior to sequencing,

overlapping clusters or read-hopping on the sequencer, or

the occurrence of erroneous read mapping arising from

pseudogene sequences elsewhere in the genome (Alkan

et al. 2011; Kosugi et al. 2019).

Direct single-molecule sequencing (SMS) is a relatively

new technology that offers a number of advantages over

short read sequencing methods. Two different SMS tech-

nologies are currently commercially available: single-

molecule real-time sequencing (SMRT; commercialized by

Paciﬁc Biosciences) and nanopore (commercialized by

Oxford Nanopore Technologies). van Dijk et al. (2018)

provides a detailed comparison of these two technologies.

Both approaches produce very long reads (10–250 kb) and

directly sequence genomic DNA without the need for

Environmental and Molecular Mutagenesis. DOI 10.1002/em

141Next-Generation Genotoxicology

intermediate PCR ampliﬁcation. The elimination of PCR

chimeras and the addition of more sequence information

within a single read signiﬁcantly reduce mis-mappings and

increases the probability of spanning breakpoints, minimiz-

ing false positives.

Although these technologies enhance the ability to detect

structural variants, they exhibit much higher error rates in

the detection of single nucleotide variants, often as high as

15%–20% (Quail et al. 2012; Ross et al. 2013; Jain et al.

2017). However, these platforms are amenable to platform

speciﬁc variations of consensus sequencing to reduce their

high false-positive rates. For example, in SMRT-based plat-

forms, circularized original DNA molecules can be

sequenced repeatedly with a highly processive DNA poly-

merase and a “circular consensus sequence”made for each

template, improving the accuracy of SNV calls by several

orders of magnitude (Travers et al. 2010; Fig. 3C).

Nanopore-based technologies, however, are not yet amend-

able to signiﬁcant consensus error correction by repeated

sequencing of the same molecule. Currently, a type of

double-strand consensus can be made by afﬁxing a hairpin

adapter to the DNA fragments such that the two strands

can be sequentially sequenced in a reverse complementary

fashion, referred to as “two-directional”sequencing

(Fig. 3D). This approach has been reported to reduce the

error rate to ~3%–5% (Jain et al. 2015; Tyler et al. 2018).

Two recent methods, termed Rolling-Circle to Con-

catameric Consensus and Intramolecular-ligated Nanopore

Consensus Sequencing, offer the possibility of increasing

the accuracy of nanopore-based platforms by implementing

a circular consensus sequencing-like approach, analogous

to what is performed on the PacBio platform (Li et al.

2016; Volden et al. 2018).

NEXT-GENERATION SEQUENCING APPLICATIONS

Modern sequencing platforms are rapidly transforming

the ability to detect, quantify, and characterize genomic

DNA at an ever increasing rate and scale. These technolo-

gies open up new potential avenues of research that are

likely to have a profound impact on the study of genomic

toxicology and mutagenesis. We highlight a number of

emerging applications for modern sequencing platforms

that are of high relevance for genotoxicity studies.

Adduct Detect ion by Sequencing

Genotoxic compounds that induce mutagenesis typically

do so by chemical modiﬁcation of the DNA that induces

base mis-insertion by DNA polymerases during genome

replication or repair. The majority of damage is effectively

removed by multifaceted cellular repair processes before

mutation occurs (Sancar et al. 2004). However, the level of

DNA damage and efﬁciency of repair can vary widely by

genomic context and damage type, with some adducts and

genomic locations being essentially unrepaired (Chang

et al. 2015; Perera et al. 2016; Geacintov and Broyde

2017). As such, there is far from a one-to-one relationship

between the presence of an adduct and risk of mutagenesis.

Indeed, this is the impetus behind the widely used comet

assay that grossly quantiﬁes the aggregate presence of

DNA break and adducts but has the limitation of not pro-

viding sequence context or genomic location information.

While outside the scope of this review, the presence of

unrepaired DNA adducts has been shown to lead to

increases in transcriptional mutagenesis and signiﬁcant

physiological consequences, even when the underlying

DNA sequences is unchanged (reviewed in (Brégeon and

Doetsch 2011)).

A number of approaches have been developed to take

advantage of modern sequencing platforms to assess the

distribution of DNA adducts on a genome wide scale and,

frequently, at single-nucleotide resolution. Current short-

read technologies, such as the Illumina platform, are typi-

cally unable to directly detect DNA adducts, so the pres-

ence of chemical alterations must be inferred by other

means. One strategy is the detection of read start or termi-

nation positions. This approach relies on the ability of

bulky lesions, such as alkyl groups, to block the DNA

polymerases during the PCR steps used in library prepara-

tion (Hu et al. 2016; Hu et al. 2017; Wu et al. 2018). The

result is that the DNA fragments being sequenced will ter-

minate immediately adjacent to the blocking moiety. The

use of DNA repair enzymes or chemical treatments has also

been employed to speciﬁcally cleave DNA at sites of dam-

age followed by adapter ligation and sequencing. The result

is similar to the above, whereby the 50-end of a read

denotes a site immediately adjacent to a site of damage.

This strategy has been used to detect UV (Mao et al. 2016;

Hu et al. 2017), cisplatin (Hu et al. 2016), and bulky alkyl

adducts (Mao et al. 2017; Aloisi et al. 2019). The presence

and location of ribose bases in DNA can be similarly

inferred, simply by inducing breaks with alkaline hydroly-

sis (Orebaugh et al. 2018).

Another frequently used strategy to infer DNA damage

employs enrichment for, or depletion of, DNA fragments

containing adducts. Depletion-based approaches make use

of enzymatic removal of adducts that render those DNA

fragments unsequenceable. The readout is a drop in cover-

age areas of the genome prone to DNA damage relative to

undamaged ones (Bryan et al. 2014). This approach

exhibits poor sensitivity when adducts are present in only a

small minority of DNA molecules, as is the case in many

in vivo applications. One solution is to enrich adduct-

containing molecules via immunoprecipitation of DNA

bearing speciﬁc adducts or bound repair proteins (ie, base

excision repair or nucleotide excision repair, etc.) (Bryan

et al. 2014; Hu et al. 2017, Hu et al. 2016; Li et al. 2017b).

In an analogous approach, base adducts that are poorly

targeted by immunoprecipitation can be chemically

Environmental and Molecular Mutagenesis. DOI 10.1002/em

142 Salk and Kennedy

modiﬁed to make them amendable for capture (Wu et al.

2018). Both methods can signiﬁcantly improve detection of

damage or repair activity on a genome-wide scale.

An advantage of many single-molecule sequencing plat-

forms is that many DNA adducts can be directly detected

without prior manipulation. In the case of the PacBio

SMRT sequencing technology, chemical modiﬁcations to

the template base affect the kinetics of dNTP incorporation

by DNA polymerases in a deﬁned way that is relatively

speciﬁc to each adduct (Clark et al. 2011). Most studies

have focused on endogenous epigenetic modiﬁcations (ie,

methylation), but the methods and statistical analysis

employed by these studies could easily be adapted to gen-

otoxicity applications.

Challenges in detecting blocking lesions are one notable

limitation for this polymerase-based approach. Nanopore

technologies, on the other hand, are well suited for identi-

fying bulky adducts. Base-calling is accomplished by

observing changes in ionic current/impendence that are

speciﬁc to the template base as it passes through the

nanopore structure (reviewed in (Deamer et al. 2016)).

Base modiﬁcations are detectable because they alter this

characteristic proﬁle in an adduct-speciﬁc way. Most efforts

have focused on detecting endogenous methylations

(Laszlo et al. 2013; Schreiber et al. 2013), but an increasing

number of reports are beginning to characterize a wider

variety of exogenous DNA adducts more relevant to

genetic toxicology, including pyrimidine dimers, benzo[a]

pyrine, 8-oxo-dG, abasic sites, and double-strand cross-

links (An et al. 2012; Wolna et al. 2013; An et al. 2015;

Perera et al. 2015; Zhang et al. 2015).

Characterizing Genotoxicit y by Muta tional Signatures

One of the primary goals of genotoxicity testing is to

link speciﬁc exposures to mutagenesis and, ultimately, car-

cinogenesis. Controlled exposure studies in animal models

are currently the gold standard for relating exposure to car-

cinogenicity. However, the linking of mutagenic exposure

to cancer in human populations is far more complex and

largely depends on population level epidemiological studies

(Wild 2008). With some rare exceptions, such as skin can-

cer with sun exposure and cervical cancer with human pap-

illomavirus, deﬁnitive attribution of a speciﬁc instance of

cancer to a speciﬁc genotoxic event is extremely difﬁcult,

especially when compounded with the naturally occurring

accumulation of mutations in cancer relevant genes during

aging (reviewed in (Risques and Kennedy 2018)). Tools

that enable detection of genotoxic exposure in humans, and

more closely link its relationship to cancer, would have a

profound impact on clinical medicine and public health, as

well as important legal and ethical implications.

The relative incidence of different types (or spectra) of

single-base substitutions are nonrandom and strongly

depends on the speciﬁc nature of the mutagen. On their

own, simple mutation spectra (ie, A!Gvs.C!A) have

limited speciﬁcity due to signiﬁcant overlap between differ-

ent mutagens and their predominant mutation type. Local

sequence context, however, strongly inﬂuences the fre-

quency of a given type of mutation. The identity of

ﬂanking nucleotides adds a great deal of additional infor-

mation that can be harnessed to better indicate the exact eti-

ology of observed mutations (Fig. 4).

Data generated by The Cancer Genome Atlas and other

large-scale sequencing efforts have provided an opportunity

to identify many distinct mutational patterns in a wide vari-

ety of cancer types. By taking into account known cancer

biology and patient medical history, analysis of the tumor

mutation patterns can, in some cases, provide a correlative

link between exposure and the observed mutational pat-

terns; for example, high levels of mutations seen in mela-

noma are consistent with pyrimidine dimers (The Cancer

Genome Atlas 2015). These patterns can be readily

detected in tumors using standard NGS techniques because

of the clonal nature of tumor formation. Mutations present

early in neoplastic transformation are propagated to descen-

dent tumor cells, where they are easily identiﬁed as well

above the background error rate of sequencing (Fig. 4A).

This is in contrast to early genotoxin-associated mutations

in normal tissues, which are present in only a minority of

cells among a larger unmutated population, and where far

more sensitive methods are required.

The primary challenge in performing this type of spectral

analysis has been that somatic tumor mutations are the

result of the cumulative mutational processes incurred by

the founding cancer cell’s lineage since embryogenesis. As

such, it is necessary to deconvolute the relative contribu-

tions of each of these mutational processes. Alexandrov

et al. were the ﬁrst to report the use of nonnegative matrix

factorization, a statistical method developed for decomposi-

tion of multivariant data, to computationally parse out con-

stituent mutational processes based on both the speciﬁc

mutation type (ie, G!T/C!A) and the identity of the

adjacent 50and 30bases (Alexandrov et al. 2013). In their

initial work, the authors reported 21 “mutational signa-

tures”(or “trinucleotide signatures”) across the TGCA data

set, with some of the signatures exhibiting high tumor-type

speciﬁcity (Alexandrov et al. 2013). Recent analysis of

tumor sequencing data, comprising 4645 whole genomes

and 19,184 exomes, has validated the vast majority of the

initially reported signatures, as well as further expanded the

number of mathematically deﬁned signatures to now

include a total of 49 single-base substitution signatures,

11 doublet-base substitution signatures, 4 clustered-base

substitution signatures, and 17 small insertion/deletion sig-

natures (Alexandrov et al. 2018).

Mutational signatures have risen to prominence in the

genomic literature over the last 5 years (reviewed in

(Phillips 2018)), but they are not without limitations. Sig-

natures are computationally derived. Some portion of the

Environmental and Molecular Mutagenesis. DOI 10.1002/em

14 3Next-Generation Genotoxicology

described signatures could be computational artifacts or

subfeatures within other processes. Furthermore, the bulk

of research on mutational signatures has focused on their

presence in tumors, for the practical reasons described

above. The signatures observed in a tumor may not fully

recapitulate processes in normal tissues. Signatures in

tumors arise from both endogenous and exogenous sources

(Alexandrov et al. 2013; Alexandrov et al. 2015;

Alexandrov et al. 2018) and are an amalgamation of muta-

genic processes that may be somewhat biased by clonal

sweeps that occur during tumor formation when effects

unrelated to exposure-associated mutagenesis are operative.

Recent work using error-corrected sequencing to study

aﬂatoxin-induced mutations in normal mouse tissue

ACDEB

ACA

ACC

ACG

ACT

CCA

CCC

CCG

CCT

GCA

GCC

GCG

GCT

TCA

TCC

TCG

TCT

ACA

ACC

ACG

ACT

CCA

CCC

CCG

CCT

GCA

GCC

GCG

GCT

TCA

TCC

TCG

TCT

ACA

ACC

ACG

ACT

CCA

CCC

CCG

CCT

GCA

GCC

GCG

GCT

TCA

TCC

TCG

TCT

ATA

ATC

ATG

ATT

CTA

CTC

CTG

CTT

GTA

GTC

GTG

GTT

TTA

TTC

TTG

TTT

ATA

ATC

ATG

ATT

CTA

CTC

CTG

CTT

GTA

GTC

GTG

GTT

TTA

TTC

TTG

TTT

ATA

ATC

ATG

ATT

CTA

CTC

CTG

CTT

GTA

GTC

GTG

GTT

TTA

TTC

TTG

TTT

0.00 0.05 0.10

Fig. 4. Approaches for assessing mutational signatures. Mutational

spectra, particularly polynucleotide mutational signatures, provide

important mechanistic insights into mutational processes. Most of what

we know about these patterns has come from natural or artiﬁcial means

of single cell cloning. (A) Exome or whole-genome sequencing of tumor

populations reﬂects the somatic processes operative in the founding cell

of the most recent clonal sweep. (B) Single cells can be cloned from

cultured populations exposed to known or suspected mutagens to assess

their mutational signatures (C) The clonal variants present in individuals

that were not present in their parents reﬂects the state of mutational

processes during gametogenesis or early embryogenesis. (D) Sequencing

of cloned cells or molecules from certain selection-based mutagenicity

assays can be used similarly, although the patterns may be distorted by

the selection system itself. (E) With ecNGS, it is now possible to obtain

mutational spectra by directly sequencing DNA from any tissue of any

organism.

Environmental and Molecular Mutagenesis. DOI 10.1002/em

144 Salk and Kennedy

demonstrated the low-frequency signature to be distinctly

different from that observed in the tumor itself. This sug-

gests additional mutagenic processes may have developed

during tumorigenesis that were unrelated to aﬂatoxin expo-

sure (Chawanthayatham et al. 2017; Fedeles et al. 2017).

For most genetic toxicologists, a forensic analysis of

the mutational processes that led to clonal tumors is only

useful insofar as the knowledge can be applied for pro-

spectively screening new compounds. Sequencing human

cancers that follow natural exposures, similar to sequenc-

ing of family trios to infer germ line processes that intro-

duce mutations between generations (Fig. 4C), is simply

not a practical tool in this regard. Most conventional gen-

otoxicity assays are not equipped to take advantage of tri-

nucleotide signature analysis due to their reliance on

selective markers with a narrow nucleotide repertoire

which can signiﬁcantly bias observed spectrum (Fig. 4D).

Simple, and even trinucleotide, mutational spectrums can

be assessed from transgenic rodent assays by manually

picking hundreds of phage plaques for sequencing, but in

addition to being very labor intensive, the approach is still

complicated by an incomplete repertoire of three base-pair

groups within the small reporter genes and the fact that

synonymous mutations do not result in phenotypic

changes.

A less biased approach for experimentally obtaining

detailed mutational spectra without any biological selec-

tion is cloning of single cells after compound exposure

followed by large-scale sequencing (Fig. 4B). In an out-

standing recent study by the Nik-Zanal group, the authors

carried out whole-genome sequencing on induced pluripo-

tent stem cells that were cloned from populations treated

with nearly 80 known or suspected carcinogens, identify-

ing dozens of distinct signatures (Kucab et al. 2019). This

more than quadrupled the existing collection of signatures

that have been experimentally ascribed to from exogenous

sources—a list which will undoubtedly continue to grow

(Chawanthayatham et al. 2017; Huang et al. 2017; Ng

et al. 2017; Boot et al. 2018).

Cultured cells cannot fully recapitulate all the meta-

bolic and distribution complexities of in vivo exposures

and single-cell cloning is not trivial (Blokzijl et al.

2016). However, the extensive signature knowledge and

mathematical methods generated from both this approach

and from genotyping tumors can be readily applied to

the above-described new sequencing technologies. Many

of these have sufﬁcient accuracy to detect low-frequency

genotoxin-induced mutations without need for clonal

expansion of any form (Fig. 4E). This opens the possibil-

ity of being able to assess mutational signatures in any

cell type from any tissue from any species directly from

extracted DNA (Chawanthayatham et al. 2017). Much

remains to be done in this emerging space, but the future

remains bright for its applications in genomic

toxicology.

Neo -Genotoxicity: Genome E ngineering Technologies

The classic ﬁelds of genetic toxicology and environmen-

tal mutagenesis have typically focused on the effects of

broadly acting DNA damaging chemicals and their effects

to human health. However, the emergence of new genetic

manipulation technologies, what we term “neo-

genotoxins,”presents both new challenges and new oppor-

tunities for the ﬁeld. A critical aspect of these tools, espe-

cially from a regulatory perspective, is determining their

speciﬁcity in altering the genome in the desired way. Like

traditional chemical mutagens, off-target DNA cutting or

gene mis-insertion could increase the risk of cancer by

inadvertently interrupting an oncogene or tumor suppres-

sor. However, unlike randomly acting small molecules, the

rules for predicting where in the genome this might hap-

pen, and the technical complexities for site-speciﬁc screen-

ing, are completely different.

With the development of programmable endonucleases,

such as zinc-ﬁnger nucleases, transcription activator-like

effector nucleases, and, most recently, CRISPR/Cas nucle-

ases, it is now possible to make targeted genomic alter-

ations in situ (reviewed in (Gaj et al. 2013)). In theory, the

20–40 bases targeted by these enzymes should be more

than sufﬁcient to ensure complete speciﬁcity, but the pres-

ence of pseudogenes, human genetic variation, and a toler-

ance for sequence changes in the recognition sequence, can

reduce site speciﬁcity (Lessard et al. 2017). In silico

methods have been developed to help predict off-target

effects of these nucleases, especially for the CRISPR/Cas

family of endonucleases, but have shown only moderate

concordance with experimental data (reviewed in (Chuai

et al. 2017)).

Using modern sequencing platforms, several unbiased

methods have been developed to detect the presence of

double-strand breaks. A primary concern with these tech-

nologies is the hundreds to thousands of potential off-target

sites that exist across the genome. Further complicating the

issue is that the probability of cutting off-target sites can

vary by several orders of magnitude which means that

brute force sequencing may not be sensitive enough to

detect rare off-target events. While the speciﬁcs of each

approach are different, they largely depend on using

in vitro digestion with the nuclease in question followed by

the introduction of a known universal sequence via liga-

tion/integration or the cell’s homologous recombination

machinery that can be selected by PCR or targeted

pulldown. These methods have reported a wide range of

off-target cutting depending on the method used (Fu et al.

2013; Frock et al. 2015; Tsai et al. 2015; Cameron et al.

2017; Tsai et al. 2017). There is a substantial need for more

accurate and sensitive methods to detect off-target cut sites.

A notable limitation of these methods is the inability to

practically assess off-target effects in vivo, which will be

critical for regulatory testing and widespread medical use

Environmental and Molecular Mutagenesis. DOI 10.1002/em

14 5Next-Generation Genotoxicology

of genome-editing technologies. To date, we are aware of

only one in vivo method, termed “Veriﬁcation of in vivo

Off-targets”(VIVO), that has been published. This

approach uses a combination of in vitro off-target detection

with evaluating the observed off-target sites seen in the

in vitro data for characteristic deletion events caused by

in vivo expression of CRISPR/Cas9 in mouse liver (Tsai

et al. 2017; Akcakaya et al. 2018). Further complicating

matters is that the highly sequence-dependent nature of

both on-target and off-target effects makes animals untena-

ble surrogates for assessing genotoxicity induced by

human-genome targeted nucleases.

The clinical importance of neogenotoxins has become

even more apparent with the emergence of cell-based thera-

pies. While cells do not constitute a genotoxin per se, the

genetic engineering and potential for clonal selection of

mutation-harboring subpopulations during their develop-

ment can lead to increased risk of acquiring cancer from

within the transplanted cells. For example, recent studies

have shown that genome editing using CRISPR-Cas9

results in TP53-mediated DNA damage response and cell-

cycle arrest. Consequently, there is a strong selective

advantage for cells harboring inactivating mutations in this

important tumor suppressor (Haapaniemi et al. 2018; Ihry

et al. 2018; Sinha et al. 2018). In other words, the effect of

even perfectly accurate on-target cutting is natural selection

of cells bearing the most common genetic driver in all

human cancers. These issues, and others that have not yet

been discovered, are likely to complicate therapeutic appli-

cations involving genetically engineered cells, such as for

regenerative medicine or CAR-T-based cancer therapies.

Technologies for accessing these risks will need to be

extremely accurate, quickly adaptable to new targets, and

equally applicable to in vitro preclinical usage as to in vivo

human studies—a tall order by any estimation.

Carcinogenicit y vs. Mu tagenicity

While essentially all human mutagens are carcinogens,

the reverse is not always true. Mutagenesis is an imperfect

surrogate for cancer risk. Nonmutagenic carcinogens may

drive neoplasia through inﬂammation, epigenetic modiﬁca-

tions, and endocrine disruption that drives aberrant cellular

proliferation (Ohshima et al. 2003; Baccarelli and Bollati

2009; Soto and Sonnenschein 2010). In these cases, classic

selection-based mutagenesis assays would not easily detect

these compounds as likely carcinogenic, indicating why

2-year rodent studies remain a safety requirement for new

drug approval.

A number of recent reports show that clonal expansion

of cells harboring somatic mutations in cancer-associated

genes is a normal part of aging (reviewed in (Risques and

Kennedy 2018)). Because non-genotoxic carcinogens are

generally believed to accelerate carcinogenesis by forcing

unregulated cell division, clonal expansions of mutations

could be used as a marker of emerging ability to proliferate

outside the conﬁnes of the normal regulated tissue architec-

ture (Salk and Horwitz 2010). The development of ultra-

accurate ecNGS may offer a way to quantify these expan-

sions and correlate their presence with environmental expo-

sure or potentially cancer risk. Approaches could involve

the sequencing of large panels of cancer driver genes or

hypermutable portions of the genome for clonal expan-

sions. A similar idea has been used in studying somatic

evolution in dysplastic and cancerous tissue (Salk et al.

2009; Naxerova et al. 2017; Baker et al. 2019). Detection

of very early preneoplastic changes at the cellular level by

observing accelerated growth of small clones could be car-

ried out in conjunction with mutagenesis screening using

the same ecNGS methods. For an in-depth discussion on

this topic, please see the accompanying review by Parsons

and colleagues (Harris et al., 2020).

FUTURE APPLICATIONS AND CONCLUSIONS

The utility of modern sequencing platforms has

expanded well beyond the initial use of sequencing DNA

for genome assembly and germ line variant detection, for

which they were originally developed. While in its infancy,

these technologies are ushering in a renaissance for the

study of genotoxicity and somatic mutagenesis. The digital

nature and massive scale at which these technologies oper-

ate is already providing rich data sets that are orders of

magnitude beyond that which was available to the ﬁeld’s

pioneers.

Ultimately, the technologies and methods that we have

described here will be deployable for direct monitoring of

exposures in human populations—a concept famously

envisioned by William Thilly more than three decades ago

(Sattaur 1985). Widely recognized environmental carcino-

gens such as aﬂatoxin and aristocholic acid cause thou-

sands of cancer deaths globally per year, but, at the current

time, it is impossible to know which individuals may have

been exposed during their lives and are at the greatest risk

(Ng et al. 2017). From the point of view of an individual,

routine screening in at-risk populations could identify those

who would most beneﬁt from close clinical surveillance.

From a public health perspective, population testing

could aid in identifying regional exposure hot spots where

source control efforts could be most effective. Numerous

statistically deﬁned “cancer clusters”have been described,

frequently near industrial sites (Thun and Sinks 2004).

New tools that more directly link chemical exposure of

individuals to an instance of cancer could empower com-

munities with objective data to more effectively demand

cleanup and provide local governments and regulators with

early detection tools to prevent clusters in the ﬁrst place.

Due to the generalizability of NGS technologies to any

source of DNA, surveying native organisms for mutagenic

Environmental and Molecular Mutagenesis. DOI 10.1002/em

146 Salk and Kennedy

signatures in their genome would allow for environmental

monitoring for the presence of mutagens. An amusing, yet

entirely appropriate, analogy is the proverbial canary-in-a-

coal mine; in this modern rendition, it is the canary’s

genome that serves as a biosensor for mutagenic coal dust

(Fig. 5). We envision that many of the varieties and appli-

cations of the new technologies outlined in this review can

be combined to obtain a more complete picture of gen-

otoxicity and cancer risk both in model systems and

humans. The use-cases described herein are likely to be

only the beginning of our needs as we look toward engag-

ing with mutagenic new environments, such as inter-

planetary space, and consider new high-risk medical

frontiers, such as gene editing of the germ line. The full

breadth of applications for these new tools remains to be

seen, but their use will undoubtedly offer new avenues of

research and further drive development of technologies that

will carry us through the next 50 years.

AUTHORCONTRIBUTIONS

S.R.K. and J.J.S. conceptualized the review topics.

S.R.K. wrote the initial manuscript draft. S.R.K. and

J.J.S. contributed to the ﬁgures and manuscript.

Conflict of Interest

J.J.S. is an employee and equity holder at TwinStrand

Biosciences. S.R.K. is a paid consultant and equity holder

ACA

ACC

ACG

ACT

CCA

CCC

CCG

CCT

GCA

GCC

GCG

GCT

TCA

TCC

TCG

TCT

ACA

ACC

ACG

ACT

CCA

CCC

CCG

CCT

GCA

GCC

GCG

GCT

TCA

TCC

TCG

TCT

ACA

ACC

ACG

ACT

CCA

CCC

CCG

CCT

GCA

GCC

GCG

GCT

TCA

TCC

TCG

TCT

ATA

ATC

ATG

ATT

CTA

CTC

CTG

CTT

GTA

GTC

GTG

GTT

TTA

TTC

TTG

TTT

ATA

ATC

ATG

ATT

CTA

CTC

CTG

CTT

GTA

GTC

GTG

GTT

TTA

TTC

TTG

TTT

ATA

ATC

ATG

ATT

CTA

CTC

CTG

CTT

GTA

GTC

GTG

GTT

TTA

TTC

TTG

TTT

Fig. 5. Canary-in-a-coal-mine: a century later. A hundred years ago, at

the suggestion of John Scott Haldane, caged canaries were routinely

brought into British coal mines as an early warning sign of human-

relevant toxic gases. Although their routine use ceased in the 1980s, the

broader concept of using sentinel species to infer the presence of

environmental hazards remains highly germane in modern genetic

toxicology. Should it have been possible to collect and analyze a DNA

sample from one of Haldane’s birds using modern ecNGS techniques, it is

quite likely that the mutagenic signature of benzo[a]pyrene could have

been identiﬁed and used to inform efforts to mitigate the environmental

cancer risk. Other naturally present sentinel organisms, including humans

themselves, can be similarly used.

Environmental and Molecular Mutagenesis. DOI 10.1002/em

14 7Next-Generation Genotoxicology

at TwinStrand Biosciences and a paid consultant for Wil-

cox & Savage, PC.

Acknowledgments

We would like to thank Dr. Penny M. Faires for scien-

tiﬁc editing, Clint Valentine for mutational spectra

graphics, and the scientiﬁc team at TwinStrand Biosciences

and members of the HESI Genetic Toxicology Technical

Committee (GTTC) for inspiration and championing new

genomic technologies. This work was supported in part by

DOD/CDMRP grant W81XWH-18-1-0339, NIJ grant

2017-DN-BX-0160, and Safeway/Albertsons Early Career

Award in Cancer Research to S.R.K. and NIH/NIEHS

grant R44ES030642 to J.J.S.

REFERENCES

Akcakaya P, Bobbin ML, Guo JA, Malagon-Lopez J, Clement K,

Garcia SP, Fellows MD, Porritt MJ, Firth MA, Carreras A, et al.

2018. In vivo CRISPR editing with no detectable genome-wide

off-target mutations. Nature 561:416–419.

Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SAJR, Behjati S,

Biankin AV, Bignell GR, Bolli N, Borg A, Børresen-Dale A-L,

et al. 2013. Signatures of mutational processes in human cancer.

Nature 500:415–421.

Alexandrov LB, Jones PH, Wedge DC, Sale JE, Campbell PJ, Nik-

Zainal S, Stratton MR. 2015. Clock-like mutational processes in

human somatic cells. Nat Genet 47:1402–1407.

Alexandrov L, Kim J, Haradhvala NJ, Huang MN, Ng AWT, Boot A,

Covington KR, Gordenin DA, Bergstrom E, Lopez-Bigas N, et al.

2018. The repertoire of mutational signatures in human cancer. bio-

Rxiv 322859. https://doi.org/10.1101/322859.

Alkan C, Coe BP, Eichler EE. 2011. Genome structural variation discov-

ery and genotyping. Nat Rev Genet 12:363–376.

Aloisi CMN, Sturla SJ, Gahlon HL. 2019. A gene-targeted polymerase-

mediated strategy to identify O

-methylguanine damage. Chem

Commun 55:3895–3898.

Ames BN, Lee FD, Durston WE. 1973. An improved bacterial test system

for the detection and classiﬁcation of mutagens and carcinogens.

Proc Natl Acad Sci USA 70:782–786.

An N, Fleming AM, White HS, Burrows CJ. 2012. Crown ether-

electrolyte interactions permit nanopore detection of individual

DNA abasic sites in single molecules. Proc Natl Acad Sci USA

109:11504–11509.

An N, Fleming AM, White HS, Burrows CJ. 2015. Nanopore detection of

8-oxoguanine in the human telomere repeat sequence. ACS Nano

9:4296–4307.

Araldi RP, de Melo TC, Mendes TB, de Sa Junior PL, Nozima BHN,

Ito ET, de Carvalho RF, de Souza EB, de Cassia Stocco R. 2015.

Using the comet and micronucleus assays for genotoxicity studies:

A review. Biomed Pharmacother 72:74–82.

Auerbach C, Robson JM, Carr JG. 1947. The chemical production of

mutations. Science 105:243–247.

Baccarelli A, Bollati V. 2009. Epigenetics and environmental chemicals.

Curr Opin Pediatr 21:243–251.

Baker KT, Nachmanson D, Kumar S, Emond MJ, Ussakli C,

Brentnall TA, Kennedy SR, Risques RA. 2019. Mitochondrial

DNA mutations are associated with ulcerative colitis preneoplasia

but tend to be negatively selected in cancer. Mol Cancer Res 17:

488–498.

Beal MA, Gagné R, Williams A, Marchetti F, Yauk CL. 2015. Character-

izing Benzo[a]pyrene-induced lacZ mutation spectrum in trans-

genic mice using next-generation sequencing. BMC Genomics

16:812.

Besaratinia A, Li H, Yoon J-I, Zheng A, Gao H, Tommasi S. 2012. A

high-throughput next-generation sequencing-based method for

detecting the mutational ﬁngerprint of carcinogens. Nucleic Acids

Res 40: e116.

Bessman MJ, Lehman IR, Simms ES, Kornberg A. 1958. Enzymatic syn-

thesis of deoxyribonucleic acid II. General properties of the reac-

tion. J Biol Chem 233:171–177.

Bielas JH, Loeb LA. 2005. Quantiﬁcation of random genomic mutations.

Nat Methods 2:285–290.

Blokzijl F, de Ligt J, Jager M, Sasselli V, Roerink S, Sasaki N, Huch M,

Boymans S, Kuijk E, Prins P, et al. 2016. Tissue-speciﬁc mutation

accumulation in human adult stem cells during life. Nature 538:

260–264.

Boot A, Huang MN, Ng AWT, Ho S-C, Lim JQ, Kawakami Y,

Chayama K, Teh BT, Nakagawa H, Rozen SG. 2018. In-depth char-

acterization of the cisplatin mutational signature in human cell lines

and in esophageal and liver tumors. Genome Res 28:654–665.

Brégeon D, Doetsch PW. 2011. Transcriptional mutagenesis: Causes and

involvement in tumour development. Nat Rev Cancer 11:218–227.

Bryan DS, Ransom M, Adane B, York K, Hesselberth JR. 2014. High res-

olution mapping of modiﬁed DNA nucleobases using excision

repair enzymes. Genome Res 24:1534–1542.

Bryce SM, Bemis JC, Dertinger SD. 2008. In vivo mutation assay based

on the endogenous Pig-a locus. Environ Mol Mutagen 49:

256–264.

Cameron P, Fuller CK, Donohoue PD, Jones BN, Thompson MS,

Carter MM, Gradia S, Vida B, Garner E, Slorach EM, et al. 2017.

Mapping the genomic landscape of CRISPR–Cas9 cleavage. Nat

Methods 14:600–606.

Cao J, Packer JS, Ramani V, Cusanovich DA, Huynh C, Daza R, Qiu X,

Lee C, Furlan SN, Steemers FJ, et al. 2017. Comprehensive single-

cell transcriptional proﬁling of a multicellular organism. Science

357:661–667.

Casbon JA, Osborne RJ, Brenner S, Lichtenstein CP. 2011. A method for

counting PCR template molecules with application to next-

generation sequencing. Nucleic Acids Res 39: e81.

Chang S, Fedeles BI, Wu J, Delaney JC, Li D, Zhao L, Christov PP,

Yau E, Singh V, Jost M, et al. 2015. Next-generation sequencing

reveals the biological signiﬁcance of the N2,3-ethenoguanine

lesion in vivo. Nucleic Acids Res 43:5489–5500.

Chauhan V, Kuo B, McNamee JP, Wilkins RC, Yauk CL. 2016. Tran-

scriptional benchmark dose modeling: Exploring how advances in

chemical risk assessment may be applied to the radiation ﬁeld:

BMD and radiation risk assessment. Environ Mol Mutagen 57:

589–604.

Chawanthayatham S, Valentine CC, Fedeles BI, Fox EJ, Loeb LA,

Levin SS, Slocu SL, Wogan GN, Croy RG, Essigmann JM. 2017.

Mutational spectra of aﬂatoxin B

in vivo establish biomarkers of

exposure for human hepatocellular carcinoma. Proc Natl Acad Sci

USA 114:E3101–E3109.

Chen C, Xing D, Tan L, Li H, Zhou G, Huang L, Xie XS. 2017a. Single-

cell whole-genome analyses by linear ampliﬁcation via transposon

insertion (LIANTI). Science 356:189–194.

Chen L, Liu P, Evans TC, Ettwiller LM. 2017b. DNA damage is a perva-

sive cause of sequencing errors, directly confounding variant iden-

tiﬁcation. Science 355:752–756.

Environmental and Molecular Mutagenesis. DOI 10.1002/em

148 Salk and Kennedy

Chuai G, Wang Q-L, Liu Q. 2017. In silico meets in vivo : Towards com-

putational CRISPR-based sgRNA design. Trends in Biotechnology

35:12–21.

Clark TA, Spittle KE, Turner SW, Korlach J. 2011. Direct detection and

sequencing of damaged DNA bases. Genome Integr 2:10.

Cusanovich DA, Daza R, Adey A, Pliner HA, Christiansen L,

Gunderson KL, Steemers FJ, Trapnell C, Shendure J. 2015. Multi-

plex single-cell proﬁling of chromatin accessibility by combinato-

rial cellular indexing. Science 348:910–914.

Deamer D, Akeson M, Branton D. 2016. Three decades of nanopore

sequencing. Nat Biotechnol 34:518–524.

DeMarini DM. 2019. The mutagenesis moonshot: The propitious begin-

nings of the environmental mutagenesis and genomics society.

Environ Mol Mutagen This issue.

van Dijk EL, Jaszczyszyn Y, Naquin D, Thermes C. 2018. The third

revolution in sequencing technology. Trends Genet 34:

666–681.

Dong X, Zhang L, Milholland B, Lee M, Maslov AY, Wang T, Vijg J.

2017. Accurate identiﬁcation of single-nucleotide variants in

whole-genome-ampliﬁed single cells. Nat Methods 14:491–493.

Fedeles BI, Chawanthayatham S, Croy RG, Wogan GN, Essigmann JM.

2017. Early detection of the aﬂatoxin B

mutational ﬁngerprint: A

diagnostic tool for liver cancer. Mol Cell Oncol 4: e1329693.

Frock RL, Hu J, Meyers RM, Ho Y-J, Kii E, Alt FW. 2015. Genome-wide

detection of DNA double-stranded breaks induced by engineered

nucleases. Nat Biotechnol 33:179–186.

Fu GK, Hu J, Wang P-H, Fodor SPA. 2011. Counting individual DNA

molecules by the stochastic attachment of diverse labels. Proc Natl

Acad Sci USA 108:9026–9031.

Fu Y, Foden JA, Khayter C, Maeder ML, Reyon D, Joung JK, Sander JD.

2013. High-frequency off-target mutagenesis induced by CRISPR-

Cas nucleases in human cells. Nat Biotechnol 31:822–826.

Fu Y, Li C, Lu S, Zhou W, Tang F, Xie XS, Huang Y. 2015. Uniform

and accurate single-cell sequencing based on emulsion whole-

genome ampliﬁcation. Proc Natl Acad Sci USA 112:11923–11928.

Gaj T, Gersbach CA, Barbas CF. 2013. ZFN, TALEN, and CRISPR/Cas-

based methods for genome engineering. Trends Biotechnol 31:

397–405.

Geacintov NE, Broyde S. 2017. Repair-resistant DNA lesions. Chem Res

Toxicol 30:1517–1548.

Gierahn TM, Ii MHW, Hughes TK, Bryson BD, Butler A, Satija R,

Fortune S, Love JC, Shalek AK. 2017. Seq-well: Portable, low-

cost RNA sequencing of single cells at high throughput. Nat

Methods 14:395–398.

Godschalk RWL, Yauk CL, van Benthem J, Douglas G, Marchetti F,

2019. In utero exposure to genotoxins leading to genetic mosai-

cisms: A forgotten window of susceptibility in genetic toxicology

testing? Environ Mol Mutagen This issue

Gregory MT, Bertout JA, Ericson NG, Taylor SD, Mukherjee R,

Robins HS, Drescher CW, Bielas JH. 2016. Targeted single mole-

cule mutation detection with massively parallel sequencing.

Nucleic Acids Res 44: e22.

Haapaniemi E, Botla S, Persson J, Schmierer B, Taipale J. 2018.

CRISPR–Cas9 genome editing induces a p53-mediated DNA dam-

age response. Nat Med 24:927–930.

Handford M. 2007. Where’s Waldo? First U.S. Paperback Edition, Vol.

Harris KL, Myers MB, McKim KL, Elespuru RK, Parsons BL. 2020.

Rationale and roadmap for developing panels of hotspot cancer

driver gene mutations as biomarkers of cancer risk. Environ Mol

Mutagen 61:152–175.

Heﬂich R, Johnson G, Zeller A, Francesco M, Douglas G, Witt K,

Gollapudi BB, White P. 2020. Mutation as a toxicological endpoint

for regulatory decision-making. Environ Mol Mutagen 61:34–41.

Hiatt JB, Patwardhan RP, Turner EH, Lee C, Shendure J. 2010. Parallel,

tag-directed assembly of locally derived short sequence reads. Nat

Methods 7:119–122.

Hiatt JB, Pritchard CC, Salipante SJ, O’Roak BJ, Shendure J. 2013. Single

molecule molecular inversion probes for targeted, high-accuracy

detection of low-frequency variation. Genome Res 23:843–854.

Hoang ML, Kinde I, Tomasetti C, McMahon KW, Rosenquist TA,

Grollman AP, Kinzler KW, Vogelstein B, Papadopoulos N. 2016.

Genome-wide quantiﬁcation of rare somatic mutations in normal

human tissues using massively parallel sequencing. Proc Natl Acad

Sci USA 113:9846–9851.

Hodgkinson A, Eyre-Walker A. 2011. Variation in the mutation rate

across mammalian genomes. Nat Rev Genet 12:756–766.

Hu J, Lieb JD, Sancar A, Adar S. 2016. Cisplatin DNA damage and repair

maps of the human genome at single-nucleotide resolution. Proc

Natl Acad Sci USA 113:11507–11512.

Hu J, Adebali O, Adar S, Sancar A. 2017. Dynamic maps of UV damage

formation and repair for the human genome. Proc Natl Acad Sci

USA 114:6758–6763.

Huang MN, Yu W, Teoh WW, Ardin M, Jusakul A, Ng AWT, Boot A,

Abedi-Ardekani B, Villar S, Myint SS, et al. 2017. Genome-scale

mutational signatures of aﬂatoxin in cells, mice, and human

tumors. Genome Res 27:1475–1486.

Ihry RJ, Worringer KA, Salick MR, Frias E, Ho D, Theriault K,

Kommineni S, Chen J, Sondey M, Ye C, et al. 2018. p53 inhibits

CRISPR–Cas9 engineering in human pluripotent stem cells. Nat

Med 24:939–946.

Jain M, Fiddes IT, Miga KH, Olsen HE, Paten B, Akeson M. 2015.

Improved data analysis for the MinION nanopore sequencer. Nat

Methods 12:351–356.

Jain M, Tyson JR, Loose M, Ip CLC, Eccles DA, O’Grady J, Malla S,

Leggett RM, Wallerman O, Jansen HJ, et al. 2017. MinION analy-

sis and reference consortium: Phase 2 data release and analysis of

R9.0 chemistry. F1000Res 6:760.

Kennedy SR, Salk JJ, Schmitt MW, Loeb LA. 2013. Ultra-sensitive

sequencing reveals an age-related increase in somatic mitochon-

drial mutations that are inconsistent with oxidative damage. PLoS

Genet 9: e1003794.

Kennedy SR, Schmitt MW, Fox EJ, Kohrn BF, Salk JJ, Ahn EH,

Prindle MJ, Kuong KJ, Shen J-C, Risques R-A, et al. 2014.

Detecting ultralow-frequency mutations by duplex sequencing. Nat

Protoc 9:2586–2606.

Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B. 2011. Detec-

tion and quantiﬁcation of rare mutations with massively parallel

sequencing. Proc Natl Acad Sci USA 108:9530–9535.

Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V,

Peshkin L, Weitz DA, Kirschner MW. 2015. Droplet barcoding for

single-cell transcriptomics applied to embryonic stem cells. Cell

161:1187–1201.

Kohler SW, Provost GS, Fieck A, Kretz PL, Bullock WO, Sorge JA,

Putman DL, Short JM. 1991. Spectra of spontaneous and mutagen-

induced mutations in the lacI gene in transgenic mice. Proc Natl

Acad Sci USA 88:7958–7962.

Kosugi S, Momozawa Y, Liu X, Terao C, Kubo M, Kamatani Y. 2019.

Comprehensive evaluation of structural variation detection algo-

rithms for whole genome sequencing. Genome Biol 20:117.

Kucab JE, Zou X, Morganella S, Joel M, Nanda AS, Nagy E, Gomez C,

Degasperi A, Harris R, Jackson SP, et al. 2019. A compendium of

Environmental and Molecular Mutagenesis. DOI 10.1002/em

14 9Next-Generation Genotoxicology

mutational signatures of environmental agents. Cell 177:

821–836.e16.

Kumar V, Rosenbaum J, Wang Z, Forcier T, Ronemus M, Wigler M,

Levy D. 2018. Partial bisulﬁte conversion for unique template

sequencing. Nucleic Acids Res 46: e10.

Laszlo AH, Derrington IM, Brinkerhoff H, Langford KW, Nova IC,

Samson JM, Bartlett JJ, Pavlenok M, Gundlach JH. 2013. Detection

and mapping of 5-methylcytosine and 5-hydroxymethylcytosine with

nanopore MspA. Proc Natl Acad Sci USA 110:18904–18909.

Lehman IR, Bessman MJ, Simms ES, Kornberg A. 1958. Enzymatic syn-

thesis of deoxyribonucleic acid I. preparation of substrates and par-

tial puriﬁcation of an enzyme from Escherichia coli. J Biol Chem

233:163–170.

Lessard S, Francioli L, Alfoldi J, Tardif J-C, Ellinor PT, MacArthur DG,

Lettre G, Orkin SH, Canver MC. 2017. Human genetic variation

alters CRISPR-Cas9 on- and off-targeting speciﬁcity at therapeuti-

cally implicated loci. Proc Natl Acad Sci USA 114:

E11257–E11266.

Li C, Chng KR, Boey EJH, Ng AHQ, Wilm A, Nagarajan N. 2016. INC-

Seq: Accurate single molecule reads using nanopore sequencing.

GigaSci 5:34.

Li H-H, Chen R, Hyduke DR, Williams A, Frötschl R, Ellinger-

Ziegelbauer H, O’Lone R, Yauk CL, Aubrecht J, Fornace AJ.

2017a. Development and validation of a high-throughput trans-

criptomic biomarker to address 21st century genetic toxicology

needs. Proc Natl Acad Sci USA 114:E10881–E10889.

Li W, Hu J, Adebali O, Adar S, Yang Y, Chiou Y-Y, Sancar A. 2017b.

Human genome-wide repair map of DNA damage caused by the

cigarette smoke carcinogen benzo[a]pyrene. Proc Natl Acad Sci

USA 114:6752–6757.

Lindahl T. 1993. Instability and decay in the primary structure of DNA.

Nature 362:709–715.

Lou DI, Hussmann JA, McBee RM, Acevedo A, Andino R, Press WH,

Sawyer SL. 2013. High-throughput DNA sequencing errors are

reduced by orders of magnitude using circle sequencing. Proc Natl

Acad Sci USA 110:19872–19877.

Lynch M. 2010. Rate, molecular spectrum, and consequences of human

mutation. Proc Natl Acad Sci USA 107:961–968.

Mao P, Smerdon MJ, Roberts SA, Wyrick JJ. 2016. Chromosomal land-

scape of UV damage formation and repair at single-nucleotide res-

olution. Proc Natl Acad Sci USA 113:9057–9062.

Mao P, Brown AJ, Malc EP, Mieczkowski PA, Smerdon MJ, Roberts SA,

Wyrick JJ. 2017. Genome-wide maps of alkylation damage, repair,

and mutagenesis in yeast reveal mechanisms of mutational hetero-

geneity. Genome Res 27:1674–1684.

Marchetti F, Douglas GR, Yauk CL. 2019. A return to the origin of the

EMGS: Rejuvenating the quest for human germ cell mutagens and

determining the risk to future generations. Environ Mol Mutagen

This issue.

Martincorena I, Roshan A, Gerstung M, Ellis P, Van Loo P, McLaren S,

Wedge DC, Fullam A, Alexandrov LB, Tubio JM, et al. 2015.

High burden and pervasive positive selection of somatic mutations

in normal human skin. Science 348:880–886.

Martincorena I, Raine KM, Gerstung M, Dawson KJ, Haase K, Van

Loo P, Davies H, Stratton MR, Campbell PJ. 2017. Universal pat-

terns of selection in cancer and somatic tissues. Cell 171:1029.

e21–1041.e21.

Mattox AK, Wang Y, Springer S, Cohen JD, Yegnasubramanian S,

Nelson WG, Kinzler KW, Vogelstein B, Papadopoulos N. 2017.

Bisulﬁte-converted duplexes for the strand-speciﬁc detection and

quantiﬁcation of rare mutations. Proc Natl Acad Sci USA 114:

4733–4738.

McCann J, Choi E, Yamasaki E, Ames BN. 1975. Detection of carcino-

gens as mutagens in the Salmonella/microsome test: Assay of

300 chemicals. Proc Natl Acad Sci USA 72:5135–5139.

McInerney P, Adams P, Hadi MZ. 2014. Error rate comparison during

polymerase chain reaction by DNA polymerase. Mol Biol Int

2014:1–8.

Metzker ML. 2010. Sequencing technologies—The next generation. Nat

Rev Genet 11:31–46.

Milholland B, Dong X, Zhang L, Hao X, Suh Y, Vijg J. 2017. Differences

between germline and somatic mutation rates in humans and mice.

Nat Commun 8: 15183.

Muller HJ. 1927. Artiﬁcial transmutation of the gene. Science 66:84–87.

Myhr BC. 1991. Validation studies with Muta™mouse: A transgenic

mouse model for detecting mutations in vivo. Environ Mol Muta-

gen 18:308–315.

Navin N, Kendall J, Troge J, Andrews P, Rodgers L, McIndoo J, Cook K,

Stepansky A, Levy D, Esposito D, et al. 2011. Tumour evolution

inferred by single-cell sequencing. Nature 472:90–94.

Naxerova K, Reiter JG, Brachtel E, Lennerz JK, van de Wetering M,

Rowan A, Cai T, Clevers H, Swanton C, Nowak MA, et al. 2017.

Origins of lymphatic and distant metastases in human colorectal

cancer. Science 357:55–60.

Ng AWT, Poon SL, Huang MN, Lim JQ, Boot A, Yu W, Suzuki Y,

Thangaraju S, Ng CCY, Tan P, et al. 2017. Aristolochic acids and

their derivatives are widely implicated in liver cancers in Taiwan

and throughout Asia. Sci Transl Med 9: eaan6446.

Ohshima H, Tatemichi M, Sawa T. 2003. Chemical basis of inﬂammation-

induced carcinogenesis. Arch Biochem Biophys 417:3–11.

Orebaugh CD, Lujan SA, Burkholder AB, Clausen AR, Kunkel TA. 2018.

Mapping ribonucleotides incorporated into DNA by hydrolytic

end-sequencing. Methods Mol Biol 1672:329–345.

Parsons BL, Heﬂich RH. 1997. Genotypic selection methods for the direct

analysis of point mutations. Mutat Res 387:97–121.

Perera RT, Fleming AM, Johnson RP, Burrows CJ, White HS. 2015.

Detection of benzo[a]pyrene-guanine adducts in single-stranded

DNA using the α-hemolysin nanopore. Nanotechnol 26: 074002.

Perera D, Poulos RC, Shah A, Beck D, Pimanda JE, Wong JWH. 2016.

Differential DNA repair underlies mutation hotspots at active pro-

moters in cancer genomes. Nature 532:259–263.

Phillips DH. 2018. Mutational spectra and mutational signatures: Insights

into cancer aetiology and mechanisms of DNA damage and repair.

DNA Repair 71:6–11.

Quail M, Smith ME, Coupland P, Otto TD, Harris SR, Connor TR,

Bertoni A, Swerdlow HP, Gu Y. 2012. A tale of three next genera-

tion sequencing platforms: Comparison of Ion Torrent, Paciﬁc Bio-

sciences and Illumina MiSeq sequencers. BMC Genomics 13:341.

Rinke C, Lee J, Nath N, Goudeau D, Thompson B, Poulton N,

Dmitrieff E, Malmstrom R, Stepanauskas R, Woyke T. 2014.

Obtaining genomes from uncultivated environmental microorgan-

isms using FACS–based single-cell genomics. Nat Protoc 9:

1038–1048.

Risques RA, Kennedy SR. 2018. Aging and the rise of somatic cancer-

associated mutations in normal tissues. PLoS Genet 14: e1007108.

Rosenberg AB, Roco CM, Muscat RA, Kuchina A, Sample P, Yao Z,

Graybuck LT, Peeler DJ, Mukherjee S, Chen W, et al. 2018. Sin-

gle-cell proﬁling of the developing mouse brain and spinal cord

with split-pool barcoding. Science 360:176–182.

Ross MG, Russ C, Costello M, Hollinger A, Lennon NJ, Hegarty R,

Nusbaum C, Jaffe DB. 2013. Characterizing and measuring bias in

sequence data. Genome Biol 14:R51.

Salk JJ, Horwitz MS. 2010. Passenger mutations as a marker of clonal cell

lineages in emerging neoplasia. Semin Cancer Biol 20:294–303.

Environmental and Molecular Mutagenesis. DOI 10.1002/em

150 Salk and Kennedy

Salk JJ, Salipante SJ, Risques RA, Crispin DA, Li L, Bronner MP,

Brentnall TA, Rabinovitch PS, Horwitz MS, Loeb LA. 2009.

Clonal expansions in ulcerative colitis identify patients with neo-

plasia. Proc Natl Acad Sci USA 106:20871–20876.

Salk JJ, Schmitt MW, Loeb LA. 2018. Enhancing the accuracy of next-

generation sequencing for detecting rare and subclonal mutations.

Nat Rev Genet 19:269–285.

Sancar A, Lindsey-Boltz LA, Ünsal-Kaçmaz K, Linn S. 2004. Molecular

mechanisms of mammalian DNA repair and the DNA damage

checkpoints. Annu Rev Biochem 73:39–85.

Sattaur O. 1985. Mutation spectra from a drop of blood. New Scientist 31:20.

Schmitt MW, Kennedy SR, Salk JJ, Fox EJ, Hiatt JB, Loeb LA. 2012.

Detection of ultra-rare mutations by next-generation sequencing.

Proc Natl Acad Sci USA 109:14508–14513.

Schreiber J, Wescoe ZL, Abu-Shumays R, Vivian JT, Baatar B,

Karplus K, Akeson M. 2013. Error rates for nanopore discrimina-

tion among cytosine, methylcytosine, and hydroxymethylcytosine

along individual DNA strands. Proc Natl Acad Sci USA 110:

18910–18915.

Shibutani S, Takeshita M, Grollman AP. 1991. Insertion of speciﬁc bases

during DNA synthesis past the oxidation-damaged base 8-oxodG.

Nature 349:431–434.

Sinha S, Cheng K, Leiserson MD, Wilson DM, Ryan BM, Lee JS,

Ruppin E. 2018. A systematic genome-wide mapping of the onco-

genic risks associated with CRISPR-Cas9 editing. bioRxiv 407767.

https://doi.org/10.1101/407767.

Soto AM, Sonnenschein C. 2010. Environmental causes of cancer:

Endocrine disruptors as carcinogens. Nat Rev Endocrinol 6:

363–370.

Stadler LJ, Sprague GF. 1936. Genetic effects of ultra-violet radiation in

maize. I. Unﬁltered radiation. Proc Natl Acad Sci USA 22:572–578.

The Cancer Genome Atlas. 2015. Genomic classiﬁcation of cutaneous

melanoma. Cell 161:1681–1696.

Thompson LH, Fong S, Brookman K. 1980. Validation of conditions for

efﬁcient detection of HPRT and APRT mutations in suspension-

cultured chinese hamster ovary cells. Mutat Res 74:21–36.

Thun MJ, Sinks T. 2004. Understanding cancer clusters. CA Cancer J Clin

54:273–280.

Travers KJ, Chin C-S, Rank DR, Eid JS, Turner SW. 2010. A ﬂexible and

efﬁcient template format for circular consensus sequencing and

SNP detection. Nucleic Acids Res 38: e159.

Tsai SQ, Zheng Z, Nguyen NT, Liebers M, Topkar VV, Thapar V,

Wyvekens N, Khayter C, Iafrate AJ, Le LP, et al. 2015. GUIDE-

seq enables genome-wide proﬁling of off-target cleavage by

CRISPR-Cas nucleases. Nat Biotechnol 33:187–197.

Tsai SQ, Nguyen NT, Malagon-Lopez J, Topkar VV, Aryee MJ,

Joung JK. 2017. CIRCLE-seq: A highly sensitive in vitro screen

for genome-wide CRISPR–Cas9 nuclease off-targets. Nat Methods

14:607–614.

Tyler AD, Mataseje L, Urfano CJ, Schmidt L, Antonation KS,

Mulvey MR, Corbett CR. 2018. Evaluation of Oxford Nanopore’s

MinION sequencing device for microbial whole genome sequenc-

ing applications. Sci Rep 8: 10931.

Vitak SA, Torkenczy KA, Rosenkrantz JL, Fields AJ, Christiansen L,

Wong MH, Carbone L, Steemers FJ, Adey A. 2017. Sequencing

thousands of single-cell genomes with combinatorial indexing. Nat

Methods 14:302–308.

Volden R, Palmer T, Byrne A, Cole C, Schmitz RJ, Green RE, Vollmers C.

2018. Improving nanopore read accuracy with the R2C2 method

enables the sequencing of highly multiplexed full-length single-cell

cDNA. Proc Natl Acad Sci USA 115:9726–9731.

Wang J, Fan HC, Behr B, Quake SR. 2012. Genome-wide single-cell anal-

ysis of recombination activity and de novo mutation rates in human

sperm. Cell 150:402–412.

Wang Q, Jia P, Li F, Chen H, Ji H, Hucks D, Dahlman K, Pao W,

Zhao Z. 2013. Detecting somatic point mutations in cancer genome

sequencing data: A comparison of mutation callers. Genome Med

5:91.

Watson JD, Crick FHC. 1953. Molecular structure of nucleic acids: A

structure for deoxyribose nucleic acid. Nature 171:737–738.

Wei Z, Wang W, Hu P, Lyon GJ, Hakonarson H. 2011. SNVer: A statis-

tical tool for variant calling in analysis of pooled or individual

next-generation sequencing data. Nucleic Acids Research 39:

e132.

Wild CP. 2008. Environmental exposure measurement in cancer epidemi-

ology. Mutagenesis 24:117–125.

Wilm A, Aw PPK, Bertrand D, Yeo GHT, Ong SH, Wong CH, Khor CC,

Petric R, Hibberd ML, Nagarajan N. 2012. LoFreq: A sequence-

quality aware, ultra-sensitive variant caller for uncovering cell-

population heterogeneity from high-throughput sequencing

datasets. Nucleic Acids Res 40:11189–11201.

Wolna AH, Fleming AM, An N, He L, White HS, Burrows CJ. 2013.

Electrical current signatures of DNA base modiﬁcations in single

molecules immobilized in the α-hemolysin ion channel. Isr J Chem

53:417–430.

Wood RD. 1996. DNA repair in eukaryotes. Ann Rev Biochem 65:

135–167.

Wu J, McKeague M, Sturla SJ. 2018. Nucleotide-resolution genome-wide

mapping of oxidative DNA damage by click-code-Seq. J Am

Chem Soc 140:9783–9787.

Yuan B, Wang J, Cao H, Sun R, Wang Y. 2011. High-throughput analysis

of the mutagenic and cytotoxic properties of DNA lesions by next-

generation sequencing. Nucleic Acids Res 39:5945–5954.

Zhang X, Price NE, Fang X, Yang Z, Gu L-Q, Gates KS. 2015. Character-

ization of interstrand DNA–DNA cross-links using the

α-hemolysin protein nanopore. ACS Nano 9:11812–11819.

Zong C, Lu S, Chapman AR, Xie XS. 2012. Genome-wide detection of

single-nucleotide and copy-number variations of a single human

cell. Science 338:1622–1626.

Accepted by—

C. Yauk

Environmental and Molecular Mutagenesis. DOI 10.1002/em

151Next- Generation Genotoxicology

Effects of urban-induced mutations on ecology, evolution and health

Article

Apr 2024
Nat. Ecol. Evol.

Evaluating the mutagenicity of N-nitrosodimethylamine in 2D and 3D HepaRG cell cultures using error-corrected next generation sequencing

Article

Full-text available

Apr 2024
ARCH TOXICOL

Human liver-derived metabolically competent HepaRG cells have been successfully employed in both two-dimensional (2D) and 3D spheroid formats for performing the comet assay and micronucleus (MN) assay. In the present study, we have investigated expanding the genotoxicity endpoints evaluated in HepaRG cells by detecting mutagenesis using two error-corrected next generation sequencing (ecNGS) technologies, Duplex Sequencing (DS) and High-Fidelity (HiFi) Sequencing. Both HepaRG 2D cells and 3D spheroids were exposed for 72 h to N-nitrosodimethylamine (NDMA), followed by an additional incubation for the fixation of induced mutations. NDMA-induced DNA damage, chromosomal damage, and mutagenesis were determined using the comet assay, MN assay, and ecNGS, respectively. The 72-h treatment with NDMA resulted in concentration-dependent increases in cytotoxicity, DNA damage, MN formation, and mutation frequency in both 2D and 3D cultures, with greater responses observed in the 3D spheroids compared to 2D cells. The mutational spectrum analysis showed that NDMA induced predominantly A:T → G:C transitions, along with a lower frequency of G:C → A:T transitions, and exhibited a different trinucleotide signature relative to the negative control. These results demonstrate that the HepaRG 2D cells and 3D spheroid models can be used for mutagenesis assessment using both DS and HiFi Sequencing, with the caveat that severe cytotoxic concentrations should be avoided when conducting DS. With further validation, the HepaRG 2D/3D system may become a powerful human-based metabolically competent platform for genotoxicity testing.

Utilization of Quizlet as an Interactive Quiz to Improve Student Learning Achievement

Article

Full-text available

Dec 2023

Refka Rahim

Along with the many advances in technology in the world of education, especially in website-based learning. Student learning achievement can be improved by using the Quizlet application. Quizlet is a platform used to make learning evaluations using the internet network. The purpose of this research is to determine the benefits of the Quizlet application as an interactive quiz to improve student learning achievement. This research uses quantitative methods using surveys and in-depth interviews, the survey was conducted online. The results of this research explain that the Quizlet platform can be used to create online-based interactive quizzes and can improve student achievement. The conclusion of this research explains that in using the applicationQuizlet really helps educators and students in the teaching and learning process, especially in implementing quizzes in learning. The limitation of this research is that the researcher only uses the Quizlet platform as an interactive quiz and the researcher hopes that future researchers can carry out the same research but with a more interesting Quiz application or can be used offline.

Environmental Toxicology and Human Health

Article

Full-text available

Dec 2023
INT J MOL SCI

Humans and animals may be exposed on a continuous daily basis to a mixture of environmental contaminants that may act on several organ systems through differing mechanisms [...]

Interpretation of In Vitro Concentration‐Response Data for Risk Assessment and Regulatory Decision‐making: Report from the 2022 IWGT Quantitative Analysis Expert Working Group Meeting

Article

Dec 2023

Quantitative risk assessments of chemicals are routinely performed using in vivo data from rodents; however, there is growing recognition that non‐animal approaches can be human‐relevant alternatives. There is an urgent need to build confidence in non‐animal alternatives given the international support to reduce the use of animals in toxicity testing where possible. In order for scientists and risk assessors to prepare for this paradigm shift in toxicity assessment, standardization and consensus on in vitro testing strategies and data interpretation will need to be established. To address this issue, an Expert Working Group (EWG) of the 8 th International Workshop on Genotoxicity Testing (IWGT) evaluated the utility of quantitative in vitro genotoxicity concentration‐response data for risk assessment. The EWG first evaluated available in vitro methodologies and then examined the variability and maximal response of in vitro tests to estimate biologically relevant values for the critical effect sizes considered adverse or unacceptable. Next, the EWG reviewed the approaches and computational models employed to provide human‐relevant dose context to in vitro data. Lastly, the EWG evaluated risk assessment applications for which in vitro data are ready for use and applications where further work is required. The EWG concluded that in vitro genotoxicity concentration‐response data can be interpreted in a risk assessment context. However, prior to routine use in regulatory settings, further research will be required to address the remaining uncertainties and limitations. This article is protected by copyright. All rights reserved.

Severity of effect considerations regarding the use of mutation as a toxicological endpoint for risk assessment: A report from the 8th International Workshop on Genotoxicity Testing (IWGT)

Article

Jun 2024
ENVIRON MOL MUTAGEN

Exposure levels without appreciable human health risk may be determined by dividing a point of departure on a dose–response curve (e.g., benchmark dose) by a composite adjustment factor (AF). An “effect severity” AF (ESAF) is employed in some regulatory contexts. An ESAF of 10 may be incorporated in the derivation of a health‐based guidance value (HBGV) when a “severe” toxicological endpoint, such as teratogenicity, irreversible reproductive effects, neurotoxicity, or cancer was observed in the reference study. Although mutation data have been used historically for hazard identification, this endpoint is suitable for quantitative dose–response modeling and risk assessment. As part of the 8th International Workshops on Genotoxicity Testing, a sub‐group of the Quantitative Analysis Work Group (WG) explored how the concept of effect severity could be applied to mutation. To approach this question, the WG reviewed the prevailing regulatory guidance on how an ESAF is incorporated into risk assessments, evaluated current knowledge of associations between germline or somatic mutation and severe disease risk, and mined available data on the fraction of human germline mutations expected to cause severe disease. Based on this review and given that mutations are irreversible and some cause severe human disease, in regulatory settings where an ESAF is used, a majority of the WG recommends applying an ESAF value between 2 and 10 when deriving a HBGV from mutation data. This recommendation may need to be revisited in the future if direct measurement of disease‐causing mutations by error‐corrected next generation sequencing clarifies selection of ESAF values.

Frequencies and spectra of aflatoxin B1-induced mutations in liver genomes of NEIL1-deficient mice as revealed by duplex sequencing

Article

May 2024

Increased risk for the development of hepatocellular carcinoma (HCC) is driven by a number of etiological factors including hepatitis viral infection and dietary exposures to foods contaminated with aflatoxin-producing molds. Intracellular metabolic activation of aflatoxin B1 (AFB1) to a reactive epoxide generates highly mutagenic AFB1-Fapy-dG adducts. Previously, we demonstrated that repair of AFB1-Fapy-dG adducts can be initiated by the DNA glycosylase NEIL1 and that male Neil1−/− mice were significantly more susceptible to AFB1-induced HCC relative to wild-type mice. To investigate the mechanisms underlying this enhanced carcinogenesis, WT and Neil1−/− mice were challenged with a single, 4 mg/kg dose of AFB1 and frequencies and spectra of mutations were analyzed in liver DNAs 2.5 months post-injection using duplex sequencing. The analyses of DNAs from AFB1-challenged mice revealed highly elevated mutation frequencies in the nuclear genomes of both males and females, but not the mitochondrial genomes. In both WT and Neil1−/− mice, mutation spectra were highly similar to the AFB1-specific COSMIC signature SBS24. Relative to wild-type, the NEIL1 deficiency increased AFB1-induced mutagenesis with concomitant elevated HCCs in male Neil1−/− mice. Our data establish a critical role of NEIL1 in limiting AFB1-induced mutagenesis and ultimately carcinogenesis.

Visualization strategies to aid interpretation of high-dimensional genotoxicity data

Article

May 2024
ENVIRON MOL MUTAGEN

This article describes a range of high‐dimensional data visualization strategies that we have explored for their ability to complement machine learning algorithm predictions derived from MultiFlow® assay results. For this exercise, we focused on seven biomarker responses resulting from the exposure of TK6 cells to each of 126 diverse chemicals over a range of concentrations. Obviously, challenges associated with visualizing seven biomarker responses were further complicated whenever there was a desire to represent the entire 126 chemical data set as opposed to results from a single chemical. Scatter plots, spider plots, parallel coordinate plots, hierarchical clustering, principal component analysis, toxicological prioritization index, multidimensional scaling, t‐distributed stochastic neighbor embedding, and uniform manifold approximation and projection are each considered in turn. Our report provides a comparative analysis of these techniques. In an era where multiplexed assays and machine learning algorithms are becoming the norm, stakeholders should find some of these visualization strategies useful for efficiently and effectively interpreting their high‐dimensional data.

Liver-on-chip Model and Application in Predictive Genotoxicity and Mutagenicity of Drugs

Article

Apr 2024

Application of duplex sequencing to evaluate mutagenicity of aristolochic acid and methapyrilene in Fisher 344 rats

Article

Feb 2024
FOOD CHEM TOXICOL

Enzymatic Synthesis of Deoxyribonucleic Acid

Article

Full-text available

Jul 1958

Enzymatic Synthesis of Deoxyribonucleic Acid

Article

Full-text available

Jul 1958

The repertoire of mutational signatures in human cancer

Article

Full-text available

Feb 2020
NATURE

Somatic mutations in cancer genomes are caused by multiple mutational processes, each of which generates a characteristic mutational signature1. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium2 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we characterized mutational signatures using 84,729,690 somatic mutations from 4,645 whole-genome and 19,184 exome sequences that encompass most types of cancer. We identified 49 single-base-substitution, 11 doublet-base-substitution, 4 clustered-base-substitution and 17 small insertion-and-deletion signatures. The substantial size of our dataset, compared with previous analyses3–15, enabled the discovery of new signatures, the separation of overlapping signatures and the decomposition of signatures into components that may represent associated—but distinct—DNA damage, repair and/or replication mechanisms. By estimating the contribution of each signature to the mutational catalogues of individual cancer genomes, we revealed associations of signatures to exogenous or endogenous exposures, as well as to defective DNA-maintenance processes. However, many signatures are of unknown cause. This analysis provides a systematic perspective on the repertoire of mutational processes that contribute to the development of human cancer. The characterization of 4,645 whole-genome and 19,184 exome sequences, covering most types of cancer, identifies 81 single-base substitution, doublet-base substitution and small-insertion-and-deletion mutational signatures, providing a systematic overview of the mutational processes that contribute to cancer development.

In utero exposure to genotoxins leading to genetic mosaicism: an overlooked window of susceptibility in genetic toxicology testing?

Article

Full-text available

Nov 2019

In utero development represents a sensitive window for the induction of mutations. These mutations may subsequently expand clonally to populate entire organs or anatomical structures. Although not all adverse mutations will affect tissue structure or function, there is growing evidence that clonally expanded genetic mosaics contribute to various monogenic and complex diseases, including cancer. We posit that genetic mosaicism is an underestimated potential health problem that is not fully addressed in the current regulatory genotoxicity testing paradigm. Genotoxicity testing focuses exclusively on adult exposures and may thus not capture the complexity of genetic mosaicisms that contribute to human disease. Numerous studies have shown that conversion of genetic damage into mutations during early developmental exposures can result in much higher mutation burdens than equivalent exposures in adults in certain tissues. Therefore, we assert that analysis of genetic effects caused by in utero exposures should be considered in the current regulatory testing paradigm, which is possible by harmonization with current reproductive/developmental toxicology testing strategies. This is particularly important given the recent proposed paradigm change from simple hazard identification to quantitative mutagenicity assessment. Recent developments in sequencing technologies offer practical tools to detect mutations in any tissue or species. In addition to mutation frequency and spectrum, these technologies offer the opportunity to characterize the extent of genetic mosaicism following exposure to mutagens. Such integration of new methods with existing toxicology guideline studies offers the genetic toxicology community a way to modernize their testing paradigm and to improve risk assessment for vulnerable populations. This article is protected by copyright. All rights reserved.

Commentary: Mutation as a toxicological endpoint for regulatory decision‐making

Article

Full-text available

Oct 2019

Mutations induced in somatic cells and germ cells are responsible for a variety of human diseases, and mutation per se has been considered an adverse health concern since the early part of the 20th Century. Although in vitro and in vivo somatic cell mutation data are most commonly used by regulatory agencies for hazard identification, that is, determining whether or not a substance is a potential mutagen and carcinogen, quantitative mutagenicity dose–response data are being used increasingly for risk assessments. Efforts are currently underway to both improve the measurement of mutations and to refine the computational methods used for evaluating mutation data. We recommend continuing the development of these approaches with the objective of establishing consensus regarding the value of including the quantitative analysis of mutation per se as a required endpoint for comprehensive assessments of toxicological risk. This article is protected by copyright. All rights reserved.

A Return to the Origin of the EMGS: Rejuvenating the Quest for Human Germ Cell Mutagens and Determining the Risk to Future Generations

Article

Full-text available

Aug 2019

Fifty years ago, the Environmental Mutagen Society (now Environmental Mutagenesis and Genomics Society) was founded with a laser‐focus on germ cell mutagenesis and the protection of “our most vital assets” – the sperm and egg genomes. Yet, five decades on, despite the fact that many agents have been demonstrated to induce inherited changes in the offspring of exposed laboratory rodents, there is no consensus on whether human germ cell mutagens exist. We argue that it is time to reevaluate the available data and conclude that we already have evidence for the existence of environmental exposures that impact human germ cells. What is missing are definite data to demonstrate a significant increase in de novo mutations in the offspring of exposed parents. We believe that with over two decades of research advancing knowledge and technologies in genomics, we are at the cusp of generating data to conclusively show that environmental exposures cause heritable de novo changes in the human offspring. We call on the research community to harness our technologies, synergize our efforts, and return to our Founders' original focus. The next 50 years must involve collaborative work between clinicians, epidemiologists, genetic toxicologists, genomics experts and bioinformaticians to precisely define how environmental exposures impact germ cell genomes. It is time for the research and regulatory communities to prepare to interpret the coming outpouring of data and develop a framework for managing, communicating and mitigating the risk of exposure to human germ cell mutagens. This article is protected by copyright. All rights reserved.

Rationale and Roadmap for Developing Panels of Hotspot Cancer Driver Gene Mutations as Biomarkers of Cancer Risk

Article

Full-text available

Aug 2019

Cancer driver mutations (CDMs) are necessary and causal for carcinogenesis and have advantages as reporters of carcinogenic risk. However, little progress has been made toward developing measurements of CDMs as biomarkers for use in cancer risk assessment. Impediments for using a CDM‐based metric to inform cancer risk include the complexity and stochastic nature of carcinogenesis, technical difficulty in quantifying low‐frequency CDMs and lack of established relationships between cancer driver mutant fractions and tumor incidence. Through literature review and database analyses, this review identifies the most promising targets to investigate as biomarkers of cancer risk. Mutational hotspots were discerned within the 20 most mutated genes across the ten deadliest cancers. Forty genes were identified that encompass 108 mutational hotspot codons overrepresented in the COSMIC database; 424 different mutations within these hotspot codons account for approximately 63,000 tumors and their prevalence across tumor types is described. The review summarizes literature on the prevalence of CDMs in normal tissues and suggests such mutations are direct and indirect substrates for chemical carcinogenesis, which occurs in a spatially‐stochastic manner. Evidence that hotspot CDMs (hCDMs) frequently occur as tumor subpopulations is presented, indicating COSMIC data may underestimate mutation prevalence. Analyses of online databases show that genes containing hCDMs are enriched in functions related to intercellular communication. In its totality, the review provides a roadmap for the development of tissue‐specific, CDM‐based biomarkers of carcinogenic potential, which are comprised of batteries of hCDMs and can be measured by error‐correct next‐generation sequencing. This article is protected by copyright. All rights reserved.

Comprehensive evaluation of structural variation detection algorithms for whole genome sequencing

Article

Full-text available

Jun 2019
GENOME BIOL

Background: Structural variations (SVs) or copy number variations (CNVs) greatly impact the functions of the genes encoded in the genome and are responsible for diverse human diseases. Although a number of existing SV detection algorithms can detect many types of SVs using whole genome sequencing (WGS) data, no single algorithm can call every type of SVs with high precision and high recall. Results: We comprehensively evaluate the performance of 69 existing SV detection algorithms using multiple simulated and real WGS datasets. The results highlight a subset of algorithms that accurately call SVs depending on specific types and size ranges of the SVs and that accurately determine breakpoints, sizes, and genotypes of the SVs. We enumerate potential good algorithms for each SV category, among which GRIDSS, Lumpy, SVseq2, SoftSV, Manta, and Wham are better algorithms in deletion or duplication categories. To improve the accuracy of SV calling, we systematically evaluate the accuracy of overlapping calls between possible combinations of algorithms for every type and size range of SVs. The results demonstrate that both the precision and recall for overlapping calls vary depending on the combinations of specific algorithms rather than the combinations of methods used in the algorithms. Conclusion: These results suggest that careful selection of the algorithms for each type and size range of SVs is required for accurate calling of SVs. The selection of specific pairs of algorithms for overlapping calls promises to effectively improve the SV detection accuracy.

A Compendium of Mutational Signatures of Environmental Agents

Article

Full-text available

Apr 2019
CELL

Whole-genome-sequencing (WGS) of human tumors has revealed distinct mutation patterns that hint at the causative origins of cancer. We examined mutational signatures in 324 WGS human-induced pluripotent stem cells exposed to 79 known or suspected environmental carcinogens. Forty-one yielded characteristic substitution mutational signatures. Some were similar to signatures found in human tumors. Additionally, six agents produced double-substitution signatures and eight produced indel signatures. Investigating mutation asymmetries across genome topography revealed fully functional mismatch and transcription-coupled repair pathways. DNA damage induced by environmental mutagens can be resolved by disparate repair and/or replicative pathways, resulting in an assortment of signature outcomes even for a single agent. This compendium of experimentally induced mutational signatures permits further exploration of roles of environmental agents in cancer etiology and underscores how human stem cell DNA is directly vulnerable to environmental agents. Video Abstract: The effects of a range of environmental mutagens in terms of the kinds of mutations they induce and how these are repaired by the cell is presented in the form of a resource.

The Mutagenesis Moonshot: The Propitious Beginnings of the Environmental Mutagenesis and Genomics Society

Article

Jul 2019

David DeMarini

A mutagenesis moonshot addressing the influence of the environment on our genetic wellbeing was launched just two months before astronauts landed on the moon. Its impetus included the discovery that X‐rays (Muller, 1927) and chemicals (Auerbach and Robson, 1947) were germ‐cell mutagens, the introduction of a growing number of untested chemicals into the environment after World War II, and an increasing awareness of the role of environmental pollution on human health. Due to mounting concern from influential scientists that germ‐cell mutagens might be ubiquitous in the environment, Alexander Hollaender and colleagues founded in 1969 the Environmental Mutagen Society (EMS), now the Environmental Mutagenesis and Genomics Society (EMGS); Frits Sobels founded the European EMS in 1970. As Fred de Serres noted, such societies were necessary because protecting populations from environmental mutagens could not be addressed by existing scientific societies, and new multi‐disciplinary alliances were required to spearhead this movement. The nascent EMS gathered policy makers and scientists from government, industry, and academia who became advocates for laws requiring genetic toxicity testing of pesticides and drugs and helped implement those laws. They created an electronic database of the mutagenesis literature; established a peer‐reviewed journal; promoted basic and applied research in DNA repair and mutagenesis; and established training programs that expanded the science worldwide. Despite these successes, one objective remains unfulfilled: identification of human germ‐cell mutagens. After 50 years, the voyage continues, and a vibrant EMGS is needed to bring the mission to its intended target of protecting populations from genetic hazards. This article is protected by copyright. All rights reserved.

Next‐Generation Genotoxicology: Using Modern Sequencing Technologies to Assess Somatic Mutagenesis and Cancer Risk

Abstract and Figures

Recommended publications

Direct quantification of in vivo mutagenesis and carcinogenesis using duplex sequencing

Direct Quantification of in vivo Mutagenesis Using Duplex Sequencing

Experimental analysis of exome-scale mutational signature of glycidamide, the reactive metabolite of...

Regulations and safety assessment of genome editing technologies for human gene therapy