ArticlePDF AvailableLiterature Review

Next Generation Sequencing: Implications in Personalized Medicine and Pharmacogenomics

Authors:

Abstract and Figures

Breakthrough in the next generation sequencing (NGS) in the last decade provided an unprecedented opportunity to investigate genetic variations in humans and their roles in health and disease. NGS offers regional genomic sequencing such as whole exome sequencing of coding regions of all genes, as well as whole genome sequencing. RNA-seq offers sequencing the entire transcriptome and ChIP-seq allows for sequencing the epigenetic architecture of the genome. Identifying genetic variations in individuals can be used to predict disease risk, with the potential to halt or retard the disease progression. NGS can also be used to predict the response to or adverse effects of drugs or to calculate appropriate drug dosage. Such a personalized medicine also provides the possibility to treat diseases based on the genetic makeup of the patient. Here, we review the basics of NGS technologies and their application in human diseases to foster human healthcare and personalized medicine.
Content may be subject to copyright.
1818 |Mol. BioSyst., 2016, 12, 1818--1830 This journal is ©The Royal Society of Chemistry 2016
Cite this: Mol. BioSyst., 2016,
12,1818
Next generation sequencing: implications in
personalized medicine and pharmacogenomics
Bahareh Rabbani,
a
Hirofumi Nakaoka,
b
Shahin Akhondzadeh,
c
Mustafa Tekin
d
and
Nejat Mahdieh*
a
A breakthrough in next generation sequencing (NGS) in the last decade provided an unprecedented
opportunity to investigate genetic variations in humans and their roles in health and disease. NGS offers
regional genomic sequencing such as whole exome sequencing of coding regions of all genes, as well
as whole genome sequencing. RNA-seq offers sequencing of the entire transcriptome and ChIP-seq
allows for sequencing the epigenetic architecture of the genome. Identifying genetic variations in
individuals can be used to predict disease risk, with the potential to halt or retard disease progression.
NGS can also be used to predict the response to or adverse effects of drugs or to calculate appropriate
drug dosage. Such a personalized medicine also provides the possibility to treat diseases based on the
genetic makeup of the patient. Here, we review the basics of NGS technologies and their application in
human diseases to foster human healthcare and personalized medicine.
Introduction
Recent advances in medical genetics have significantly facilitated
efforts to unravel the mechanism of human diseases. Progress in
genetic testing has the potential to make significant changes in the
way medications are utilized. First, in many cases, the genotype
affects the phenotypic consequences of disorders. Exploring the
causal variants ultimately leads to the development of new methods
of medical treatment. Second, translating pharmacogenetics
research into clinical practice could provide insights into
personalized medicine (PM). In other words, genetic testing could
improve health care in the society and help the administration of
effective drugs and therapy for patients, groups of individuals
and populations.
1–5
The Human Genome Project (HGP), an international scientific
research effort, aimed to sequence and map human genes; using
the Sanger sequencing method the final draft of the human
genome sequence was published in 2003.
6–8
It is the source of
detailed information about the structure, organization and
function of human genes. In general, the Sanger approach, while
having excellent accuracy and reasonable read length, has a very
a
Cardiogenetic Research Center, Rajaie Cardiovascular Medical and Research
Center, Iran University of Medical Sciences, Niayesh-Vali asr Intersection, Tehran,
Iran. E-mail: nmahdieh@gmail.com; Tel: +98 21 23922294
b
Division of Human Genetics, Department of Integrated Genetics,
National Institute of Genetics, Yata 1111, Mishima, Shizuoka 411-8540, Japan
c
Psychiatric Research Center, Roozbeh Psychiatric Hospital,
Tehran University of Medical Sciences, Tehran, Iran
d
John P Hussman Institute for Human Genomics, Miller School of Medicine,
University of Miami, Miami, FL, USA
Bahareh Rabbani
Bahareh Rabbani, PhD, an Assistant
Professor in the Cardiogenetic
research lab, Rajaie Cardiovascular
Medical and Research Center, Iran
University of Medical Sciences,
Tehran, Iran.
Hirofumi Nakaoka
Hirofumi Nakaoka, PhD, an
Assistant Professor in the
Division of Human Genetics,
National Institute of genetics.
Received 16th February 2016,
Accepted 29th March 2016
DOI: 10.1039/c6mb00115g
www.rsc.org/molecularbiosystems
Molecular
BioSystems
REVIEW
This journal is ©The Royal Society of Chemistry 2016 Mol. BioSyst., 2016, 12, 1818--1830 | 1819
low throughput and requires highly expensive equipment to
sequence all genes. These difficulties abandon it as a robust
method for clinical application to sequence the entire genome or
many genes at once. A breakthrough in next generation sequencing
(NGS) platforms throughout this decade now provides affordable
and reliable high-throughput sequencing for evaluation of functional
DNAvariationsinmanydiseasesincludingbothmonogenicand
polygenic phenotypes such as diabetes, cancer, cardiovascular and
neurological disorders as well as in the regulation of physiological
conditions such as height, blood pressure, and body mass index.
9–14
Thus, there is increasing interest in applying individual genome
sequencing for predicting disease risk, life-long well-being of
individuals, medical care and response to drugs in the era of PM.
DNA sequencing has identified several millions of DNA variants
in different populations, mainly single nucleotide variants (SNVs)
deposited in databases.
15,16
Approximately 38 million SNVs, 1.4
million bi-allelic insertions or deletions (indels) and 14000 large
deletions have been discovered.
16
Also, the haplotype map of
the human genome, in HapMap, was used to establish the
methodology of genome-wide association studies (GWAS). With
the help of GWAS, a large number of SNVs associated with
common diseases (e.g. type 1
17
and type 2 diabetes,
18
cancer,
19,20
asthma
21
) and continuous traits (e.g. height,
22
fat mass
23
)have
been elucidated. A comprehensive set of SNVs from NGS such as
in the 1000 Genomes Project has allowed us to fine-map disease-
associated loci through imputation techniques.
One of the major known variations contributing to human
diseases is the structural change. This is due to the different
functional effects of variations depending on their size, localization,
and type. One common structural change referred to as the copy
number variant (CNV), a segment of DNA that is one kilobase or
larger and present at a variable copy number in comparison with a
reference genome,
24
has shown to play roles in human phenotypes
such as birth defects, psychiatric disorders and cancer.
25–27
6–19%
of each chromosome is affected by CNVs.
28
The non-redundant
CNV mean coverage per chromosome is 71.48%
29
(http://dgv.tcag.
ca/dgv/app/). Different individuals with the same structural
variation could show different severity and/or heterogeneities
of the rare and common phenotypes. CNVs could also affect
responses to drugs in people. Since the publication of the initial
studies on CNVs by Iafrate et al. and Sebat et al. 2004, these
variants have been discovered rapidly and up to March 2015,
353,126-3,024,212 CNVs have been identified (http://dgv.tcag.
ca/dgv/app/).
29–31
The casual role of de novo SNVs and CNVs has
also been elucidated under clinical conditions.
32
Studies show
that identification of de novo DNA variants is easier via NGS
technology.
33
Genotype-specific therapy aims to develop care for disease
phenotypes associated with DNA variants. Much work is needed
to identify and characterize the risk variants. Clinicians and
health professionals need to know the associated genotypes for
better management of the disorders. These variants are important
for individual responses to medications, which is a major part
of PM. PM tries to treat each patient based on her/his genetic
makeup.
34
In this regard genotyping of SNPs, CNVs, gene
expression and proteomic profiling data of the patient are
considered. Especially, the genomic information has a significant
role in certain aspects of PM for instance in pharmacogenomics. In
general, DNA variations in metabolic pathways affect individual
responses to drugs;
35
and hence, therapeutic and adverse effects
are different between two people. Prescribing a drug based on
individual’s genetic variations provides an outstanding opportunity
for safety and efficacy of drug usage.
36,37
As a result, NGS has the
potential to revolutionize the diagnosis, treatment, and prevention
Mustafa Tekin
Mustafa Tekin, MD, a Professor in
the Dr John T. Macdonald Founda-
tion Department of Human Genetics,
is a board certified clinical and
molecular geneticist and expert
on the phenotypic and genotypic
characterization of a variety of
Mendelian disorders. His labora-
tory has discovered numerous
genes that cause human diseases
when disrupted by mutations.
Nejat Mahdieh
NejatMahdieh,PhD,anAssistant
Professor in the Cardiogenetic
research lab, Rajaie Cardio-
vascular Medical and Research
Center, Iran University of Medical
Sciences, Tehran, Iran.
Shahin Akhondzadeh
Shahin Akhondzadeh, PharmD,
PhD, FBPharmacolS., DSc, a
Professor of Neuroscience in the
Tehran University of Medical
Sciences.
Review Molecular BioSystems
1820 |Mol. BioSyst., 2016, 12, 1818--1830 This journal is ©The Royal Society of Chemistry 2016
and even therapy of disease in humans. Here, the basics of NGS
technologies and their applications in human health care, PM,
personalized genomics, personalized oncology, and pharmaco-
genomics are discussed.
Next generation sequencing: basics
and applications
The first DNA sequencing methods were invented and published
in the 1970s by Frederick Sanger, Walter Gilbert and Allan Maxam,
such as DNA sequencing with chain-terminating inhibitors and
DNA sequencing by chemical degradation, respectively.
38,39
The
Sanger approach was subsequently started to be used as a routine
method in many institutes and laboratories throughout the world.
It became the gold standard for over three decades because of the
excellent accuracy and reasonable read length so that the HGP was
finishedusingthisapproach;itsverylowthroughputandhigh
expense, however, make it unsuitable for sequencing the human
genome in routine applications. This method was refined and
not replaced by another method for three decades. In 2005, the
first NGS technology (2nd generation sequencing) was developed
and published.
40
Soon after, several NGS platforms have been
introduced
40–43
(Table 1). In addition to these, the third generation
sequencing techniques emerged in order to sequence single
molecules of DNA to overcome difficulties of the second genera-
tion sequencing technologies;
44–48
sequencing technologies of
Nanopore (http://www.nanoporetech. com/), HelicosBioSciences
(http://www.helicosbio.com/), Pacific Biosciences (http://www.
pacificbiosciences.com/) and Complete Genomics (http://www.
completegenomics.com) are promising to acquire higher
throughput, greater accuracy, longer read lengths, faster turn-
around time and low costs. Platforms developed by Pacific
Biosciences have the advantage of sequencing tens of kilobases
and Nanopore can sequence hundreds of kilobases, which are
useful for longer reads of genomic assembly and phasing. All
the pros and cons of these technologies have been described
elsewhere in detail.
49–51
The Illumina HiSeq has almost cornered
the market in NGS but 3rd generation devices are in hot pursuit,
and this might change in the next few years.
Basics of the NGS technologies have been recently reviewed
extensively.
10,14,52,53
In general, the DNA sample is extracted,
fragmented, end-repaired and adaptors are ligated to the ends
of fragments. The produced fragment library, then, is clonally
Table 1 NGS steps, development in biochemical steps, the kind of machine types, performance of each step, and the mechanism of preparation and
how each step is performed for complete massive parallel sequencing. To mention, some machines and software have been categorized here since it is
out of the scope of this paper to include all of them. The 3rd generation sequencing systems differ in library preparation, ultra long reads and streaming
algorithms
NGS step Development in biochemical steps Mechanism Ref.
Template
preparation
Constructing a library of gDNA/cDNA
fragments
Random fragmentation of DNA into 100–850 bp (depend on
platforms) fragments
Ligating adapter sequences onto the
ends of the DNA fragments
Fragmented double-stranded DNA is repaired with an
end-repair enzyme, and then adaptors are ligated
– 3rd generation library preparation Library preparation similar to 2nd generation, an adapter
with hairpin structure, ligate to one or both ends of dsDNA
Sequencing and
imaging
454 GS FLX/Roche (2rd GS) Emulsion PCR & pyrosequencing 42 and 119
HiSeq2000/Illumina (2rd GS) Immobilization on a solid surface & cyclic reversible
termination method with four fluorescent colors
41 and 120
SOLiD v4/applied biosystems (2rd GS) Emulsion PCR & sequencing by ligation 121 and 122
Ion proton sequencer/ion torrent/life
technologies (2rd GS)
Ion semiconductor sequencing; based on the
detection of hydrogen ions that are released during
the polymerization of DNA (a method of sequencing
by synthesis)
123 and 124
HeliScope single molecule sequencer/
Helicos BioSciences Corporation (3rd GS)
Shearing and hybridizing DNA onto a flow cell
surface for sequencing-by-synthesis; cyclic reversible termi-
nation method with one fluorescent colors
48
Single-molecule real-time (SMRT)
sequencing/PacBio RS (3rd GS)
Ligation of the bubble adapters to randomly
fragmented DNA molecules and sequencing of circular DNA
by synthesis in real time for several times
125
Complete genomics’ sequencing
instrument/complete genomics (3rd GS)
Nanoball (rolling circle) amplification &
combinatorial probe anchor-ligation technology
126
Nanopore sequencing/Oxford
nanopores (3rd GS)
Linear DNA plus ‘‘Motor’’ protein pass through the
nanopore, make alterations in the ionic current
46 and 47
Data analysis Remove adapter sequences and
low-quality reads
fastqc, PRINSEQ, FASTX-toolkit, RSeQC, RNA-SeQC
Mapping of the data To a reference genome: Eland, Stampy, BWA, MAQ 127–130
de novo assembly: overlap-layout-consensus (OLC); de Bruijn
assembly, ALLPATHS, SOAPdenovo, ABySS; string graph
Analysis of the sequence data; variant
calling and annotation
GATK, joint SNVMix, MuTect, Samtools, Platypus, Break-
Dancer, Pindel, Genome STRiP, Varscan2,
SegSeq, Somatic Sniper, ANNOVAR
129–135
Shown are the steps in biochemical development of NGS, machine types, performance of each step, and the mechanism of preparation and how
each step is performed for complete massive parallel sequencing. Not all machines are listed. The 3rd generation sequencing systems differ in
library preparation, ultra long reads and streaming algorithms.
Molecular BioSystems Review
This journal is ©The Royal Society of Chemistry 2016 Mol. BioSyst., 2016, 12, 1818--1830 | 1821
amplified followed by sequencing of fragments in a massively
parallel mode (Fig. 1). Various applications of this technology are
whole genome sequencing (WGS), whole exome sequencing (WES),
targeted sequencing, RNA-seq and ChIP-seq (Table 2). In WES or
targeted sequencing, the regions of interest (exome for WES) are
captured by the hybridization procedure (Table 2; Fig. 1). RNA-seq
can be used to determine the expression profile in normal and
diseased cells and tissues. Chromatin immunoprecipitation
(ChIP) traditionally was used to understand DNA–protein binding
interactions.
54
Using NGS approaches, ChIP-seq is applied to
Fig. 1 NGS steps to identify genetic variants in human disorders. Overview of the WGS and WES workflow including three major steps of (A) sample
preparation (B) sequencing process (C) data analysis and identifying causal variants. MAF: minor allele frequency.
Table 2 Main NGS applications
Application Purpose Methodology Ref.
WGS Finding structural abnormalities and SNPs in both
coding and non-coding regions
DNA extraction, fragmentation, sequencing, data analysis,
and variant finding
120
WES Identifying mutations occur in coding regions DNA extraction, fragmentation, exome capturing,
sequencing, data analysis, and variant finding
136
Targeted
seq
Mutation screening involving a large number of genes DNA extraction, fragmentation, targeted region capturing,
sequencing, data analysis, and variant finding
137 and 138
RNA-seq Transcriptome profiling and finding the fusion genes RNA extraction, cDNA making, fragmentation,
sequencing, data analysis, and variant finding
139
ChIP-seq Used to analyze protein interactions with DNA
(epigenetic)
DNA fragmentation, adding bead-attached antibodies
against the protein of interest, DNA purification,
sequencing, and mapping to the genome
122 and 140
Review Molecular BioSystems
1822 |Mol. BioSyst., 2016, 12, 1818--1830 This journal is ©The Royal Society of Chemistry 2016
analyze protein interactions with DNA in a high-throughput
manner so that NGS could determine the regulatory protein
binding sites. Using the Illumina platform, Mikkelsen et al.
identified that chromatin changes could lead to changes in
gene expression.
55
Mapping, variant calling, annotation
and disease gene filtering
Briefly, the output of NGS platforms is usually in the standard
FASTQ format containing a text file of short DNA fragments.
Currently, rapid run mode of the Illumina HiSeq2500 can
produce up to 2 250 base pairs (bps) as well as a list of the
corresponding quality scores. DNA sequences are aligned to a
reference genome, or de novo assembly might be done for
organisms without the reference genome.
56
Base recalibration
based on empirical mismatches to the reference genome at loci
without known genetic variation in databases (e.g. dbSNP) and
realignment around known indels are performed to improve
the mapping step.
57
Different sites from the reference genome are called to find
the pathogenic variants. Several databases, e.g. the 1000 Genomes
project, dbSNP, and HapMap, are used for filtering and annotating
the variants. Variants are annotated based on defined features
(e.g. amino acid change), severity (e.g. conservation), and protein
structures. By defining the inheritance mode, population
frequency via various databases such as the 1000 genomes
project, and pathway analysis, causative variant(s) is predicted.
10
Rare Mendelian diseases are caused by rare DNA variants
with high penetrance. Identification of causal variants is performed
according to the mode of inheritance, i.e. dominant, recessive and
de novo. Discovering those with incomplete penetrant variants is
much challenging. Whereas, some diseases are likely caused by
common variants at many loci with low penetrance, some others
are potentially caused by the combinational effects of rare and
common variants.
58
Health care and NGS
The health care system is passing from a public to individualized
view. Health includes physical, mental, and social well-being and
the absence of disease. Somehow all these characters have a
genetic basis; meanwhile we focus on how science and new
technologies could improve health regarding the genetic basis of
individuals.
Life span is enhanced by risk assessment, prognosis, and
diagnosis of a disease state, defining the genetic susceptibility
of each individual. Nowadays, focus on genetic variations
constitutes part of health and disease programs in population,
families, and individuals. Similarly, social and environmental
attributes are components of the health management. Due to
genetic heterogeneity, a cost efficient and faster technology to
define the variants is needed. Standard genetic tests should be
introduced in the laboratories for detecting genetic variations
in patients/individuals. A standard test should be able to
(1) measure accurately and reliably the molecular target of
interest (analytical validity), (2) detect or predict the associated
disorder including clinical sensitivity, clinical specificity and
positive and negative predictive values (clinical validity) and (3)
affect clinical decisions and improve patient outcomes in
clinical practice (clinical utility) (http://osp.od.nih.gov/). How-
ever, the most crucial factor to use a test in a health care system
is a concern that the test will lead to a better outcome. A large
amount of potential diagnostic and prognostic markers could
be determined using NGS technologies with remarkable speed
as a genetic test in clinics.
9
Also, knowing susceptible variants
may aid for better clinical management. Here, we exemplify
how different markers can be utilized for clinical care.
The phenotypic variance among humans is due to the
accumulative effects of the genetic variants and possible inter-
actions among them in addition to environmental factors.
Genetic information and predictive biomarkers could validate
the predictive risk, prognosis, diagnosis, prevention, screening
tests, and clinical management of patients. These markers
could be a DNA sequence, RNA, proteins, and metabolites.
Overtime, the germline DNA sequence doesn’t change and it is
a useful marker of susceptibility to a disease; the somatic DNA
change in cancer is an exception. Also, the gene expression
profile (transcriptome), the proteome, and the metabolome of
humans have correlations with the level of disease and health.
One of the notable examples is the human leukocyte antigen
(HLA) system. Several loci and genes encode the HLA; this
group of genes encodes the major histocompatibility complex
(MHC class I–III) in humans. HLAs play many potential roles in
disease defense, organ transplant rejections, protection against
cancers, autoimmune disease, and even in some human behaviors
such as mate selection.
59,60
With high throughput DNA sequencing,
HLA typing dilemmas could be overcome, and the genetic effects
could be investigated.
The cellular state is based on metabolic products used as
markers for diagnosis and therapy. These molecules show the
biochemical activity and pathways of the cell. Besides, they
could be used as biomarkers for identifying diseases and
treatments.
61
Unlike other biomarkers they are not related to
genes; so they are part of new changes throughout the cell and
results of metabolism and biochemical changes of organic
compounds. Global profiling of metabolites in cells, tissues,
and organs is used for determining drug toxicity, gene function,
therapeutic targets and diagnosis of diseases such as cancers
62,63
and diabetes.
64
Similarly, an epigenetic analysis shows methyla-
tion changes in various diseases. For example, hypomethylation
of oncogenes
65
and hypermethylation of tumor suppressor genes
are hallmarks in tumors
65,66
and different diseases
67
such as
type 2 diabetes,
68
cardiovascular disease,
69
and systematic lupus
erythematosus.
70
Proteins are assayed as biomarkers to benefit from targeted
therapies. Some signaling proteins are candidates of pathways
for specific treatments. Besides, some proteins act as markers
to confirm the associated pathological state used for disease
diagnosis and prognosis. Integrated technologies such as mass
spectrometry, protein, and DNA arrays in addition to NGS
Molecular BioSystems Review
This journal is ©The Royal Society of Chemistry 2016 Mol. BioSyst., 2016, 12, 1818--1830 | 1823
opened a new era in cancer analysis. Characterizing signaling
pathways, molecules and tumor progression is challenging because
of alteration in cancer. Due to this complexity and changing states
of cancers, there is a need for extensive investigation.
71
The gene expression profile changes depending on environ-
mental effects; bacteria or virus contamination may affect the
expression pattern of human genes. Similarly, microbiome of
the gut is essential for health monitoring and PM, since every
person has unique flora.
72
The pathological effect of gastro-
intestinal microbes in blood pressure, coronary heart disease,
73
obesity,
72
inflammatory bowel diseases (IBDs)
74
and autism
75
has been investigated. Also, high throughput sequencing of
B-cell and T-cell receptors showing the host immune response,
referred to as repertoire-sequencing (Rep-Seq), has been used to
understand the immune response and immune behavior;
antigen-specific/monoclonal antibodies are produced for therapeutic
measures for autoimmune diseases and cancer.
76
In a survey, an
individual’s genome, transcriptome, proteome, metabolome,
autoantibody and miRNA profiles were studied for 14 months.
During two periods of viral infection, alternatively spliced
isoforms, heteroallelic expression, RNA edits, and microRNA
changes were observed.
77
In another study, the genome, epigenome
and RNA sequence of a monozygotic twin pair confirmed the
environmental effect in the pathogenesis of multiple sclerosis.
78
All these examples show that different biological markers are
valuable for defining and controlling individuals’ health progress
which is brought about by new and advanced technologies.
Understanding variability in human behavior, even within
the same family, is more complex than many other traits. For
example, preferences of individuals to choose their food,
approaches to solving their problems and their habits are, at
least in part, due to the role of many genetic variations with
minor effects. These can be investigated and profiled only by
NGS technology. In other words, genetic elements in interaction
with the environment create human traits and habits, so one can
say that human health concerns and disease are affected by
these minor factors.
Personalized medicine
Drugs, in any form of herbal or mineral, have been administrated
depending on the age, body size, and gender since ancient times; i.e.
apersonalizedapproach is considered. Hippocrates, Empedocles,
and others believed that there were four elements, air, earth, fire and
water corresponding to blood, black bile, yellow bile, and phlegm,
respectively. There was an opinion in Iranian traditional medicine
formulated by Ibn Sina (also knownasAvicenna)inhisbook
Canon
79
that a disease was the result of an imbalance in the
composition of these basic body fluids. Temperament was based
on these humors. Blood was designated as warm and wet, yellow bile
wasdesignatedaswarmanddry,melancholywasdesignatedascold
anddry,andphlegmwasdesignatedascoldanddamp.Foods,
drugs, and nutrition belonged to one of the four groups: moist and
warm, moist and cold, dry and warm, and dry and cold. People
consumed foods that were appropriate for their nature; people who
had moist and warm temperament used dry and cold foods and vice
versa. During several hundred years of progress, the term PM has
encompassed all sorts of personalized measures, while it was first
coined in the context of genetics. Needless to say, an in-depth
understanding of the genomes (interactions among the genes),
biological molecules, metabolic pathways and immunological pro-
cesses is essential to apply the personalized concept of medicine.
Broadly speaking clinical, genetic, genomic, and environmental
characterization of individuals is investigated in this field for disease
therapy and interventions. Genetic information would be useful in
the aspects of personalized medicine, e.g. pharmacogenomics.
PM will lead to new therapeutic approaches for different
diseases likewise personalized health care aims to prevent and
diagnose territories. Predicting the risk of a disease, optimizing
screening programs and identifying disease at early stages raise
health concerns, for individuals, families and population. NGS
makes it easier to find causal genes and related variations for
the management of patients. Of course, the first line in NGS is
research (Fig. 2). Then, our knowledge of the genes and variations
is used in pharmacogenomics. Developing new therapies is the
expectation of PM, and the full potential is being understood more.
Personalized genomics
NGS as a revolution in molecular biology has made personal
genomes affordable. An expanding optimism is that the successful
application of NGS approaches to understand the biology of
diseases, which would provide treatment choices based on the
individual genetic background. In traditional medical approaches,
all patients with a given disease usually receive the same treatment
based on the results of studies performed on similar patients
(Fig. 3). As we know, the genetics or biology of individuals is not
identical; hence, different treatments might be required for
patients based on their genetic information (Fig. 3).
Ashley et al. performed WGS of a patient with a family history
of vascular disease and sudden early death but no clinically
significant medical record to predict possible risk for coronary
artery disease and causes of sudden cardiac death.
80
They found
rare variants of three genes including TMEM43 (MIM# 612048),
DSP (MIM# 125647), and MYBPC3 (MIM# 600958) that are
clinically associated with sudden cardiac death. This patient
was heterozygous for a null mutation in the CYP2C19 (MIM#
124020) gene suggesting probable clopidogrel resistance. The
authors proposed that WGS could provide useful and clinically
relevant information for individual patients; knowing the
pharmacogenetic variants could be significant for the future
personalized medical care of the patient. As a whole, the
personal genomic analysis is a field of genomics which finally
provides a practical medical therapy for individuals based on
genome analysis. Predictive and preventive care will lead to
advanced health nursing.
Studies have shown that the additive effects of a series of
genetic variants having low-frequency, dominant and independent
action could explain the causality for most complex chronic
diseases (rare-variant hypothesis).
81,82
As mentioned previously,
Review Molecular BioSystems
1824 |Mol. BioSyst., 2016, 12, 1818--1830 This journal is ©The Royal Society of Chemistry 20 16
more than 95% of the variants identified using HapMap and the
1000 Genomes Project are common ones. WES and WGS may
provide a promising approach to identify low-frequency (1–5%)
and rare (o1%) variants. In GWAS, SNPs between disease and
control cases are compared and the associated variants to
diseases are investigated. Using a genomewide approach, the
appropriate drug response loci can be found rather than finding
one gene.
83
To comprehend the complex entity of genetic factors
modifying response to drugs, many studies are needed to be
performed. In addition, genetic polymorphisms of different
ethnic groups should be determined. NGS technologies are
promising to achieve these goals. Genetics of common diseases
requires studying a large number of subjects for each condition.
Furthermore, problems of drug response need to be investigated for
eachdrug,eachdisease,andeachpopulation.Pharmacogenomics
includes the study of drug response in an individual’s genome.
The allele-specific response of an enzyme for drug metabolism
should be investigated for drug dose reaction, e.g. CYP2D6 (MIM#
124030) gene dose adjustment.
84
NGS technologies could afford to
establish detailed candidate gene investigations, genome-wide
studies, and metabolic pathway analyses within the cell, or
tissue to make biomarker panels of genetic polymorphisms for
personalized purposes in each disease; then we could decide to
choose an appropriate treatment protocol based on these panels of
polymorphisms. Clinical pharmacology implementation consortium
(CPIC) investigates the development of gene/drug pair guidelines
that help clinical professionals to interpret genotype data.
85
Pharmacogenomics
The term ‘‘pharmacogenetics’’ was first used by the German
pharmacologist Friedrich Vogel in 1959. It is explained as the
study of the variability in drug response among the people
because of DNA variations. Favism is the first example of human
traits delineating pharmacogenetics. A deficiency of glucose-6-
phosphate dehydrogenase (G6PD) leads to non-immune hemo-
lytic anemia in response to many causes such as infection or
exposure to certain medications or fava beans. Glucose-6-
phosphate dehydrogenase is a metabolic enzyme involved in
the pentose phosphate pathway producing NADPH, especially
important in red blood cell metabolism. Phenotypic assessment
of drug metabolizing enzyme capacity has been used to under-
stand acetylation and hydroxylation capacity of some drug
metabolizing enzymes like CYP2D6 among peoples.
86
Nowadays,
the interaction of the single gene–drug effect is considered
in pharmacogenetics. The emergence of advanced molecular
techniques experiences a new era in pharmacogenetics, and
the role of DNA variations is highlighted. For example, using
these techniques, more than 80 variants have been reported for
the CYP2D6 gene.
87,88
Many drugs work in a complex interaction of many genes
and environmental factors; therefore a combination of geno-
typing effects (genetic profile) is necessary, which is referred to
as pharmacogenomics. Drugs change the gene expression in
cells. Therefore, the relationship between the transcriptome,
proteome, and metabolome is noteworthy.
86
Patients’ genetic
variability due to pharmacokinetics (drug metabolizing enzymes
and transporters) and pharmacodynamics (drug targets and
associated pathways) is a candidate of variable responses. As
mentioned, detecting a large number of genes and variations
requires advanced tools such as NGS.
Briefly, responses to drugs may fall in one of the four groups:
inefficient, efficient, resistant, and toxic. A person may show a
particular response to a drug or even its dosage (Fig. 3). For
example, it has been known that valproate has a different effect
on patients with bipolar disorder based on their genotype of
XBP1-116C/G.
89
Similarly, therapeutic dosages of warfarin vary
among different individuals; plasma concentrations of the drug
may be variable up to 30-fold in patients;
90,91
variations in
CYP2C9 and vitamin K epoxide reductase (VKOR)geneslead
to different metabolic abilities of the encoding enzymes.
92,93
Fig. 2 The scheme shows how NGS is a flourishing personalized medicine
(PM). Half apple symbolizes the health management including prevention
and prediction. Totally we aim for better treatment and therapy which is
the goal of PM (consider the circles turning around). Next generation
sequencing (NGS) is the first line of the way. New technologies, such as
NGS, are progressing in genetic testing and are also used in research,
screening and diagnosis. Each of these applications could be used for the
individuals, families and population, which may include different diseases as
Mendelian diseases, complex traits, common disorders, and cancers. Of
course, the first level of medicine is the research to identify genes and
variations, to help finding the mechanism of disease and test or identify drugs
(new/old) for the susceptible variations – known as ‘‘pharmacogenomics’’ in
medicine. Basically, functional studies are performed in research studies to
evaluate new genes/variations for complete understanding the therapeutic
effects. Then, drugs/medications are tested for different variations in clinical
trials. Finally they could be used in diagnosis and screening programs. Con-
sequently, the basic knowledge is translated into clinical practice. Screening
programs lead to prevention strategies, which are part of health management.
At the level of diagnosis, NGS could be applied in PND (Prenatal Diagnosis),
PGD (Preimplantation Genetic Diagnosis), PSD (Presymptomatic Genetic Diag-
nosis), etc.; therefore better treatment or prevention strategies could be applied
in clinics. All these levels applying for individuals’ therapy are known as
‘‘personalized medicine’’, which are part of health systems.
Molecular BioSystems Review
This journal is ©The Royal Society of Chemistry 2016 Mol. BioSyst., 2016, 12, 1818--1830 | 1825
In short, drug safety has also been shown to vary person to
person
94
and efficacy rates of drugs are in the range of 20–80%
while up to 20% of the patients are poor responders.
94,95
Another example is trastuzumab (Herceptin) and breast cancer.
The human epidermal receptor type 2 (HER2) gene is overexpressed
in about one-quarter of patients affected with breast cancer, and
Her2 is concentrated in cancer cells; Herceptin blocks this receptor,
and so it suppresses those tumors expressing Her2.
96,97
The HER2
gene is characterized by its complex and unusual genomic structure.
Therefore, an intensive investigation is needed to be performed in
this region of the gene to understand the complexity, which could be
the cause of cancer heterogeneity and drug resistance. As discussed
previously, due to different genetic backgrounds drug responses
vary, so other drugs (third generation) may be effective for different
individuals.
In summary, clinical decisions should be adjusted regarding
each patient’s genetic information. Genetic polymorphisms can
affect pharmacodynamics and/or pharmacokinetics of drugs; from
absorption to excretion, a drug is affected by genetic polymorph-
isms. Various isoforms of a protein such as ion channels, enzymes,
and receptors show different drug responses.
Drug metabolization pathway and
genes
Upon intake a drug starts to be metabolized by a special
enzymatic system. The xenobiotic (substances not normally
found in the human body) metabolism is composed of a series
of biochemical changes in more excreted products. In brief,
usually, three phases of metabolism including modification,
conjugation, and excretion are considered which are made
possible by a variety of enzymes encoded by the genes. Drugs
are converted into reactive electrophilic metabolites at the first
step known as nonsynthetic reactions. Various modifications
may occur such as oxidation, hydrolysis, reduction, cyclization,
and decyclization, which are carried out by specific enzymes.
The cytochrome P450
98
superfamily is one of the most common
groups of phase I enzymes. Subsequently, phase II enzymes
conjugate these metabolites with components such as glutathione
(GSH), glucuronic acid, or sulfate, so activities are decreased. A
large group of transferases like glutathione S-transferases catalyzes
these reactions. The metabolites may bear more metabolic actions
or be excreted. Proteins such as the multidrug resistance protein
family act as transporters of the xenobiotic conjugates. Each of the
enzymatic pathways of metabolism to excretion is susceptible to
genetic variation.
Three drug metabolizing enzymes CYP2D6, CYP2C9 and
CYP2C19 are clinically significant and show an enormous number
of genetic polymorphisms. Some SNPs may affect the activity of
proteins more than other SNPs. Though there are more drugs to be
mentioned.
Alexanderson et al. in 1969 showed that metabolism of
nortriptyline (a tricyclic antidepressant) which is dependent on
the genetic background of individuals indeed affects the drug
metabolism.
99
In the following years, other researchers studied
Fig. 3 Classic and new approaches to drug therapy in human disease. (A) In the classic approach clinicians used the same drug for the same phenotypic
character. In addition, the same dose was used for different individuals. (B) Above all in the new drug therapy the genetic background of each individualis
the key; in fact what we call is personalized medicine. Individuals with the same phenotypic effect, which have different backgrounds may use different
drugs with different dosages due to genetic variations. An approach, which we call ‘‘first genotype-next therapy’’, is considered. This approach is cost
benefit; indeed adverse drug responses are decreased because the right drug is used for the right patient.
Review Molecular BioSystems
1826 |Mol. BioSyst., 2016, 12, 1818--1830 This journal is ©The Royal Society of Chemistry 20 16
the role of genetic polymorphisms of CYP2D6 (debrisoquine
hydroxylase or sparteine oxidase) in the variable oxidation of
debrisoquine/sparteine among individuals.
100,101
It has been
determined that four types of nortriptyline metabolizers may
exist among each population consisting of poor, intermediate,
extensive and ultra-rapid metabolizers depending on the CYP2D6
gene polymorphisms. In addition to the genetic background,
other factors such as age, drug interaction, and nature of disease
could lead to variability in drug responses. Furthermore, the
expression of noncoding RNA such as siRNA and microRNAs
influences the phenotype. The RNA expression level (transcriptome)
is applied to predict subsets of disease and the level of
pathogenicity.
102
In fact, the importance of RNA-seq for drug
development is becoming more discernable to clinicians and
drug developers.
NGS technologies and
pharmacogenomics
Identification of DNA variants might be necessary for the
elucidation of the different responses of affected individuals with
the same disease. Until now, several millions of DNA variants have
been found.
15,16,103–107
The majority of these are common variants;
SNPs with a minor allele frequency of at least 5–10% are important
to be employed in pharmacogenomics.
108,109
Characterizing the
rare variants enhances our understanding of pharmacogenetic
studies because it is known that uncommon variants are more
frequently determined and could cause discrepancies in responses
to drugs.
110,111
Interactions of SNPs between/among other compo-
nents such as other SNPs, proteins, and non-coding RNAs should
be determined to find out their exact role. These variations when
occurring in non-coding regions could have a regulatory role in
affecting the expression of genes and/or a structural role acting
as a scaffold of proteins. A comprehensive pharmacogenomics
knowledge database (PharmGKB: http://www.pharmgkb.org/) is
a collection of clinical information including dosing guidelines
and drug labels and annotates genetic variants and the gene–
drug–disease relationship.
A large number of markers have been reported to be an
appropriate target for therapies in human disease and cancer.
112–114
Garralda et al. studied patients with advanced solid tumors to find
genomic alterations as well as to test proposed treatment strategies
in Avatar models. Using WES they found three substitutional
mutations, one of them was p.F909C, the catalytic domain of the
phosphoinositide 3-kinase protein (PI3K), causing a volume change
(from bulky F to a smaller C) providing possible S–S bonds, and two
other mutations were p.F909L and p.F909S in addition to GNG11
amplification. They treated the patient Avatar model with a PI3K
inhibitor alone and drug cocktails of a PI3K and MEK inhibitors as
well. Based on their results, the combination of a PI3K and MEK
inhibitors could be the possible useful approach.
115
NGS has been used in several reports to analyze the drug
resistance in cell lines or different disorders.
116–119
In a drug
resistance analysis, Mohamed et al., studied 50 HIV-1 patients
experiencing virological failure to detect genetic variations
using ultra-deep sequencing (UDS) and Sanger sequencing.
131
They found a total of 643 and 224 clinically relevant changes in
UDS and Sanger sequencing, respectively; three resistance
variations with more than 20% prevalence were detected solely
by UDS. These researchers concluded that identifying more
resistance variations and improving the drug-resistance inter-
pretation, a cut-off of 1%, provides a better characterization of
the viral population. In another study, Ji et al., compared the
results of the 454 pyrosequencing with Sanger sequencing; they
used 48 specimens from an HIV drug resistance study to pool
and pyrosequence them. They found a high concordance
between Sanger sequencing and pyrosequencing for detecting
the variants.
117
NGS would help rapid and cost-effective analysis of a vast
number of HIV sequences from different populations and regions.
Also, it could determine the minority effect of variations. Though
a lack of enrichment of the virus genome through the HIV
phylogenetic extent, subtypes and recombination make the
analysis of HIV laborious. Also, the difficulty of short read
assembly of RNA is complex due to error-prone RNA poly-
merases, reverse transcriptases, and replication rates, despite
the fact that methods are improving for a universal approach
due to the heterogeneity of sequences.
120
Of course, viral genome
diversity and drug resistance may act as a limitation for NGS.
In silico analysis of gene expression profiles of certain
disorders has been used for drug repositioning to investigate
the relationship between drug efficacy and genomic information.
Sirota et al. used a systematic computational pipeline for predicting
novel therapeutic indications based on comprehensive testing of
molecular biomarkers in drug–disease pairs.
121
Gene expression
profiles in 100 diseases on 164 drug compounds were investigated,
and recent disease–drug match between the antiulcer drug
(cimetidine) and lung cancer was successfully identified. In another
study, a match between the antiepileptic drug (topiramate) and
inflammatory bowel disease was determined using in silico
analysis.
122
In summary, new approaches to drug therapy can decrease
side effects of the drug among patients, save the expenses and
increase the efficacy and safety of individuals. Using the classical
approach to drug treatment, there is a ‘‘same disease–same
drug’’ concept; one drug could be used to treat patients with
the same disease. In a new approach, we accept another concept,
‘‘same disease–different drugs’’ based on the genetic back-
ground of each person; at first, genetic biomarkers of individuals
are determined, and then an appropriate drug is prescribed to
obtain higher efficacy and safety (Fig. 3).
Future perspectives
Progress in molecular technologies is providing opportunities
to change health care practice; determining the genetic make-up
will soon become an integral part of routine health assessment
in medicine. Health care practitioners require knowledge about
the interpretation of genetic information and more details of
genetic phenomena in humans. NGS is revolutionizing and
Molecular BioSystems Review
This journal is ©The Royal Society of Chemistry 2016 Mol. BioSyst., 2016, 12, 1818--1830 | 1827
incorporating all specialties in medicine. A vast variety of
dreams and goals in PM have emerged upon outburst of these
technologies. Increasing our knowledge about the genes, their
variations and functions, transcriptome, metabolome, drug
compositions and interaction among them will enable us to
provide pharmacogenomic profiles of individuals as well as the
individual-specific therapeutic protocol. Prescribing the right
drug in the right dosage at the right place to the right patient,
which may originate from an old opinion in medicine, i.e.
treatment based on the four basic humors of humans, is an
achievable goal using modern medical approaches in the near
future. Scientific content will cover basic cancer research and a
variety of novel or emerging therapeutic approaches with ses-
sions on cancer heterogeneity, genetics, epigenetics, molecular
mechanisms, targeted therapies, immune therapies and trans-
lational/clinical research. The health and genomic profile card
for each person upon birth containing all variations of his/her
(highlighting beneficial and at risk) genomic variations and
specific drug names depending on the genetic variation might
be one of the accessible dreams for forethoughtful therapeutic
actions when/where is required. Of course, the prerequisite for
providing such a card is overcoming the problems of transfer-
ring NGS to the clinic and understanding the role of genomic
biomarkers and metabolic pathways of the drug and nutritional
compounds as well as solving the ethical, legal, and societal
issues. In summary, we are at the beginning of rapid evolution
in screening, diagnosis, prevention and therapy of multiple
diseases, not envisaged before the Human Genome Project.
Conclusion
Variant profiles in many people with different diseases need to
be analysed to determine molecular biomarkers for each disease.
While the cost of NGS is going down, it is still too expensive for
some countries. Advances in technology and informatics will
eventually bring down the overall cost of genomic analysis. Other
methods of genotyping such as microarrays or bead arrays will
likely find their way in clinical usage because of their low cost. A
current limitation of NGS is sequence errors in homopolymer
regions that are produced in particular NGS platforms. Further-
more, data analysis is now relatively time-consuming, and
specialized knowledge of bioinformatics is needed to analyze
the sequence data.
Although before marketing, the drug safety and efficacy
should be approved by strict regulations, a drug may not show
an expectable efficacy and safety in some individuals due to
genetic variability among human beings. Unraveling the profile
of human genomic makeup could provide a practical direction
in prescribing drugs to each person; this is attainable by NGS
technologies. The drugs, therefore, would be prescribed in an
individualized manner for decreasing the adverse effects of the
drugs, preventing their excessive consumption. With genomic
data, more efficient, cheaper, and safer new drugs can be
developed; thus, pharmacogenomics plays a central role in
personalized medicine.
Acknowledgements
We express our sincere appreciation to Dr Gholam Reza
Walizadeh for critical reading of the manuscript.
References
1 S.Kubrick,A.C.Clarke,K.Dullea,G.Lockwood,W.Sylvester,
Copyright Collection (Library of Congress), et al. (1988). 2001, a
space odyssey. In Criterion collection Criterion collection edit.,
pp. 3 videodiscs of 3 (optical) (149 min). The Voyager
Company, USA.
2 A. Albrechtsen, N. Grarup, Y. Li, T. Sparso, G. Tian and
H. Cao, et al.,Diabetologia, 2012, 56, 298–310.
3 I. Goodhead, P. Capewell, J. W. Bailey, T. Beament,
M. Chance and S. Kay, et al.,mBio, 2013, 4, e00197.
4 L. Farberov, A. Gilam, O. Isakov and N. Shomron, Genet.
Res., 2013, 95, 53–56.
5 G. Girault, Y. Blouin, G. Vergnaud and S. Derzelle, BMC
Genomics, 2014, 15, 288.
6 International Human Genome C. Sequencing, Nature,
2004, 431, 931–945.
7 V. G. Cheung, N. Nowak, W. Jang, I. R. Kirsch, S. Zhao and
X. N. Chen, et al.,Nature, 2001, 409, 953–958.
8 J. C. Venter, M. D. Adams, E. W. Myers, P. W. Li, R. J. Mural
and G. G. Sutton, et al.,Science, 2001, 291, 1304–1351.
9 B. Rabbani, M. Tekin and N. Mahdieh, J. Hum. Genet.,
2014, 59, 5–15.
10 B. Rabbani, N. Mahdieh, K. Hosomichi, H. Nakaoka and
I. Inoue, J. Hum. Genet., 2012, 57, 621–632.
11O.Diaz-Horta,D.Duman,J.Foster,2nd,A.Sirmaci,
M. Gonzalez and N. Mahdieh, et al.,PLoS One, 2012, 7, e50628.
12 S. F. Kingsmore and C. J. Saunders, Sci. Transl. Med., 2011,
3, 87ps23.
13 M. J. Bamshad, S. B. Ng, A. W. Bigham, H. K. Tabor,
M. J. Emond and D. A. Nickerson, et al.,Nat. Rev. Genet.,
2011, 12, 745–755.
14 M. L. Metzker, Nat. Rev. Genet., 2010, 11, 31–46.
15 D. M. Altshuler, R. A. Gibbs, L. Peltonen, E. Dermitzakis,
S. F. Schaffner and F. Yu, et al.,Nature, 2010, 467, 52–58.
16 C. Genomes Project, G. R. Abecasis, A. Auton, L. D. Brooks,
M. A. DePristo and R. M. Durbin, et al.,Nature, 2012, 491,
56–65.
17 J. A. Todd, N. M. Walker, J. D. Cooper, D. J. Smyth, K. Downes
and V. Plagnol, et al.,Nat. Genet., 2007, 39, 857–864.
18 R. Sladek, G. Rocheleau, J. Rung, C. Dina, L. Shen and
D. Serre, et al.,Nature, 2007, 445, 881–885.
19 S. N. Stacey, A. Manolescu, P. Sulem, T. Rafnar,
J. Gudmundsson and S. A. Gudjonsson, et al.,Nat. Genet.,
2007, 39, 865–869.
20 D. F. Easton, K. A. Pooley, A. M. Dunning, P. D. Pharoah,
D. Thompson and D. G. Ballinger, et al.,Nature, 2007, 447,
1087–1093.
21 M. F. Moffatt, M. Kabesch, L. Liang, A. L. Dixon,
D. Strachan and S. Heath, et al.,Nature, 2007, 448,
470–473.
Review Molecular BioSystems
1828 |Mol. BioSyst., 2016, 12, 1818--1830 This journal is ©The Royal Society of Chemistry 20 16
22 S. Sanna, A. U. Jackson, R. Nagaraja, C. J. Willer, W. M.
Chen and L. L. Bonnycastle, et al.,Nat. Genet., 2008, 40,
198–203.
23 T. M. Frayling, N. J. Timpson, M. N. Weedon, E. Zeggini,
R. M. Freathy and C. M. Lindgren, et al.,Science, 2007, 316,
889–894.
24 L. Feuk, A. R. Carson and S. W. Scherer, Nat. Rev. Genet.,
2006, 7, 85–97.
25 J. S. Beckmann, X. Estivill and S. E. Antonarakis, Nat. Rev.
Genet., 2007, 8, 639–646.
26 S. E. Antonarakis and J. S. Beckmann, Nat. Rev. Genet.,
2006, 7, 277–282.
27 S. A. McCarroll and D. M. Altshuler, Nat. Genet., 2007, 39,
S37–S42.
28 M. Zarrei, J. R. MacDonald, D. Merico and S. W. Scherer,
Nat. Rev. Genet., 2015, 16, 172–183.
29 J. R. MacDonald, R. Ziman, R. K. Yuen, L. Feuk and
S. W. Scherer, Nucleic Acids Res., 2014, 42, D986–D992.
30 A. J. Iafrate, L. Feuk, M. N. Rivera, M. L. Listewnik, P. K.
Donahoe and Y. Qi, et al.,Nat. Genet., 2004, 36, 949–951.
31 J. Sebat, B. Lakshmi, J. Troge, J. Alexander, J. Young and
P. Lundin, et al.,Science, 2004, 305, 525–528.
32 K. Inoue and J. R. Lupski, Annu. Rev. Genomics Hum. Genet.,
2002, 3, 199–242.
33 P. H. Sudmant, T. Rausch, E. J. Gardner, R. E. Handsaker,
A. Abyzov and J. Huddleston, et al.,Nature, 2015, 526,
75–81.
34 E. H. Weiss, F. M. Merchant, A. d’Avila, L. Foley, V. Y.
Reddy and J. P. Singh, et al.,Circ.: Arrhythmia Electrophysiol.,
2011, 4,407417.
35 K. P. Weber, S. De, I. Kozarewa, D. J. Turner, M. M. Babu
and M. de Bono, PLoS One, 2010, 5, e13922.
36 M. Nazem, A. B. Amouee, M. Eidy, I. A. Khan and H. A. Javed,
Afr. J. Paediatr. Surg., 2010, 7, 203–205.
37 V. L. Corbellini, B. R. Lara dos Santos, B. S. Ojeda, L. M.
Gerhart, O. R. Eidt and S. C. Stein, et al.,Rev. Bras. Enferm.,
2010, 63, 555–560.
38 F. Sanger, S. Nicklen and A. R. Coulson, Proc. Natl. Acad.
Sci. U. S. A., 1977, 74, 5463–5467.
39 A. M. Maxam and W. Gilbert, Proc. Natl. Acad. Sci. U. S. A.,
1977, 74, 560–564.
40 M. Margulies, M. Egholm, W. E. Altman, S. Attiya, J. S.
Bader and L. A. Bemben, et al.,Nature, 2005, 437, 376–380.
41 C. Luo, D. Tsementzi, N. Kyrpides, T. Read and K. T.
Konstantinidis, PLoS One, 2012, 7, e30087.
42 D. R. Bentley, S. Balasubramanian, H. P. Swerdlow, G. P.
Smith, J. Milton and C. G. Brown, et al.,Nature, 2008, 456,
53–59.
43 S. Bennett, Pharmacogenomics, 2004, 5, 433–438.
44 H. Bayley, Nature, 2010, 467, 164–165.
45 E. E. Schadt, S. Turner and A. Kasarskis, Hum. Mol. Genet.,
2010, 19, R227–R240.
46 J. Clarke, H. C. Wu, L. Jayasinghe, A. Patel, S. Reid and
H. Bayley, Nat. Nanotechnol., 2009, 4, 265–270.
47 S. Garaj, W. Hubbard, A. Reina, J. Kong, D. Branton and
J. A. Golovchenko, Nature, 2010, 467, 190–193.
48 J. F. Thompson and K. E. Steinmann, Curr. Protoc. Mol.
Biol., 2010, ch. 7, Unit 7 10.
49 L. Liu, Y. Li, S. Li, N. Hu, Y. He and R. Pong, et al.,
J. Biomed. Biotechnol., 2012, 2012, 251364.
50 J. W. Hinrichs, W. T. van Blokland, M. J. Moons, R. D.
Radersma, J. H. Radersma-van Loon and C. M. de Voijs,
et al.,Am. J. Clin. Pathol., 2015, 143, 573–578.
51 E. Samorodnitsky, J. Datta, B. M. Jewell, R. Hagopian,
J. Miya and M. R. Wing, et al.,J. Mol. Diagn., 2015, 17,
64–75.
52 D. C. Koboldt, L. Ding, E. R. Mardis and R. K. Wilson,
Briefings Bioinf., 2010, 11, 484–498.
53 J. Shendure and H. Ji, Nat. Biotechnol., 2008, 26,
1135–1145.
54 M. F. Carey, C. L. Peterson and S. T. Smale, Cold Spring
Harb Protoc, 2009, 2009, pdb prot5279.
55 T. S. Mikkelsen, M. Ku, D. B. Jaffe, B. Issac, E. Lieberman
and G. Giannoukos, et al.,Nature, 2007, 448, 553–560.
56 M. J. Chaisson, R. K. Wilson and E. E. Eichler, Nat. Rev.
Genet., 2015, 16, 627–640.
57 M. A. DePristo, E. Banks, R. Poplin, K. V. Garimella,
J. R. Maguire and C. Hartl, et al.,Nat. Genet., 2011, 43,
491–498.
58 G. Gibson, Nat. Rev. Genet., 2011, 13, 135–145.
59 W. Galbraith, M. C. Wagner, J. Chao, M. Abaza, L. A. Ernst
and M. A. Nederlof, et al.,Cytometry, 1991, 12, 579–596.
60 P. A. Brennan and K. M. Kendrick, Philos. Trans. R. Soc.
London, Ser. B, 2006, 361, 2061–2078.
61 G. J. Patti, O. Yanes and G. Siuzdak, Nat. Rev. Mol. Cell
Biol., 2012, 13, 263–269.
62 J. L. Griffin and J. P. Shockcor, Nat. Rev. Cancer, 2004, 4,
551–561.
63 J. K. Nicholson, J. Connelly, J. C. Lindon and E. Holmes,
Nat. Rev. Drug Discovery, 2002, 1, 153–161.
64 C. B. Newgard, J. An, J. R. Bain, M. J. Muehlbauer,
R. D. Stevens and L. F. Lien, et al.,Cell Metab., 2009, 9,
311–326.
65 A. P. Feinberg and B. Vogelstein, Biochem. Biophys. Res.
Commun., 1983, 111, 47–54.
66 A. P. Feinberg and B. Vogelstein, Nature, 1983, 301, 89–92.
67 K. D. Robertson, Nat. Rev. Genet., 2005, 6, 597–610.
68 S. U. Devaskar and M. Thamotharan, Rev. Endocr. Metab.
Disord., 2007, 8, 105–113.
69 E. J. Corwin, Biol. Res. Nurs., 2004, 6, 11–16; discussion
21-3.
70 R. Januchowski, J. Prokop and P. P. Jagodzinski, J. Appl.
Genet., 2004, 45, 237–248.
71 S. Hanash and A. Taguchi, Nat. Rev. Cancer, 2010, 10,
652–660.
72 P. J. Turnbaugh, R. E. Ley, M. A. Mahowald, V. Magrini,
E. R. Mardis and J. I. Gordon, Nature, 2006, 444,
1027–1031.
73 E. Holmes, R. L. Loo, J. Stamler, M. Bictash, I. K. Yap and
Q. Chan, et al.,Nature, 2008, 453, 396–400.
74 J. R. Marchesi, E. Holmes, F. Khan, S. Kochhar, P. Scanlan
and F. Shanahan, et al.,J. Proteome Res., 2007, 6, 546–551.
Molecular BioSystems Review
This journal is ©The Royal Society of Chemistry 2016 Mol. BioSyst., 2016, 12, 1818--1830 | 1829
75 S. M. Finegold, Med. Hypotheses, 2008, 70, 508–511.
76 J. Benichou, R. Ben-Hamo, Y. Louzoun and S. Efroni,
Immunology, 2011, 135, 183–191.
77 R. Chen, G. I. Mias, J. Li-Pook-Than, L. Jiang, H. Y. Lam
and E. Miriami, et al.,Cell, 2012, 148, 1293–1307.
78 S. E. Baranzini, J. Mudge, J. C. van Velkinburgh,
P. Khankhanian, I. Khrebtukova and N. A. Miller, et al.,
Nature, 2010, 464, 1351–1356.
79 Avicenna, 1025, Ibn Sinas Canon of Medicine, General
matters relative to the science of medicine 1, AUB
Libraries.
80 E. A. Ashley, A. J. Butte, M. T. Wheeler, R. Chen, T. E. Klein
and F. E. Dewey, et al.,Lancet, 2010, 375, 1525–1535.
81 T. F. Mackay, Annu. Rev. Genet., 2001, 35, 303–339.
82 T. F. Mackay, E. A. Stone and J. F. Ayroles, Nat. Rev. Genet.,
2009, 10, 565–577.
83 A. K. Daly, Nat. Rev. Genet., 2010, 11, 241–246.
84 J. Kirchheiner, K. Brosen, M. L. Dahl, L. F. Gram, S. Kasper
and I. Roots, et al.,Acta Psychiatr. Scand., 2001, 104,
173–192.
85 M. V. Relling and T. E. Klein, Clin. Pharmacol. Ther., 2011,
89, 464–467.
86 U. A. Meyer, Nat. Rev. Genet., 2004, 5, 669–676.
87 R. Ammar, T. A. Paton, D. Torti, A. Shlien and G. D. Bader,
F1000Research, 2015, 4, 17.
88 I. Numanagic, S. Malikic, V. M. Pratt, T. C. Skaar,
D. A. Flockhart and S. C. Sahinalp, Bioinformatics, 2015,
31, i27–i34.
89 B. Kim, S. Kim, R. S. McIntyre, H. J. Park, S. Y. Kim and
Y. H. Joo, Psychiatry Invest., 2009, 6, 78–84.
90 H. Takahashi, G. R. Wilkinson, Y. Caraco, M. Muszkat,
R. B. Kim and T. Kashima, et al.,Clin. Pharmacol. Ther.,
2003, 73, 253–263.
91 M. Wadelius, L. Y. Chen, K. Downes, J. Ghori, S. Hunt and
N. Eriksson, et al.,Pharmacogenomics J., 2005, 5, 262–270.
92 S. Sanderson, J. Emery and J. Higgins, Genet. Med., 2005, 7,
97–104.
93 F. Kamali and H. Wynne, Annu. Rev. Med., 2010, 61, 63–75.
94 H. G. Xie and F. W. Frueh, Pers. Med., 2005, 2, 325–337.
95 B. B. Spear, Epilepsia, 2001, 42(suppl 5), 31–34.
96 M. J. Piccart-Gebhart, Eur. J. Cancer, 2006, 42, 1715–1719.
97 A. Valachis, D. Mauri, N. P. Polyzos, G. Chlouverakis,
D. Mavroudis and V. Georgoulias, Breast, 2011, 20,
485–490.
98 L. Liu, S. Okada, X. F. Kong, A. Y. Kreins, S. Cypowyj and
A. Abhyankar, et al.,J. Exp. Med., 2011, 208, 1635–1648.
99 B. Alexanderson, D. A. Evans and F. Sjoqvist, Br. Med. J.,
1969, 4, 764–768.
100 A. Mahgoub, J. R. Idle, L. G. Dring, R. Lancaster and
R. L. Smith, Lancet, 1977, 2, 584–586.
101 M. Eichelbaum, N. Spannbrucker, B. Steincke and
H. J. Dengler, Eur. J. Clin. Pharmacol., 1979, 16, 183–187.
102 R. Simon, Br. J. Cancer, 2003, 89, 1599–1604.
103 E. S. Lander, L. M. Linton, B. Birren, C. Nusbaum,
M. C. Zody and J. Baldwin, et al.,Nature, 2001, 409,
860–921.
104 R. Sachidanandam, D. Weissman, S. C. Schmidt, J. M.
Kakol, L. D. Stein and G. Marth, et al.,Nature, 2001, 409,
928–933.
105 I. H. Consortium, Nature, 2005, 437, 1299–1320.
106 K. A. Frazer, S. S. Murray, N. J. Schork and E. J. Topol, Nat.
Rev. Genet., 2009, 10, 241–251.
107 G. P. Consortium, Nature, 2010, 467, 1061–1073.
108 A. P. Tharp, M. V. Maffini, P. A. Hunt, C. A. VandeVoort,
C. Sonnenschein and A. M. Soto, Proc. Natl. Acad. Sci.
U. S. A., 2012, 109, 8190–8195.
109 H. S. Nelson, S. T. Weiss, E. R. Bleecker, S. W. Yancey and
P. M. Dorinsky, Chest, 2006, 129, 15–26.
110 V. E. Ortega and D. A. Meyers, J. Allergy Clin. Immunol.,
2014, 133, 16–26.
111 D. Huang, D. W. Kim, A. Kotsakis, S. Deng, P. Lira and
S. N. Ho, et al.,Genomics, 2013, 102, 157–162.
112 I. Cascorbi, O. Bruhn and A. N. Werk, Eur. J. Clin. Pharma-
col., 2013, 69(suppl 1), 17–23.
113 H. Hong, F. Goodsaid, L. Shi and W. Tong, Biomarkers
Med., 2010, 4, 215–225.
114 E. Garralda, K. Paz, P. P. Lopez-Casas, S. Jones, A. Katz and
L. M. Kann, et al.,Clin. Cancer Res., 2014, 20, 2476–2484.
115 M. F. Mohamed, R. F. Frye and T. Y. Langaee, Genet. Test.,
2008, 12, 513–516.
116 H. Ji, Y. Li, M. Graham, B. B. Liang, R. Pilon and S. Tyson,
et al.,Antivir. Ther., 2011, 16, 871–878.
117 P. Jia, H. Jin, C. B. Meador, J. Xia, K. Ohashi and L. Liu,
et al.,Genome Res., 2013, 23, 1434–1445.
118 S. Mohamed, G. Penaranda, D. Gonzalez, C. Camus,
H. Khiri and R. Boulme, et al.,AIDS, 2014, 28, 1315–1324.
119 A. Gall, B. Ferns, C. Morris, S. Watson, M. Cotten and
M. Robinson, et al.,J. Clin. Microbiol., 2012, 50, 3838–3844.
120 M. Sirota, J. T. Dudley, J. Kim, A. P. Chiang, A. A. Morgan
and A. Sweet-Cordero, et al.,Sci. Transl. Med., 2011,
3, 96ra77.
121 J. T. Dudley, M. Sirota, M. Shenoy, R. K. Pai, S. Roedder
and A. P. Chiang, et al.,Sci. Transl. Med., 2011, 3, 96ra76.
122 A. M. Smith, L. E. Heisler, R. P. St Onge, E. Farias-Hesson,
I. M. Wallace and J. Bodeau, et al.,Nucleic Acids Res., 2010,
38, e142.
123 D. R. Bentley, Curr. Opin. Genet. Dev., 2006, 16, 545–552.
124 Y. Liu, D. Shen, F. Zhou, G. Wang and C. An, PLoS One,
2014, 9, e86436.
125 A. Valouev, D. S. Johnson, A. Sundquist, C. Medina,
E. Anton and S. Batzoglou, et al.,Nat. Methods, 2008, 5,
829–834.
126 E. Pennisi, Science, 2010, 327, 1190.
127 N. Rusk, Nat. Methods, 2011, 8, 107.
128 J. Korlach, P. J. Marks, R. L. Cicero, J. J. Gray, D. L. Murphy
and D. B. Roitman, et al.,Proc. Natl. Acad. Sci. U. S. A.,
2008, 105, 1176–1181.
129 R. Drmanac, A. B. Sparks, M. J. Callow, A. L. Halpern,
N. L. Burns and B. G. Kermani, et al.,Science, 2010, 327,
78–81.
130 G. Lunter and M. Goodson, Genome Res., 2011, 21,
936–939.
Review Molecular BioSystems
1830 |Mol. BioSyst., 2016, 12, 1818--1830 This journal is ©The Royal Society of Chemistry 2016
131 H. Li, J. Ruan and R. Durbin, Genome Res., 2008, 18,
1851–1858.
132 M. Ruffalo, T. LaFramboise and M. Koyuturk, Bioinformatics,
2011, 27, 2790–2796.
133 B. Langmead, C. Trapnell, M. Pop and S. L. Salzberg,
Genome Biol., 2009, 10, R25.
134 A. McKenna, M. Hanna, E. Banks, A. Sivachenko,
K. Cibulskis and A. Kernytsky, et al.,Genome Res., 2010,
20, 1297–1303.
135 D. C. Koboldt, Q. Zhang, D. E. Larson, D. Shen, M. D.
McLellan and L. Lin, et al.,Genome Res., 2012, 22, 568–576.
136 K. Ye, M. H. Schulz, Q. Long, R. Apweiler and Z. Ning,
Bioinformatics, 2009, 25, 2865–2871.
137 K. Wang, M. Li and H. Hakonarson, Nucleic Acids Res.,
2010, 38, e164.
138M.P.Dolled-Filhart,M.LeeJr.,C.W.Ou-Yang,R.R.
HaraksinghandJ.C.Lin,Sci. World J., 2013, 2013, 730210.
139 M. Choi, U. I. Scholl, W. Ji, T. Liu, I. R. Tikhonova and
P. Zumbo, et al.,Proc. Natl. Acad. Sci. U. S. A., 2009, 106,
19096–19101.
140 D. Schmidt, M. D. Wilson, C. Spyrou, G. D. Brown,
J. Hadfield and D. T. Odom, Methods, 2009, 48, 240–248.
Molecular BioSystems Review
... Recently, the integration of deep sequencing technologies into clinical diagnostics has revolutionized personalized medicine. Here, this integration has enabled the swift assessment of a patient's genetic profile, particularly genes relevant to disease susceptibility; furthermore, Table 1 Major advantages of synthetic Abs over natural counterparts it has been effective in understanding cellular and molecular effects during treatment responses [16][17][18]. When taken together, deep sequencing is paving the way for important breakthroughs in both scientific research and medical applications [19]. ...
Article
Full-text available
Synthetic antibodies (Abs) represent a category of artificial proteins capable of closely emulating the functions of natural Abs. Their in vitro production eliminates the need for an immunological response, streamlining the process of Ab discovery, engineering, and development. These artificially engineered Abs offer novel approaches to antigen recognition, paratope site manipulation, and biochemical/biophysical enhancements. As a result, synthetic Abs are fundamentally reshaping conventional methods of Ab production. This mirrors the revolution observed in molecular biology and genomics as a result of deep sequencing, which allows for the swift and cost-effective sequencing of DNA and RNA molecules at scale. Within this framework, deep sequencing has enabled the exploration of whole genomes and transcriptomes, including particular gene segments of interest. Notably, the fusion of synthetic Ab discovery with advanced deep sequencing technologies is redefining the current approaches to Ab design and development. Such combination offers opportunity to exhaustively explore Ab repertoires, fast-tracking the Ab discovery process, and enhancing synthetic Ab engineering. Moreover, advanced computational algorithms have the capacity to effectively mine big data, helping to identify Ab sequence patterns/features hidden within deep sequencing Ab datasets. In this context, these methods can be utilized to predict novel sequence features thereby enabling the successful generation of de novo Ab molecules. Hence, the merging of synthetic Ab design, deep sequencing technologies, and advanced computational models heralds a new chapter in Ab discovery, broadening our comprehension of immunology and streamlining the advancement of biological therapeutics.
... In cancers, identification of survival-associated cellular processes will provide more information. Recently, multi-omics data (such as gene expression, CNVs, mutations, and DNA methylation) integration is a promising method to improve patients' outcome (Werner et al., 2014;Rabbani et al., 2016;Lightbody et al., 2019). Proper processing and in-depth analysis of these multidimensional and diverse data makes it possible to obtain comprehensive and reliable insights. ...
Article
Full-text available
Background: Liver cancer is a common malignant tumor with an increasing incidence in recent years. We aimed to develop a model by integrating clinical information and multi-omics profiles of genes to predict survival of patients with liver cancer. Methods: The multi-omics data were integrated to identify liver cancer survival-associated signal pathways. Then, a prognostic risk score model was established based on key genes in a specific pathway, followed by the analysis of the relationship between the risk score and clinical features as well as molecular and immunologic characterization of the key genes included in the prediction model. The function experiments were performed to further elucidate the undergoing molecular mechanism. Results: Totally, 4 pathways associated with liver cancer patients’ survival were identified. In the pathway of integrin cell surface interactions, low expression of COMP and SPP1, and low CNVs level of COL4A2 and ITGAV were significantly related to prognosis. Based on above 4 genes, the risk score model for prognosis was established. Risk score, ITGAV and SPP1 were the most significantly positively related to activated dendritic cell. COL4A2 and COMP were the most significantly positively associated with Type 1 T helper cell and regulatory T cell, respectively. The nomogram (involved T stage and risk score) may better predict short-term survival. The cell assay showed that overexpression of ITGAV promoted tumorigenesis. Conclusion: The risk score model constructed with four genes (COMP, SPP1, COL4A2, and ITGAV) may be used to predict survival in liver cancer patients.
... In this context, it can efficiently identify a patient's genetic factors linked to disease susceptibility and assesses potential cellular responses to diverse treatments. [17][18][19]. Also, its swift and high-throughput big data acquisitions, combined with overall cost reductions, presents opportunities for conducting advanced bioinformatic analyses. ...
Article
Full-text available
Synthetic antibodies (Abs) represent a category of engineered proteins meticulously crafted to replicate the functions of their natural counterparts. Such Abs are generated in vitro, enabling advanced molecular alterations associated with antigen recognition, paratope site engineering, and biochemical refinements. In a parallel realm, deep sequencing has brought about a paradigm shift in molecular biology. It facilitates the prompt and cost-effective high-throughput sequencing of DNA and RNA molecules, enabling the comprehensive big data analysis of Ab transcriptomes, including specific regions of interest. Significantly, the integration of artificial intelligence (AI), based on machine- and deep- learning approaches, has fundamentally transformed our capacity to discern patterns hidden within deep sequencing big data, including distinctive Ab features and protein folding free energy landscapes. Ultimately, current AI advances can generate approximations of the most stable Ab structural configurations, enabling the prediction of de novo synthetic Abs. As a result, this manuscript comprehensively examines the latest and relevant literature concerning the intersection of deep sequencing big data and AI methodologies for the design and development of synthetic Abs. Together, these advancements have accelerated the exploration of antibody repertoires, contributing to the refinement of synthetic Ab engineering and optimizations, and facilitating advancements in the lead identification process.
... With recent advancements in next-generation sequencing (NGS), personalized medical care tailored to individual patients has become possible for conditions that were previously difficult to diagnose and analyze, enabling more effective treatments [13,14]. This study aimed to establish a diagnostic research foundation for multiple CA through clinical, epidemiological, and genomic information collection protocols. ...
Article
Full-text available
Standardized protocols have been designed and developed specifically for clinical information collection and obtaining trio genomic information from infants affected with congenital anomalies (CA) and their parents, as well as securing human biological resources. The protocols include clinical and genomic information collection on multiple CA that were difficult to diagnose using pre-existing screening methods. We obtained human-derived resources and genomic information from 138 cases, including 45 families of infants with CA and their parent trios. For the clinical information collection protocol, criteria for target patient selection and a consent system for collecting and utilizing research resources are crucial. Whole genome sequencing data were generated for all participants, and standardized protocols were developed for resource collection and manufacturing. We recorded the phenotype information according to the Human Phenotype Ontology term, and epidemiological information was collected through an environmental factor questionnaire. Updating and recording of clinical symptoms and genetic information that have been newly added or changed over time are significant. The protocols enabled long-term tracking by including the growth and development status that reflect the important characteristics of newborns. Using these clinical and genetic information collection protocols for CA, an essential platform for early genetic diagnosis and diagnostic research can be established, and new genetic diagnostic guidelines can be presented in the near future.
... PGx is part of personalized [2] medicine, which is a fast developing eld of medicine .The administration of the "appropriate dose of the right drug for the right patient at the right time" is the aim of PGx . Rapid innovations in genetic technology have led to an exponential rise in the identication of number of diseases that have been linked to specic genetic variables, which has sped up the development of genetic factors as biomarkers.Also genetic variations in individuals decide their capacity [3] for metabolism . Different subsets of patients have different levels of drug metabolizing capacity with implications for drug effectiveness and toxicity risk depending on the type of drug. ...
Article
Aim: Pharmacogenomics (PGx) is part of personalized medicine, which is a fast-developing eld of medicine. It is the study of how a patient's genetic makeup affects and how they react to drugs. In pregnancy the physiological and genetic changes will affect the drug efcacy and can cause side effects. The goal of this review is to outline the pharmacogenetics employed in current therapies, to outline recent results and research in obstetric pharmacogenetics, and to outline what is required to assure the future of pharmacogenetics and customized pharmacotherapy in pregnancy. Conclusion: Results from Pharmacogenomic tests may help ensure that medications are used safely during pregnancy and after delivery, and they may also provide mothers more assurance that their child will gain from breastfeeding. More efforts are needed to raise knowledge of testing, encourage informed decision-making, and ensure proper utilization and availability because Pharmacogenomic testing is still new many physicians and patients.
... The application of HTG and genotype imputing has revolutionized genetic research recently. These methods have been useful in identifying genetic risk factors for complicated diseases, characterizing population genetic structures, and identifying pharmacogenetic markers for personalized therapy [213]. Furthermore, HTG and imputation have paved the way for large-scale genomic studies, including GWAS, where millions of genetic markers are analyzed across thousands of individuals [214]. ...
Article
Full-text available
Rapidly rising population and climate changes are two critical issues that require immediate action to achieve sustainable development goals. The rising population is posing increased demand for food, thereby pushing for an acceleration in agricultural production. Furthermore, increased anthropogenic activities have resulted in environmental pollution such as water pollution and soil degradation as well as alterations in the composition and concentration of environmental gases. These changes are affecting not only biodiversity loss but also affecting the physio-biochemical processes of crop plants, resulting in a stress-induced decline in crop yield. To overcome such problems and ensure the supply of food material, consistent efforts are being made to develop strategies and techniques to increase crop yield and to enhance tolerance toward climate-induced stress. Plant breeding evolved after domestication and initially remained dependent on phenotype-based selection for crop improvement. But it has grown through cytological and biochemical methods, and the newer contemporary methods are based on DNA-marker-based strategies that help in the selection of agronomically useful traits. These are now supported by high-end molecular biology tools like PCR, high-throughput genotyping and phenotyping, data from crop morpho-physiology, statistical tools, bioinformatics, and machine learning. After establishing its worth in animal breeding, genomic selection (GS), an improved variant of marker-assisted selection (MAS), has made its way into crop-breeding programs as a powerful selection tool. To develop novel breeding programs as well as innovative marker-based models for genetic evaluation, GS makes use of molecular genetic markers. GS can amend complex traits like yield as well as shorten the breeding period, making it advantageous over pedigree breeding and marker-assisted selection (MAS). It reduces the time and resources that are required for plant breeding while allowing for an increased genetic gain of complex attributes. It has been taken to new heights by integrating innovative and advanced technologies such as speed breeding, machine learning, and environmental/weather data to further harness the GS potential, an approach known as integrated genomic selection (IGS). This review highlights the IGS strategies, procedures, integrated approaches, and associated emerging issues, with a special emphasis on cereal crops. In this domain, efforts have been taken to highlight the potential of this cutting-edge innovation to develop climate-smart crops that can endure abiotic stresses with the motive of keeping production and quality at par with the global food demand.
Article
Full-text available
Objective Matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) is currently used in clinical microbiology laboratories. This study aimed to determine whether dual-polarity time-of-flight mass spectrometry (DP-TOF MS) could be applied to clinical nucleotide detection. Methods This prospective study included 40 healthy individuals and 110 patients diagnosed with cardiovascular diseases. We used DP-TOF MS and Sanger sequencing to evaluate 17 loci across 11 genes associated with cardiovascular drug responses. In addition, we used DP-TOF MS to test 998 retrospectively collected clinical DNA samples with known results. Results A, T, and G nucleotide detection by DP-TOF MS and Sanger sequencing revealed 100% concordance, whereas the C nucleotide concordance was 99.86%. Genotyping based on the results of the two methods showed 99.96% concordance. Regarding clinical applications, DP-TOF MS yielded a 99.91% concordance rate for known loci. The minimum detection limit for DNA was 0.4 ng; the inter-assay and intra-assay precision rates were both 100%. Anti-interference analysis showed that aerosol contamination greater than 10¹³ copies/µL in the laboratory environment could influence the results of DP-TOF MS. Conclusions The DP-TOF MS platform displayed good detection performance, as demonstrated by its 99.96% concordance rate with Sanger sequencing. Thus, it may be applied to clinical nucleotide detection.
Chapter
DNA barcodes are short, standardized DNA segments that geneticists can use to identify all living taxa. On the other hand, DNA barcoding identifies species by analyzing these specific regions against a DNA barcode reference library. In its initial years, DNA barcodes sequenced by Sanger’s method were extensively used by taxonomists for the characterization and identification of species. But in recent years, DNA barcoding by next-generation sequencing (NGS) has found broader applications, such as quality control, biomonitoring of protected species, and biodiversity assessment. Technological advancements have also paved the way to metabarcoding, which has enabled massive parallel sequ.encing of complex bulk samples using high-throughput sequencing techniques. In future, DNA barcoding along with high-throughput techniques will show stupendous progress in taxonomic classification with reference to available sequence data.
Chapter
Genome sequencing has rapidly evolved over the last few years and has wide applications in life science research. Study of the structure, function of genes, and functional assembly of many genes within themselves as well as gene protein level needs accuracy in view of vast human genome, of which many structural and functional details are still unknown. Next-generation sequencing (NGS) has revolutionized the sequencing technique, and now nanotechnology-based nanopore sequencing provides details at the single-molecule level like never before, which enhances accuracy and speed. For fast, yet accurate DNA sequencing, it is essential to understand the effect of nanopore material properties as well as nanopore geometry on the subsequent results. In order to ascertain these effects, the use of molecular dynamics (MD) simulations has been explored, and its potential in determining the requisite nanopore configurations leading to efficient DNA sequencing is analyzed. Brief methods of sequencing the DNA and the nanopore sequencing method will be discussed in this chapter. There is a growing list of applications of NGS in various fields ranging from medicine, environment to microorganisms. As huge amounts of data are being generated by these investigations, it is essential to streamline data storage, security, and ethical as well as legal guidelines of interpretation and communication for mankind.
Article
Full-text available
Haplotypes are often critical for the interpretation of genetic laboratory observations into medically actionable findings. Current massively parallel DNA sequencing technologies produce short sequence reads that are often unable to resolve haplotype information. Phasing short read data typically requires supplemental statistical phasing based on known haplotype structure in the population or parental genotypic data. Here we demonstrate that the MinION nanopore sequencer is capable of producing very long reads to resolve both variants and haplotypes of HLA-A, HLA-B and CYP2D6 genes important in determining patient drug response in sample NA12878 of CEPH/UTAH pedigree 1463, without the need for statistical phasing. Long read data from a single 24-hour nanopore sequencing run was used to reconstruct haplotypes, which were confirmed by HapMap data and statistically phased Complete Genomics and Sequenom genotypes. Our results demonstrate that nanopore sequencing is an emerging standalone technology with potential utility in a clinical environment to aid in medical decision-making.
Article
Full-text available
Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.
Article
Full-text available
CYP2D6 is highly polymorphic gene which encodes the (CYP2D6) enzyme, involved in the metabolism of 20-25% of all clinically prescribed drugs and other xenobiotics in the human body. CYP2D6 genotyping is recommended prior to treatment decisions involving one or more of the numerous drugs sensitive to CYP2D6 allelic composition. In this context, high-throughput sequencing (HTS) technologies provide a promising time-efficient and cost-effective alternative to currently used genotyping techniques. To achieve accurate interpretation of HTS data, however, one needs to overcome several obstacles such as high sequence similarity and genetic recombinations between CYP2D6 and evolutionarily related pseudogenes CYP2D7 and CYP2D8, high copy number variation among individuals and short read lengths generated by HTS technologies. In this work, we present the first algorithm to computationally infer CYP2D6 genotype at basepair resolution from HTS data. Our algorithm is able to resolve complex genotypes, including alleles that are the products of duplication, deletion and fusion events involving CYP2D6 and its evolutionarily related cousin CYP2D7. Through extensive experiments using simulated and real datasets, we show that our algorithm accurately solves this important problem with potential clinical implications. Cypiripi is available at http://sfu-compbio.github.io/cypiripi. cenk@sfu.ca. © The Author 2015. Published by Oxford University Press.
Article
Full-text available
A major contribution to the genome variability among individuals comes from deletions and duplications - collectively termed copy number variations (CNVs) - which alter the diploid status of DNA. These alterations may have no phenotypic effect, account for adaptive traits or can underlie disease. We have compiled published high-quality data on healthy individuals of various ethnicities to construct an updated CNV map of the human genome. Depending on the level of stringency of the map, we estimated that 4.8-9.5% of the genome contributes to CNV and found approximately 100 genes that can be completely deleted without producing apparent phenotypic consequences. This map will aid the interpretation of new CNV findings for both clinical and research applications.
Article
Full-text available
Targeted, capture-based DNA sequencing is a cost-effective method to focus sequencing on a coding region or other customized region of the genome. There are multiple targeted sequencing methods available, but none has been systematically investigated and compared. We evaluated four commercially available custom-targeted DNA technologies for next-generation sequencing with respect to on-target sequencing, uniformity, and ability to detect single-nucleotide variations (SNVs) and copy number variations. The technologies that used sonication for DNA fragmentation displayed impressive uniformity of capture, whereas the others had shorter preparation times, but sacrificed uniformity. One of those technologies, which uses transposase for DNA fragmentation, has a drawback requiring sample pooling, and the last one, which uses restriction enzymes, has a limitation depending on restriction enzyme digest sites. Although all technologies displayed some level of concordance for calling SNVs, the technologies that require restriction enzymes or transposase missed several SNVs largely because of the lack of coverage. All technologies performed well for copy number variation calling when compared to single-nucleotide polymorphism arrays. These results enable laboratories to compare these methods to make informed decisions for their intended applications. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Article
The discovery of genetic variation and the assembly of genome sequences are both inextricably linked to advances in DNA-sequencing technology. Short-read massively parallel sequencing has revolutionized our ability to discover genetic variation but is insufficient to generate high-quality genome assemblies or resolve most structural variation. Full resolution of variation is only guaranteed by complete de novo assembly of a genome. Here, we review approaches to genome assembly, the nature of gaps or missing sequences, and biases in the assembly process. We describe the challenges of generating a complete de novo genome assembly using current technologies and the impact that being able to perfectly sequence the genome would have on understanding human disease and evolution. Finally, we summarize recent technological advances that improve both contiguity and accuracy and emphasize the importance of complete de novo assembly as opposed to read mapping as the primary means to understanding the full range of human genetic variation.
Article
Inherited genetic variation has a critical but as yet largely uncharacterized role in human disease. Here we report a public database of common variation in the human genome: more than one million single nucleotide polymorphisms (SNPs) for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted. These data document the generality of recombination hotspots, a block-like structure of linkage disequilibrium and low haplotype diversity, leading to substantial correlations of SNPs with many of their neighbours. We show how the HapMap resource can guide the design and analysis of genetic association studies, shed light on structural variation and recombination, and identify loci that may have been subject to natural selection during human evolution.
Article
To compare next-generation sequencing (NGS) platforms with mutation-specific analysis platforms in a clinical setting, in terms of sensitivity, mutation specificity, costs, capacity, and ease of use. We analyzed 25 formalin-fixed, paraffin-embedded lung cancer samples of different size and tumor percentage for known KRAS and EGFR hotspot mutations with two dedicated genotyping platforms (cobas [Roche Diagnostics, Almere, The Netherlands] and Rotor-Gene [QIAGEN, Venlo, The Netherlands]) and two NGS platforms (454 Genome Sequencer [GS] junior [Roche Diagnostics] and Ion Torrent Personal Genome Machine [Life Technologies, Bleiswijk, The Netherlands]). All platforms, except the 454 GS junior, detected the mutations originally detected by Sanger sequencing and high-resolution melting prescreening and detected an additional KRAS mutation. The dedicated genotyping platforms outperformed the NGS platforms in speed and ease of use. The large sequencing capacity of the NGS platforms enabled them to deliver all mutation information for all samples at once. Sensitivity for detecting mutations was highly comparable among all platforms. The choice for either a dedicated genotyping platform or an NGS platform is basically a trade-off between speed and genetic information. Copyright© by the American Society for Clinical Pathology.