ArticlePDF Available

Genome-wide identification of natural RNA aptamers in prokaryotes and eukaryotes

Authors:

Abstract and Figures

RNAs are well-suited to act as cellular sensors that detect and respond to metabolite changes in the environment, due to their ability to fold into complex structures. Here, we introduce a genome-wide strategy called PARCEL that experimentally identifies RNA aptamers in vitro, in a high-throughput manner. By applying PARCEL to a collection of prokaryotic and eukaryotic organisms, we have revealed 58 new RNA aptamers to three key metabolites, greatly expanding the list of natural RNA aptamers. The newly identified RNA aptamers exhibit significant sequence conservation, are highly structured and show an unexpected prevalence in coding regions. We identified a prokaryotic precursor tmRNA that binds vitamin B2 (FMN) to facilitate its maturation, as well as eukaryotic mRNAs that bind and respond to FMN, suggesting FMN as the second RNA-binding ligand to affect eukaryotic expression. PARCEL results show that RNA-based sensing and gene regulation is more widespread than previously appreciated in different organisms.
Precursor tmRNA can act as an RNA sensor for FMN. a Schematic of the B. subtilis tmRNA genomic locus (top) and predicted secondary structures of the long precursor tmRNA, short precursor tmRNA, and mature tmRNA, using the RNAfold program 20 (bottom). b RNA footprinting analysis of the long precursor tmRNA, using a SHAPE-like chemical (NAI), in the presence (lane 4) and absence (lane 3) of 100 µM FMN. Also shown are A ladder (lane 1) and unmodified RNA (lane 2). The red bar indicates bases that become more single-stranded in the presence of FMN. c Predicted secondary structure of the tmRNA long precursor using the RNAfold program 20. The red bases correspond to the positions marked by the red bar in b. d Average footprinting analysis (n = 3, SAFA) of mature (top), short precursor (middle), and long precursor tmRNA (bottom), in the presence (red) and absence (black) of 100 µM FMN, in the dark. The beige box indicate the region of increased flexibility in the precursor tmRNAs in the presence of FMN. The stars indicate bases that show statistically significant changes with FMN (p ≤ 0.05, Student t-test). e qPCR analysis of the mRNA expression level of precursor tmRNA and mature tmRNA, across six biological replicates, after addition of 100 µM of riboflavin to the growth media of B. subtilis. Fold changes are normalized to the negative control Veg gene. The known B. subtilis FMN riboswitch is used as the positive control. p-values were calculated by Student's t-test, the error bars indicate standard deviation of the replicates
… 
Measuring RNA-ligand binding by structure probing and deep sequencing. a RNA undergoes structure changes upon ligand binding. This structural change is detected by the double-strand specific nuclease, RNase V1, which cuts at different double-stranded places along the RNA in the presence and absence of the ligand. The cleavage sites are then captured and cloned into a cDNA library for deep sequencing. After mapping the reads to the transcriptome, we can identify which bases have undergone changes in structuredness upon ligand binding (highlighted in beige boxes). b Deep sequencing reveals structure changes of a known TPP riboswitch, thiM, using RNase V1 (top), S1 nuclease (middle), and in-line probing (bottom). The red and black lines indicate the structure profiles of thiM treated with and without 100 µM TPP, respectively. The beige regions highlight regions of structural changes upon ligand binding. c PARCEL identified 85% of known TPP, FMN, and SAM riboswitches in B. subtilis and P. aeruginosa. The black and the white bars indicate the number of known riboswitches that were captured and missed in our study, respectively. d PARCEL sequencing data for the B. subtilis TPP riboswitch, thiT, in the presence and absence of 100 µM TPP (top), 100 µM thiamine (middle), and 100 µM oxythiamine (bottom). PARCEL detected strongest structural change in thiT in the presence of TPP, followed by thiamine and then oxythiamine, which corresponds to the binding affinities of TPP riboswitches for these metabolites⁹. e The plots show normalized V1 read counts of the thiC TPP riboswitch under increasing concentrations of TPP. PARCEL was performed on the B. subtilis transcriptome
… 
PARCEL identifies new RNA aptamers in bacterial species. a PARCEL identifies a total of 52 RNA aptamers in B. subtilis and P. aeruginosa. Black and white bars indicate the numbers of known riboswitches and novel aptamers that are identified in our study, respectively. b Distribution of known riboswitches and new RNA aptamers along the 5′ UTR, CDS, and 3′ UTR regions for B. subtilis and P. aeruginosa, showing that a substantial proportion of RNA aptamers are located in the 3′ UTR and CDS regions. c Comparison of score distribution of Alifoldz¹² for RNA aptamers vs. shuffled counterparts. The upper, middle, and lower bounds of the boxplot represent the 75, 50, and 25th percentile of the values, respectively. A negative score indicates a stable, conserved consensus structure. p-value was obtained using the non-parametric Kolmogorov–Smirnov test. d, e Comparison of the nucleotide substitution rate (number of substitutions per base-pair) for new RNA aptamers in coding region (KrCDS), new RNA aptamers in UTR (KrUTR), 3′ UTR (K3UTR), 5′ UTR (K5UTR), synonymous sites (Ks), and non-synonymous sites (Ka). The upper, middle, and lower bounds of the boxplot represent the 75, 50, and 25th percentile of the values, respectively. To calculate nucleotide substitutions, B. subtilis 168 was compared to B. subtilis subsp. spizizenii W23 (d), and P. aeruginosa PAO1 was compared to P. aeruginosa PA7 (e). Note that Krknown denotes the substitution rate of known riboswitches in B. subtilis (15 in total) as annotated in the RegPrecise database¹¹. Krknown was not calculated in P. aeruginosa as there are too few known TPP and FMN riboswitches. p-values were calculated using the non-parametric Kolmogorov–Smirnov test
… 
PARCEL identifies new RNA aptamers in Candida albicans. a Pie chart of the number of C. albicans RNA aptamers that are located in 5′ UTR, CDS, and 3′ UTR. The majority of C. albicans RNA aptamers are found in CDSs. b Comparison of the distribution of Alifoldz scores for RNA aptamers vs. shuffled counterpart. The upper, middle, and lower bounds of the boxplot represent the 75, 50, and 25th percentile of the values, respectively. A negative score indicates a stable, conserved consensus structure. P-value was obtained using the non-parametric Kolmogorov–Smirnov test. c Nucleotide substitution rates, calculated as the number of substitutions per base-pair, for RNA aptamers (Kr), 3′ UTR (K3UTR), 5′ UTR(K5UTR), synonymous sites (Ks), and non-synonymous sites (Ka). The upper, middle, and lower bounds of the boxplot represent the 75, 50, and 25th percentile of the values, respectively. C. albicans SC5314 was compared to Candida dubliniensis for the calculation. p-value was obtained using the non-parametric Kolmogorov–Smirnov test. d Gel analysis of RPS31 mRNA using in-line probing (left) and RNase V1 (right) in the presence (lane 3) and absence (lane 2) of 100 µM FMN. The A ladder (A, lane 1) is also shown. The black arrows indicate positions along the RNA that changed in the presence of FMN. e A representative Western blot showing RPS31::FLAG (top) and loading (bottom) protein levels in RPS31::FLAG knock-in strains with WT (left) and fmn1∆ (right) backgrounds cultured at different FMN concentrations (mM). Using t-test (n = 8), significant p-values of 0.009 and 0.01 (for 2.5 and 5.0 mM against 10.0 mM, respectively) were determined for fmn1∆, but not WT (p-values of 0.2 and 0.5). f Gel analysis of RPS31 mRNA using in-line probing in the presence of 20, 100, or 500 µM of FMN, FAD or riboflavin. In-line probing of RNA in the absence of metabolite (H2O, lane 2) and A ladder (A, lane 1) are also shown. g SAFA analysis of WT RPS31 (top) and codon-optimized RPS31 (bottom) in the presence (red line) and absence (black line) of 100 µM FMN. The beige box indicates the region of structural change in WT RPS31 when it interacts with FMN. This structural change is absent in the codon-optimized RPS31. h A representative Western blot showing codon-optimized RPS31::FLAG (top) and loading (bottom) protein levels in codon-optimized RPS31::FLAG knock-in strains with WT (left) and fmn1∆ (right) backgrounds cultured at different FMN concentrations (mM). Using t-test (n = 3), calculated p-values for 2.5 and 5.0 mM were insignificant for both fmn1∆ (both 0.7) and WT (0.9 and 0.09)
… 
This content is subject to copyright. Terms and conditions apply.
ARTICLE
Genome-wide identication of natural RNA
aptamers in prokaryotes and eukaryotes
Sidika Tapsin1, Miao Sun2, Yang Shen2, Huibin Zhang3, Xin Ni Lim1, Teodorus Theo Susanto1, Siwy Ling Yang1,
Gui Sheng Zeng4, Jasmine Lee4, Alexander Lezhava5, Ee Lui Ang3, Lian Hui Zhang4, Yue Wang 4,
Huimin Zhao 3,6, Niranjan Nagarajan2& Yue Wan1
RNAs are well-suited to act as cellular sensors that detect and respond to metabolite changes
in the environment, due to their ability to fold into complex structures. Here, we introduce a
genome-wide strategy called PARCEL that experimentally identies RNA aptamers in vitro, in
a high-throughput manner. By applying PARCEL to a collection of prokaryotic and eukaryotic
organisms, we have revealed 58 new RNA aptamers to three key metabolites, greatly
expanding the list of natural RNA aptamers. The newly identied RNA aptamers exhibit
signicant sequence conservation, are highly structured and show an unexpected prevalence
in coding regions. We identied a prokaryotic precursor tmRNA that binds vitamin B2 (FMN)
to facilitate its maturation, as well as eukaryotic mRNAs that bind and respond to FMN,
suggesting FMN as the second RNA-binding ligand to affect eukaryotic expression. PARCEL
results show that RNA-based sensing and gene regulation is more widespread than pre-
viously appreciated in different organisms.
DOI: 10.1038/s41467-018-03675-1 OPEN
1Stem Cell and Development Biology, Genome Institute of Singapore, Singapore 138672, Singapore. 2Computational and Systems Biology, Genome Institute
of Singapore, Singapore 138672, Singapore. 3Metabolic Engineering Research Laboratory (MERL), Science and Engineering Institutes, Agency for Science,
Technology, and Research (A*STAR), 31 Biopolis Way, Nanos #01-01, Singapore 138669, Singapore. 4Institute of Molecular and Cell Biology, Proteos, 61
Biopolis Drive, Singapore 138673, Singapore. 5Translational research group, Genome Institute of Singapore, Singapore 138672, Singapore. 6Department of
Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States. These authors contributed equally:
Sidika Tapsin, Miao Sun, Yang Shen, Huibin Zhang. Correspondence and requests for materials should be addressed to
N.N. (email: nagarajann@gis.a-star.edu.sg) or to Y.W. (email: wany@gis.a-star.edu.sg)
NATURE COMMUNICATIONS | (2018) 9:1289 |DOI: 10.1038/s41467-018-03675-1 |www.nature.com/naturecommunications 1
1234567890():,;
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Microorganisms are constantly sensing their environment
for changes in temperature, pH, metabolites, and
nutrients so as to regulate their gene expression pro-
grams to best adapt to different signals for growth, survival, and
virulence1. As such, comprehensive mapping of regulatory net-
works in microbes under different environmental conditions is
crucial to understanding their biology. While many of these
regulators have been extensively studied at the protein and DNA
levels, RNAs role as direct sensors and responders remains
relatively under-explored. One class of cellular RNA sensors,
known as riboswitches, can recognize and respond to specic
metabolites by altering gene expression2. Upon binding to their
ligands, riboswitches undergo conformational changes that result
in the regulation of gene expression through diverse means, such
as changes in transcription, translation and decay2, informing the
host of its environmental conditions. The modularity of ribos-
witches also allows them to be transplantable to different systems,
broadening their use as biological sensors.
While in vitro selection methods, such as SELEX, have been
applied with variable success to generate new synthetic RNA
aptamers3, the ability to comprehensively identify natural RNA
aptamers from transcriptomes would expand our toolbox for
synthetic biology and deepen our understanding of RNA -based
gene regulation in vivo. Currently, potential natural RNA apta-
mers are mostly identied through computational determination
of sequence and structural homology to known riboswitches4.
However, as RNA can adopt different folds for binding to the
same ligand and organisms can diverge greatly in sequence
content, computational means of searching for riboswitches
through sequence homology have limited scope5, and strategies
that allow direct experimental detection are highly desirable. One
recent strategy, Term-seq, utilizes high-throughput sequencing to
detect differential transcription termination events in bacteria
under different conditions6. However, complementary strategies
for detecting riboswitches that act through other mechanisms
(such as translation inhibition), as well as riboswitches that bind
to metabolites whose intracellular concentrations are not easily
altered, need to be developed.
Here, we report an in vitro method for experimentally identi-
fying RNA aptamers in transcriptomes by detecting ligand-
induced RNA structural changes using high-throughput sequen-
cing (Fig. 1a). This method allows us to rapidly screen through
transcriptomes to identify natural RNA aptamers toward almost
any ligand of choice. Specically, we extract total RNA from
organisms grown under different conditions and probe their
structures in the presence or absence of different metabolites by
using a double-stranded nuclease, RNase V1, which recognizes
and cleaves at base-paired regions in RNAs. The different cleavage
sites, in the presence and absence of metabolites, are cloned into
cDNA libraries for deep sequencing. Subtle differences between
these two libraries point to the few true ligand-specic structure
changing RNA elements in the genome and we developed a
sensitive and robust computational analysis pipeline to identify
these (Methods). This experimental and computational approach,
termed Parallel Analysis of RNA Conformations Exposed to
Ligand binding (PARCEL), revealed the breadth of RNA-ligand
interactions in prokaryotic and eukaryotic transcriptomes, iden-
tifying many new natural RNA aptamers in the process.
Results
Development of PARCEL. To establish PARCEL, we system-
atically tested different structure-probing strategies to determine
the approach that best captures ligand-induced structural changes
genome-wide, allowing for a simplied, cost-effective workow
without multiple probing assays7. Strategies using double-strand
or single-strand specic nucleases (RNase V1 and S1 nuclease,
respectively), as well as in-line probing which probes nucleotide
exibility8, were tested for their abilities to detect structural
changes using high-throughput sequencing (Fig. 1b, Supple-
mentary Fig. 1,2,3). The known thiamine pyrophosphate (TPP)
and S-adenosylmethionine (SAM) riboswitches were used as
positive controls and other RNA sequences not known to bind
TPP or SAM were used as negative controls in this experi-
ment9,10. As expected, the known riboswitches showed large
structural changes upon ligand binding (Fig. 1b, Supplementary
Fig. 1,2), while the negative controls did not (Supplementary
Fig. 3), indicating that the structure changes captured by nuclease
digestion followed by sequencing are highly specic. These
structural differences can be less pronounced in libraries prepared
using in-line probing (Fig. 1b, Supplementary Fig. 2). This is
likely due to the noise introduced by the additional 5phos-
phorylation step that is used in in-line probing library prepara-
tion. As degraded cellular RNAs are also phosphorylated and
cloned into the library, it is difcult to distinguish in-line probed
fragments from degradation fragments. Among the nucleases, we
observed a higher degree of correlation between two biological
replicates of RNase V1 (R=0.99) versus S1 nuclease libraries (R
=0.66, Supplementary Fig. 4a), a key feature for differential
analysis. The structural changes observed using the nucleases
could also be reproduced using low-throughput RNA footprint-
ing and mapped to the secondary structure of the TPP riboswitch
(Supplementary Fig. 1b-d). Correspondingly, we selected RNase
V1 as the probing strategy of choice in all PARCEL experiments
to identify natural RNA aptamers genome-wide.
As Bacillus subtilis and Pseudomonas aeruginosa are bacteria for
which many riboswitches are known, we performed structure
probing in the presence and absence of key metabolites known to
interact with RNAs in their transcriptomes. To maximize our
chances of nding RNA aptamers which may only be expressed
under specic conditions, we grew the bacteria in rich or minimal
media to exponential or stationary phases (Methods). We then
extracted total RNA from the pooled bacteria, performed ribosomal
depletion to enrich for mRNAs, and did structure probing, followed
by deep sequencing. We had two biological replicates for each
experiment and obtained more than 7 million reads per replicate
(Supplementary Table 1). RNA aptamers that bind specically to
one ligand should not recognize other unrelated ligands. As such,
we developed a novel computational pipeline to identify contiguous
positions of structural change (to increase signal-to-noise ratio) that
show statistically different numbers of reads that indicate base
pairing in one metabolite condition but not in others (Supplemen-
tary Fig. 4b, Methods). This approach aggregates signals of variation
in each base pair across conditions to dene ligand-responsive
regions using dynamic programming, and combines this with the
computation of a BLAST-like E-value to identify RNA aptamers
with statistical condence (Methods).
PARCEL nds known riboswitches in B.subtilis and P.aeru-
ginosa. To evaluate PARCEL, we rst determined if we could
identify known riboswitches that have been previously reported
in the literature. We identied 17 out of 20 known riboswitches
that interact with key metabolites (TPP, FMN, and SAM) in B.
subtilis and P.aeruginosa, including 4/5 known TPP riboswitches,
2/2 FMN riboswitches, 9/11 SAM riboswitches in B.subtilis,as
well as 1/1 TPP and 1/1 FMN riboswitches in P.aeruginosa11,
highlighting the high sensitivity of the method (85%; Fig. 1c,
Supplementary Fig. 5a-d). Furthermore, pair-wise analysis of
control libraries that were generated using the same metabolite
did not identify any candidate RNA aptamers, indicating that the
approach is highly specic to the presence of metabolites
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03675-1
2NATURE COMMUNICATIONS | (2018) 9:1289 |DOI: 10.1038/s41467-018-03675-1 |www.nature.com/naturecommunications
Content courtesy of Springer Nature, terms of use apply. Rights reserved
(Supplementary Fig. 5e). We noted that undetected known
riboswitches have low numbers of RNase V1 and S1 nuclease
reads along their regulons, indicating that they are either poorly
expressed or present in nuclease-inaccessible regions (Supple-
mentary Fig. 5f, g).
Besides detecting structural changes upon ligand binding, we
also determined whether PARCEL can detect RNA-ligand
interactions quantitatively. As TPP riboswitches bind most
strongly to TPP, followed by thiamine and oxythiamine, we
tested the sensitivity of PARCEL in detecting RNA structural
changes due to differences in ligand binding afnities9. Indeed,
we observed the strongest structural change in TPP riboswitches
upon binding to TPP, followed by thiamine and then oxythia-
mine (Fig. 1d, Supplementary Fig. 6a). We also treated B.subtilis
TPP FMN SAM
No. of known riboswitches
Identified
Missed
0
2
4
6
8
10
12
Water control
100 μM TPP
50
100
150
200
250
0
0
50
100
150
200
250
50
100
150
200
250
0
Normalized reads
Bases to start codon
Water control
100 μM thiamine
Water control
100 μM oxythiamine
0
4
8
12
16
20
19 24 29 34 39 44 49 54 59 64 69 74 79 84
0
4
8
12
16
20
0
4
8
12
16
20
Bases
TPP
No TPP
S1 nuclease
In-line probing
Normalized sequencing reads
–218 –208 –198 –188 –178 –168 –158 –148 –138 –128
thiM riboswitch
RNase V1
10
70
20 30
40
50
60
Library construction
Deep sequencing
Mapping to transcriptome
Library construction
Deep sequencing
Mapping to transcriptome
Ligand
TPP
FMN
V1
V1
Double
stranded signal
Double
stranded signal
10
570
20 30
40 10
570
20 30
40
50
60
10 7020 30 40 50 60
110 7020 30 40 50 601
0
100
200
300
62 67 72 77 82 87 92 97 102 107 112 117 122
Base
Normalized V1 reads
0.16 μM TPP H2O
0.8 μM TPP H2O
4 μM TPP H2O
20 μM TPP H2O
100 μM TPP H2O
500 μM TPP H2O
127
0
100
200
300
0
100
200
300
0
100
200
300
0
100
200
300
0
100
200
300
a
bc
de
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03675-1 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:1289 |DOI: 10.1038/s41467-018-03675-1 |www.nature.com/naturecommunications 3
Content courtesy of Springer Nature, terms of use apply. Rights reserved
ribosomal RNA-depleted pool with different concentrations of
TPP to determine the approximate binding afnity of known TPP
riboswitches (Fig. 1e). PARCEL data on the known thiC
riboswitch shows a graded change in RNA structure under
different ligand concentrations, and an approximate K
D
of 110
nM in the most ligand sensitive regions, similar to the previously
reported K
D
of 100nM9(Supplementary Fig. 6b). These data
collectively demonstrate that PARCEL is quantitative and can be
used to approximate relative binding afnities of RNA-ligand
interactions.
PARCEL identies new RNA aptamers in B.subtilis and P.
aeruginosa. Beyond known riboswitches, we also identied new
aptamers; 17 TPP and 12 FMN RNA aptamers in the B.subtilis
and P.aeruginosa transcriptomes, as well as 6 SAM RNA apta-
mers in the B.subtilis transcriptome (Fig. 2a, Supplementary
Table 2). RNA footprinting validated four out of six ligand-
induced structure changes identied by PARCEL (Supplementary
Fig. 7a-d). While known riboswitches are mostly located
upstream of their operons, we observed that 28% and 47% of the
newly identied B.subtilis and P.aeruginosa RNA aptamers are
found in coding regions, respectively (Fig. 2b, Supplementary
Fig. 8). Two of our validated RNA aptamers fall in coding regions
(Supplementary Fig. 7a, c), indicating that these coding regions
exhibit real structural changes in the presence of TPP.
To study the structural properties of these RNA aptamers, we
utilized the program Alifoldz12 to calculate folding energies
across their RNA orthologs in different bacterial species
(Methods). The newly identied elements exhibit lower folding
energies than their dinucleotide shufed controls, indicating that
they are more structured (Fig. 2c). As functionally important
RNAs frequently show evolutionary constraints in their
sequences, we calculated the nucleotide substitution rate of our
new RNA aptamers, as compared to other synonymous regions
and UTRs. Similar to known riboswitches, the novel RNA
aptamers show a signicant reduction in nucleotide substitution
rate (Fig. 2d, e), suggesting that they are evolutionarily conserved
and likely to be functional.
We observed that one of our prokaryotic metabolite-sensitive
regions encodes a small non-coding RNA, specically a tmRNA
(Fig. 3a). The B.subtilis tmRNA is transcribed from two
promoters to produce both long and short precursor RNAs,
which are then processed into the mature tmRNA. To understand
the functional role of metabolite sensing in tmRNA, we cloned
and in vitro transcribed all three isoforms of tmRNA, then
performed structure probing in the presence and absence of FMN
in the dark, to avoid FMN-induced photocleavage of RNAs.
Interestingly, we observed FMN-induced structure changes in
both precursor forms of tmRNA, but not in the mature form
(Fig. 3b, c, d, Supplementary Fig. 7d), highlighting that it is the
precursor tmRNA that responds to FMN. To determine whether
FMN binding inuences the processing of precursor tmRNAs, we
grew B.subtilis in minimal media with and without the FMN
precursor, riboavin. Addition of riboavin resulted in a decrease
in precursor tmRNA levels and a two-fold increase in mature
tmRNA levels (Fig. 3e), supporting the hypothesis that FMN
regulates RNA maturation by binding and altering precursor
tmRNA structures.
PARCEL nds new eukaryotic FMN aptamers in Candida
albicans. To date, only riboswitches that bind to TPP have been
found in eukaryotes, and they regulate splicing and 3UTR
usage13,14. Identifying new eukaryotic riboswitches is important
for broadening our understanding of eukaryotic gene regulation.
To maximize our chances of nding eukaryotic riboswitches, we
screened the fungal pathogen, C.albicans, using a pool of meta-
bolites that correspond to highly abundant classes of riboswitches
in bacteria, including FMN, SAM, glycine, lysine, and vitamin
B12 (Adocbl). PARCEL identied 23 new RNA aptamers that
exhibited structural changes in the presence of the metabolite
pool (Supplementary Table 3). 87% of the new C.albicans RNA
aptamers reside in coding regions (Fig. 4a, Supplementary Fig. 8),
in contrast to known riboswitches and new prokaryotic aptamers,
indicating that they may have different functions from classical
riboswitches. We validated seven out of nine PARCEL-identied
structural changes by performing in-line probing of these novel
RNA aptamers in the presence of the pooled metabolites (Sup-
plementary Fig. 911), all of which fall in the coding regions of
these genes, conrming that the PARCEL-detected structural
changes are real. Similar to their prokaryotic counterparts, the
eukaryotic RNA aptamers were found to be signicantly more
structured compared to dinucleotide shufed controls, suggesting
that structure is likely to be important for their function (Fig. 4b).
As many of the new eukaryotic RNA aptamers are located in
highly conserved coding regions, we observed an expected
increase in conservation of these elements as compared to UTRs,
but not a further reduced nucleotide substitution rate compared
to other coding sequences (Fig. 4c).
Eukaryotic RNA aptamers undergo gene expression changes
with FMN. To better understand the cellular roles of the new
eukaryotic RNA aptamers, we performed structure probing in the
presence of each individual compound in the metabolite pool on
two RNA aptamers identied in the coding regions of the genes
RPS31 and ATP1. Interestingly, structure probing of these apta-
mers revealed that they respond specically to FMN, and not to
other metabolites in the solution (Supplementary Fig. 10a, 11a).
Detailed structure probing, in the dark, along the length of these
two RNAs identied several regions that changed structure in the
presence of FMN (Fig. 4d, Supplementary Fig. 10b), suggesting
that FMN binding results in structural remodeling of these RNAs.
To determine whether changes in the intracellular concentration
Fig. 1 Measuring RNA-ligand binding by structure probing and deep sequencing. aRNA undergoes structure changes upon ligand binding. This structural
change is detected by the double-strand specic nuclease, RNase V1, which cuts at different double-stranded places along the RNA in the presence and
absence of the ligand. The cleavage sites are then captured and cloned into a cDNA library for deep sequencing. After mapping the reads to the
transcriptome, we can identify which bases have undergone changes in structuredness upon ligand binding (highlighted in beige boxes). bDeep
sequencing reveals structure changes of a known TPP riboswitch, thiM, using RNase V1 (top), S1 nuclease (middle), and in-line probing (bottom). The red
and black lines indicate the structure proles of thiM treated with and without 100 µM TPP, respectively. The beige regions highlight regions of structural
changes upon ligand binding. cPARCEL identied 85% of known TPP, FMN, and SAM riboswitches in B.subtilis and P.aeruginosa. The black and the white
bars indicate the number of known riboswitches that were captured and missed in our study, respectively. dPARCEL sequencing data for the B.subtilis TPP
riboswitch, thiT, in the presence and absence of 100 µM TPP (top), 100 µM thiamine (middle), and 100 µM oxythiamine (bottom). PARCEL detected
strongest structural change in thiT in the presence of TPP, followed by thiamine and then oxythiamine, which corresponds to the binding afnities of TPP
riboswitches for these metabolites9.eThe plots show normalized V1 read counts of the thiC TPP riboswitch under increasing concentrations of TPP.
PARCEL was performed on the B.subtilis transcriptome
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03675-1
4NATURE COMMUNICATIONS | (2018) 9:1289 |DOI: 10.1038/s41467-018-03675-1 |www.nature.com/naturecommunications
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3UTR5UTR CDS
15
B. subtilis
Known riboswitches
B. subtilis
New RNA aptamers
1
8
8
P. aeruginosa
New RNA aptamers
B. subtilis
Z-score (obtained from Alifoldz)
Shuffled
−8
−6
−4
−2
0
p = 6.25×10−6
P. aeruginosa
Z-score (obtained from Alifoldz)
−10
−8
−6
−4
−2
0
p = 0.006
No. of RNA aptamers
B. subtilis P. aeruginosa
TPP FMN SAM TPP FMN
0
2
4
6
8
10
12
14
16
18
1
12
5
Known riboswitches
New RNA aptamers
KrCDS KrUTR Ks
0.1
0.2
0.3
0.4
0.5
1.1
0
1.2
K5UTR
K3UTR Ka
Nucleotide substitution rate
Krknown
0.1
0.3
0.5
1.1
0
1.2
0.4
0.2
Nucleotide substitution rate
KrCDS KrUTR KsK5UTR
K3UTR Ka
p = 0.001
B.spi vs. B.sub
PAO1 vs. PA7
p = 0.002
Observed ShuffledObserved
ab
dc
e
Fig. 2 PARCEL identies new RNA aptamers in bacterial species. aPARCEL identies a total of 52 RNA aptamers in B.subtilis and P.aeruginosa. Black and
white bars indicate the numbers of known riboswitches and novel aptamers that are identied in our study, respectively. bDistribution of known
riboswitches and new RNA aptamers along the 5UTR, CDS, and 3UTR regions for B.subtilis and P.aeruginosa, showing that a substantial proportion of
RNA aptamers are located in the 3UTR and CDS regions. cComparison of score distribution of Alifoldz12 for RNA aptamers vs. shufed counterparts. The
upper, middle, and lower bounds of the boxplot represent the 75, 50, and 25th percentile of the values, respectively. A negative score indicates a stable,
conserved consensus structure. p-value was obtained using the non-parametric KolmogorovSmirnov test. d, e Comparison of the nucleotide substitution
rate (number of substitutions per base-pair) for new RNA aptamers in coding region (Kr
CDS
), new RNA aptamers in UTR (Kr
UTR
), 3UTR (K
3UTR
), 5UTR
(K
5UTR
), synonymous sites (Ks), and non-synonymous sites (Ka). The upper, middle, and lower bounds of the boxplot represent the 75, 50, and 25th
percentile of the values, respectively. To calculate nucleotide substitutions, B.subtilis 168 was compared to B.subtilis subsp.spizizenii W23 (d), and P.
aeruginosa PAO1 was compared to P.aeruginosa PA7 (e). Note that Kr
known
denotes the substitution rate of known riboswitches in B.subtilis (15 in total) as
annotated in the RegPrecise database11.Kr
known
was not calculated in P.aeruginosa as there are too few known TPP and FMN riboswitches. p-values were
calculated using the non-parametric KolmogorovSmirnov test
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03675-1 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:1289 |DOI: 10.1038/s41467-018-03675-1 |www.nature.com/naturecommunications 5
Content courtesy of Springer Nature, terms of use apply. Rights reserved
of FMN could alter gene expression changes of RPS31 or ATP1
in vivo, we integrated FLAG-tagged C.albicans RPS31 or ATP1 in
aS.cerevisae FMN1 synthase deletion mutant (fmn1Δ)asS.
cerevisae is known to take up exogenous FMN, unlike C.albi-
cans15. Both transcript and protein levels of FLAG-tagged C.
albicans RPS31 and ATP1 were measured, following growth of
the integrated strains under varying FMN concentrations. We
found that while transcript levels of FLAG-tagged C.albicans
RPS31 and ATP1 in fmn1Δdid not change with increasing FMN
concentrations (Supplementary Fig. 10c, 11b), RPS31 and ATP1
protein levels were decreased and increased, respectively (Fig. 4e,
Supplementary Fig. 10d-f,11c), suggesting that FMN sensing
could have a regulatory effect on gene expression in vivo, and that
3,450,000 3,450,500 3,451,000 3,451,500
BSU33590 BSU33600
tmRNA
Short precursor
Long precursor
Short precursor tmRNALong precursor tmRNA Mature tmRNA
B. subtilis
A NAI
FMN (μM) 0 1000
Long precursor
0
0.5
1
1.5
2
2.5
3
Veg FMN
riboswitch
Precursor
tmRNA
tmRNA
Fold change +/– riboflavin
p = 0.04
p = 0.02
U
0
175A
168A
162A
154A
158A
150A
148A
143A
139A
ACGTTACGGATTCGACAGGGATGGATCGAGCTTGAGCTGCG
0
0.5
1
1.5
2
2.5
3
3.5
ACGTTACGGATTCGACAGGGATGGATCGAGCTTGAGCTGCG
ACGT TACGGAT TCGAC AGGGATGGATCGAGCT TGAGCT GCG
Water
FMN
Long precursor
tmRNA
Short precursor
Mature tmRNA
Base
Relative NAI probing intensity
0
0.5
1
1.5
2
2.5
3
3.5
0
0.5
1
1.5
2
2.5
3
3.5
**
*** *
A
C
G
A
T
C
A
G
A
T
C
A
C
G
A
C
G
C
C
A
T
T
C
A
T
T
T
GA
A
G
G
A
T
T
T
GAC
A
A
TT
G
A
A
A
A
GAG
C
C
G
T
G
A
T
C
A
T
G
T
T
A
T
A
A
T
A
A
G
ACTATAGC
C
A
G
T
TT
G
A
G
T
G
A
G
A
G
C
T
T
G
A
TATTTTCTC
C
C
G
T
A
T
T
T
C
C
C
T
T
A
T
A
C
CAAGGGGA
CGTTACGGATTC
G
A
C
A
G
G
G
A
T
G
G
A
T
C
G
A
G
C
T
T
G
A
G
C
T
G
C
G
A
G
C
C
G
A
G
A
G
G
CGA
TCTC
G
T
A
A
A
C
A
C
G
C
A
C
T
TAAATATAACTGGC
A
A
A
A
C
T
A
AC
A
G
TTTTA
A
C
CA
A
A
A
C
G
T
A
G
C
A
T
T
A
G
C
T
G
C
C
T
A
ATA
A
G
C
G
C
A
G
C
GA
G
C
T
C
T
T
C
C
TG
A
C
A
T
T
GCCT
A
T
G
T
G
T
CT
G
T
G
A
A
G
A
G
C
A
C
A
T
C
C
A
A
G
T
A
G
G
C
T
A
C
GCTTGCGTTC
C
CGTC
T
G
A
G
A
A
C
G
T
A
A
G
AAGAGATGAACAGACTAGCTCTCG
G
A
A
G
G
C
C
C
G
C
C
C
G
C
A
G
G
C
A
A
G
A
A
G
ATGAGTG
A
A
A
C
C
A
TAAAT
A
T
G
C
A
G
G
C
T
A
C
G
C
T
C
G
T
A
G
A
C
G
C
T
T
A
A
G
T
A
A
T
C
G
A
T
G
T
T
T
C
T
G
GACGTGGGTT
C
G
A
C
T
C
C
C
A
C
C
G
T
C
T
C
C
A
T
A
C
A
T
A
C
T
GA
C
A
A
T
A
A
A
G
C
A
G
A
A
C
C
T
CT
T
A
A
G
A
G
G
T
T
C
T
G
C
T
T
T
A
T
T
T
T
T
T
1
140
150
abc
de
Fig. 3 Precursor tmRNA can act as an RNA sensor for FMN. aSchematic of the B.subtilis tmRNA genomic locus (top) and predicted secondary structures
of the long precursor tmRNA, short precursor tmRNA, and mature tmRNA, using the RNAfold program20 (bottom). bRNA footprinting analysis of the long
precursor tmRNA, using a SHAPE-like chemical (NAI), in the presence (lane 4) and absence (lane 3) of 100 µM FMN. Also shown are A ladder (lane 1) and
unmodied RNA (lane 2). The red bar indicates bases that become more single-stranded in the presence of FMN. cPredicted secondary structure of the
tmRNA long precursor using the RNAfold program20. The red bases correspond to the positions marked by the red bar in b.dAverage footprinting analysis
(n=3, SAFA) of mature (top), short precursor (middle), and long precursor tmRNA (bottom), in the presence (red) and absence (black) of 100 µM FMN,
in the dark. The beige box indicate the region of increased exibility in the precursor tmRNAs in the presence of FMN. The stars indicate bases that show
statistically signicant changes with FMN (p0.05, Student t-test). eqPCR analysis of the mRNA expression level of precursor tmRNA and mature
tmRNA, across six biological replicates, after addition of 100 µM of riboavin to the growth media of B.subtilis. Fold changes are normalized to the negative
control Veg gene. The known B.subtilis FMN riboswitch is used as the positive control. p-values were calculated by Studentst-test, the error bars indicate
standard deviation of the replicates
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03675-1
6NATURE COMMUNICATIONS | (2018) 9:1289 |DOI: 10.1038/s41467-018-03675-1 |www.nature.com/naturecommunications
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3UTR 5 UTR CDS
C. albicans
New RNA aptamers
C. albicans
−3
−2
−1
0
Z-score (obtained from Alifoldz)
p = 0.016
Ka Kr Ks
0.1
0.2
0.3
0.4
0.5
1.1
0
1.2
Nucleotide substitution rate
1
2
20
Shuffled
FMN FAD RiboflavinA
20 100 500 20 100 500 20 100 500
(μM)
RPS31
Loading
2.5 10.05.0
FMN (mM)
WT fmn1Δ
2.5 10.05.0
Wildtype RPS31 protein levels
RPS31
Loading
2.5 10.05.0FMN (mM)
WT fmn1Δ
2.5 10.05.0
Codon-optimized RPS31 protein levels
U
H2O
359
337337
317
282
240
348
+A
In-line probing
+A
RNase V1
359
348
337337
317
282
240
FMN (100 μM) FMN
p = 3.4×10–12
0
1
2
3
4
5
6
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
WT RPS31
Codon-optimized RPS31
Base
Water
100 μM FMN
0
1
2
3
4
5
6
C.alb vs. C.dub
Observed K3UTR K5UTR
Normalized in-line
probing intensity
abc
e
d
f
gh
Fig. 4 PARCEL identies new RNA aptamers in Candida albicans.aPie chart of the number of C.albicans RNA aptamers that are located in 5UTR, CDS, and
3UTR. The majority of C.albicans RNA aptamers are found in CDSs. bComparison of the distribution of Alifoldz scores for RNA aptamers vs. shufed
counterpart. The upper, middle, and lower bounds of the boxplot represent the 75, 50, and 25th percentile of the values, respectively. A negative score
indicates a stable, conserved consensus structure. P-value was obtained using the non-parametric KolmogorovSmirnov test. cNucleotide substitution
rates, calculated as the number of substitutions per base-pair, for RNA aptamers (Kr), 3UTR (K
3UTR
), 5UTR(K
5UTR
), synonymous sites (Ks), and non-
synonymous sites (Ka). The upper, middle, and lower bounds of the boxplot represent the 75, 50, and 25th percentile of the values, respectively. C.albicans
SC5314 was compared to Candida dubliniensis for the calculation. p-value was obtained using the non-parametric KolmogorovSmirnov test. dGel analysis
of RPS31 mRNA using in-line probing (left) and RNase V1 (right) in the presence (lane 3) and absence (lane 2) of 100 µM FMN. The A ladder (A, lane 1) is
also shown. The black arrows indicate positions along the RNA that changed in the presence of FMN. eA representative Western blot showing RPS31::
FLAG (top) and loading (bottom) protein levels in RPS31::FLAG knock-in strains with WT (left) and fmn1Δ(right) backgrounds cultured at different FMN
concentrations (mM). Using t-test (n=8), signicant p-values of 0.009 and 0.01 (for 2.5 and 5.0 mM against 10.0 mM, respectively) were determined for
fmn1Δ, but not WT (p-values of 0.2 and 0.5). fGel analysis of RPS31 mRNA using in-line probing in the presence of 20, 100, or 500 µM of FMN, FAD or
riboavin. In-line probing of RNA in the absence of metabolite (H
2
O, lane 2) and A ladder (A, lane 1) are also shown. gSAFA analysis of WT RPS31 (top)
and codon-optimized RPS31 (bottom) in the presence (red line) and absence (black line) of 100 µM FMN. The beige box indicates the region of structural
change in WT RPS31 when it interacts with FMN. This structural change is absent in the codon-optimized RPS31. hA representative Western blot showing
codon-optimized RPS31::FLAG (top) and loading (bottom) protein levels in codon-optimized RPS31::FLAG knock-in strains with WT (left) and fmn1Δ(right)
backgrounds cultured at different FMN concentrations (mM). Using t-test (n=3), calculated p-values for 2.5 and 5.0 mM were insignicant for both fmn1Δ
(both 0.7) and WT (0.9 and 0.09)
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03675-1 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:1289 |DOI: 10.1038/s41467-018-03675-1 |www.nature.com/naturecommunications 7
Content courtesy of Springer Nature, terms of use apply. Rights reserved
these genes could represent the rst known eukaryotic FMN
riboswitches.
To further understand ligand binding specicities and afnities
of this putative FMN riboswitch, we performed detailed structural
studies on the RPS31 transcript. We observed that RPS31 RNA
binds specically to FMN and does not respond to structurally
similaranalogs,suchasriboavin and FAD (Fig. 4f, Supplementary
Fig. 11d). Integrating double-stranded (RNase V1), single-stranded
(S1 nuclease) structure probing and in-line probing information
along the length of the RPS31 transcript into the RNAfold
prediction software showed that RPS31 RNA consists of seven
stems around a central loop, and that FMN binding results in
extensive structural rearrangements (Supplementary Fig. 12a). The
FMN-bound RPS31 aptamer consists of six stems around the
central FMN bound loop, and appears to resemble the prokaryotic
FMN riboswitch structure16. This attests to the structural plasticity
of RNA molecules, whereby different sequences can be utilized to
form similar structures for cellular function.
To further test whether the FMN-induced change in RPS31
protein levels is mediated post-transcriptionally, we designed a
codon-optimized version of C.albicans RPS31 (by changing
nucleotides of synonymous bases) to disrupt RPS31 RNA structure
without altering its protein sequence. As expected, codon-optimized
RPS31 mRNA is structurally different from wildtype RPS31 mRNA,
and does not show structure changes in the presence and absence of
FMN (Fig. 4g). Codon-optimized RPS31 maintained similar mRNA
and protein levels at varying concentrations of FMN (Fig. 4h,
Supplementary Fig.12b-d), supporting the hypothesis that the
binding of FMN to native C.albicans RPS31 RNA results in post-
transcriptional regulation of RPS31 protein levels.
Discussion
In summary, we have developed a new strategy named PARCEL
that experimentally identies RNA aptamers transcriptome-wide
by detecting ligand-induced structure changes. As PARCEL
allows us to rapidly screen through transcriptomes to identify
RNA aptamers, we identied a total of 58 novel candidate RNA
aptamers in two prokaryotic and one eukaryotic species, includ-
ing a second class of putative eukaryotic riboswitches. Unlike
known riboswitches described in the literature, the newly iden-
tied aptamers reside in both UTRs and coding sequences, and
are not necessarily linked to the biosynthetic pathways of their
respective ligands. Further characterization of three new RNA
aptamers showed that they could be riboswitches as FMN-sensing
induces RNA structural changes and regulates transcript levels of
tmRNA in B.subtilis, and protein levels of RPS31 and ATP1 in C.
albicans. As PARCEL can be readily applied to any transcriptome
and ligand, we believe that further application of PARCEL to
diverse organisms will result in the identication of many novel
natural RNA aptamers in the near future, providing new building
blocks for biological sensing and deepening our understanding of
RNA-based gene regulation in vivo.
Methods
Bacterial and yeast cultures.P.aeruginosa PAO1 and B.subtilis 168, were grown
in LB or minimal media to log (OD
600
=0.60.8) or stationary phases (OD
600
> 2).
Total RNA from P.aeruginosa was extracted using Trizol reagent (Thermo Fisher
Scientic). Total RNA from B.subtilis was extracted by rst incubating B.subtilis in
4 mg per mL of lysozyme for 15 min before using Trizol LS reagent. Ribosomal
depleted RNA, Ribo() RNA, was obtained by using Ribo-Zero rRNA Removal Kit
(Epicenter) according to manufacturers instructions. S.cerevisiae S288C was
grown in YPD to exponential phase (OD
600
=0.60.8). C.albicans strain SC5314
was grown in YPD or GMM (yeast nitrogen base without amino acids and with 2%
glucose) to exponential (OD
600
=0.60.8) or stationary phases (OD
600
> 2). Total
RNA from S.cerevisiae or C.albicans was extracted using a slightly modied
protocol that uses hot acid phenol17. Poly(A) +RNA was obtained by using the
Poly(A) Purist MAG kit (Thermo Fisher Scientic) according to manufacturers
instructions. Poly(A)+or Ribo() RNA were then structure probed in the pre-
sence and absence of metabolites.
fmn1Δmutant was created by replacement of FMN1 in BY4741 strain with
KanMX using homologous recombination18. RPS31 from C.albicans, with a C-
terminal FLAG-tag (GATTACAAGGACGACGATGACAAG), was integrated
together with URA3 at the ura3Δsite to generate the RPS31::FLAG knock-in
strains. ATP1 from C.albicans, with a N-terminal FLAG-tag, was integrated
together with URA3 at the ura3Δsite to generate the ATP1::FLAG knock-in
strains.
RNA structure probing. Briey, 250 ng of Poly(A)+or Ribo() RNA was heated
to 90 °C for 2 min and cooled on ice for 2min before adding 10X RNA structure
buffer (500 mM Tris pH 7.4, 1.5 M NaCl, 100 mM MgCl
2
) and metabolites to the
RNA. The RNA pool was slowly brought to 37 °C for 30 min and structure probed
using RNase V1 (1:2000 dilution, AM2275 Life Technologies) or S1 nuclease (1:500
dilution, Fermentas) at 37 °C for 15 min. The nuclease reactions were inactivated
using phenol chloroform extraction and ethanol precipitated. In-line probing
reactions were performed in 50 mM Tris-HCl (pH 8.3), 20 mM MgCl
2
, and 100
mM KCl at 25 °C for 40 hours8. The in-line probed RNA was phosphorylated using
T4 polynucleotide kinase (PNK) in T4 PNK buffer and 1 mM ATP to capture the
cleavage sites.
Library preparation. Structure probed RNA was fragmented at 95 °C for 3.5 min
in alkaline hydrolysis buffer (Ambion). As fragmentation results in 5OH, and is
hence ligation incompatible, it does not interfere with the downstream library
preparation process. Fragmented RNA was then puried using RiboMinus con-
centration module (Life Technologies), using the modied protocol for RNAs that
are <200 bases. The RNA was eluted in 12 µl of nuclease free water and con-
centrated to 2 µl using a vacuum centrifuge. The RNA was then ligated to 5
adapter from NEBNext Multiplex Small RNA Library Prep Set for Illumina using
T4 RNA ligase1 (T4 RNA ligase buffer, 1 mM ATP, 10% PEG, 10% DMSO) at 16 °
C overnight. The 5adapter ligated RNAs were then puried through a 6% TBE
urea PAGE gel and size selected for 50200 bases. The RNA was then ligated to 3
adapter, reverse transcribed, and PCR amplied using the NEBNext Multiplex
Small RNA Library Prep Set (New England Biolabs) for Illumina using manu-
facturers instructions. Eighteen cycles of PCR amplication were typically per-
formed for each library.
RNA footprinting analysis. Cleavage and modication sites along structure pro-
bed RNA were identied using primer extension. Briey, a primer located ~3050
bases downstream of the structure probed region was labeled with ɣP32 ATP using
T4 PNK kinase. The labeled primer was then puried using a 15% TBE urea PAGE
gel. The labeled primer was incubated with the RNA at 65 °C for 5 min, followed by
35 °C for 5 min, and then cooled at 4 °C. To detect the structure probed sites, we
add 3 µl of enzyme mix (4:1:1 of rst-strand buffer: DTT: NTP) to the reaction,
incubated at 52 °C for 1 min, and Superscript III was added to the reaction at 52 °C
for 10 min. To generate a sequencing ladder for the RNA, we added 1 µl of ddNTP
(5 mM) to the reaction after the enzyme mix, and before adding Superscript III. 4
M sodium hydroxide was added to the reaction to denature the RNA before the
samples were loaded onto a 7 M TBE-Urea PAGE sequencing gel. Gel images were
quantied using the software Semi-automated footprinting analysis (SAFA)19.
RNA structure models. RNA secondary structure predictions were generated
using RNA footprinting data with RNase V1, S1 nuclease, and NAI as constraints,
using the program RNAfold20 with default parameters.
qPCR and Western blotting for wildtype and codon-optimized RPS31. The
RPS31::FLAG and ATP1::FLAG strains were inoculated from single colonies into 2
mL SC-ura media and grown overnight at 30 °C, with shaking. Strains with the
fmn1Δmutation were supplemented with 10 mM FMN and 200 µg per mL G418 in
the cultures. The overnight cultures (1:100 dilution of OD
600
2.0) were used to seed
50 mL YPD cultures supplemented with 2.5, 5.0 or 10.0 mM FMN. Cells were
harvested after 46 h of growth at 30 °C, with shakin g (when OD
600
reaches 0.4).
The cultures were split for RNA extraction and Western blotting, pelleted, and
washed once with PBS. The resultant cell pellets were frozen and stored at 80 °C.
RNA extraction and qPCR. RNA was extracted from frozen yeast pellets using the
hot acidic phenol method and treated with TURBOTM DNase (ThermoFisher
Scientic)17. We made cDNA using the Transcriptor First Strand cDNA Synthesis
Kit (Roche) and qPCR was performed using SYBR Green Master Mix (Roche) on a
Light Cycler 96 instrument (Roche). Primers used are listed as below. The RPS31
and ATP1 primers are specic for the knock-in C.albicans RPS31 and ATP1, and
do not amplify the endogenous S.cerevisiae RPS31 and ATP1. Normalized fold
changes were calculated by normalizing against actin (ACT1) and the respective
strain cultured at 10 mM FMN.
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03675-1
8NATURE COMMUNICATIONS | (2018) 9:1289 |DOI: 10.1038/s41467-018-03675-1 |www.nature.com/naturecommunications
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Forward Reverse
RPS31 set 1 (HZ_pri072) TCCACC
AGACCAAC
AAAGATTG
(HZ_pri073) ACCAAGTG
CAAGGTGGATTC
RPS31 set 2 (HZ_pri023) GAATCCA
CCTTGCACTTGGTC
(HZ_pri024) GCCAACT
TGTGTTTTCTGTGC
RPS31 set 3 (HZ_pri015) GCACAGAA
AACACAAGTTGGC
(HZ_pri016) CCATGAAA
ATACCGGCACCAC
ACT1 (HZ_pri051) ATGGATTC
TGAGGTTGCTGC
(HZ_pri052) TGGTCT
ACCGACGATAGATGG
RPS31_codon-
optimized set 1
(HZ_pri118) TGGAGGTCG
AGTCATCAGATAC
(HZ_pri119) CGTCTTA
TTTTCGCAGGGAAGC
RPS31_codon-
optimized set 2
(HZ_pri122) TTTCGCA
GGGAAGCAGTTAG
(HZ_pri123) TTTTTCCC
CCACCCCTTAACC
RPS31_codon-
optimized set 3
(HZ_pri120) TAGAGAG
GTTGAGGCGTGAATG
(HZ_pri121) GGCACTT
ACCGCAATATTGACG
ATP1 set 1 (HZ_pri064) TTACGTA
CTGCTGCTCGTACAG
(HZ_pri065) GGCAGAG
GCAAATCTTTGAGC
ATP1 set 2 (HZ_pri058) AAGTCG
GGGTTGTGTTGTTC
(HZ_pri059) TTCTGGA
CCAATTGGAACGG
ATP1 set 3 (HZ_pri062) TCGCTGG
TGTTAACGGTTTC
(HZ_pri063) CACCCTT
GGTCTTAATAGCATCC
Western blotting. The frozen cell pellet was resuspended in 80 µL of lysis buffer
(50 mM Tris pH 7.4, 4% SDS) with proteinase inhibitor (one tablet of completeTM,
Mini, EDTA-free [Roche] in 2.5 mL of lysis buffer). An equal volume of glass beads
(425600 µm, Sigma-Aldrich) was added and cells were lysed in a Mini-
Beadbeater-96 (Biospec Products) for two cycles of 15 s with a two minute interval.
Cell lysates were centrifuged and supernatants were run on 420% Mini-Protean
TGX Stain-Free protein gels (Bio-Rad). To determine the relative levels of total
protein (loading), the gels were rst imaged on a ChemiDoc MP System (Bio-Rad)
using the stain-free technology21. Following wet transfer and blocking with 5%
milk in PBST, RPS31::FLAG and ATP1::FLAG were detected using mouse anti-
FLAG M2 primary antibody (1:4000 in PBS at 4 °C overnight, Sigma-Aldrich,
F1804) and sheep anti-mouse IgG, HRP-linked secondary antibodies (1:20000 in
1% milk at 25 °C for an hour, GE Healthcare, NA931). All images were taken using
the ChemiDoc MP System (Bio-Rad) and analyzed with ImageJ22. Corresponding
uncropped images of blots (in main gures) can be found in supplementary gures.
Read mapping. Short reads from PARCEL libraries (Illumina HiSeq, 50 bp, single-
end) in P.aeruginosa PAO1, B.subtilis 168, and E.coli K12 were aligned to their
corresponding reference genomes downloaded from NCBI using the short-read
aligner bowtie2 (parameters: k1local)23. In the case of S.cerevisiae S288C and C.
albicans SC5314, we extracted the UTR annotation from Bruno et al.24,25 and
integrated them into their corresponding transcriptomes before alignment by bow-
tie2. In both cases, only uniquely mapped reads were used for subsequent analysis.
Identication of RNA aptamers. For each position along the genome or tran-
scriptome, we counted the number of reads whose rst mapped base was one base
downstream of the inspected position. Higher counts suggest greater accessibility to
V1 nuclease, and are more likely to be associated with a double-stranded con-
formation. In all expressed transcripts, positions with zero count could either be
associated with a single-stranded conformation, or come from a heavily folded
region that is inaccessible to V1 nuclease.
Since RNA structural changes should typically span across multiple bases, we
looked for regions that exhibit differential V1 counts to increase the sensitivity/
specicity of detecting RNA aptamers. We rst evaluated the signicance of
differential V1 counts at each nucleotide position using the edgeR package26, where
we compared samples treated by one specic metabolite (e.g., TPP) to samples
from all other conditions. We focused on positions that were generally accessible to
the V1 nuclease by applying a minimum abundance threshold (average counts per
sample per position, a> 1) and then computed a score s
i
for each passed position i
based on edgeR-generated p-values (pval
i
): s
i
=ln(0.1) ln(pval
i
) (in effect, giving
negative scores for p-values > 0.1).
The higher the score, s
i
, the more likely that differential V1 cutting was observed
at that specic position. Accordingly, positions that failed to pass the abundance
threshold were assigned a penalty score of 10. We then looked for segments of
contiguous positions (e.g., a segment from position m to n) with the highest
aggregate score S¼P
n
i¼m
s
i
, by applying the Kadane algorithm (maximal subarray
problem). We then determined the signicance of these high-scoring segments
based on KarlinAltschul statistics, similar to the approach used in BLAST27.
As described by Karlin and Altschul27, the expected value (E-value) of high-
scoring segments with an aggregate score of at least Sis given by the formula:
Ev¼KeλS
:ð1Þ
Therefore, we examined the extreme value distribution of the aggregate score S
to estimate the two parameters required i.e., Kand λ. Specically, λcan be
calculated from the formula: Ppieλsi¼1, where p
i
is the corresponding
probability of the scores
i
.Ass
i
=ln(0.1) ln(pval
i
) and pval
i
approximately
follows a uniform distribution (U(0,1)) due to the assumption that the majority of
nucleotide positions do not undergo any structural changes, the equation
Ppieλsi¼1can be translated to:
1
pvali¼0eλln 0:1ðÞln pvali
ðÞðÞ
¼1:
This can be solved to 0:1λ
1λ¼1, and λ=0.862871. The parameter Kis bounded
between K¼Cλδ
eλδ1

and Kþ¼Cλδ
1eλδ

. Since δis the smallest span of s
i
,K
is bounded between limδ!0Kand lim
δ!0Kþand C*isdened by the formula:
C¼e2P1
k¼1
1
kðEe
λSk;Sk<0
ðÞ
þProb Sk0ðÞÞ
λEðS1eλS1Þ:
Here, S
k
is the random variable representing the sum of kindependently chosen
s
i
i.e., Sk¼Psi¼kIn 0:1ðÞðÞþ
Pln pvali

. Since pvali
U0;1ðÞ
Pln pvali

approximately follows the gamma distribution i.e.,
Γk¼k;δ¼1ðÞ:Let Xk¼Pk
i¼1ln pvali

, it can then be derived that:
Ee
λSk;Sk<0

¼0:1λkZkln0:1
Xk¼0
Xk1
keλ1ðÞXk
k1ðÞ!;
ProbðSk0Þ¼1Rkln0:1
t¼0tk1et
k1ðÞ!;
and
λES
1eλS1

¼λ0:1λ1λ1ðÞln0:1
λ1
ðÞ
2:
Taken together, C* can be solved to take the value of 0.0809635, and the upper
and lower bounds of K,K, and K+, both equal C*, i.e., K=0.0809635. We then
calculated E
v
for high-scoring segments by applying equation (1). Segments that
pass the E
v
threshold of 10 were considered as candidate RNA aptamers that
undergo metabolite-responsive conformational changes. Under all conditions, we
report candidate regions that have positions with absolute fold-change f>2,
relative abundance greater than median +standard deviation for the transcript
(abundance-lter; to avoid segments with lower accessibility) and low bonferroni-
corrected p-value (<10; to avoid segments with no strongly changing position).
Distribution of RNA aptamers across operons and transcripts. We evaluated
the distribution of RNA aptamers across operons in bacteria, and along transcripts
in fungi (including a 500 bp window on either side when UTRs were not specied).
We plotted the histogram of all RNA aptamer positions, with operons in bacteria
and coding regions in fungi being scaled to 1 kbp. There are cases where the same
position can be considered as belonging to multiple classes and in such cases, we
preferentially assigned positions to the 5UTR, then to the operon or CDS, and
lastly to the 3UTR.
Sequence conservation of RNA aptamers. We estimated the sequence con-
servation of identied RNA aptamers by measuring nucleotide substitution rate of
these regions to their blastn-identied orthologous sequences. If the identied
aptamer regions were shorter than 200 bases in length, we extended them on both
sides to a maximum of 200 bases. As highly divergent and highly similar sequences
would result in an unreliable estimate of nucleotide substitution rate28, we chose
fairly divergent, and yet not too divergent species (median Ks ranges from 0.1 to
0.4) for this analysis. The orthologous riboswitches, 3UTR, 5UTR, and protein
coding regions were identied using blastn, for non-coding, or genblastG29, for
coding sequence, in other species, respectively. To identify orthologous noncoding
sequences in other organisms with high sensitivity, we changed the default blastn
parameters as follows: -e 1e-5 -word_size 7 gapopen 2 gapextend 130.We
aligned the noncoding sequences using MUSCLE31, and the coding sequences
using MACSE32, to construct the multiple sequence alignment. The nucleotide
substitution rate of riboswitches, 3UTR and 5UTR were calculated using
Kimuras 2-parameter method33. The synonymous and non-synonymous sub-
stitution rates was calculated using Kakscalculator34 with the LPB method.
Calculating the degree of pairedness for RNA aptamers and controls.We
searched for orthologous sequences of RNA aptamers identied in B.subtilis,P.
aeruginosa, and C.albicans across the Bacillus,Pseudomonas, and Candida genus
using blastn (with parameters: -e 1e-5 -word_size 7 gapopen 2 gapextend 1).
The species that were used in each genus are: B.subtilis XF-1, B.subtilis BSn5, B.
malacitensis CR-95, B.natto BEST195, B.licheniformis DSM13, B.subtilis subsp.
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03675-1 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:1289 |DOI: 10.1038/s41467-018-03675-1 |www.nature.com/naturecommunications 9
Content courtesy of Springer Nature, terms of use apply. Rights reserved
spizizenii W23, and B.cereus ATCC-14579 for Bacillus;P.aeruginosa PAO1, P.
aeruginosa PA7, P.mendocina 1267_PMEN, P.knackmussii B13, P.oryzihabitans
USDA-ARS-USMARC-56511, P.pseudoalcaligenes KF707, P.stutzeri A1501, P.
stutzeri A1501, P.mendocina ymp, P.entomophila L48, P.putida F1, P.uorescens
SBW25, and P.syringae pv.tomato str. DC3000 for Pseudomonas;C.albicans
SC5314,C.albicans WO-1, and C.dubliniensis for Candida. We then built multiple
species alignments for each RNA aptamer region using MUSCLE31. We used the
program Alifoldz12 to calculate the energy and structural stability of the consensus
structure. For each RNA aptamer alignment, a shufed alignment was generated as
a control using the shufe.plscript from the Alifoldz package.
Data availability. All relevant data are available from the authors upon request.
Data has been deposited under GEO accession number GSE106133.
Received: 21 December 2017 Accepted: 5 March 2018
References
1. Gasch, A. P. et al. Genomic expression programs in the response of yeast cells
to environmental changes. Mol. Biol. Cell 11, 42414257 (2000).
2. Breaker, R. R. Riboswitches and the RNA world. Cold Spring Harb. Perspect.
Biol.4, a003566 (2012).
3. Conrad, R. C., Baskerville, S. & Ellington, A. D. In vitro selection methodologies
to probe RNA function and structure. Mol. Divers. 1,6978 (1995).
4. Barrick, J. E. & Breaker, R. R. The distributions, mechanisms, and structures of
metabolite-binding riboswitches. Genome Biol. 8, R239 (2007).
5. Wan, Y., Kertesz, M., Spitale, R. C., Segal, E. & Chang, H. Y. Understanding the
transcriptome through RNA structure. Nat. Rev. Genet. 12, 641655 (2011).
6. Dar, D. et al. Term-seq reveals abundant ribo-regulation of antibiotics
resistance in bacteria. Science 352, aad9822 (2016).
7. Kertesz, M. et al. Genome-wide measurement of RNA secondary structure in
yeast. Nature 467, 103107 (2010).
8. Regulski, E. E. & Breaker, R. R. In-line probing analysis of riboswitches.
Methods Mol. Biol. 419,5367 (2008).
9. Winkler, W., Nahvi, A. & Breaker, R. R. Thiamine derivatives bind messenger
RNAs directly to regulate bacterial gene expression. Nature 419,952956 (2002).
10. Winkler, W. C., Nahvi, A., Sudarsan, N., Barrick, J. E. & Breaker, R. R. An
mRNA structure that controls gene expression by binding S-
adenosylmethionine. Nat. Struct. Biol. 10, 701707 (2003).
11. Novichkov, P. S. et al. RegPrecise 3.0a resource for genome-scale exploration of
transcriptional regulation in bacteria. BMC Genomics 14, 745-2164-14-745 (2013).
12. Washietl, S. & Hofacker, I. L. Consensus folding of aligned sequences as a new
measure for the detection of functional RNAs by comparative genomics. J.
Mol. Biol. 342,1930 (2004).
13. Li, S. & Breaker, R. R. Eukaryotic TPP riboswitch regulation of alternative splicing
involving long-distance base pairing. Nucleic Acids Res. 41, 30223031 (2013).
14. Wachter, A. et al. Riboswitch control of gene expression in plants by splicing
and alternative 3end processing of mRNAs. Plant Cell 19, 34373450 (2007).
15. Echt, S. et al. Potential anti-infective targets in pathogenic yeasts: structure and
properties of 3,4-dihydroxy-2-butanone 4-phosphate synthase of Candida
albicans. J. Mol. Biol. 341, 10851096 (2004).
16. Winkler, W. C., Cohen-Chalamish, S. & Breaker, R. R. An mRNA structure
that controls gene expression by binding FMN. Proc. Natl Acad. Sci. USA 99,
1590815913 (2002).
17. Collart, M. A. & Oliviero, S. Preparation of yeast RNA. Curr.Protoc.Mol.Biol.
Chapter 13, Unit13.12 (2001).
18. Guldener, U., Heck, S., Fielder, T., Beinhauer, J. & Hegemann, J. H. A new
efcient gene disruption cassette for repeated use in budding yeast. Nucleic
Acids Res. 24, 25192524 (1996).
19. Das,R.,Laederach,A.,Pearlman,S.M.,Herschlag,D.&Altman,R.B.SAFA:
semi-automated footprinting analysis software for high-throughput quantication
of nucleic acid footprinting experiments. RNA 11, 344354 (2005).
20. Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol.Biol.6, 26-7188-6-
26 (2011).
21. Posch, A., Kohn, J., Oh, K., Hammond, M. & Liu, N. V3 stain-free workow
for a practical, convenient, and reliable total protein loading control in
western blotting. J. Vis. Exp. 82, 50948 (2013).
22. Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis.
Nat. Methods 9, 676682 (2012).II
23. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat.
Methods 9, 357359 (2012).
24. Nagalakshmi, U. et al. The transcriptional landscape of the yeast genome
dened by RNA sequencing. Science 320, 13441349 (2008).
25. Bruno, V. M. et al. Comprehensive annotation of the transcriptome of the
human fungal pathogen Candida albicans using RNA-seq. Genome Res. 20,
14511458 (2010).
26. Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor
package for differential expression analysis of digital gene expression data.
Bioinformatics 26, 139140 (2010).
27. Karlin, S. & Altschul, S. F. Methods for assessing the statistical signicance of
molecular sequence features by using general scoring schemes. Proc. Natl
Acad. Sci. USA 87, 22642268 (1990).
28. Tzeng, Y. H., Pan, R. & Li, W. H. Comparison of three methods for estimating
rates of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol.
Evol. 21, 22902298 (2004).
29. She, R. et al. genBlastG: using BLAST searches to build homologous gene
models. Bioinformatics 27, 21412143 (2011).
30. Lu, J. et al. The birth and death of microRNA genes in Drosophila. Nat. Genet.
40, 351355 (2008).
31. Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and
high throughput. Nucleic Acids Res. 32, 17921797 (2004).
32. Ranwez, V., Harispe, S., Delsuc, F. & Douzery, E. J. MACSE: multiple
alignment of coding sequences accounting for frameshifts and stop codons.
PLoS One 6, e22594 (2011).
33. Kimura, M. A simple method for estimating evolutionary rates of base
substitutions through comparative studies of nucleotide sequences. J. Mol.
Evol. 16, 111120 (1980).
34. Zhang, Z. et al. KaKs_Calculator: calculating Ka and Ks through model
selection and model averaging. Genom. Proteom. Bioinform. 4, 259263 (2006).
Acknowledgements
We thank members of the Wan lab, Nagarajan lab, S. Chen, W.F. Burkholder, A. Sim, H.
H. Ng, and B. Lim for discussions. B.subtilis 168 was obtained from the Bacillus Genetic
Stock Center. N. Nagarajan is supported by funding from A*STAR. Y. Wan is supported
by funding from A*STAR, Society in Science - Branco Weiss Fellowship, and EMBO
Young Investigatorship.
Author contributions
Y.W. conceived the project, developed the protocol, and designed the experiments.
N.N. and M.S. designed the computational pipeline. Y.W., S.T., X.N.L., T.T.S., G.S.Z., J.L.,
Y.W., L.H.Z., E.L.A., H.Z. and H.Z. planned and performed all the experiments. N.N.,
S.L.Y., and M.S. planned and conducted the data analysis. A.L. helped with the
sequencing. Y.W. organized and wrote the paper with contributions from H.Z., S.L.Y.,
N.N. and all other authors.
Additional information
Supplementary Information accompanies this paper at https://doi.org/10.1038/s41467-
018-03675-1.
Competing interests: The authors declare no competing interests.
Reprints and permission information is available online at http://npg.nature.com/
reprintsandpermissions/
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional afliations.
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons license, and indicate if changes were made. The images or other third party
material in this article are included in the articles Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the
articles Creative Commons license and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this license, visit http://creativecommons.org/
licenses/by/4.0/.
© The Author(s) 2018
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-03675-1
10 NATURE COMMUNICATIONS | (2018) 9:1289 |DOI: 10.1038/s41467-018-03675-1 |www.nature.com/naturecommunications
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com

Supplementary resources (2)

... Most potential aptamers were identified in silico by searching for evolutionarily conserved motifs preceding coding regions, followed by analysis of the RNAs for changes in secondary structure in vitro in the presence versus absence of specific ligands (6,(16)(17)(18). More recently, analysis of RNA structures in vivo under different physiological conditions or in the presence of high concentrations of a given ligand in vitro revealed aptamers and riboswitches that were not identified by in silico approaches (19)(20)(21). In either case, demonstration of ligand binding to more than a small number of candidate aptamers at a time, often using probing with nucleases or spontaneous cleavage (in-line probing), has proven challenging. ...
... Primers for amplifying the thiC, thiM and ilvE DNA templates were designed with a T7 promoter adjacent to the DNA coding for the 5 end of the aptamer and a 3 tag where indicated (Supplementary Table S3). For (p)ppGppbinding aptamer candidates (sequences from (25), Supplementary Tables S1 and S3) and for TPP-binding aptamer candidates (sequences from (19), Supplementary Tables S1 and S3), DNA templates for PCR amplification were purchased as double stranded DNA gene blocks from Integrated DNA Technologies. The gene blocks contained a T7 promoter adjacent to the DNA coding for the 5 end of the aptamer. ...
... The results with previously identified ppGpp-and TPP-binding aptamers, as well as the relative ease of application of RNA-DRaCALA to multiple samples, suggested that this method could be used to screen the binding of these or other ligands to large numbers of aptamers. However, TPP aptamer candidates identified in a recent differential structuromics study (19) did not bind 32 P-TPP by RNA-DRaCALA, possibly because of differences in the lengths of the RNA sequences and the solution conditions, such as the much higher ligand concentrations used in the structuromics study (Supplementary Figure S1C-E). We focused our further efforts on the analysis of aptamers from the ykkC subtype 2a homologous family of ppGpp binding aptamers. ...
Article
Full-text available
Ligand-binding RNAs (RNA aptamers) are widespread in the three domains of life, serving as sensors of metabolites and other small molecules. When aptamers are embedded within RNA transcripts as components of riboswitches, they can regulate gene expression upon binding their ligands. Previous methods for biochemical validation of computationally predicted aptamers are not well-suited for rapid screening of large numbers of RNA aptamers. Therefore, we utilized DRaCALA (Differential Radial Capillary Action of Ligand Assay), a technique designed originally to study protein-ligand interactions, to examine RNA-ligand binding, permitting rapid screening of dozens of RNA aptamer candidates concurrently. Using this method, which we call RNA-DRaCALA, we screened 30 ykkC family subtype 2a RNA aptamers that were computationally predicted to bind (p)ppGpp. Most of the aptamers bound both ppGpp and pppGpp, but some strongly favored only ppGpp or pppGpp, and some bound neither. Expansion of the number of biochemically verified sites allowed construction of more accurate secondary structure models and prediction of key features in the aptamers that distinguish a ppGpp from a pppGpp binding site. To demonstrate that the method works with other ligands, we also used RNA DRaCALA to analyze aptamer binding by thiamine pyrophosphate.
... Several methods have been proposed to identify SVRs. PARCEL 19 and RASA 20 directly model and compare raw read counts at each nucleotide position, and then identify regions enriched for position-level signals. However, their models are tailored for specific experimental protocols, and it is not straightforward to extend them to accommodate more emerging SP techniques. ...
... Despite the success of existing methods, differential analysis of SP data remains challenging in many aspects. First, SVRs manifest great variation in length, ranging from a few to several dozens of nucleotide positions 19,23,24 . As a result, searching with fixed search length can lead to insufficient detection power and inaccurate boundary mapping, when the prespecified search length deviates greatly from the true length. ...
... These data included a total of 13,162 simulated SVRs, with 6,587 being single nucleotide structural variations, 6,081 having lengths between 1 nt and 5 nt, and 494 SVRs with lengths greater than 5 nt (see Supplementary Fig. 4 for further details). Note that these simulated data echo the real world knowledge that the lengths of SVRs vary extensively 13,19,23,24,39 . We varied the strength of differential signals of simulated SVRs between "high", "medium", and "low". ...
Preprint
Full-text available
RNAs perform their function by forming specific structures, which can change across cellular conditions. Structure probing experiments combined with next generation sequencing technology have enabled transcriptome-wide analysis of RNA secondary structure in various cellular conditions. Differential analysis of structure probing data in different conditions can reveal the RNA structurally variable regions (SVRs), which is important for understanding RNA functions. Here, we propose DiffScan, a computational framework for normalization and differential analysis of structure probing data in high resolution. DiffScan preprocesses structure probing datasets to remove systematic bias, and then scans the transcripts to identify SVRs and adaptively determines their lengths and locations. The proposed approach is compatible with most structure probing platforms (e.g., icSHAPE, DMS-seq). When evaluated with simulated and benchmark datasets, DiffScan identifies structurally variable regions at nucleotide resolution, with substantial improvement in accuracy compared with existing SVR detection methods. Moreover, the improvement is robust when tested in multiple structure probing platforms. Application of DiffScan in a dataset of multi-subcellular RNA structurome identified multiple regions that form different structures in nucleus and cytoplasm, linking RNA structural variation to regulation of mRNAs encoding mitochondria-associated proteins. This work provides an effective tool for differential analysis of RNA secondary structure, reinforcing the power of structure probing experiments in deciphering the dynamic RNA structurome.
... Several methods have been proposed to identify SVRs. PARCEL 19 and RASA 20 directly model and compare raw read counts at each nucleotide position, and then identify regions enriched for position-level signals. However, their models are tailored for specific experimental protocols, and it is not straightforward to extend them to accommodate more emerging SP techniques. ...
... Continued development of methods to tackle the following challenges will advance insights from SP data. First, SVRs manifest great variation in length, ranging from a few to several dozens of nucleotide positions 19,24,26 . As a result, searching with fixed search length can lead to insufficient detection power and inaccurate boundary mapping, when the prespecified search length deviates greatly from the true length. ...
... These data included a total of 38,317 simulated SVRs, with 12,201 being single nucleotide structural variations, 13,048 having lengths between 2 nt and 5 nt, and 13,068 SVRs with lengths greater than 5 nt (see Supplementary Fig. 4 for further details). Note that these simulated data echo the real world knowledge that the lengths of SVRs vary extensively 13,19,24,26,43 . We varied the strength of differential signals of simulated SVRs between "high", "medium", and "low". ...
Article
Full-text available
RNAs perform their function by forming specific structures, which can change across cellular conditions. Structure probing experiments combined with next generation sequencing technology have enabled transcriptome-wide analysis of RNA secondary structure in various cellular conditions. Differential analysis of structure probing data in different conditions can reveal the RNA structurally variable regions (SVRs), which is important for understanding RNA functions. Here, we propose DiffScan, a computational framework for normalization and differential analysis of structure probing data in high resolution. DiffScan preprocesses structure probing datasets to remove systematic bias, and then scans the transcripts to identify SVRs and adaptively determines their lengths and locations. The proposed approach is compatible with most structure probing platforms (e.g., icSHAPE, DMS-seq). When evaluated with simulated and benchmark datasets, DiffScan identifies structurally variable regions at nucleotide resolution, with substantial improvement in accuracy compared with existing SVR detection methods. Moreover, the improvement is robust when tested in multiple structure probing platforms. Application of DiffScan in a dataset of multi-subcellular RNA structurome and a subsequent motif enrichment analysis suggest potential links of RNA structural variation and mRNA abundance, possibly mediated by RNA binding proteins such as the serine/arginine rich splicing factors. This work provides an effective tool for differential analysis of RNA secondary structure, reinforcing the power of structure probing experiments in deciphering the dynamic RNA structurome. The authors present DiffScan, an advanced tool for normalization and differential analysis of RNA structure probing experiments, combining their power in deciphering the dynamic RNA structurome and facilitating the discovery of RNA regulatory functions.
... For example, in vitro selection for RNA aptamers beginning with pools transcribed from natural genomic DNA sequences from eukaryotes was used to identify numerous RNAs that bind adenosine [104,105], GTP [106,107], or folic acid [108]. An in vivo structure probing method also was used to identify putative eukaryotic aptamers for the coenzyme FMN [109]. However, these findings await the publication of convincing evidence that these structures are used by cells as natural aptamers with a relevant biochemical purpose such as riboswitch function. ...
... Again, the main drawback of these approaches is that the predicted number of novel riboswitch classes per organism studied is simply too small, and many bacteria have none. Furthermore, structure probing methods can yield signatures of RNA structure switching upon binding of the riboswitch ligand either in vivo [132] or in vitro [109] but, to successfully establish switching function, the researcher must choose to test the matching ligand for the riboswitch class in the species under examination. Given the low probability of choosing an organism with a novel riboswitch and testing its corresponding ligand from among hundreds or thousands of candidate ligand choices, it is unlikely that researchers can obtain success with this approach at a scale that will be competitive with bioinformatic search methods. ...
Article
Full-text available
Riboswitches are structured noncoding RNA domains used by many bacteria to monitor the concentrations of target ligands and regulate gene expression accordingly. In the past 20 years over 55 distinct classes of natural riboswitches have been discovered that selectively sense small molecules or elemental ions, and thousands more are predicted to exist. Evidence suggests that some riboswitches might be direct descendants of the RNA-based sensors and switches that were likely present in ancient organisms before the evolutionary emergence of proteins. We provide an overview of the current state of riboswitch research, focusing primarily on the discovery of riboswitches, and speculate on the major challenges facing researchers in the field.
... Also, Term-seq cannot differentiate transcription termination caused by a riboswitch from transcription termination caused by a protein, which could be derived from indirect regulation. Another experimental method that can be used to discover new riboswitches is Parallel Analysis of RNA Confirmations Exposed to Ligand Binding (PARCEL) 18 . It is an in vitro method to detect structural changes of natural RNA in presence of a ligand by comparing the targeted sites of the RNAse V1, an enzyme that cleaves base-paired regions. ...
... It is an in vitro method to detect structural changes of natural RNA in presence of a ligand by comparing the targeted sites of the RNAse V1, an enzyme that cleaves base-paired regions. A change in the degradation pattern of the RNAse V1 between tested conditions is a good indicator that the ligand induces a change of conformation in the RNA, which is characteristic of riboswitch 18 . However, this technique is limited to regions that are accessible by RNase and only one genome can be analyzed at the time. ...
Preprint
Full-text available
Riboswitches are regulatory sequences composed of an aptamer domain capable of binding a ligand and an expression platform that allows the control of the downstream gene expression based on a conformational change. Current bioinformatic methods for their discovery have various limitations. To circumvent this, we developed an experimental technique to discover new riboswitches called SR-PAGE (Shifted Reverse Polyacrylamide Gel Electrophoresis). A ligand-based regulatory molecule is recognized by exploiting the conformational change of the sequence following binding with the ligand within a native polyacrylamide gel. Known riboswitches were tested with their corresponding ligands to validate our method. SR-PAGE was imbricated within an SELEX to enrich switching RNAs from a TPP riboswitch-based degenerate library to change its binding preference from TPP to thiamine. The SR-PAGE technique allows performing a large screening for riboswitches, search in several organisms and test more than one ligand simultaneously.
... A major challenge for discovery of novel bacterial ncRNA classes is that current biochemical, genetic, or bioinformatic methods are limited in their ability to identify candidates efficiently and comprehensively. Most biochemical and genetic methods, such as RNA structural probing ( 11 ) or RNA transcriptomics ( 12 ), are generally limited to examining the RNAs produced in a single, culturable organism per experiment. Thus, other strategies are needed to effectively uncover additional ncRNA candidates more efficiently. ...
Article
Structured noncoding RNAs (ncRNAs) contribute to many important cellular processes involving chemical catalysis, molecular recognition and gene regulation. Few ncRNA classes are broadly distributed among organisms from all three domains of life, but the list of rarer classes that exhibit surprisingly diverse functions is growing. We previously developed a computational pipeline that enables the near-comprehensive identification of structured ncRNAs expressed from individual bacterial genomes. The regions between protein coding genes are first sorted based on length and the fraction of guanosine and cytidine nucleotides. Long, GC-rich intergenic regions are then examined for sequence and structural similarity to other bacterial genomes. Herein, we describe the implementation of this pipeline on 50 bacterial genomes from varied phyla. More than 4700 candidate intergenic regions with the desired characteristics were identified, which yielded 44 novel riboswitch candidates and numerous other putative ncRNA motifs. Although experimental validation studies have yet to be conducted, this rate of riboswitch candidate discovery is consistent with predictions that many hundreds of novel riboswitch classes remain to be discovered among the bacterial species whose genomes have already been sequenced. Thus, many thousands of additional novel ncRNA classes likely remain to be discovered in the bacterial domain of life.
... Furthermore, in vitro-selected nucleic acid aptamers and engineered riboswitches (12)(13)(14) can change their conformation upon ligand binding. Utilizing this property, biosensors can be designed to detect biologically important small molecules, including neurotransmitters (14)(15)(16)(17)(18)(19) and hormones (20), metabolites (12,13,21), antibiotics (22), and anticancer drugs (23), for biological mechanism exploration, disease diagnostics, enzyme profiling, and pharmacokinetics studies. In addition, small molecule-sensing aptamers, such as the theophylline aptamer, can be engineered into gene circuits (24)(25)(26)(27) and activated through a ligand-triggered conformational transition to program gene expression and gene editing (27,28). ...
Article
Full-text available
Nucleic acids can undergo conformational changes upon binding small molecules. These conformational changes can be exploited to develop new therapeutic strategies through control of gene expression or triggering of cellular responses and can also be used to develop sensors for small molecules such as neurotransmitters. Many analytical approaches can detect dynamic conformational change of nucleic acids, but they need labeling, are expensive, and have limited time resolution. The nanopore approach can provide a conformational snapshot for each nucleic acid molecule detected, but has not been reported to detect dynamic nucleic acid conformational change in response to small -molecule binding. Here we demonstrate a modular, label-free, nucleic acid-docked nanopore capable of revealing time-resolved, small molecule-induced, single nucleic acid molecule conformational transitions with millisecond resolution. By using the dopamine-, serotonin-, and theophylline-binding aptamers as testbeds, we found that these nucleic acids scaffolds can be noncovalently docked inside the MspA protein pore by a cluster of site-specific charged residues. This docking mechanism enables the ion current through the pore to characteristically vary as the aptamer undergoes conformational changes, resulting in a sequence of current fluctuations that report binding and release of single ligand molecules from the aptamer. This nanopore tool can quantify specific ligands such as neurotransmitters, elucidate nucleic acid-ligand interactions, and pinpoint the nucleic acid motifs for ligand binding, showing the potential for small molecule biosensing, drug discovery assayed via RNA and DNA conformational changes, and the design of artificial riboswitch effectors in synthetic biology.
... Novel riboswitches can also be identified by means of experimental approaches. For instance, the binding of a metabolite ligand triggers an RNA conformational change in naturally occurring aptamers, which can be mapped at the transcriptome-wide level in vitro and in vivo [88][89][90]. ...
Article
Full-text available
Quantification of the concentration of particular cellular metabolites reports on the actual utilization of metabolic pathways in physiological and pathological conditions. Metabolite concentration also constitutes the readout for screening cell factories in metabolic engineering. However, there are no direct approaches that allow for real-time assessment of the levels of intracellular metabolites in single cells. In recent years, the modular architecture of natural bacterial RNA riboswitches has inspired the design of genetically encoded synthetic RNA devices that convert the intracellular concentration of a metabolite into a quantitative fluorescent signal. These so-called RNA-based sensors are composed of a metabolite-binding RNA aptamer as the sensor domain, connected through an actuator segment to a signal-generating reporter domain. However, at present, the variety of available RNA-based sensors for intracellular metabolites is still very limited. Here, we go through natural mechanisms for metabolite sensing and regulation in cells across all kingdoms, focusing on those mediated by riboswitches. We review the design principles underlying currently developed RNA-based sensors and discuss the challenges that hindered the development of novel sensors and recent strategies to address them. We finish by introducing the current and potential applicability of synthetic RNA-based sensors for intracellular metabolites.
... Genomic DNA was extracted using ENAISDK for coculture fermentations. The genomic DNA was used as template for amplification, and the primers used are listed in Supplementary Table S1 qPCR was performed using 2 × ChamQ Universal SYBR Green Master Mix (Vazyme) on a StepOnePlus real-time PCR system (Applied Biosystems, Foster City, USA), based on a previously described method (Tapsin et al., 2018;Wang et al., 2021b). The quantitative equation is as follows: ...
Article
Chameleon-like microbes in the fermentation community are an internal factor that facilitate the transformation of the community to the corresponding homeostasis states under specific environmental conditions. High temperature daqu can form three typical microecologies during the preparation process, making it an ideal system for studying chameleon-like microbes. This study integrated multi-omic methods such as metaproteomics, and determined that Neurospora crassa, Aspergillus nidulans, Bacillus subtilis and Oceanobacillus iheyensis were chameleon-like microbes that regulated the metabolic differences of five-member heterocyclic amino acids in daqu, resulting in microecological differentiation. Synthetic microbial consortia consisting of the four chameleon-like microbes with (T6) and without (T4) the dominant functional bacteria Saccharopolyspora erythraea and Virgibacillus haloimitrificans were fermented under simulated in situ conditions. The community constructed by microorganisms with greater functional diversity (T6) was more robust, and its metabolome was more similar to the in situ system. When exposed to environmental disturbances, the functional diversity helped to maintain the community stability by increasing the dissimilarity of chameleon-like microbes in the community and forming different homeostasis.
Article
Genetically encoded biosensors are the vital components of synthetic biology and metabolic engineering, as they are regarded as powerful devices for the dynamic control of genotype metabolism and evolution/screening of desirable phenotypes. This review summarized the recent advances in the construction and applications of different genetically encoded biosensors, including fluorescent protein-based biosensors, nucleic acid-based biosensors, allosteric transcription factor-based biosensors and two-component system-based biosensors. First, the construction frameworks of these biosensors were outlined. Then, the recent progress of biosensor applications in creating versatile microbial cell factories for the bioproduction of high-value chemicals was summarized. Finally, the challenges and prospects for constructing robust and sophisticated biosensors were discussed. This review provided theoretical guidance for constructing genetically encoded biosensors to create desirable microbial cell factories for sustainable bioproduction.
Article
Full-text available
The western blot is a very useful and widely adopted lab technique, but its execution is challenging. The workflow is often characterized as a "black box" because an experimentalist does not know if it has been performed successfully until the last of several steps. Moreover, the quality of western blot data is sometimes challenged due to a lack of effective quality control tools in place throughout the western blotting process. Here we describe the V3 western workflow, which applies stain-free technology to address the major concerns associated with the traditional western blot protocol. This workflow allows researchers: 1) to run a gel in about 20-30 min; 2) to visualize sample separation quality within 5 min after the gel run; 3) to transfer proteins in 3-10 min; 4) to verify transfer efficiency quantitatively; and most importantly 5) to validate changes in the level of the protein of interest using total protein loading control. This novel approach eliminates the need of stripping and reprobing the blot for housekeeping proteins such as β-actin, β-tubulin, GAPDH, etc. The V3 stain-free workflow makes the western blot process faster, transparent, more quantitative and reliable.
Article
Full-text available
Background: Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). Description: RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. Conclusions: RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in bacterial genomes. Analytical capabilities include exploration of: regulon content, structure and function; TF binding site motifs; conservation and variations in genome-wide regulatory networks across all taxonomic groups of Bacteria. RegPrecise 3.0 was selected as a core resource on transcriptional regulation of the Department of Energy Systems Biology Knowledgebase, an emerging software and data environment designed to enable researchers to collaboratively generate, test and share new hypotheses about gene and protein functions, perform large-scale analyses, and model interactions in microbes, plants, and their communities.
Article
Full-text available
Thiamin pyrophosphate (TPP) riboswitches are found in organisms from all three domains of life. Examples in bacteria commonly repress gene expression by terminating transcription or by blocking ribosome binding, whereas most eukaryotic TPP riboswitches are predicted to regulate gene expression by modulating RNA splicing. Given the widespread distribution of eukaryotic TPP riboswitches and the diversity of their locations in precursor messenger RNAs (pre-mRNAs), we sought to examine the mechanism of alternative splicing regulation by a fungal TPP riboswitch from Neurospora crassa, which is mostly located in a large intron separating protein-coding exons. Our data reveal that this riboswitch uses a long-distance (∼530-nt separation) base-pairing interaction to regulate alternative splicing. Specifically, a portion of the TPP-binding aptamer can form a base-paired structure with a conserved sequence element (α) located near a 5′ splice site, which greatly increases use of this 5′ splice site and promotes gene expression. Comparative sequence analyses indicate that many fungal species carry a TPP riboswitch with similar intron architecture, and therefore the homologous genes in these fungi are likely to use the same mechanism. Our findings expand the scope of genetic control mechanisms relying on long-range RNA interactions to include riboswitches.
Article
Full-text available
Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis. Fiji uses modern software engineering practices to combine powerful software libraries with a broad range of scripting languages to enable rapid prototyping of image-processing algorithms. Fiji facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system. We propose Fiji as a platform for productive collaboration between computer science and biology research communities.
Article
Full-text available
As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
Article
Full-text available
Secondary structure forms an important intermediate level of description of nucleic acids that encapsulates the dominating part of the folding energy, is often well conserved in evolution, and is routinely used as a basis to explain experimental findings. Based on carefully measured thermodynamic parameters, exact dynamic programming algorithms can be used to compute ground states, base pairing probabilities, as well as thermodynamic properties. The ViennaRNA Package has been a widely used compilation of RNA secondary structure related computer programs for nearly two decades. Major changes in the structure of the standard energy model, the Turner 2004 parameters, the pervasive use of multi-core CPUs, and an increasing number of algorithmic variants prompted a major technical overhaul of both the underlying RNAlib and the interactive user programs. New features include an expanded repertoire of tools to assess RNA-RNA interactions and restricted ensembles of structures, additional output information such as centroid structures and maximum expected accuracy structures derived from base pairing probabilities, or z-scores for locally stable secondary structures, and support for input in fasta format. Updates were implemented without compromising the computational efficiency of the core algorithms and ensuring compatibility with earlier versions. The ViennaRNA Package 2.0, supporting concurrent computations via OpenMP, can be downloaded from http://www.tbi.univie.ac.at/RNA.
Article
Full-text available
Until now the most efficient solution to align nucleotide sequences containing open reading frames was to use indirect procedures that align amino acid translation before reporting the inferred gap positions at the codon level. There are two important pitfalls with this approach. Firstly, any premature stop codon impedes using such a strategy. Secondly, each sequence is translated with the same reading frame from beginning to end, so that the presence of a single additional nucleotide leads to both aberrant translation and alignment. We present an algorithm that has the same space and time complexity as the classical Needleman-Wunsch algorithm while accommodating sequencing errors and other biological deviations from the coding frame. The resulting pairwise coding sequence alignment method was extended to a multiple sequence alignment (MSA) algorithm implemented in a program called MACSE (Multiple Alignment of Coding SEquences accounting for frameshifts and stop codons). MACSE is the first automatic solution to align protein-coding gene datasets containing non-functional sequences (pseudogenes) without disrupting the underlying codon structure. It has also proved useful in detecting undocumented frameshifts in public database sequences and in aligning next-generation sequencing reads/contigs against a reference coding sequence. MACSE is distributed as an open-source java file executable with freely available source code and can be used via a web interface at: http://mbb.univ-montp2.fr/macse.
Article
How bacteria switch between tracks Bacterial riboswitches prevent the formation of full-length messenger RNA, and hence proteins, via transcriptional termination in response to metabolites. However, identifying riboswitches within the genome has previously required comparative analysis, which may miss species- and environmentally specific responses. Dar et al. developed a method called term-seq to document all riboswitches in a bacterial genome, as well as their metabolite counterparts (see the Perspective by Sommer and Suess). The method revealed a role for pathogenic bacterial riboswitches in antibiotic resistance. Thus, transcription may be one way pathogens fend off antibiotic attack. Science , this issue p. 10.1126/science.aad9822 ; see also p. 144
Article
Three frequently used methods for estimating the synonymous and nonsynonymous substitution rates (Ks and Ka) were evaluated and compared for their accuracies; these methods are denoted by LWL85, LPB93, and GY94, respectively. For this purpose, we used a codon-evolution model to obtain the expected Ka and Ks values for the above three methods and compared the values with those obtained by the three methods. We also proposed some modifications of LWL85 and LPB93 to increase their accuracies. Our computer simulations under the codon-evolution model showed that for sequences less than or equal to300 codons, the performance of GY94 may not be reliable. For longer sequences, GY94 is more accurate for estimating the Ka/Ks ratio than the modified LPB93 and LWL85 in the majority of the cases studied. This is particularly so when k greater than or equal to 3, which is the transition/transversion (mutation) rate ratio. However, when k is approximately 2 and when the sequence divergence is relatively large, the modified LWL85 performed better than GY94 and the modified LP1393. The inferiority of LPB93 to LWL85 is surprising because LPB93 was intended to improve LWL85. Also, it has been thought that the codon-based method of GY94 is better than the heuristic method of LWL85, but our simulation results showed that in many cases, the opposite was true, even though our simulation was based on the codon-evolution model.
Article
Summary: It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data.Availability: The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org).Contact: mrobinson@wehi.edu.au