ArticlePDF Available

Machine-learning guided elucidation of contribution of individual steps in the mevalonate pathway and construction of a yeast platform strain for terpenoid production

Authors:

Abstract and Figures

The production of terpenoids from engineered microbes contributes markedly to the bioeconomy by providing essential medicines, sustainable materials, and renewable fuels. The mevalonate pathway leading to the synthesis of terpenoid precursors has been extensively targeted for engineering. Nevertheless, the importance of individual pathway enzymes to the overall pathway flux and final terpenoid yield is less known, especially enzymes that are thought to be non-rate-limiting. To investigate the individual contribution of the five non-rate-limiting enzymes in the mevalonate pathway, we created a combinatorial library of 243 Saccharomyces cerevisiae strains, each having an extra copy of the mevalonate pathway integrated into the genome and expressing the non-rate-limiting enzymes from a unique combination of promoters. High-throughput screening combined with machine learning algorithms revealed that the mevalonate kinase, Erg12p, stands out as the critical enzyme that influences product titer. ERG12 is ideally expressed from a medium-strength promoter which is the ‘sweet spot’ resulting in high product yield. Additionally, a platform strain was created by targeting the mevalonate pathway to both the cytosol and peroxisomes. The dual localization synergistically increased terpenoid production and implied that some mevalonate pathway intermediates, such as mevalonate, IPP, and DMAPP, are diffusible across peroxisome membranes. The platform strain resulted in 94-fold, 60-fold, and 35-fold improved titer of monoterpene geraniol, sesquiterpene α-humulene, and triterpene squalene, respectively. The terpenoid platform strain will serve as a chassis for producing any terpenoids and terpene derivatives.
Content may be subject to copyright.
Metabolic Engineering 74 (2022) 139–149
Available online 29 October 2022
1096-7176/© 2022 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Machine-learning guided elucidation of contribution of individual steps in
the mevalonate pathway and construction of a yeast platform strain for
terpenoid production
Minakshi Mukherjee
a
, Rachael Hageman Blair
b
, Zhen Q. Wang
a
,
*
a
Department of Biological Sciences, University at Buffalo, State University of New York, Buffalo, NY, NY14260, USA
b
Department of Biostatistics, University at Buffalo, State University of New York, Buffalo, NY, NY14260, USA
ARTICLE INFO
Keywords:
Terpene
Saccharomyces cerevisiae
Random forest
Mevalonate kinase
Metabolic engineering
ABSTRACT
The production of terpenoids from engineered microbes contributes markedly to the bioeconomy by providing
essential medicines, sustainable materials, and renewable fuels. The mevalonate pathway leading to the synthesis
of terpenoid precursors has been extensively targeted for engineering. Nevertheless, the importance of individual
pathway enzymes to the overall pathway ux and nal terpenoid yield is less known, especially enzymes that are
thought to be non-rate-limiting. To investigate the individual contribution of the ve non-rate-limiting enzymes
in the mevalonate pathway, we created a combinatorial library of 243 Saccharomyces cerevisiae strains, each
having an extra copy of the mevalonate pathway integrated into the genome and expressing the non-rate-limiting
enzymes from a unique combination of promoters. High-throughput screening combined with machine learning
algorithms revealed that the mevalonate kinase, Erg12p, stands out as the critical enzyme that inuences product
titer. ERG12 is ideally expressed from a medium-strength promoter which is the ‘sweet spot resulting in high
product yield. Additionally, a platform strain was created by targeting the mevalonate pathway to both the
cytosol and peroxisomes. The dual localization synergistically increased terpenoid production and implied that
some mevalonate pathway intermediates, such as mevalonate, isopentyl pyrophosphate (IPP), and dimethylallyl
pyrophosphate (DMAPP), are diffusible across peroxisome membranes. The platform strain resulted in 94-fold,
60-fold, and 35-fold improved titer of monoterpene geraniol, sesquiterpene
α
-humulene, and triterpene squa-
lene, respectively. The terpenoid platform strain will serve as a chassis for producing any terpenoids and terpene
derivatives.
1. Introduction
Terpenoids are ve-carbon isoprene derivatives that constitute the
largest class of natural products and are widely used as fuels, medicines,
and fragrances (Christianson, 2017; Belcher et al., 2020). However,
terpenoid yields from natural biological sources are often low, and
chemical synthesis is challenging due to their structural complexity.
Engineering microbes, especially bakers yeast, for sustainable terpe-
noid production has achieved considerable success in the past decade
(Ro et al., 2006; Engels et al., 2008). Terpenoid biosynthesis in yeast
relies on the mevalonate (MVA) pathway, which produces the universal
terpenoid precursors isopentyl pyrophosphate (IPP) and dimethylallyl
pyrophosphate (DMAPP) (Fig. 1A).
Engineered yeast strains for terpenoid production usually
overexpresses MVA pathway genes to provide sufcient IPP and DMAPP
for producing a wide range of terpenoids in yeast Saccharomyces cer-
evisiae (Navale et al., 2021). In recent works, all seven genes of the MVA
pathway were overexpressed from the yeast genome to increase con-
centrations of IPP and DMAPP and subsequently increased the titer of
specic terpenoids (Guo et al., 2018; Yuan and Ching, 2014; Yee et al.,
2019; Lv et al., 2016; Jiang et al., 2021; Westfall et al., 2012; Peng et al.,
2017; Li et al., 2020; Liu et al., 2020; Zhang et al., 2020). The seven
genes were usually expressed from strong promoters, and there has been
limited attention to balancing the expression of each gene. Unbalanced
expression of pathway genes may lead to the accumulation of in-
termediates that inhibit enzyme activities through feedback regulations
(Sauro, 2017). Combinatorial screening of the MVA pathway genes
expressed from promoters with various strengths can help identify the
* Corresponding author. Department of Biological Sciences, University at Buffalo, 653 Cooke Hall, Buffalo, NY14260, USA.
E-mail address: zhenw@buffalo.edu (Z.Q. Wang).
Contents lists available at ScienceDirect
Metabolic Engineering
journal homepage: www.elsevier.com/locate/meteng
https://doi.org/10.1016/j.ymben.2022.10.004
Received 26 April 2022; Received in revised form 16 October 2022; Accepted 23 October 2022
Metabolic Engineering 74 (2022) 139–149
140
optimal expression of each enzyme for maximized pathway ux and
terpenoid production. Such effort can also reveal the in vivo contribution
of each gene in the MVA pathway, especially the ve non-rate-limiting
enzymes. While there is a consensus that HMG-CoA reductase Hmg1p
and IPP isomerase Idi1p are bottlenecks (Han et al., 2018; Zhao et al.,
2017; Jiang et al., 2017; Xie et al., 2015; Verwaal et al., 2007), varying
information exists regarding the relative contribution of the other ve
MVA pathway genes (Kwak et al., 2020; Zhou et al., 2018; McClory
et al., 2019; Hu et al., 2020; Madsen et al., 2011; Yao et al., 2018;
Redding-Johanson et al., 2011; Alonso-Gutierrez et al., 2015). Thus, an
exhaustive study elucidating the relative importance of the ve
non-rate-limiting enzymes in the MVA pathway will deepen our
fundamental knowledge of the pathway enzymes and guide future en-
gineering to increase terpenoid titers.
Moreover, creating a yeast platform strain with increased terpenoid
precursors can shorten the strain development process to support the
high-titer production of terpenoids. A platform strain is a genetically
engineered microbe that provides abundant precursors for producing
various products (Nielsen, 2015). Developing a platform strain elimi-
nates repetitive engineering of the same precursor pathway for different
target molecules. Several yeast platform strains have been developed to
access precursors for alkaloids and aromatics (Chen et al., 2013;
Rodriguez et al., 2015; Gold et al., 2015; Campbell et al., 2016; Pyne
et al., 2020), but no such platform strain exists for terpenoids. Therefore,
we aim to build a yeast platform strain that can be used to produce any
terpenoid once compound-specic downstream modications are
incorporated.
In this study, we created a combinatorial library of 243 stable
transgenic strains with each of the ve non-rate-limiting MVA pathway
genes under three different promoters. Machine learning algorithms
revealed that ERG12 encoding the mevalonate kinase is the most critical
gene, apart from HMG1 and ID, that contributes signicantly to the
productivity of the MVA pathway. We have also created a universal
yeast platform strain for producing any terpenoids by dual-targeting the
MVA pathway to both the cytosol and peroxisomes. The dual-targeting
experiment revealed that some MVA pathway intermediates, including
mevalonate and IPP/DMAPP, are diffusible between cytosol and per-
oxisomes. The platform strain produced 94-fold higher monoterpene
geraniol, 60-fold higher sesquiterpene
α
-humulene, and 35-fold higher
triterpene squalene compared to the wild-type control.
2. Materials and methods
2.1. Strains and growth media
S. cerevisiae strains used to construct the engineered strains, CEN.
PK21C (MATa; his3D1; leu2-3_112; ura3-52; trp1-289; MAL2-8c; SUC2),
CEN.PK2-1D (MAT
ɑ
; his3D1; leu2-3_112; ura3-52; trp1-289; MAL2-8c;
SUC2) and CEN.PK2 (MATa/
ɑ
; his3D1/his3D1; leu2-3_112/leu2-3_112;
ura3-52/ura3-52; trp1-289/trp1-289; MAL2-8c/MAL2-8c; SUC2/SUC2),
were acquired from Euroscarf, Germany. E. coli strain DH5ɑ was used for
cloning and plasmid propagation.
E. coli cells were grown on Luria-Bertani (LB) plates with appropriate
antibiotics. Yeast synthetic dropout media used for integrations, mating,
and culturing contained 0.67% (w/v) yeast nitrogen base without amino
acids (Difco, Franklin Lakes, NJ), 2% (w/v) dextrose (Fisher Scientic,
Waltham, MA), 0.07% (w/v) synthetic complete amino acid mix (CSM)
without certain amino acids (Sunrise Science, Knoxville, TN). SD +400
μ
g/ml G418 (pH =7) (Goldbio, St. Louis, MO), which selects for the
plasmid, was used for seed culture preparation. YPD (1% yeast extract,
2% peptone, and 2% dextrose) without antibiotic selection was used for
preparing the growth curves in Fig. 4B. YPD +200
μ
g/ml G418 was used
for compound production (Vickers et al., 2013).
2.2. Gene synthesis, PCR, and cloning
The ERG20
WW
, tObGES, ZSS1, and CdGeDH genes were codon-
optimized and synthesized by IDT (Newark, NJ). PCR amplication
was performed using the Phusion High Fidelity DNA Polymerase (NEB,
Ipswich, MA) according to the manufacturers protocol. Gibson assem-
bly (Gibson, 2011) was used to clone the sgRNAs into the pCAS (Fer-
nandes et al., 2007) plasmid for CRISPR-guided genomic integration.
Golden Gate assembly (Mukherjee et al., 2021) was performed to
assemble all the other constructs. The sequences of all part plasmids
were conrmed using Sanger sequencing (GeneWiz, South Plaineld,
NJ). A schematic outlining the general strategy for cloning the
multi-gene plasmids is included in Fig. S1. All the constructs created and
primers used are listed in Tables S1S8.
Fig. 1. Overexpressing the complete MVA pathway
led to increased geraniol production. (A) The MVA
pathway leads to geraniol production. Proteins in
blue were overexpressed MVA enzymes. Erg10p:
acetoacetyl-CoA thiolase; Erg13p: 3-hydroxy-3-meth-
ylglutaryl-CoA (HMG-CoA) synthase; tHmg1p: trun-
cated HMG-CoA reductase without the regulatory
domain; Erg12p: mevalonate kinase; Erg8p: phos-
phomevalonate kinase Erg19p: mevalonate pyro-
phosphate decarboxylase; Idi1p: isopentenyl-
diphosphate isomerase; Erg20
ww
p: Erg20p (F96W,
N127W) mutant acting as a geranyl pyrophosphate
(GPP) synthase; tObGES: truncated geraniol synthase
from Ocimum basilicum. IPP: isopentyl pyrophosphate;
DMAPP: dimethylallyl pyrophosphate. (B) Schematic
showing the genomic integration of seven MVA
pathway genes and the tObGES-Erg20
ww
p fusion
protein expressed episomally from a strong constitu-
tive promoter (pPYK001). The two proteins are fused
together with a GSG linker. (C) Geraniol yield in
engineered strains (MVAc1, MVAc2, MVAc3, and
MVAc4). "c indicates that genes are localized to the
cytosol. Fold increase compared to the wild type at
each time point is noted at the top of each bar. Data
represent the average ±SD of three independent
biological replicates.
M. Mukherjee et al.
Metabolic Engineering 74 (2022) 139–149
141
2.3. Strain construction
Yeast competent cells were co-transformed with the NotI digested
and linearized multi-gene (Lee et al., 2015) and pCAS-sgRNA (Ryan
et al., 2014) plasmids using the Frozen-EZ yeast transformation II kit
(Zymo Research, Irvine, CA) according to the manufacturers protocol.
The transformed cells were plated on appropriate dropout media for
selection and incubated at 30 C for two days and 37 C for an additional
day to facilitate genomic integration (Ryan et al., 2014). Two pairs of
diagnostic primers were used to conrm each integration by
polymerase-chain reactions (PCR) using the GoTaqGreen DNA poly-
merase (Promega, Madison, WI). For further conrmation of each gene
in two-gene inserts at ROX1 and GAL80 loci, primers were designed such
that the forward and reverse primers bind to the rst and the second
gene, respectively. For three gene inserts at the GAL1 locus, an addi-
tional pair of forward and reverse primers binds to the second and third
genes, respectively. All the primers used are listed in Table S8.
2.4. Mating of yeast strains
243 library strains: One colony was picked from each of the 27 GAL1Δ
and 9 ROX1ΔGAL80Δ+tObGES-ERG20
ww
strains from their respective
dropout plates (SD-Leu and SD-Ura-Trp-His) and streaked out in vertical
and horizontal lines respectively on an SD-Leu- Ura-Trp-His plate fol-
lowed by incubating at 30 C for two days (see schematic in Fig. S2).
Colonies growing at the intersection of the streaks were further streaked
out on a fresh SD-Leu-Ura-Trp-His plate and incubated at 30 C over-
night. They were then screened with diagnostic and gene-specic
primers to conrm the integration. For the MVA platform strain, one
colony from MVAc4 and MVAp4 were streaked out as above on an SD-
Leu-Ura-Trp +200
μ
g/ml Hygromycin (Goldbio, St. Louis, MO) plate
and incubated and screened as mentioned above.
2.5. Geraniol production and quantication
2.5.1 Geraniol production: For geraniol production from strains CEN.
PK21C and MVAc1-MVAc4, yeast colonies transformed with the
pPYK1-tObGES-ERG20
ww
plasmids were grown overnight in 5 ml SD-His
at 30 C with shaking at 200 rpm. The overnight culture was inoculated
at an initial OD
600
of 0.1 into fresh SD-His and grown at 30 C with
shaking at 200 rpm for 48 h 1 ml of the culture was collected at 12, 24,
and 48 h and was pelleted at 16,000×g for 1 min, and 50
μ
l of the su-
pernatant was used to quantify geraniol using the geraniol dehydroge-
nase (GeDH) assay (Lin et al., 2018).
For library screening, seed cultures were set up with three replicates
of each wildtype CEN.PK2 and 243 strains by inoculating three colonies
of each strain into 200
μ
l SD-Leu-Ura-Trp-His media in 96-well plates.
The overnight culture was inoculated at an initial OD
600
of ~0.1 into
fresh SD-Leu-Ura-Trp-His media in 96-deep-well plates; each well has
500
μ
l culture. The deep-well plates were incubated at 30 C with
shaking at 400 rpm for 12 h. The plates were centrifuged at 3,220×g for
5 min, and 50
μ
l of the supernatant was used for the GeDH assay.
For geraniol production from the wildtype CEN.PK21C, MVAc4,
MVAp4, and MVA platform strains, yeast colonies transformed with
either pGAL1-tObGES-ERG20
ww
or tObGES-ERG20
ww
-SKL were grown
overnight in 5 ml SD +400
μ
g/ml G418 (pH =7). The overnight culture
was inoculated at an initial OD
600
of 0.1 into fresh YPD +200
μ
g/ml
G418 and grown at 30 C with shaking at 200 rpm for 24 h 1 ml of the
culture was collected and pelleted at 16,000×g for 1 min, and 50
μ
l of
the supernatant was used to quantify geraniol using the GeDH assay.
2.5.2 Geraniol dehydrogenase assay: CdGeDH gene from Castellaniella
defragrans, encoding the geraniol dehydrogenase, was cloned into the
pET-24 vector by Gibson assembly. Protein purication and the assay
were performed with slight modications from the protocol described in
Lin et al., 2018). Briey, pET-24_CdGeDH with a C-terminal his-tag was
transformed into E. coli (BL21), a single colony was inoculated for seed
culture overnight and diluted 50-fold in a scaled-up culture, grown at
37 C till OD
600
of 0.6, then 0.1 mM of IPTG (Goldbio, St. Louis, MO) was
added, followed by grown at 16 C for 24 h. The culture was centrifuged
at 3220×g for 20 min, the supernatant was discarded, and the pellet was
resuspended in lysis buffer (50 mM Tris pH =7.5, 5 mM imidazole, and
1 mM phenylmethylsulfonyl uoride) and 1 mg/ml lysozyme (Sigma
Aldrich, St. Louis, MO). Cells were lysed with a sonicator (Misonix,
Farmingdale, NY) for 2 min with 10 s pulses. Proteins were puried
using a Ni-NTA column (Qiagen, Germantown, MD). Unbound proteins
were eliminated with wash buffer (50 mM Tris pH-7.5, 40 mM imid-
azole), and GeDH protein was eluted with elution buffer (50 mM Tris
pH-7.5, 250 mM imidazole). The purify of the resulting CdGeDH enzyme
was routinely examined by protein gel electrophoresis.
For the GeDH assay, 50
μ
l of the spent media was mixed with 50
μ
l of
a prepared reaction mix such that the nal mixture contained: 100 mM
Tris-HCl (pH 8.0), 2 mM nicotinamide adenine dinucleotide (NAD
+
)
(Goldbio, St. Louis, MO), 2 mM resazurin sodium salt (Acros Organics,
Belgium), 0.002 U puried geraniol dehydrogenase, and 1 U diaphorase
(Sigma Aldrich, St.Louis, MO). To prepare geraniol standard curve, 10X
of each geraniol concentration was prepared by dissolving the authentic
geraniol standard (Acros Organics, Belgium) in acetone. Next, the 10X
concentrations were diluted and added to the reaction mix such that the
nal geraniol concentration is 1X. The geraniol standard curves used for
Figs. 1C, 2B and 4C are shown in Fig. S3. Each reaction was incubated at
room temperature for 45 min, and uorescence was recorded at the
excitation and emission of 530 nm and 590 nm, respectively, using a
Tecan Spark microplate reader (Morrisville, NC). The geraniol concen-
trations of MVA platform +tObGES-ERG20
ww
were conrmed using gas
chromatography coupled with mass spectrometry (GC-MS) (Fig. S4).
2.6. Terpene quantication using GC-MS
For geraniol, citronellol, and geranyl acetate extraction, 1 ml culture
was centrifuged at 16,000×g for 1 min, 500
μ
l of the supernatant was
mixed with 500
μ
l hexane and shaken in a plate shaker at the highest
speed for 10 min, followed by centrifugation at 16,000×g for 2 min. 500
μ
l of the hexane layer was diluted ve folds in hexane and used for GC-
MS. For
α
-humulene extraction, 1 ml culture was centrifuged at
16,000×g for 1 min, and 500
μ
l of the supernatant was mixed with 500
μ
l ethyl acetate and shaken in a plate shaker at the highest speed for 10
min followed by centrifugation at 16,000×g for 2 min. 500
μ
l of the ethyl
acetate layer was collected for GC-MS. For squalene extraction, 1 ml
culture was centrifuged at 16,000×g for 1 min. The supernatant was
discarded, and the pellet was dissolved in 200
μ
l ethyl acetate, followed
by homogenizing with 100 mg of 0.5 mm glass beads in a Bullet
Blender® tissue homogenizer at the highest setting for 10 min at 4 C.
300
μ
l ethyl acetate was then added to the sample, and the sample was
further vortexed and centrifuged at 16,000×g for 2 min. 500
μ
l of the
hexane layer was collected for GC-MS.
Terpenes were detected using a Thermo Trace 1300 Gas Chromato-
graph and Thermo Q-exactive
TM
Orbitrap Mass Spectrometer (Waltham,
MA). 5
μ
L geraniol-containing samples, 2
μ
L
α
-humulene-, or squalene-
containing samples were injected into a Thermo Scientic TraceGOLD
TG-5SILMS column (30 m long, 0.25 mm inner diameter, 0.25
μ
m lm
thickness) using helium as the carrier gas (1 ml/min). The injector was
held at 200 C. For geraniol, citronellol, and geranyl acetate analysis, the
oven was held at 40 C for 4 min, followed by ramping up to 280 C at a
rate of 20 C/min and then holding at 280 C for 2 min. The mass range
monitored was 39200 m/z in the positive ion mode. Geraniol eluted at
10.24 min, citronellol at 9.93 min, and geranyl acetate at 10.99 min. For
α
-humulene, the oven was held at 80 C for 3 min, followed by ramping
up to 180 C at a rate of 15 C/min and further ramping to 240 C at the
rate of 10 C/min, holding for 1 min. The mass range monitored was
50250 m/z in the positive ion mode.
α
-humulene eluted at 9.7 min. For
squalene, the oven was held at 80 C for 3 min, followed by ramping up
to 180 C at a rate of 15 C/min and further ramping to 310 C at 20 C/
M. Mukherjee et al.
Metabolic Engineering 74 (2022) 139–149
142
min and then holding at 280 C for 1 min. The mass range monitored was
50450 m/z in the positive ion mode. Squalene eluted at 16.8 min. The
MS transfer line was at 250 C, and the source temperature was 200 C.
The resolution was set to 60,000. The MS was set to monitor total ion
counts.
Peak areas for geraniol,
α
-humulene, and squalene were quantied
using the Xcalibursoftware (Thermo Fisher, Waltham, MA). Absolute
sample concentrations were calculated from a standard curve of
authentic geraniol (Acros Organics, Belgium), citronellol (Acros Or-
ganics, Belgium), geranyl acetate (Thermo Scientic, Waltham, MA),
α
-humulene (Millipore Sigma, Burlington, MA), and squalene (TCI
America, Portland, OR) standards. To prepare standard curves, geraniol,
citronellol, and geranyl acetate were diluted in hexane and squalene and
α
-Humulene standards in ethyl acetate. Geraniol and squalene standards
were diluted over a range of 1.5625 mg/L, citronellol 1.066.25 mg/L,
and
α
-Humulene 0.53112.5 mg/L. Ions of m/z values 123.1168 ±5
ppm, 138.1403±5 ppm, 136.1247±5 ppm, 93.0698 ±5 ppm, and
121.1012 ±5 ppm were used for quantifying the peak area for geraniol,
citronellol, geranyl acetate,
α
-humulene, and squalene, respectively.
2.7. Statistical methods
A random forest (RF) (Breiman, 2001) was used to t predictive
models for geraniol production. Briey, RFs construct ensembles of
Classication and Regression Trees (CART) (Breiman et al., 2017) from
bootstrap replications of the data. Each CART model is a decision tree
that creates a prediction of geraniol, and the nal prediction is based on
aggregation over the ensemble. Models were t based on out-of-bag
estimation (Breiman, 1996), which prevents overtting.
Tree-based models such as RFs are particularly useful when in-
teractions are expected between variables, in this case, the MVA
pathway enzymes, and for delineating the role and importance of the
individual variables (Breiman, 1996) in the prediction of the outcome,
geraniol titer. Another strength of the RF is that it implements bootstrap
resampling of the data (Efron and LePage, 1992), accounting for un-
certainty in the population, and is ideal for a smaller sample size of this
type. The bootstrap replication datasets are generated by resampling the
observations (strains) with replacement and are the same size as the
original dataset. The output is an ensemble of prediction models
aggregated to produce a prediction for each observation. The accuracy
of the RF was estimated using a simple residual sum of squares (RSS) loss
function averaged over out-of-bag (OOB) samples (Friedman JTibshir-
ani, 2001) in the ensemble to produce a mean squared error (MSE).
Using the OOB error estimate eliminates the requirement for a set-aside
test set (Breiman, 2001). Notably, by nature of the resampling, not all
the observations are present in each bootstrap replication. OOB error
leverages this for estimation by aggregating only over the predictors in
the ensemble for which an observation was not randomly selected in the
bootstrap, which inherently avoids overtting (Breiman, 2001). OOB
estimation is an effective alternative for smaller datasets that may be
sensitive to training and testing splits or fold assignments in
cross-validation.
Variable importance (Breiman, 2001; Friedman JTibshirani, 2001)
measures were used to prioritize the enzymes according to their
contribution to the predictive accuracy of the outcome. Importance is
measured by increases in node purity that serves as a surrogate for the
performance of the random forest. High increases in node purity indicate
that the predictive strength of the model shows high levels of
improvement when the enzyme is included in the random forest, and its
elimination from the data set would considerably degrade the predictive
strength (Fig. 3A).
Partial Dependence Plots (PDP) are a popular technique for visual-
izing the contribution of variables to an outcome and the relationships
between pairs of variables and an outcome (Cutler et al., 2007; Green-
well, 2017). Using the variable importance measure as a prioritization,
we examined the impact of the ve MVA pathway enzymes on geraniol
production and their interactions. PDP proles were computed using
grids created of ten equally spaced values over the support region for
each enzyme. Linear interpolation was used to estimate geraniol pro-
duction in between data points.
Individual Conditional Expectation (ICE) curves (Goldstein et al.,
2015) were also examined for the highest and lowest-producing strains.
ICE curves enable the visualization of the functional relationships be-
tween the predicted values of geraniol production and enzyme levels for
individual strains and are useful for assessing sensitivity (Fig. S4).
Analysis was performed in the R programming language with the
randomForest(Breiman, 2001), PDP(Greenwell, 2017), and vivo
packages.
Fig. 2. Construction and screening of the combina-
torial yeast MVA library with varying promoter
strengths. (A) A diploid library of 243 strains, each
having tHMG1 and IDI1 under strong promoters and
ERG13, ERG12, ERG19, ERG10, and ERG8 under a
unique combination of strong, medium, or weak
promoters integrated into the genome. The tObGES-
ERG20
ww
fusion protein was expressed from a
plasmid. Color intensity represents promoter
strength. The strains were cultured in 96-deep-well
plates, and the geraniol produced was quantied
using a uorescence-based assay. (B) Heat map
showing relative promoter strengths and the corre-
sponding uorescence normalized to OD
600
of the
wild type and the 243 strains. The top ten strains with
the highest uorescence readings are marked with an
asterisk. Data represent the average of three inde-
pendent biological replicates.
M. Mukherjee et al.
Metabolic Engineering 74 (2022) 139–149
143
3. Results
3.1. Sequential integration of the complete MVA pathway into the yeast
genome
We have focused on genomic integration instead of a plasmid-based
system because an ideal platform strain should be genetically stable and
not require selective markers during fermentation. An additional copy of
all seven MVA pathway genes was integrated sequentially into the yeast
genome under the rationale that overexpression of the complete MVA
pathway would increase IPP and DMAPP levels. The MVA pathway
genes were inserted into three genomic loci, GAL80, GAL1, and ROX1
(Fig. 1B, Table 1). GAL80 and GAL1 deletions allowed gene expression
under galactose-inducible promoters when glucose was the sole carbon
source (Westfall et al., 2012). ROX1 was disrupted to boost the MVA
pathway by alleviating transcriptional repression (Trikka et al., 2015).
Each MVA pathway gene was expressed from a unique, strong consti-
tutive promoter to minimize potential homologous recombination
(Orr-Weaver et al., 1981). The sequentially engineered MVA strains
(MVAc1-4) were transformed with a plasmid enabling the production of
geraniol, a fragrant monoterpenoid and a precursor for medicinally
important indole alkaloids (Chen and Viljoen, 2010; Brown et al., 2015).
The fusion protein tObGES-ERG20
ww
(Wang et al., 2021; Ignea et al.,
2014) was used for geraniol biosynthesis as fusing geraniol diphosphate
synthase (ERG20
ww
) with geraniol synthase (tObGES) resulted in higher
geraniol production than when the two are separately expressed
(Fig. S5).
Geraniol yield increased with the increase in the number of over-
expressed MVA pathway genes (Fig. 1C). Strain MVAc1 with ERG10 and
tHMG1 overexpressed had over 2.5-fold increased geraniol yield after
12 h of shake-ask cultivation. Strain MVAc2 only showed a marginal
increase compared with MVAc1, likely because the excessive mevalo-
nate generated by tHMG1 overexpression was not channeled into the
MVA pathway due to the lack of the mevalonate kinase ERG12 in the
heterologous pathway. Strain MVAc3 overexpressing ve out of the
seven MVA pathway genes further increased geraniol yield. MVAc4 with
the complete MVA pathway overexpressed had the highest geraniol
yield, which is 7.5-fold of the wild type at 12 h. Geraniol titer was
maximum at 24 h (Fig. S6). Therefore, in addition to the two rate-
limiting enzymes, the other ve enzymes also play important roles in
increasing the MVA pathway productivity.
3.2. Creating a combinatorial strain library to survey the promoter space
of MVA pathway genes
When integrating the complete MVA pathway into the genome,
strong yeast promoters are usually used. However, they may not be the
ideal set of promoters that maximize pathway productivity. To nd the
optimal promoter combination of pathway genes and to delineate the
contribution of each gene to MVA pathway productivity, we created a
combinatorial strain library of 243 diploid strains with varying pro-
moter strengths. The rate-limiting genes tHMG1 and IDI1 were always
expressed from a strong promoter since their essentiality to the pathway
is well-documented (Han et al., 2018; Zhao et al., 2017; Jiang et al.,
2017; Xie et al., 2015; Verwaal et al., 2007; Zhou et al., 2012). Each of
the remaining ve genes was expressed from a unique combination of
strong, medium, or weak promoters, creating 3
5
=243 strains (Fig. 2A).
The choice of promoters and their relative expression strengths were
based on the extensive characterization of yeast promoters by Lee et al.
(2015) (Table S10).
The construction of the combinatorial library was streamlined by
mating engineered haploids of opposite mating types. Haploid strains of
mating-type MATa overexpressed ERG13, ERG12, and ERG19, each
under three different promoters, in the GAL1 locus. 3
3
=27 of such
Fig. 3. Random Forests were used to assess the importance and dependence of the MVA enzymes. (A) Variable importance from a random forest predicting readout.
Enzymes are ranked according to increases in node purity, a measure of performance. (BF) Partial dependence plots show the predicted geraniol readout values as a
function of enzyme expression for ERG19, ERG13, ERG12, ERG10, and ERG8. The blue tick marks represent the promoter strengths within the data, and the
remaining curve was generated through interpolation. (GL) Two-way partial dependence plots for the interactions between ERG12 and the other four pathway
enzymes, as well as the interactions between ERG19 and ERG13, and ERG8 and ERG10.
Table 1
List of strains generated for creating the MVA platform strain.
Strains Description Source
MVAc1 CEN-PK2-1C; rox1Δ::pHHF2-ERG10-tENO1, pTDH3-
tHMG1-tTDH1, URA3
This
study
MVAc2 MVAc1; gal80Δ::pTEF1-ERG8-tSSA1, pCCW12-IDI1-
tENO2, TRP1
This
study
MVAc3 MVAc1; gal1Δ::pPGK1-ERG13-tPGK1, pTEF2-ERG12-
tADH1, pHHF1-ERG19-tCYC1, LEU2
This
study
MVAc4 MVAc3; gal80Δ::pTEF1-ERG8-tSSA1, pCCW12-IDI1-
tENO2, TRP1
This
study
MVAp1 CEN-PK2-1D; rox1Δ::pHHF2-ERG10-SKL-tENO1, pTDH3-
tHMG1-SKL-tTDH1, URA3
This
study
MVAp2 MVAp1; gal80Δ::pTEF1-ERG8-SKL-tSSA1, pCCW12-IDI1-
SKL-tENO2, pTEF1-HygR-tTEF1
This
study
MVAp3 MVAp1; gal1Δ::pPGK1-ERG13-SKL-tPGK1, pTEF2-
ERG12-SKL-tADH1, pHHF1-ERG19-SKL-tCYC1, LEU2
This
study
MVAp4 MVAp3; gal80Δ::pTEF1-ERG8-SKL-tSSA1, pCCW12-IDI1-
SKL-tENO2, pTEF1-HygR-tTEF1
This
study
MVA
platform
CEN-PK2; rox1Δ::pHHF2-ERG10-tENO1, pTDH3-tHMG1-
tTDH1, URA3; gal1Δ::pPGK1-ERG13-tPGK1, pTEF2-
ERG12-tADH1, pHHF1-ERG19-tCYC1, LEU2; gal80Δ::
pTEF1-ERG8-tSSA1, pCCW12-IDI1-tENO2, TRP1; rox1Δ::
pHHF2-ERG10-SKL-tENO1, pTDH3-tHMG1-SKL-tTDH1,
URA3; gal1Δ::pPGK1-ERG13-SKL-tPGK1, pTEF2-ERG12-
SKL-tADH1, pHHF1-ERG19-SKL-tCYC1, LEU2,
gal80Δ::pTEF2-ERG8-SKL-tSSA1, pCCW12-IDI1-SKL-
tENO2, pTEF1-HygR-tTEF1
This
study
M. Mukherjee et al.
Metabolic Engineering 74 (2022) 139–149
144
MATa strains were created (Table S10). Similarly, haploid strains with
the opposite MAT
ɑ
mating type overexpressed the other four MVA
pathway genes with ERG10 and ERG8 under three different promoters,
generating 3
2
=9 strains (Table S11). These nine strains were also
transformed with a plasmid bearing the tObGES-ERG20
ww
fusion gene
for geraniol production. Mating the engineered haploid strains with the
opposite mating type generated 3
3
×3
2
=243 diploid strains, each
containing an extra copy of the seven MVA pathway genes and capable
of producing geraniol. The strain library was cultivated in 96-deep-well
plates, followed by geraniol quantication using a high-throughput
uorescence-based assay (Lin et al., 2018). A heat map with the pro-
moter strengths and uorescence readings of all strains revealed a
unique pattern that the strains expressing ERG12 from a
medium-strength promoter produced some of the highest amounts of
geraniol. Eight out of the top ten geraniol-producing strains had ERG12
expressed from the medium-strength promoter (Fig. 2B). Quantitative
real-time PCR veried that transcript levels of overexpressed MVA
pathway genes positively correlated with the promoter strengths
(Fig. S7, Table S9). Quantication of intracellular mevalonate, a critical
pathway intermediate, in the strains with all strong promoters (
α
1), all
medium promoters (β5), and all weak promoters (γ9) showed a pro-
gressive decrease, as expected. (Table S12).
3.3. Applying machine learning to the combinatorial strain library
Machine learning was used to investigate the combinatorial library
with the primary objective of understanding the impact of each of the
ve enzymes on the productivity of the MVA pathway. Random forest
models (Breiman, 2001) were t to the data in the combinatorial library
with the outcome variable as geraniol production. Variable importance
measures indicate that the top three enzymes that are critical for pre-
dicting geraniol production are Erg19p, the mevalonate pyrophosphate
decarboxylase; Erg13p, the HMG-CoA synthase; and Erg12p, the
mevalonate kinase (Fig. 3A). In addition to the ranking, we also view the
drops in importance as insightful, especially between Erg12p and
Erg10p. This large gap secured the role of the top three enzymes as
critical for the predictive accuracy of geraniol production in the 243
strains.
Next, we took a closer look at measures of variable importance using
Partial Dependence Plots (PDPs) (Greenwell, 2017) to visualize the
contribution of the enzyme levels to geraniol output. PDP of the ve
enzymes showed the predicted geraniol production when an enzyme
was set at a given promoter strength (Fig. 3BF). Erg13p, Erg19p, and
Erg8p showed increased geraniol production when their promoter
strengths were increased, eventually leveling off at saturation (Fig. 3B,
C&F), as expected. However, a unique role of mevalonate kinase
(Erg12p) was apparent from the PDP of ERG12 (Fig. 3D), which showed
that a maximum geraniol production was reached within our data when
its expression level was moderately low and then decreased with higher
promoter strength. Erg10p did not show saturation in the promoter
strengths tested.
In the two-enzyme interaction plots (Fig. 3G-L), the role of Erg12p is
even more apparent. When the value of ERG12 was in the moderate
range, the predicted geraniol output was the highest. This could be due
to several reasons, such as feedback inhibitions of Erg12p by pathway
intermediates (Anthony et al., 2009; Garcia and Keasling, 2014; Primak
et al., 2011; Kazieva et al., 2017) and metabolic burden leading to
protein aggregation. Therefore, moderate expression of ERG12 most
likely strikes the right balance for higher ux through the pathway.
The two-enzyme interaction plot between ERG19 and ERG13
(Fig. 3K) showed the highest geraniol production when the expression of
ERG19 was low and ERG13 was high. In the same plot, we also see
relatively high predicted readout values when the expression of ERG19
was high and ERG13 was moderate. This reverse balance is likely
because when Erg13p is expressed highly, Erg12p might be feedback
inhibited due to the increased intermediates downstream of Erg19p
(Anthony et al., 2009; Garcia and Keasling, 2014; Primak et al., 2011;
Kazieva et al., 2017), and lower expression of ERG19 would be more
desirable. However, when Erg13p is expressed low, Erg19p must have a
higher expression to maximize the pathway productivity since it cata-
lyzes the irreversible step, which releases CO
2
to produce IPP. The rest of
the two-enzyme interaction plots are similar to ERG10 and ERG8 in-
teractions (Fig. 3L), where expression of both enzymes led to the highest
amount of product, as expected, and are included in Fig. S8.
While the global analysis, including data from the entire combina-
torial library, provides information in the prediction of geraniol output,
the local analysis focuses on the top ten producers. Through the exam-
ination of the enzyme proles and their variable importance of the ten
highest geraniol-producing strains, we can gain insights into the role of
the individual enzymes in the prediction of high geraniol levels. The
local importance of pathway enzymes in the top ten strains supplements
the PDP plots and shows a clear pattern where Erg12p comes out as the
most important enzyme in seven out of ten strains (Table 2, Fig. S9). In
Table 2, there are two instances of ERG12s expression as high (promoter
strength =7.77). In both cases, the expression of ERG8, ERG13, and
ERG19 is also high. This is also supported in the Individual Conditional
Expectation (ICE) curves (Goldstein et al., 2015) (Fig. S10), which show
that if ERG12s expression is high, other pathway enzymes expression
has to be also high to maximize geraniol production. In the top ten
geraniol-producing strains, eight have ERG12 expressed at a moderately
low range (promoter strength =1.69), which we found to be a ‘sweet
spot.When ERG12 is expressed moderately, there are a variety of sce-
narios that can arise to produce a high amount of geraniol. Indeed,
within the eight strains having ERG12 expressed in a moderately low
range, seven have Erg12p as the most important enzyme for determining
nal productivity (Table 2, Fig. S9). In addition, Erg19p has consistently
moderate low abundance across the top ten strains when Erg12p is in the
sweet spot. Taken together, Erg12p is clearly the most critical enzyme
for maximum geraniol production out of the ve non-rate-limiting
enzymes.
These local and global measures of variable importance provide
complementary information. While the global analysis focuses overall
on the variables that are important for predicting readouts of all ranges,
the local importance allows us to zoom in on the patterns that give rise to
high geraniol production. Not surprisingly, they tell somewhat different
stories. Although ranked third in global variable importance, Erg12p is
the control point that limits production in the entire pathway and is the
most important enzyme when it comes to maximization of geraniol
production. The prominent role of Erg12p is likely due to feedback
regulations by pathway intermediates (Hinson et al., 1997; Chen et al.,
2018; Fu et al., 2008; Ma et al., 2011), reduced protein expression, or
protein aggregation.
3.4. Dual localization of the MVA pathway to both the cytosol and
peroxisomes
To further increase geraniol production, we localized the MVA
pathway into both the cytosol and peroxisomes. Peroxisomes are an
excellent choice for metabolic compartmentalization as they are not
essential for cell survival (Sibirny, 2016). Additionally, fatty acid
β-oxidation inside peroxisomes generates a pool of acetyl-CoA, which is
the substrate for the MVA pathway (Dusseaux et al., 2020). A haploid
peroxisome strain (MVAp4) was generated by tagging all seven MVA
genes with a C-terminal -SKL tripeptide. Similar to the MVAc4 strain, the
MVAp4 strain has seven MVA genes integrated into the genome.
Next, MVAc4 and MVAp4 strains were mated to obtain a diploid
strain, creating the MVA platform strain (Fig. 4A). The growth curves of
the strains showed that the engineered strains had no growth defect and,
in fact, grew signicantly faster than the wild-type strains in rich media
(Fig. 4B). When transformed with a plasmid bearing tObGES-ERG20
ww
,
the MVA platform strain doubled geraniol titers compared to the haploid
strains, indicating that the dual targeting of the MVA pathway
M. Mukherjee et al.
Metabolic Engineering 74 (2022) 139–149
145
signicantly increased geraniol production (Fig. 4C). We also generated
two control strains, MVA cyto*2 and MVA per*2, in which two copies of
the entire MVA pathway were targeted to either the cytosol or peroxi-
somes (Fig. S13). The MVA platform strain produced comparable
amount of geraniol as the MVA cyto*2 strain but higher amount than the
MVA per*2 strain. This could be due to the insufcient NADPH inside
peroxisomes that limited the MVA pathway productivity. There was no
difference in geraniol titers between the strains expressing the MVA
pathway in the cytosol (MVAc4) and peroxisomes (MVAp4) (Fig. 4C and
D). Similar results were observed when the same strains were cultured in
minimal media (Fig. S12). Expressing the tObGES-ERG20
ww
in the
peroxisome of the cytosolic strain MVAc4 showed only a small drop in
geraniol titer compared to the strain with both the fusion protein and the
additional MVA pathway localized to the cytosol. Furthermore, when
localizing the tObGES-ERG20
ww
into the cytosol of the peroxisomal
strain MVAp4, there was no signicant drop in geraniol titer compared
to the strain with the fusion protein and the additional MVA pathway
localized to the peroxisome. These data indicate that the IPP/DMAPP
may diffuse somewhat freely between the cytosol and the peroxisome.
To check if the pathway intermediate, mevalonate, is diffusible, two
more strains, MVAp-c and MVAc-p were constructed. MVAp-c has the
top half of the pathway, from ERG10 to tHMG1, localized to the
peroxisome, and the bottom half of the pathway, from ERG12 to IDI1, in
the cytosol. Conversely, MVAc-p has the top half of the pathway local-
ized to the cytosol and the bottom half of the pathway in the peroxisome
(Fig. S11). There was no difference in geraniol titer among the strains
MVAc4 and MVAp-c or MVAp4 and MVAc-p; thus, mevalonate diffuses
readily between the cytosol and peroxisome.
The growth of the engineered strains showed an inversed relation-
ship with geraniol titer, possibly caused by geraniol toxicity to yeast at
higher concentrations (Denby et al., 2018). When normalized by OD
600
,
there is an over two-fold increase in geraniol production in the MVA
platform strain compared to the haploids (Fig. 4D). When extending the
culturing time from 24 to 48 h, geraniol production decreased signi-
cantly (Fig. S14). The decrease in geraniol titer could be due to the
compounds volatility or the reduced expression of the heterologous
MVA pathway genes when glucose has been exhausted during the sta-
tionary phase (Peng et al., 2015). We also detected a minor product,
citronellol, which is reduced from geraniol by yeasts native enzymes
(Fig. S14), whereas another common geraniol derivative, geraniol ace-
tate, was not detected. In an attempt to increase geraniol production,
MVAp4 and MVA platform strains were grown in a fatty-acid-based
media (YPO) (Gerke et al., 2020). However, the geraniol production in
YPO decreased 2-fold compared to the productivity in YPD (Fig. S15).
This was likely due to the low activity of promoters for expressing MVA
genes in fatty-acid-based media since most of these promoters are from
the glycolysis pathway.
3.5. Producing diverse terpenoids from the MVA platform strain
The MVA platform strain can be conveniently leveraged to jumpstart
the production of a wide range of terpenoids since the users only need to
transform a plasmid with the desired prenyltransferase and terpene
synthase. To demonstrate the versatility of the MVA platform strain, we
next utilized it to produce a sesquiterpene,
α
-humulene, and a tri-
terpene, squalene, in addition to the monoterpene geraniol.
α
-humulene
Table 2
Top ten strains with the highest level of geraniol. The numbers under each enzyme are the relative promoter strengths quantied by Lee et al. (Lee et al., 2015).
Strains ERG10 ERG13 ERG12 ERG8 ERG19 Geraniol (a.u.) Critical enzymes
α
1 9.01 11.01 7.77 8.85 4.81 518.85 ±0.54 Erg8p
β2 9.01 2.85 1.69 2.28 1.53 517.94 ±13.96 Erg12p
α
4 3.00 11.01 7.77 8.85 4.81 516.19 ±87.54 Erg8p
N3 9.01 1.06 1.69 0.91 1.53 513.53 ±42.87 Erg10p
N2 9.01 1.06 1.69 2.28 1.53 510.49 ±11.46 Erg12p
β4 3.00 2.85 1.69 8.85 1.53 509.51 ±21.59 Erg12p
β5 3.00 2.85 1.69 2.28 1.53 505.28 ±10.16 Erg12p
β7 1.06 2.85 1.69 8.85 1.53 502.44 ±15.87 Erg12p
β3 9.01 2.85 1.69 0.91 1.53 502.34 ±12.10 Erg12p
β1 9.01 2.85 1.69 8.85 1.53 501.19 ±1.77 Erg12p
Fig. 4. Creating the MVA platform strain by over-
expressing the MVA pathway in both cytosol and
peroxisomes. (A) The diploid strain (MVA platform)
was created by mating the haploid MVAc4 and
haploid MVAp4. (B) Growth (OD
600
) of the engi-
neered MVAc4, MVAp4, and MVA platform strains
and their wildtype counterparts. (C) Geraniol titer
and OD
600
of engineered MVAc4, MVAp4, and MVA
platform strains with tObGES-ERG20
ww
in either the
cytosol (‘C) or peroxisomes (‘P). (D) Geraniol yield
in the above strains. Data represent the average ±SD
of three independent biological replicates.
M. Mukherjee et al.
Metabolic Engineering 74 (2022) 139–149
146
has potential anti-inammatory properties and acts as a precursor for
the anti-cancer drug zerumbone (Fernandes et al., 2007; Zhang et al.,
2018), while squalene is used as an emollient in personal care products
due to its skin-compatible properties (Popa et al., 2015). For
α
-humulene
production, the MVA platform strain transformed with a plasmid having
ERG20 encoding the FPP synthase and ZSS1 encoding an
α
-humulene
synthase from Zingiber zerumbet (Alemdar et al., 2016) produced
~60-fold more
α
-humulene than the wild type in 24 h (Fig. 5AC).
Fusion constructs with ERG20-ZSS1 produced about half of the amount
compared with the non-fused counterpart, indicating that the fused
enzymes have unfavorable conformational properties. OD
600
increased
with the increase of
α
-humulene, which is likely due to a parallel in-
crease in squalene, the precursor for ergosterol (Li et al., 2020). For
squalene production, the MVA platform strain was transformed with a
plasmid having ERG20 and ERG9 encoding a squalene synthase. The
resulting strain yielded ~35-fold more squalene than the wild type when
grown in the presence of terbinane, an anti-fungal agent that inhibits
Erg1p, which metabolizes squalene to 2,3-oxidosqualene (Garaiova
et al., 2014) (Fig. 5A, D&E). Fusion constructs of ERG20 and ERG9
produced approximately half the amount of squalene, potentially due to
unfavorable protein conformation. The growth of these strains was
positively correlated with the amount of squalene produced since
squalene is the substrate for ergosterol biosynthesis.
4. Discussion
In this study, we investigated the contribution of individual enzymes
to the MVA pathway, which is widely utilized to improve titers of ter-
penoids. Previous studies have highlighted the importance of HMG1 and
IDI1 as rate-limiting enzymes (Han et al., 2018; Zhao et al., 2017; Jiang
et al., 2017; Xie et al., 2015; Verwaal et al., 2007; Zhou et al., 2012);
however, there is a lack of consensus about the role of the other ve
enzymes in the pathway (Kwak et al., 2020; Zhou et al., 2018; McClory
et al., 2019; Hu et al., 2020; Madsen et al., 2011; Yao et al., 2018;
Redding-Johanson et al., 2011; Alonso-Gutierrez et al., 2015; Anthony
et al., 2009; Garcia and Keasling, 2014; Chen et al., 2018; Ma et al.,
2011; Pojer et al., 2006). To clarify the importance of non-rate-limiting
enzymes in the MVA pathway, we created a combinatorial yeast library
for a comprehensive exploration of the promoter space of each of the
ve enzymes. Machine learning-guided modeling quantitatively
revealed the contribution of each enzyme to product titer and found
Erg19, Erg13, and Erg12p as crucial enzymes in determining product
yield. Note that the importance of each enzyme in a given pathway
cannot be inferred from the Gibbs free energy (ΔG) of the reaction it
catalyzes since enzymes act by decreasing the activation energy neces-
sary for reactions to proceed but do not change the overall ΔG of the
reactions (NelsonCox, 2004). While monoterpene geraniol was
employed as a readout of the MVA pathway, the modeling results are
likely extendable to terpenoids with longer chain lengths because all
these terpenoids require IPP: DMAPP ratio equal or above one, whereas
the product ratio of IDI1 at equilibrium is IPP: DMAPP =1: 2.2 (Street
et al., 1990).
We identied the medium expression of Erg12p as the ‘sweet spot
for optimal terpenoid yield. Indeed, previous research showed that
mevalonate kinase is feedback inhibited by multiple terpenoid in-
termediates, including mevalonate, IPP, DMAPP, farnesyl pyrophos-
phate (FPP), geraniol pyrophosphate (GPP), and geranylgeranyl
pyrophosphate (GGPP) (Hinson et al., 1997; Chen et al., 2018; Fu et al.,
2008; Ma et al., 2011). A feedback-resistant mevalonate kinase from
archaea (Primak et al., 2011; Kazieva et al., 2017) may be used instead
of the native enzyme for further enhancement of the pathway produc-
tivity. Further, our analysis of the top ten geraniol-producing strains
(Table 2) shows that the strongest combination,
α
1, expressing all seven
MVA pathway genes under strong promoters, indeed maximizes gera-
niol production, but several pathway genes can be expressed with
relatively weaker promoters without signicantly reducing the product
titer. Seven out of the top ten producers having at least four genes
expressed from medium or weak promoters produced comparable
Fig. 5. Production of
α
-humulene and squalene using the MVA platform strain (A) Pathway for
α
-humulene and squalene production. ZSS1 encodes an
α
-humulene
synthase from Zingiber zerumbet; ERG9 encodes a squalene synthase in S. cerevisiae; ERG1 encodes a squalene epoxidase in S. cerevisiae. (B) Episomal constructs
express ERG20 and ZSS1 either separately or as a fusion protein with a ‘GSGlinker. (C)
α
-Humulene production and growth (OD
600
) of the wild type (WT) and the
engineered MVA platform expressing ERG20 and ZSS1. (D) Episomal constructs express ERG20 and ERG9 separately or as a fusion gene with a GSG linker. (E)
Squalene production and growth (OD
600
) of WT and the engineered MVA platform with ERG20 and ERG9. Data represent the average ±SD of three independent
biological replicates.
M. Mukherjee et al.
Metabolic Engineering 74 (2022) 139–149
147
geraniol titer as the top strain
α
1. It is to be noted that these conclusions
may only apply to the MVA pathway during the exponential phase of
growth.
The dual localization of the MVA pathway to both the cytosol and
peroxisomes signicantly increased geraniol titers (Fig. 4), most likely
due to the high abundance of acetyl-CoA and NADPH in the peroxisomes
and cytosol, respectively. Interestingly, targeting the MVA pathway into
the peroxisome but the prenyltransferase and terpenoid synthase into
the cytosol yielded similar amounts of geraniol. The same observations
were made when switching the localization of the overexpressed MVA
pathway and the prenyl transferase and geraniol synthase. These results
indicate that IPP/DMAPP are diffusible across the peroxisome mem-
brane. Similarly, we have constructed strains MVAc-p and MVAp-c to
show that mevalonate can diffuse readily across peroxisome membranes
(Fig. S11). Since peroxisome has a single-layer membrane, small mole-
cules can travel across either passively or facilitated by transporters
(Antonenkov et al., 2009). Furthermore, multiple MVA enzymes have
been reported to be localized in peroxisomes of plants and animals
(Guirimand et al., 2012; Simkin et al., 2011; Breitling and Krisans, 2002;
Sapir-Mir et al., 2008), which also supports the diffusion of MVA in-
termediates between peroxisomes and cytosol. The faster growth of the
engineered strains with the MVA pathway overexpressed is likely due to
the increased demand for acetyl-CoA, ATP, and NADPH, which results in
the accelerated turnover of sugar, lipids, and amino acids in the rich
media.
We used the dual localization strategy to create a platform strain as a
starting point for the production of terpenoids. Although plasmid-based
expression for peroxisomal localized genes resulted in a much higher
monoterpene production (Dusseaux et al., 2020), we focused on
genomic integration as it is known to be more stable than the
plasmid-based system (Ryan et al., 2014). Users only need to transfer a
plasmid carrying the particular prenyltransferase and terpenoid syn-
thase into the platform strain for the production of target terpenoids. To
demonstrate the versatility of our platform strain, we used it to produce
geraniol,
α
-humulene, and squalene as representatives of the three
classes of terpenes: mono-, sesqui-, and triterpenes. The highest titer in
shaking ask culture reported so far for geraniol,
α
-humulene, and
squalene are 523.96 mg/L (Jiang et al., 2017), 160 mg/L (Zhang et al.,
2020), and 1.3 g/L (Liu et al., 2020), respectively. These titers were
achieved by introducing compound-specic genetic modications and
optimizing culturing conditions. We did not introduce any additional
compound-specic genomic modications in the platform strain since
such modications will narrow the product scope of the platform. As a
result, the terpene titers from the off-the-shelf usage of the platform
strain were not expected to be the highest. Thus, future
compound-specic genomic modications hold promise to increase the
titers of a particular terpenoid. For example, genes such as ATF1 and
OYE2 may be deleted to increase geraniol titer by preventing its meta-
bolism (Brown et al., 2015). For increasing
α
-humulene and squalene
production, genes encoding non-specic phosphatases such as LPP1 and
DPP1 (Faulkner et al., 1999; Albertsen et al., 2011; Scalcinati et al.,
2012) may be deleted to prevent the divergence of farnesyl pyrophos-
phate (FPP) to farnesol. Expressing ERG9 from a weak promoter (Zhang
et al., 2018) or tagging it for degradation (Zhang et al., 2020) can lead to
higher
α
-humulene accumulation. Expressing ERG1 under a weak pro-
moter (Liu et al., 2020) can improve the production of squalene.
5. Conclusions
This study elucidated the detailed contribution of the ve non-rate-
limiting enzymes of the MVA pathway in S. cerevisiae by creating a
combinatorial yeast library. Analysis using machine learning algorithms
revealed the critical role of Erg12p in determining MVA pathway pro-
ductivity. A platform strain with dual localization of the MVA pathway
into both the cytosol and peroxisomes was created. This strain can be
leveraged to produce diverse terpenoids. The insights gained regarding
the contribution of individual MVA pathway enzymes and the MVA
yeast platform created will guide the future design and engineering to
produce high titers of any terpenoid.
Funding sources
This project was supported by the Research Foundation for the State
University of New York [71272] to Z. Q. Wang and the National Science
Foundation [CHE-1919594] to the University at Buffalo Chemistry In-
strument Center.
Declaration of competing interest
The authors declare that they have no known competing nancial
interests or personal relationships that could have appeared to inuence
the work reported in this paper.
Author contributions
Minakshi Mukherjee: Investigation, Methodology, Formal analysis,
Validation, Visualization, Writing Original Draft, Writing -Review &
Editing. Rachael Hageman Blair: Software, Methodology, Formal
analysis, Visualization, Writing Original Draft, Writing -Review &
Editing. Zhen Q. Wang: Conceptualization, Resources, Supervision,
Project administration, Funding acquisition, Writing Original Draft,
Writing -Review & Editing.
Data availability
Data will be made available on request.
Acknowledgment
The authors are grateful to Dr. John Dueber for providing the raw
data for the relative promoter strengths of characterized yeast pro-
moters. We thank Dr. Sarah Walker for providing access to the Tecan
Spark microplate reader. We also thank Dr. Valerie Freichs for assistance
with developing the chromatography methods.
Appendix A. Supplementary data
Supplementary data to this article can be found online at https://doi.
org/10.1016/j.ymben.2022.10.004.
References
Albertsen, L., et al., 2011. Diversion of ux toward sesquiterpene production in
Saccharomyces cerevisiae by fusion of host and heterologous enzymes. Appl. Environ.
Microbiol. 77, 10331040.
Alemdar, S., et al., 2016. Heterologous expression, purication, and biochemical
characterization of alpha-Humulene Synthase from Zingiber zerumbet Smith. Appl.
Biochem. Biotechnol. 178, 474489.
Alonso-Gutierrez, J., et al., 2015. Principal component analysis of proteomics (PCAP) as
a tool to direct metabolic engineering. Metab. Eng. 28, 123133.
Anthony, J.R., et al., 2009. Optimization of the mevalonate-based isoprenoid
biosynthetic pathway in Escherichia coli for production of the anti-malarial drug
precursor amorpha-4,11-diene. Metab. Eng. 11, 1319.
Antonenkov, V.D., Mindthoff, S., Grunau, S., Erdmann, R., Hiltunen, J.K., 2009. An
involvement of yeast peroxisomal channels in transmembrane transfer of glyoxylate
cycle intermediates. Int. J. Biochem. Cell Biol. 41, 25462554.
Belcher, M.S., Mahinthakumar, J., Keasling, J.D., 2020. New frontiers: harnessing pivotal
advances in microbial engineering for the biosynthesis of plant-derived terpenoids.
Curr. Opin. Biotechnol. 65, 8893.
Breiman, L., 1996. Out-of-bag Estimation.
Breiman, L., 2001. Random forests. Mach. Learn. 45, 532.
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J., 2017. Classication and
Regression Trees. Routledge.
Breitling, R., Krisans, S.K., 2002. A second gene for peroxisomal HMG-CoA reductase? A
genomic reassessment. J. Lipid Res. 43, 20312036.
Brown, S., Clastre, M., Courdavault, V., OConnor, S.E., 2015. De novo production of the
plant-derived alkaloid strictosidine in yeast. Proc. Natl. Acad. Sci. U. S. A. 112,
32053210.
M. Mukherjee et al.
Metabolic Engineering 74 (2022) 139–149
148
Campbell, A., et al., 2016. Engineering of a nepetalactol-producing platform strain of
Saccharomyces cerevisiae for the production of plant seco-iridoids. ACS Synth. Biol. 5,
405414.
Chen, W., Viljoen, A.M., 2010. Geraniol a review of a commercially important
fragrance material. South Afr. J. Bot. 76, 643651.
Chen, Y., Daviet, L., Schalk, M., Siewers, V., Nielsen, J., 2013. Establishing a platform
cell factory through engineering of yeast acetyl-CoA metabolism. Metab. Eng. 15,
4854.
Chen, H., et al., 2018. Directed evolution of mevalonate kinase in Escherichia coli by
random mutagenesis for improved lycopene. RSC Adv. 8, 1502115028.
Christianson, D.W., 2017. Structural and chemical biology of terpenoid cyclases. Chem.
Rev. 117, 1157011648.
Cutler, D.R., et al., 2007. Random forests for classication in ecology. Ecology 88,
27832792.
Denby, C.M., et al., 2018. Industrial brewing yeast engineered for the production of
primary avor determinants in hopped beer. Nat. Commun. 9, 965.
Dusseaux, S., Wajn, W.T., Liu, Y., Ignea, C., Kampranis, S.C., 2020. Transforming yeast
peroxisomes into microfactories for the efcient production of high-value
isoprenoids. Proc. Natl. Acad. Sci. U. S. A. 117, 3178931799.
Efron, B., LePage, R., 1992. Introduction to Bootstrap. Wiley & Sons, New York.
Engels, B., Dahm, P., Jennewein, S., 2008. Metabolic engineering of taxadiene
biosynthesis in yeast as a rst step towards taxol (paclitaxel) production. Metab. Eng.
10, 201206.
Faulkner, A., et al., 1999. The LPP1 and DPP1 gene products account for most of the
isoprenoid phosphate phosphatase activities in Saccharomyces cerevisiae. J. Biol.
Chem. 274, 1483114837.
Fernandes, E.S., et al., 2007. Anti-inammatory effects of compounds alpha-humulene
and (-)-trans-caryophyllene isolated from the essential oil of Cordia verbenacea. Eur.
J. Pharmacol. 569, 228236.
Friedman J, H.T., Tibshirani, R., 2001. The Elements of Statistical Learning. Springer
series in statistics, New York.
Fu, Z., Voynova, N.E., Herdendorf, T.J., Miziorko, H.M., Kim, J.J., 2008. Biochemical
and structural basis for feedback inhibition of mevalonate kinase and isoprenoid
metabolism. Biochemistry 47, 37153724.
Garaiova, M., Zambojova, V., Simova, Z., Griac, P., Hapala, I., 2014. Squalene epoxidase
as a target for manipulation of squalene levels in the yeast Saccharomyces cerevisiae.
FEMS Yeast Res. 14, 310323.
Garcia, D.E., Keasling, J.D., 2014. Kinetics of phosphomevalonate kinase from
Saccharomyces cerevisiae. PLoS One 9, e87112.
Gerke, J., et al., 2020. Production of the fragrance geraniol in peroxisomes of a product-
tolerant bakers yeast. Front. Bioeng. Biotechnol. 8, 582052.
Gibson, D.G., 2011. Enzymatic assembly of overlapping DNA fragments. Methods
Enzymol. 498, 349361.
Gold, N.D., et al., 2015. Metabolic engineering of a tyrosine-overproducing yeast
platform using targeted metabolomics. Microb. Cell Factories 14, 73.
Goldstein, A., Kapelner, A., Bleich, J., Pitkin, E., 2015. Peeking inside the black box:
visualizing statistical learning with plots of individual conditional expectation.
J. Comput. Graph Stat. 24, 4465.
Greenwell, B.M., 2017. pdp: an R Package for constructing partial dependence plots. Rev.
Javer. 9, 421.
Guirimand, G., et al., 2012. A single gene encodes isopentenyl diphosphate isomerase
isoforms targeted to plastids, mitochondria and peroxisomes in Catharanthus roseus.
Plant Mol. Biol. 79, 443459.
Guo, X.J., et al., 2018. Metabolic engineering of Saccharomyces cerevisiae for 7-dehy-
drocholesterol overproduction. Biotechnol. Biofuels 11, 192.
Han, J.Y., Seo, S.H., Song, J.M., Lee, H., Choi, E.S., 2018. High-level recombinant
production of squalene using selected Saccharomyces cerevisiae strains. J. Ind.
Microbiol. Biotechnol. 45, 239251.
Hinson, D.D., Chambliss, K.L., Toth, M.J., Tanaka, R.D., Gibson, K.M., 1997. Post-
translational regulation of mevalonate kinase by intermediates of the cholesterol and
nonsterol isoprene biosynthetic pathways. J. Lipid Res. 38, 22162223.
Hu, Z., et al., 2020. Improve the production of D-limonene by regulating the mevalonate
pathway of Saccharomyces cerevisiae during alcoholic beverage fermentation. J. Ind.
Microbiol. Biotechnol. 47, 10831097.
Ignea, C., Pontini, M., Maffei, M.E., Makris, A.M., Kampranis, S.C., 2014. Engineering
monoterpene production in yeast using a synthetic dominant negative geranyl
diphosphate synthase. ACS Synth. Biol. 3, 298306.
Jiang, G.Z., et al., 2017. Manipulation of GES and ERG20 for geraniol overproduction in
Saccharomyces cerevisiae. Metab. Eng. 41, 5766.
Jiang, L., et al., 2021. Improved functional expression of cytochrome P450s in
Saccharomyces cerevisiae through screening a cDNA library from Arabidopsis thaliana.
Front. Bioeng. Biotechnol. 9, 764851.
Kazieva, E., et al., 2017. Characterization of feedback-resistant mevalonate kinases from
the methanogenic archaeons Methanosaeta concilii and Methanocella paludicola.
Microbiology (Read.) 163, 12831291.
Kwak, S., et al., 2020. Redirection of the glycolytic ux enhances isoprenoid production
in Saccharomyces cerevisiae. Biotechnol. J. 15, e1900173.
Lee, M.E., DeLoache, W.C., Cervantes, B., Dueber, J.E., 2015. A highly characterized
yeast toolkit for modular, multipart assembly. ACS Synth. Biol. 4, 975986.
Li, T., et al., 2020. Metabolic Engineering of Saccharomyces cerevisiae to overproduce
squalene. J. Agric. Food Chem. 68, 21322138.
Lin, J.-L., Ekas, H., Markham, K., Alper, H.S., 2018. An enzyme-coupled assay enables
rapid protein engineering for geraniol production in yeast. Biochem. Eng. J. 139,
95100.
Liu, G.S., et al., 2020. The yeast peroxisome: a dynamic storage depot and subcellular
factory for squalene overproduction. Metab. Eng. 57, 151161.
Lv, X., et al., 2016. Dual regulation of cytoplasmic and mitochondrial acetyl-CoA
utilization for improved isoprene production in Saccharomyces cerevisiae. Nat.
Commun. 7, 12851.
Ma, S.M., et al., 2011. Optimization of a heterologous mevalonate pathway through the
use of variant HMG-CoA reductases. Metab. Eng. 13, 588597.
Madsen, K.M., et al., 2011. Linking genotype and phenotype of Saccharomyces cerevisiae
strains reveals metabolic engineering targets and leads to triterpene hyper-
producers. PLoS One 6, e14763.
McClory, J., Lin, J.T., Timson, D.J., Zhang, J., Huang, M., 2019. Catalytic mechanism of
mevalonate kinase revisited, a QM/MM study. Org. Biomol. Chem. 17, 24232431.
Mukherjee, M., Caroll, E., Wang, Z.Q., 2021. Rapid assembly of multi-gene constructs
using modular Golden Gate cloning. JoVE 168, e61993.
Navale, G.R., Dharne, M.S., Shinde, S.S., 2021. Metabolic engineering and synthetic
biology for isoprenoid production in Escherichia coli and Saccharomyces cerevisiae.
Appl. Microbiol. Biotechnol. 105, 457475.
Nelson, D.L., Cox, M.M., 2004. Lehninger Principles of Biochemistry.
Nielsen, J., 2015. Bioengineering. Yeast cell factories on the horizon. Science 349,
10501051.
Orr-Weaver, T.L., Szostak, J.W., Rothstein, R.J., 1981. Yeast transformation: a model
system for the study of recombination. Proc. Natl. Acad. Sci. U. S. A. 78, 63546358.
Peng, B., Williams, T.C., Henry, M., Nielsen, L.K., Vickers, C.E., 2015. Controlling
heterologous gene expression in yeast cell factories on different carbon substrates
and across the diauxic shift: a comparison of yeast promoter activities. Microb. Cell
Factories 14, 91.
Peng, B., et al., 2017. A squalene synthase protein degradation method for improved
sesquiterpene production in Saccharomyces cerevisiae. Metab. Eng. 39, 209219.
Pojer, F., et al., 2006. Structural basis for the design of potent and species-specic
inhibitors of 3-hydroxy-3-methylglutaryl CoA synthases. Proc. Natl. Acad. Sci. U. S.
A. 103, 1149111496.
Popa, O., Babeanu, N.E., Popa, I., Nita, S., Dinu-Parvu, C.E., 2015. Methods for obtaining
and determination of squalene from natural sources. BioMed Res. Int. 2015, 367202.
Primak, Y.A., et al., 2011. Characterization of a feedback-resistant mevalonate kinase
from the archaeon Methanosarcina mazei. Appl. Environ. Microbiol. 77, 77727778.
Pyne, M.E., et al., 2020. A yeast platform for high-level synthesis of
tetrahydroisoquinoline alkaloids. Nat. Commun. 11, 3337.
Redding-Johanson, A.M., et al., 2011. Targeted proteomics for metabolic pathway
optimization: application to terpene production. Metab. Eng. 13, 194203.
Ro, D.K., et al., 2006. Production of the antimalarial drug precursor artemisinic acid in
engineered yeast. Nature 440, 940943.
Rodriguez, A., Kildegaard, K.R., Li, M., Borodina, I., Nielsen, J., 2015. Establishment of a
yeast platform strain for production of p-coumaric acid through metabolic
engineering of aromatic amino acid biosynthesis. Metab. Eng. 31, 181188.
Ryan, O.W., et al., 2014. Selection of chromosomal DNA libraries using a multiplex
CRISPR system. Elife 3, e03703.
Sapir-Mir, M., et al., 2008. Peroxisomal localization of Arabidopsis isopentenyl
diphosphate isomerases suggests that part of the plant isoprenoid mevalonic acid
pathway is compartmentalized to peroxisomes. Plant Physiol. 148, 12191228.
Sauro, H.M., 2017. Control and regulation of pathways via negative feedback. J. R. Soc.
Interface 14.
Scalcinati, G., et al., 2012. Dynamic control of gene expression in Saccharomyces
cerevisiae engineered for the production of plant sesquitepene alpha-santalene in a
fed-batch mode. Metab. Eng. 14, 91103.
Sibirny, A.A., 2016. Yeast peroxisomes: structure, functions and biotechnological
opportunities. FEMS Yeast Res. 16.
Simkin, A.J., et al., 2011. Peroxisomal localisation of the nal steps of the mevalonic acid
pathway in planta. Planta 234, 903914.
Street, I.P., Christensen, D.J., Poulter, C.D., 1990. Hydrogen exchange during the
enzyme-catalyzed isomerization of isopentenyl diphosphate and dimethylallyl
diphosphate. J. Am. Chem. Soc. 112, 85778578.
Trikka, F.A., et al., 2015. Iterative carotenogenic screens identify combinations of yeast
gene deletions that enhance sclareol production. Microb. Cell Factories 14, 60.
Verwaal, R., et al., 2007. High-level production of beta-carotene in Saccharomyces
cerevisiae by successive transformation with carotenogenic genes from
Xanthophyllomyces dendrorhous. Appl. Environ. Microbiol. 73, 43424350.
Vickers, C.E., Bydder, S.F., Zhou, Y., Nielsen, L.K., 2013. Dual gene expression cassette
vectors with antibiotic selection markers for engineering in Saccharomyces cerevisiae.
Microb. Cell Factories 12, 96.
Wang, X., et al., 2021. Engineering Escherichia coli for production of geraniol by
systematic synthetic biology approaches and laboratory-evolved fusion tags. Metab.
Eng. 66, 6067.
Westfall, P.J., et al., 2012. Production of amorphadiene in yeast, and its conversion to
dihydroartemisinic acid, precursor to the antimalarial agent artemisinin. Proc. Natl.
Acad. Sci. U. S. A. 109, E111E118.
Xie, W., Lv, X., Ye, L., Zhou, P., Yu, H., 2015. Construction of lycopene-overproducing
Saccharomyces cerevisiae by combining directed evolution and metabolic
engineering. Metab. Eng. 30, 6978.
Yao, Z., et al., 2018. Enhanced isoprene production by reconstruction of metabolic
balance between strengthened precursor supply and improved isoprene synthase in
Saccharomyces cerevisiae. ACS Synth. Biol. 7, 23082316.
Yee, D.A., et al., 2019. Engineered mitochondrial production of monoterpenes in
Saccharomyces cerevisiae. Metab. Eng. 55, 7684.
Yuan, J., Ching, C.B., 2014. Combinatorial engineering of mevalonate pathway for
improved amorpha-4,11-diene production in budding yeast. Biotechnol. Bioeng.
111, 608617.
Zhang, C., et al., 2018. Production of sesquiterpenoid zerumbone from metabolic
engineered Saccharomyces cerevisiae. Metab. Eng. 49, 2835.
M. Mukherjee et al.
Metabolic Engineering 74 (2022) 139–149
149
Zhang, C., Li, M., Zhao, G.R., Lu, W., 2020. Harnessing yeast peroxisomes and cytosol
acetyl-Coa for sesquiterpene alpha-humulene production. J. Agric. Food Chem. 68,
13821389.
Zhao, J., et al., 2017. Dynamic control of ERG20 expression combined with minimized
endogenous downstream metabolism contributes to the improvement of geraniol
production in Saccharomyces cerevisiae. Microb. Cell Factories 16, 17.
Zhou, Y.J., et al., 2012. Modular pathway engineering of diterpenoid synthases and the
mevalonic acid pathway for miltiradiene production. J. Am. Chem. Soc. 134,
32343241.
Zhou, P., et al., 2018. Crystal structure of cytoplasmic acetoacetyl-CoA thiolase from
Saccharomyces cerevisiae. Acta Crystallogr F Struct Biol Commun 74, 613.
M. Mukherjee et al.
... In particular, it is critical to carefully select promoter for gene regulation during pathway optimization. Through evaluation of a combinatorial expression cassette library for mevalonate pathway genes, the medium-strength expression of ERG12 was found to be beneficial for product biosynthesis in S. cerevisiae [17]. Randomly combination of promoters for optimizing the expression of pathway genes enabled enhanced glucose consumption [18], and increased production of patchoulol [19] and carotenoid [20] in S. cerevisiae. ...
... The increased efficiency in glucose utilization under fed-batch fermentation also led to the higher yields. However, we here only modulated single gene in EMP pathway by using several promoters, and further comprehensive optimization for pathway genes by creating a combinatorial library [17,18] should be beneficial for improve bioproduction. ...
Article
Full-text available
Precisely controlling gene expression is beneficial for optimizing biosynthetic pathways for improving the production. However, promoters in nonconventional yeasts such as Ogataea polymorpha are always limited, which results in incompatible gene modulation. Here, we expanded the promoter library in O. polymorpha based on transcriptional data, among which 13 constitutive promoters had the strengths ranging from 0–55% of PGAP, the commonly used strong constitutive promoter, and 2 were growth phase-dependent promoters. Subsequently, 2 hybrid growth phase-dependent promoters were constructed and characterized, which had 2-fold higher activities. Finally, promoter engineering was applied to precisely regulate cellular metabolism for efficient production of β-elemene. The glyceraldehyde-3-phosphate dehydrogenase gene GAP was downregulated to drive more flux into pentose phosphate pathway (PPP) and then to enhance the supply of acetyl-CoA by using phosphoketolase-phosphotransacetylase (PK-PTA) pathway. Coupled with the phase-dependent expression of synthase module (ERG20∼LsLTC2 fusion), the highest titer of 5.24 g/L with a yield of 0.037 g/(g glucose) was achieved in strain YY150U under fed-batch fermentation in shake flasks. This work characterized and engineered a series of promoters, that can be used to fine-tune genes for constructing efficient yeast cell factories.
... Overexpression of the key rate-limited enzyme tHMGR in peroxisomes resulted in an 11.8% increase in cis-trans nepetalactol titer (NEPC6P1), and the incorporation of peroxisomal ERG19 and ERG8 (NEPC6P2) led to an additional 26.2% improvement (Fig. 6B). It demonstrated the transport of certain MVA pathway intermediates across the peroxisomal membranes, such as mevalonate-5-phosphate, mevalonate-5-diphosphate, and GPP (Mukherjee et al., 2022). Unfortunately, NEPC4P4 strain, possessing the entire peroxisomal MVA and geraniol biosynthetic pathway, resulted in a decrease in cis-trans nepetalactol accumulation, highlighting challenges associated with geraniol excess and low conversion efficiency of the downstream pathway (Fig. 6B). ...
... Thus, the application of ML holds great promise in evaluating strain design strategies. For example, to investigate the individual contributions of five non-rate-limiting enzymes in the MVA pathway, a combinatorial library of 243 S. cerevisiae strains was created, each with an additional copy of the MVA pathway integrated into the genome and expressing the non-rate-limiting enzymes through a unique combination of promoters [85]. Through high-throughput screening combined with ML algorithm, it was revealed that Erg12p was the key enzyme in affecting the titer of the product. ...
... Furthermore, the development of machine learning facilitated in the elucidation of complicated biosynthetic pathways. The construction of a novel model based on machine learning algorithms (Mukherjee, Blair, & Wang, 2022) by comprehensive analysis of multiple omics data from medicinal plants is probably an effective method to promote the accuracy of gene function and metabolic pathways prediction. ...
... The mevalonic acid (MVA) pathway located in the cytoplasm, and the 1-deoxy-D-xylulose 5-phosphate (DXP)/ methylerythritol phosphate (MEP) pathways located in the plastid . In MVA pathway, there are seven rate-limiting enzymes, including acetyl-CoA C-acetyltransferase (AACT), 3-hydroxy-3-methylglutaryl-CoA synthetase (HMGS), 3hydroxy-3-methylglutaryl coenzyme-A reductase (HMGR), phosphomevalonate kinase (PMK), geranyl diphosphate synthase (GPPS), farnesyl diphosphate synthase (FPPS) and terpene synthases (TPS) (Mukherjee et al., 2022). Moreover, A total of seven key rate-liming enzymes are responsible for participating in the MEP pathway, namely Deoxy-D-xylulose 5-phosphate synthase (DXS), DXP reducto-isomerase (DXR), 2-C-methyl-D-erythritol 4phosphate cytidylyltransferase (MCT), 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase (HDS), geranyl geranyl pyrophosphate synthase (GGPPS), GPPS and TPS (Wan et al., 2021). ...
Article
Full-text available
Volatile compounds including terpenes, aldehyde, phenol, and alcohol are significantly contributed floral and fruity aromas to the Muscat variety. ‘Ruidu Hongyu’ grapevine is one of the newly developed grape varieties, and cultivation of this variety has been extended across China due to unique quality traits and taste. In this study, HS-SPME/GC−MS and transcriptome sequencing analysis were performed to evaluate the impact of exogenous 2,4-epibrassinolide (EBR), jasmonic acid (JA), and their signaling inhibitors brassinazole (Brz)/sodium diethyldithiocarbamate (DIECA) on the biosynthesis of aroma substances in ‘Ruidu Hongyu’ grapevine. According to the results, exogenous BR and JA promoted the accumulation of various aroma substances, including hexenal, 2-hexenal, nerol oxide, vanillin, hotrienol, terpineol, neral, nerol, geraniol, and geranic acid. After EBR and JA treatments, most of the genes responsible for terpene, aldehyde, and alcohol biosynthesis expressed at a higher level than the CK group. Relatively, EBR treatment could not only promote endogenous BR biosynthesis and metabolism but also elevate BR signaling transduction. JA treatment contributed to endogenous JA and MeJA accumulation, as well. Through transcriptome sequencing, a total of 3043, 903, 1470, and 607 DEGs were identified in JA vs. JD, JA vs. CK, BR vs. CK, and BR vs. Brz, respectively. There were more DEGs under both EBR and JA treatments at late fruit ripening stages. The findings of this study increase our understanding regarding aroma substances biosynthesis and endogenous BR/JA metabolism in response to exogenous EBR and JA signals.
Article
Full-text available
α-humulene, a sesquiterpene found in essential oils of various plant species, has garnered interest due to its potential therapeutic applications. This scoping review aims to consolidate α-humulene's evidence base, informing clinical translation and guiding future research directions. A scoping review was conducted of EMBASE, MEDLINE and PubMed databases up to 14th July 2023. All studies describing original research on α-humulene extraction, pre-clinical and clinical research were included for review. Three-hundred and forty articles were analyzed. α-humulene yields ranged from negligible to 60.90% across plant species. In vitro experiments demonstrated cytotoxicity against adenocarcinomas (such as colorectal, pulmonary, breast, prostatic, lung, and ovarian), with varying responses in other cell models Mechanistic insights revealed its involvement in mitochondrial dysfunction, diminished intracellular glutathione levels, and the induction of oxidative stress. In rodent studies, oral administration of α-humulene at 50 mg/kg reduced inflammation markers in paw edema and ovalbumin-induced airway inflammation. Intraperitoneal administration of α-humulene (50-200 mg/kg) exhibited cannabimimetic properties through cannabinoid 1 and adenosine A2a receptors. α-humulene also exhibited a multitude of properties with potential scope for therapeutic utilization. However, there is a paucity of studies which have successfully translated this research into clinical populations with the associated disease. Potential barriers to clinical translation were identified, including yield variability, limited isolation studies, and challenges associated with terpene bioavailability. Consequently, rigorous pharmacokinetic studies and further mechanistic investigations are warranted to effectively uncover the potential of α-humulene.
Article
Full-text available
Cytochrome P450 enzymes (P450s) are a superfamily of heme-thiolate proteins widely existing in various organisms and play a key role in the metabolic network and secondary metabolism. However, the low expression levels and activities have become the biggest challenge for P450s studies. To improve the functional expression of P450s in Saccharomyces cerevisiae, an Arabidopsis thaliana cDNA library was expressed in the betaxanthin-producing yeast strain, which functioned as a biosensor for high throughput screening. Three new target genes AtGRP7, AtMSBP1, and AtCOL4 were identified to improve the functional expression of CYP76AD1 in yeast, with accordingly the accumulation of betaxanthin increased for 1.32-, 1.86-, and 1.10-fold, respectively. In addition, these three targets worked synergistically/additively to improve the production of betaxanthin, representing a total of 2.36-fold improvement when compared with the parent strain. More importantly, these genes were also determined to effectively increase the activity of another P450 enzyme (CYP736A167), catalyzing the hydroxylation of α-santalene to produce Z-α-santalol. Simultaneous overexpression of AtGRP7, AtMSBP1, and AtCOL4 increased α-santalene to Z-α-santalol conversion rate for more than 2.97-fold. The present study reported a novel strategy to improve the functional expression of P450s in S. cerevisiae and promises the construction of platform yeast strains for the production of natural products.
Article
Full-text available
Geraniol is a valuable monoterpene extensively used in the fragrance, food, and cosmetic industries. Increasing environmental concerns and supply gaps have motivated efforts to advance the microbial production of geraniol from renewable feedstocks. In this study, we first constructed a platform geraniol Escherichia coli strain by bioprospecting the key enzymes geranyl diphosphate synthase (GPPS) and geraniol synthase (GES) and selection of a host cell background. This strategy led to a 46.4-fold increase in geraniol titer to 964.3 mg/L. We propose that the expression level of eukaryotic GES can be further optimized through fusion tag evolution engineering. To this end, we manipulated GES to maximize flux towards the targeted product geraniol from precursor geranyl diphosphate (GPP) via the utilization of fusion tags. Additionally, we developed a high-throughput screening system to monitor fusion tag variants. This common plug-and-play toolbox proved to be a robust approach for systematic modulation of protein expression and can be used to tune biosynthetic metabolic pathways. Finally, by combining a modified E1* fusion tag, we achieved 2124.1 mg/L of geraniol in shake flask cultures, which reached 27.2% of the maximum theoretical yield and was the highest titer ever reported. We propose that this strategy has set a good reference for enhancing a broader range of terpenoid production in microbial cell factories, which might open new possibilities for the bio-production of other valuable chemicals.
Article
Full-text available
The Golden Gate cloning method enables the rapid assembly of multiple genes in any user-defined arrangement. It utilizes type IIS restriction enzymes that cut outside of their recognition sites and create a short overhang. This modular cloning (MoClo) system uses a hierarchical workflow in which different DNA parts, such as promoters, coding sequences (CDS), and terminators, are first cloned into an entry vector. Multiple entry vectors then assemble into transcription units. Several transcription units then connect into a multi-gene plasmid. The Golden Gate cloning strategy is of tremendous advantage because it allows scar-less, directional, and modular assembly in a one-pot reaction. The hierarchical workflow typically enables the facile cloning of a large variety of multi-gene constructs with no need for sequencing beyond entry vectors. The use of fluorescent protein dropouts enables easy visual screening. This work provides a detailed, step-by-step protocol for assembling multi-gene plasmids using the yeast modular cloning (MoClo) kit. We show optimal and suboptimal results of multi-gene plasmid assembly and provide a guide for screening for colonies. This cloning strategy is highly applicable for yeast metabolic engineering and other situations in which multi-gene plasmid cloning is required.
Article
Full-text available
Isoprenoids, often called terpenoids, are the most abundant and highly diverse family of natural organic compounds. In plants, they play a distinct role in the form of photosynthetic pigments, hormones, electron carrier, structural components of membrane, and defence. Many isoprenoids have useful applications in the pharmaceutical, nutraceutical, and chemical industries. They are synthesized by various isoprenoid synthase enzymes by several consecutive steps. Recent advancement in metabolic engineering and synthetic biology has enabled the production of these isoprenoids in the heterologous host systems like Escherichia coli and Saccharomyces cerevisiae. Both heterologous systems have been engineered for large-scale production of value-added isoprenoids. This review article will provide the detailed description of various approaches used for engineering of methyl-d-erythritol-4-phosphate (MEP) and mevalonate (MVA) pathway for synthesizing isoprene units (C5) and ultimate production of diverse isoprenoids. The review particularly highlighted the efforts taken for the production of C5–C20 isoprenoids by metabolic engineering techniques in E. coli and S. cerevisiae over a decade. The challenges and strategies are also discussed in detail for scale-up and engineering of isoprenoids in the heterologous host systems. Key points • Isoprenoids are beneficial and valuable natural products. • E. coli and S. cerevisiae are the promising host for isoprenoid biosynthesis. • Emerging techniques in synthetic biology enabled the improved production. • Need to expand the catalogue and scale-up of un-engineered isoprenoids. Graphical abstract Metabolic engineering and synthetic biology for isoprenoid production in Escherichia coli and Saccharomyces cerevisiae
Article
Full-text available
Monoterpenoids, such as the plant metabolite geraniol, are of high industrial relevance since they are important fragrance materials for perfumes, cosmetics, and household products. Chemical synthesis or extraction from plant material for industry purposes are complex, environmentally harmful or expensive and depend on seasonal variations. Heterologous microbial production offers a cost-efficient and sustainable alternative but suffers from low metabolic flux of the precursors and toxicity of the monoterpenoid to the cells. In this study, we evaluated two approaches to counteract both issues by compartmentalizing the biosynthetic enzymes for geraniol to the peroxisomes of Saccharomyces cerevisiae as production sites and by improving the geraniol tolerance of the yeast cells. The combination of both approaches led to an 80% increase in the geraniol titers. In the future, the inclusion of product tolerance and peroxisomal compartmentalization into the general chassis engineering toolbox for monoterpenoids or other host-damaging, industrially relevant metabolites may lead to an efficient, low-cost, and eco-friendly microbial production for industrial purposes.
Article
Full-text available
The tetrahydroisoquinoline (THIQ) moiety is a privileged substructure of many bioactive natural products and semi-synthetic analogs. Plants manufacture more than 3,000 THIQ alkaloids, including the opioids morphine and codeine. While microbial species have been engineered to synthesize a few compounds from the benzylisoquinoline alkaloid (BIA) family of THIQs, low product titers impede industrial viability and limit access to the full chemical space. Here we report a yeast THIQ platform by increasing production of the central BIA intermediate (S)-reticuline to 4.6 g L−1, a 57,000-fold improvement over our first-generation strain. We show that gains in BIA output coincide with the formation of several substituted THIQs derived from amino acid catabolism. We use these insights to repurpose the Ehrlich pathway and synthesize an array of THIQ structures. This work provides a blueprint for building diverse alkaloid scaffolds and enables the targeted overproduction of thousands of THIQ products, including natural and semi-synthetic opioids. Plants synthesize more than 3000 tetrahydroisoquinoline (THIQ) alkaloids, but only a few of them have been produced by engineered microbes and titers are very low. Here, the authors increase (S)-reticuline titer to 4.6 g/L and repurpose the yeast Ehrlich pathway to synthesize a diverse array of THIQ scaffolds.
Article
Significance Monoterpenoids, monoterpene indole alkaloids, and cannabinoids are highly valued for their fragrant and therapeutic properties, but sourcing them from nature or deriving them from petrochemicals is no longer sustainable. However, sustainable production of these compounds in engineered microorganisms is mostly hampered by the limited availability of the main building block in their biosynthesis, geranyl diphosphate. Here, we overcome this challenge by engineering yeast peroxisomes as geranyl diphosphate-synthesizing microfactories and unlock the potential of yeast to produce a wide range of high-value isoprenoids. Conceptually, in this work we develop peroxisomes as synthetic biology devices that can be used for the modular assembly and optimization of complex pathways, adding an extra level of hierarchical abstraction in the systematic engineering of cell factories.
Article
d-Limonene, a cyclized monoterpene, possesses citrus-like olfactory property and multi-physiological functions, which can be used as a bioactive compound and flavor to improve the overall quality of alcoholic beverages. In our previous study, we established an orthogonal pathway of d-limonene synthesis by introducing neryl diphosphate synthase 1 (tNDPS1) and d-limonene synthase (tLS) in Saccharomyces cerevisiae. To further increase d-limonene formation, the metabolic flux of the mevalonate (MVA) pathway was enhanced by overexpressing the key genes tHMGR1, ERG12, IDI1, and IDI1WWW, respectively, or co-overexpressing. The results showed that strengthening the MVA pathway significantly improved d-limonene production, while the best strain yielded 62.31 mg/L d-limonene by co-expressing tHMGR1, ERG12, and IDI1WWW genes in alcoholic beverages. Furthermore, we also studied the effect of enhancing the MVA pathway on the growth and fermentation of engineered yeasts during alcoholic beverage fermentation. Besides, to further resolve the problem of yeast growth inhibition, we separately investigated transporter proteins of the high-yielding d-limonene yeasts and the parental strain under the stress of different d-limonene concentration, suggesting that the transporters of Aus1p, Pdr18p, Pdr5p, Pdr3p, Pdr11p, Pdr15p, Tpo1p, and Ste6p might play a more critical role in alleviating cytotoxicity and improving the tolerance to d-limonene. Finally, we verified the functions of three transporter proteins, finding that the transporter of Aus1p failed to transport d-limonene, and the others (Pdr5p and Pdr15p) could improve the tolerance of yeast to d-limonene. This study provided a valuable platform for other monoterpenes’ biosynthesis in yeast during alcoholic beverage fermentation.
Article
Terpenoids are a vast and diverse class of molecules with industrial and medicinal importance. The majority of these molecules are produced across kingdom Plantae via specialized metabolism. Microorganisms, mainly Escherichia coli and Saccharomyces cerevisiae, have become choice platforms for the biosynthesis of terpenoids due to recent advances in synthetic biology and metabolic engineering. New techniques for gene discovery have expanded our search space for novel terpene synthesis pathways and unlocked unrealized potential for the microbial production of more complex derivatives. Additionally, numerous advances in host and pathway engineering have allowed for the production of terpenoids requiring oxidation and glycosylation, effectively expanding the potential target space. These advances will lay the foundation for the microbial biosynthesis of a seemingly infinite domain of terpenoids with varying applications.