PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

The ever-increasing demand for novel drugs highlights the need for bioprospecting unexplored taxa for their biosynthetic potential. Lichen-forming fungi (LFF) are a rich source of natural products but their implementation in pharmaceutical industry is limited, mostly because the genes corresponding to a majority of their natural products is unknown. Furthermore, it is not known to what extent these genes encode structurally novel molecules. Advance in next-generation sequencing technologies has expanded the range of organisms that could be exploited for their biosynthetic potential. In this study, we mine the genomes of nine lichen-forming fungal species of the genus Umbilicaria for biosynthetic genes, and categorize the BGCs as associated product structurally known, and associated product putatively novel. We found that about 25-30% of the biosynthetic genes are divergent when compared to the global database of BGCs comprising of 1,200,000 characterized biosynthetic genes from planta, bacteria and fungi. Out of 217 total BGCs, 43 were only distantly related to known BGCs, suggesting they encode structurally and functionally unknown natural products. Clusters encoding the putatively novel metabolic diversity comprise PKSs (30), NRPSs (12) and terpenes (1). Our study emphasizes the utility of genomic data in bioprospecting microorganisms for their biosynthetic potential and in advancing the industrial application of unexplored taxa. We highlight the untapped structural metabolic diversity encoded in the lichenized fungal genomes. To the best of our knowledge, this is the first investigation identifying genes coding for NPs with potentially novel therapeutic properties in LFF.
Content may be subject to copyright.
1
Genome mining as a biotechnological tool for the discovery of
1
novel biosynthetic genes in lichens
2
3
Garima Singh1,2,*, Francesco Dal Grande1,2,3, Imke Schmitt1,2,4
4
1 Senckenberg Biodiversity and Climate Research Centre (SBiK-F), 60325 Frankfurt am
5
Main, Germany
6
2 LOEWE Center for Translational Biodiversity Genomics (TBG), 60325 Frankfurt am Main,
7
Germany
8
3 Department of Biology, University of Padova, Via U. Bassi, 58/B, 35121 Padova, Italy
9
4 Institute of Ecology, Diversity and Evolution, Goethe University, Frankfurt am Main,
10
Germany
11
12
Emails: Garima Singh: garima.singh@senckenberg.de, gsingh458@gmail.com
13
Francesco Dal Grande: francesco.dalgrande@unipd.it
14
Imke Schmitt: imke.schmitt@senckenberg.de
15
16
*Corresponding author: Garima Singh
17
18
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
2
Abstract
19
The ever-increasing demand for novel drugs highlights the need for bioprospecting
20
unexplored taxa for their biosynthetic potential. Lichen-forming fungi (LFF) are a rich source
21
of natural products but their implementation in pharmaceutical industry is limited, mostly
22
because the genes corresponding to a majority of their natural products is unknown.
23
Furthermore, it is not known to what extent these genes encode structurally novel molecules.
24
Advance in next-generation sequencing technologies has expanded the range of organisms
25
that could be exploited for their biosynthetic potential. In this study, we mine the genomes of
26
nine lichen-forming fungal species of the genus Umbilicaria for biosynthetic genes, and
27
categorize the BGCs as “associated product structurally known”, and “associated product
28
putatively novel”. We found that about 25-30% of the biosynthetic genes are divergent when
29
compared to the global database of BGCs comprising of 1,200,000 characterized biosynthetic
30
genes from planta, bacteria and fungi. Out of 217 total BGCs, 43 were only distantly related
31
to known BGCs, suggesting they encode structurally and functionally unknown natural
32
products. Clusters encoding the putatively novel metabolic diversity comprise PKSs (30),
33
NRPSs (12) and terpenes (1). Our study emphasizes the utility of genomic data in
34
bioprospecting microorganisms for their biosynthetic potential and in advancing the industrial
35
application of unexplored taxa. We highlight the untapped structural metabolic diversity
36
encoded in the lichenized fungal genomes. To the best of our knowledge, this is the first
37
investigation identifying genes coding for NPs with potentially novel therapeutic properties in
38
LFF.
39
40
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
3
Key words
41
Natural products, fungi, biosynthetic genes, lichen-forming fungi, secondary metabolites,
42
drug discovery, medicinal fungi, BiG-FAM, BiG-SLiCE
43
44
Background
45
Natural products (NPs) are small molecules in nature produced by the organism. Historically,
46
NPs have played a key role in drug discovery due to their broad pharmacological effects
47
encompassing antimicrobial, antitumor, anti-inflammatory properties and against
48
cardiovascular diseases [1,2]. In the past decades about 70% of the drugs were based on NPs
49
or NP analogs [1,2]. The demand for novel drugs however, is ever increasing due to the
50
emergence of antibiotic-resistant pathogens, the rise of new diseases, the existence of diseases
51
for which no efficient treatments are available yet, and the need for replacement of drugs due
52
to toxicity or high side-effects [3,4]. One way to address global health threats and to
53
accelerate NP-based drug discovery efforts is bioprospecting unexplored taxa to assess their
54
biosynthetic potential and identify potentially novel drug leads.
55
Genes involved in the synthesis of a NPs are often grouped together in biosynthetic
56
gene clusters [5–7]. These clusters have a core gene which codes for the backbone structure of
57
the NP and other genes which may be involved in the modification of the backbone or may
58
have a regulatory or transport-related function [5,8–10]. Depending upon the core gene, the
59
BGCs could be grouped into the following major classes: non-ribosomal peptide synthetases
60
(NRPS), polyketide synthases (PKS), NRPS-PKS (hybrid non-ribosomal peptide synthetase-
61
polyketide synthase), terpenes, and RiPP (ribosomally synthesized and post-translationally
62
modified peptide). Conserved motives, especially of the PKS genes, facilitate the
63
bioinformatic detection of the clusters [11–14].
64
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
4
Traditionally, a large portion of NP-based drugs have been contributed by a few
65
organisms as the drug discovery was mostly restricted to culturable organisms [15–17]. In the
66
last decades, bioinformatic prediction of biosynthetic gene or biosynthetic gene clusters
67
(group of two or more genes that are clustered together and are involved in the production of
68
a secondary metabolite) has revolutionized NP-based drug discovery as this process is
69
culture-independent and enables rapid identification of entire biosynthetic landscape from so
70
far unexplored NP resources, including silent or unexpressed genes. Two tools have been vital
71
to bioinformatic approach to drug discovery: AntiSMASH [18] and MIBiG [19]. AntiSMASH
72
includes one of the largest BGC database for BGC prediction [18] whereas MIBiG (Minimum
73
Information about a Biosynthetic Gene Cluster) is a data repository allowing functional
74
interpretation of target BGCs by comparison with BGCs with known functions [19]. Recently,
75
efforts have been made to cluster homologous BGCs into gene cluster families (GCFs) and to
76
simultaneously identify novel BGCs [20,21]. Two tools have been introduced to cluster BGCs
77
into GCFs: BiG-FAM clusters structurally and functionally related BGCs into GCFs and
78
identifies structurally most diverse BGCs by comparing the query BGCs to about 1,200,000
79
BGCs of the BiG-FAM database [21]. BiG-SLiCE clusters homologous BGCs of a dataset
80
into GCFs without reference to an external database, to identify unique BGCs in it [20].
81
Bioinformatic prediction and clustering of BGCs allows rapid identification of potentially
82
novel drug leads, reducing the costs and time associated with drug discovery by early
83
elimination of unlikely candidates.
84
Lichens, symbiotic organisms composed of fungal and photosynthetic partners (green
85
algae or cyanobacteria, or both), are suggested to be treasure chests of biosynthetic genes and
86
NPs [22–24]. Although the number of identified NPs per LFF is typically less than 5 [25], the
87
number of BGCs in the genomes of LFF may range from 25-60 [12]. It is not well known
88
how BGCs from LFF relate in structure and function to BGCs from bacteria and non-
89
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
5
lichenized fungi, i.e., if a portion of the BGC landscape of LFF is distinct, and might serve as
90
a source of NPs with novel therapeutic properties. Difficulties associated with heterologous
91
expression of LFF genes have so far restricted the application of LFF-derived NPs in the
92
industry. Recently two biosynthetic genes from LFF have been successfully heterologously
93
expressed [9,26]. This, combined with advances in long-read sequencing technology (higher
94
genome quality), and low cost of sequencing provide a promising way forward to discover
95
LFF-derived NPs with pharmacological potential.
96
Here we employ a long-read sequencing based comparative genomics and genome
97
mining approach to estimate the BGC functional diversity of nine species of the lichenized
98
fungal genus Umbilicaria. Specifically, we aim to answer the following questions: (1) What is
99
the functional diversity of BGCs in Umbilicaria? and 2) what is the percentage of novel
100
BGCs and species-specific BGCs in Umbilicaria?
101
102
Results
103
Overview of BGCs in the Umbilicaria genomes
104
Umbilicaria genomes contain 20-33 BGCs, with the highest number of BGCs detected in U.
105
deusta and lowest in U. phaea (Fig. 1A). We did not observe a correlation between genome
106
size and number of BGCs (correlation coefficient = 0.10). Umbilicaria species contain an
107
average of 13 PKS clusters, and 4.2 NRPS clusters per species (Fig. 1B), making a PKS to
108
NRPS clusters proportion of 3.1). The most dominant class of BGC in Umbilicaria are the
109
ones with PKSs, amounting more than 50% of the total BGCs, followed by terpene clusters
110
(about 20%) and NRPS clusters (about 15%) respectively, (Fig. 2A). In contrast, NRPSs are
111
the most dominant class among fungal and bacterial BGCs (Fig. 2B, C), amounting to about
112
42% and 30% respectively.
113
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
6
114
Fig 1. Genome quality metrics and diversity of biosynthetic genes in nine species of Umbilicaria. A)
115
Genome metrics including the total number of biosynthetic gene clusters as predicted by antiSMASH,
116
and number of genes and proteins estimated by InterProScan and SignalP as implemented in the
117
funannotate pipeline. B) Diversity of biosynthetic gene clusters associated with major natural product
118
categories, indicated as percentages (colored bars) and absolute numbers (numbers on bars).
119
120
121
Fig 2. Biosynthetic gene clusters in A) Umbilicaria, B) the full fungal BGC dataset and C) full
122
bacterial BGC dataset. PKSs are the most dominant class of BGCs in Umbilicaria whereas in fungi
123
and bacteria NRPSs are the predominant BGC class. Although the publicly available LFF genomes (>
124
50) are much lower than the non-lichenized fungi (about 2100), all the LFF genomes analyzed for
125
their BGCs have PKSs as the most common class of BGCs (see discussion for details), suggesting that
126
the predominance of PKSs as observed here in Umbilicaria dataset is a common feature of LFF
127
genomes.
128
Species
total BGCs
Genome size
Completeness
Scaffolds
#genes
#proteins
U. deusta
33
40.9
97.6
44
8,949
8,857
U. freyi
23
47.5
95.7
107
10,156
10.065
U. grisea
20
44.43
96.9
38
8,155
8,104
U. hispanica
25
48.6
97.3
53
8,781
8,696
U. muhlenbergii
23
34.81
100
7
8,968
8,854
U. phaea
19
35.1
97.2
38
7,681
7,628
U. pustulata
27
35.7
96.8
27
8,790
8,740
U. spodochroa
27
44.3
97.6
130
8,791
8,705
U. subpolyphylla
20
31.8
97.6
39
8,556
8,410
U. deusta
U. freyi
U. grisea
U. hispanica
U. muhlenbergii
U. phaea
U. pustulata
U. spodochroa
U. subpolyphylla
20 %
40 %
60 %
80 %
100 %
PKS
terpene
NRPS
other
RiPP
3
21
9
10
13
15
11
15
13
10
4
6
3
6
2
3
6
5
5
4
1
3
5
5
4
5
5
4
5
3
2
1
2
4
2
2
0.5 %
9.5 %
17.2 %
19.0 %
53.8 %
Biosynthetic genes
in Umbilicaria
Species 9
Total BGCs 217
0.6 %
7.1 %
42.6 %
17.6 %
32.1 %
PKS terpene NRPS Other RiPP
Biosynthetic
genes in fungi
24.1 %
21.7 %
29.5 %
7.8 %
16.9 %
Biosynthetic
genes in bacteria
Species 2171
Total BGCs 124092
Species 18661
Total BGCs 1096473
A B C
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
7
BGC clustering: BiG-FAM
129
Of the total 217 BGCs found in 9 Umbilicaria species, 18 BGCs (8%) obtained a BGC-to-
130
GCFs (Gene Cluster Families) pairing distance lower than 400, indicating that they
131
potentially code for structurally very similar compounds known from the BGCs of their
132
respective GCFs (Fig. 3A, B); 156 (71%) had a pairing distance of 400-900, suggesting that
133
they share similar domain architectures with previously described BGCs in the BiG-FAM
134
database. We identify the clusters belonging to above two groups as “associated product
135
structurally known”. 43 BGCs (21%) had a pairing distance greater than 900 and are
136
potentially BGCs encoding novel natural products (Fig. 3 A). We identify these clusters as
137
“associated product putatively novel”. These BGCs belong to the class terpenes (1 BGCs),
138
NRPSs (12 BGCs) and PKSs (30 BGCs). The details of these BGCs and the sequence of the
139
core gene is provided in the Additional file 1.
140
141
Within-genus comparison of BGCs: BiG-SLiCE
142
We identified species-specific BGCs within Umbilicaria using BiG-SLiCE. Out of 217 total
143
BGCs, 159 (72%) grouped into 20 GCFs (d=900), suggesting they are similar clusters shared
144
by multiple species, while 58 (28%) had a d > 900, indicating that they were only distantly
145
related to other BGCs in Umbilicaria. Each Umbilicaria species contains four to ten (6.45 –
146
16.13%) unique, species-specific BGCs (Additional file 2A). In U. deusta we detected two
147
BGCs (both with PKSs) that were extremely divergent (d > 1800) within the genus
148
(Additional file 2B).
149
Out of these BGCs, 15 are unique within Umbilicaria as well divergent from the
150
BGCs to the known BGCs present in BiG-FAM database.
151
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
8
152
Fig. 3 A) Total BGCs in Umbilicaria and GCFs as identified by BiG-FAM and the number of BGCs
153
clustering into a pre-characterized gene cluster families (GCFs) in BiG-FAM and their distance
154
groups. d<=400 suggest that the cluster codes for a structurally and functionally similar NP, d=400-
155
900 indicates that the BGC codes for a related but structurally and functionally divergent NP, whereas
156
d>900 suggests that the BGC codes for a novel NP. B) Bar plots representing the percentage of BGCs
157
in each Umbilicaria species with d<= 400, d= 400-900 and d>900. Only a small proportion of BGCs
158
in each species could be grouped into a pre-characterized GCF in the BiG-FAM database (21,678
159
species, 1,225,071 BGCs and 29,955 GCFs), whereas a large proportion of them is only distantly
160
related to the pre-characterized BGCs. About 15-30% of BGCs could not be grouped into BiG-FAM
161
gene cluster families and potentially code of structurally and functionally divergent NPs.
162
163
U. deusta
U. freyi
U. grisea
U. hispanica
U. muhlenbergii
U. phaea
U. pustulata
U. spodochroa
U. subpolyphylla
20 %
40 %
60 %
80 %
100 %
T<=400 T=400-900 T>900
Species
BGCs
GCFs
d<=400
d=400-900
d>900
U. deusta
33
26
1
26
6
U. freyi
23
15
3
17
3
U. grisea
20
13
2
15
3
U. hispanica
25
18
2
19
4
U. muhlenbergii
23
19
2
14
7
U. phaea
19
11
1
14
4
U. pustulata
27
17
2
20
5
U. spodochroa
27
19
4
19
4
U. subpolyphylla
20
13
1
12
7
total
217
135
18
156
43
A
B
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
9
164
165
Discussion
166
Lichens produce a large number of natural products, and they have even more BGCs [27–29].
167
However, whether these BGCs encode hitherto unknown metabolic diversity/chemical
168
structures is not known. Here we quantify, for the first time, the proportion of BGCs linked to
169
putatively novel natural products in a group of closely related lichen-forming fungi. The
170
identification of 23 clusters encoding putatively novel chemical structures can be useful in the
171
search for new structures and drug leads.
172
In this study we mined the genomes of the Umbilicaria spp. to identify all the BGCs
173
(Fig. 1), followed by clustering the structurally and functionally similar BGCs into gene
174
cluster families (Fig. 3A, B) and identifying the gene clusters potentially coding for novel
175
NPs (Fig. 4, Additional File 1). Using Umbilicaria spp. as a study system, we show that LFF
176
biosynthetic landscape is diverse from that of non-lichenized fungi and bacteria, being
177
particularly rich in PKSs (Fig 2) and that a substantial portion for LFF BGCs (about 28% in
178
case of Umbilicaria) potentially codes for novel NPs (Fig. 3A, B). To the best of our
179
knowledge, this is the first investigation of this kind, implementing state of the art
180
computational tools to determines the proportion of metabolic diversity in LFF coding for
181
U. hispannica
6.45 %
U. phaea
8.06 %
U. subpolyphylla
8.06 %
U. grisea
8.06 %
U. pustulata
11.29 %
U. muehlenbergii
12.90 %
U. spodochroa
12.90 %
U. freyi
16.13 %
U. deusta
16.13 %
Species 9
Total BGCs 217
Total novel BGCs 58
novel BGCs/species 6.5
Novel terpenes 33
Novel PKSs 24
Novel NRPS 1
Fig. 4 Pie chart depicting the contribution
of each species to the overall novel
Umbilicaria BGCs (as identified by BiG-
SLiCE, T>900) Each Umbilicaria species
contains about 4-10 unique, species-
specific BGCs. U. freyi and U. deusta
contain the highest number of novel
BGCs. The number of novel BGCs
slightly positively correlated to the
number of clusters (R=0.68). Out of 58
BGCs unique BGCs (T>900) 56.89%
were terpene- and 41.37% were PKS
clusters.
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
10
novel drugs and identifying candidate genes as a source of drug leads to prioritize them for
182
drug discovery efforts.
183
184
Biosynthetic potential and BGC diversity of Umbilicaria spp.
185
Although only PKSs-derived NPs are reported from Umbilicaria species (gyrophoric-,
186
umbilicaric-, and hiascic acid etc.) [30–32], we found that the Umbilicaria BGC landscape is
187
biosynthetically diverse and comprises three to five classes of NPs (Fig 1A, B). This is also
188
the case for most other LFF, for instance, PKS-derived NPs, are reported from Bacidia spp.,
189
Cladonia spp., Endocarpon spp., Evernia prunastri, Umbilicaria pustulata, Pseudevernia
190
furfuracea, but all of them contain several PKS, NRPS and terpene gene clusters [12,29,32–
191
34]. All these above-stated studies show that the biosynthetic potential of LFF vastly exceeds
192
their detectable chemical diversity. On average LFF may contain up to 30-40 BGCs but the
193
number of identified compounds per species is usually less than 10 [12,33,35]. This could be
194
because most of the clusters are silent and do not synthesize the NP or it could be simply
195
because of the failure to detect the NP. Bioinformatic characterization of entire BGC
196
landscape followed by identification of most distinct BGCs provides a way to estimate the
197
novelty of all the BGCs including the unexpressed and silent ones.
198
199
BGC diversity of LFF as compared to bacteria and non-lichenized fungi
200
We identified five classes of BGCs in the Umbilicaria genomes. PKSs were the most
201
dominant class, amounting to about 50%, followed by terpenes (19%), and NRPSs (14%)
202
(Fig. 1, Fig. 2 A). BGCs including PKSs typically make up the majority of BGCs in LFF:
203
Evernia prunastri (60%), Pseudevernia furfuracea (61%), Cladonia spp. (65%), Endocarpon
204
pusillum (58%), Lobaria pulmonaria (46%), and Ramalina peruviana (63%) (cite).
205
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
11
Although the number of publicly accessible, good quality LFF genomes are rather
206
scarce for LFF (<25) as compared to the bacteria and non-lichenized fungi, the data available
207
(9 Umbilicaria spp. genomes [36] plus 9 other publicly available lichen genomes) suggests
208
that the predominance of PKSs is a common feature of BGCs in LFF contributing more than
209
50% to the total BGCs. In contrast, in bacteria and non-lichenized fungi, NRPS are the most
210
prevalent BGC class, amounting to about 30% and 42% respectively, followed by the PKSs
211
(Fig. 2 B, C). This suggests that the biosynthetic potential of LFF is unique as compared to
212
the other organisms traditionally exploited for NPs, i.e., non-lichenized fungi and bacteria,
213
especially with respect to PKS diversity. In this regard, a recent study suggested that although
214
bacteria and fungi may share a few NPs, they do not have an overlapping chemical space and
215
instead have distinct biosynthetic potential [37]. LFF having a distinct BGC landscape
216
presents a complementary resource of NPs with promising medicinally-relevant biosynthetic
217
properties.
218
219
Umbilicaria BGCs: Gene Cluster Families (GCFs) and novel NPs
220
Gene cluster families (GCFs) are the groups of BGCs that encode the same or very similar
221
molecules. A total of 217 BGCs from nine Umbilicaria species were clustered into of 135
222
unique GCFs. (Fig 3 A) This suggests that Umbilicaria spp. are potentially capable of
223
synthesizing many structurally and functionally different natural products, although in nature
224
only one compound class is typically detected (depsides, linked to a BGC containing a PKS).
225
Only a small fraction of Umbilicaria BGCs, 8%, could be clustered with the pre-
226
characterized BGCs (Fig. 3A, B). About 71% of the BGCs were clustered to the BiG-FAM
227
GCFs with d= 400-900, indicating that they were only distantly related in structure and
228
function (Fig. 3 A, B). These BGCs are also interesting candidates to be investigated for their
229
biosynthetic properties as even a minor difference in the cluster and the chemistry of the final
230
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
12
metabolites could cause a crucial difference in bioactivity related to function and the
231
pharmacological potential of the product [38].
232
About 21% percent BGCs were highly divergent (d>900) and are novel, potentially
233
coding for structurally and functionally unique NPs and could be an interesting target for NP-
234
based drug discovery (Fig. 3 B). The strikingly high number of novel BGCs in a fungal genus
235
adds to the mounting evidence that the non-model and understudied taxa are enormous,
236
untapped source of novel NPs.
237
Genome mining for large genomic regions, such as fungal BGCs, works best when the
238
genomes under study are highly complete and contiguous, as well as reliably annotated. Many
239
publicly available LFF genomes do not fulfill these criteria, preventing a taxonomically broad
240
study of biosynthetic novelty encoded in the genomes of LFF. We were surprised that even a
241
“chemically boring” lichen taxon, such as the genus Umbilicaria, harbored 43 BGCs
242
encoding putatively unknown natural product diversity. It lets us suspect that chemically more
243
diverse taxa, e.g. Lecanorales or Pertusariales, each including hundreds of species, are even
244
richer sources of BGCs with novel functions, and compounds with potential pharmaceutical
245
applications. Increased genome sequencing of taxonomically diverse LFF, combined with
246
higher genome qualities will facilitate BGC discovery.
247
248
Unique BGCs within Umbilicaria spp.: BiG-SLiCE
249
BGCs which are uniquely occurring in a species are candidates for interesting NPs [20,37,39].
250
On average each Umbilicaria species contains seven species-specific BGCs. Most of the
251
novel BGCs are present in U. deusta and U. freyi whereas U. hispanica has lowest number of
252
novel BGCs (Fig. 4). This suggests that even closely related species (species within a single
253
genus) contain diverse biosynthetic potential. Species or strain specific biosynthetic potential
254
has already been demonstrated for LFF, for example in Umbilicaria pustulata [32] and
255
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
13
Pseudevernia furfuracea [33] and it is a rather common occurrence in fungi [32,37,40]. For
256
instance, majority of the BGCs in Streptomyces, i.e., 57% have been shown to be strain-
257
specific [41]. The unique BGCs within Umbilicaria belong to the BGC classes PKSs,
258
terpenes, NRPS as well as to indoles (Supplementary information S2). Of these, mostly only
259
PKS derived NPs have been well studies in LFF and shown to have diverse pharmacological
260
properties [42–44].
261
Two PKS obtained a pairing distance greater than 1800. These were the most
262
divergent BGCs (Supplementary information S2) within Umbilicaria and were “orphan
263
BGCs”, i.e., for these clusters the corresponding metabolite cannot be predicted. Recently
264
several orphan clusters have been activated to synthesize a compound with useful
265
pharmacological properties, for example the antibiotic holomycin gene cluster from the
266
marine bacterium Photobacterium galatheae was activated in culture [45–48]. The novel and
267
orphan clusters reported in this study are potentially interesting candidates for synthesizing
268
molecules with unique pharmacological properties and may serve as drugs leads.
269
About 17% of fungal BGCs, 8% of bacterial BGCs and 19% of LFF BGCs comprise
270
terpenes (Fig. 2). Terpenes are pharmaceutically extremely versatile, with antimicrobial, anti-
271
inflammatory, neurodegenerative, and cytotoxic properties [49–54]. Some of the common
272
plant-derived terpenes and terpenoids are curcumin, Eucalyptus oil. Although several studies
273
reported pharmacological properties of fungal terpenes, such studies on LFF are missing
274
despite the slightly larger proportion of terpenes in LFF genomes. In this study we report
275
structurally and functionally unique terpenes as promising candidates, to be investigated for
276
their pharmaceutical potential.
277
278
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
14
Conclusion
279
In this study we identified the biosynthetic diversity of the lichen forming fungal genus
280
Umbilicaria, grouped the structurally and functionally related clusters into GCFs and
281
identified the most diverse, potentially novel clusters. Using Umbilicaria as model system we
282
show that LFF constitute a valuable source of novel NPs suggesting that there is tremendous
283
natural product diversity to be discovered in them. In particular they are rich source of novel
284
PKSs and terpenes. Combining this observation with other sequenced LFF we show that LFF
285
are indeed a source of untapped natural product diversity.
286
287
Materials and methods
288
Dataset
289
The genomes of the following Umbilicaria species were used for this study: U. deusta, U.
290
freyi, U. grisea, U. subpolyphylla, U. hispanica, U. phaea, U. pustulata, U. muhlenbergii and
291
U. spodochroa. Except U. muhlenbergii which belongs to the Bioproject PRJNA239196, all
292
the other genomes are a part of Bioproject PRJNA820300 (Table 1). The details of sample
293
and library preparation, as well as genome sequencing for U. muhlenbergii are available in
294
Park et al. [55] and for the other eight Umbilicaria spp in Singh et al. [36]. Briefly, all the
295
genomes except U. muhlenbergii were generated via PacBio SMRT sequencing on the Sequel
296
System II using the continuous long read (CLR) mode or the circular consensus sequencing
297
(CCS) mode. The continuous long reads (i.e. CLR reads) were then processed into highly
298
accurate consensus sequences (i.e. HiFi reads) and assembled into contigs using the assembler
299
metaFlye v2.7 [56]. The contigs were then scaffolded with LRScaf v1.1.12
300
(github.com/shingocat/lrscaf, [57]). We used only binned Ascomyocta reads for this study
301
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
15
(extracted using blastx in DIAMOND (--more-sensitive --frameshift 15 –range-culling) on a
302
custom database and following the MEGAN6 Community Edition pipeline [58]).
303
304
BGC prediction and clustering: AntiSMASH
305
BGCs were predicted using antiSMASH (antibiotics & SM Analysis Shell, v6.0) with scripts
306
implemented in the funannotate pipeline [18,59]. We tested, if a smaller genome size was
307
correlated with a lower number of BGCs. A correlation coefficient near 0 indicates no
308
correlation whereas a coefficient near 1 indicates a positive correlation.
309
310
311
BGC clustering into BiG-FAM GCFs
312
The homologous BGCs present in the Umbilicaria genomes were grouped into Gene Cluster
313
Families (GCFs) using BiG-FAM, which clusters structurally and functionally related BGCs
314
into GCFs and identifies the structurally most diverse BGCs by comparing the query BGCs to
315
the 1,225,071 BGCs of the BiG-FAM database. The 1,225,071 BGCs in BiG-FAM are
316
Table 1. Voucher information of the genomes used in the study
Organism
Sample ID
Sequencing
technology
BioProject
BioSample
Genome accession
Umbilicaria deusta
TBG_2334
PacBio sequal II
PRJNA820300
SAMN26992774
JALILR000000000
Umbilicaria freyi
TBG_2329
PacBio sequal II
PRJNA820300
SAMN26992773
JALILQ000000000
Umbilicaria grisea
TBG_2336
PacBio sequal II
PRJNA820300
SAMN26992780
JALILX000000000
Umbilicaria hispanica
TBG_2337
PacBio sequal II
PRJNA820300
SAMN26992775
JALILS000000000
Umbilicaria muhlenbergii
KoLRI No. LF000956
Illumina HiSeq
PRJNA239196
SAMN02650300
GCA_000611775.1
Umbilicaria phaea
TBG_1112
PacBio sequal II
PRJNA820300
SAMN26992776
JALILT000000000
Umbilicaria pustulata
TBG_2345
PacBio sequal II
PRJNA820300
SAMN26992777
JALILU000000000
Umbilicaria spodochroa
TBG_2434
PacBio sequal II
PRJNA820300
SAMN26992778
JALILV000000000
Umbilicaria subpolyphylla
TBG_2324
PacBio sequal II
PRJNA820300
SAMN26992779
JALILW000000000
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
16
clustered into 29,955 GCFs based on similar domain architectures. A GCF comprises closely
317
related BGCs, potentially encoding the same or very similar compounds. By enabling such
318
clustering BiG-FAM establishes the degree of similarity of BGCs of a query taxon to
319
currently known (functionally pre-characterized) fungal and bacterial BGCs. The antiSMASH
320
job ID of each Umbilicaria species was used as input for BiG-FAM analysis.
321
322
Quantification of BGC diversity and species specific BGCs in Umbilicaria: BiG-SLiCE
323
We used BiG-SLiCE [20] to identify the most unique or species-specific BGCs within
324
Umbilicaria. While BiG-FAM identifies the most diverse BGCs compared to pre-
325
characterized BGCs from other taxa deposited in public repositories, BiG-SLiCE 1.1.0. is a
326
networking-based tool which assesses relations of BGCs of the dataset (i.e., Umbilicaria
327
BGCs in our study) and estimates their distance within the dataset to identity unique, species-
328
specific BGCs. The resulting distance indicates how closely a given BGC is related to other
329
BGCs. BiG-SLiCE was run on the Umbilicaria BGC dataset (i.e., 217 BGCs from nine
330
Umbilicaria spp.) using three different thresholds: 400, 900 and 1800.
331
332
Declarations
333
Ethics approval and consent to participate: Not applicable
334
Consent for publication: Not applicable
335
Availability of data and materials:
336
The dataset(s) supporting the conclusions of this article are available in the GenBank
337
repository, accession PRJNA820300, under the accession numbers JALILQ000000000 -
338
JALILY000000000. The lichen samples of the corresponding Umbilicaria Spp. are available
339
as Biosamples SAMN27294873 - SAMN27294881 and the mycobiont samples as
340
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
17
Biosamples SAMN26992773 - SAMN26992781. The antiSMASH files of Umbilicaria spp.
341
is available at figshare (doi: 10.6084/m9.figshare.19625997).
342
Competing interests: None
343
Funding: This research was funded by LOEWE-Centre TBG, funded by the Hessen
344
State Ministry of Higher Education, Research and the Arts (HMWK).
345
Authors' contributions:
346
GS analyzed and interpreted the data, generated the figures and tables and wrote the
347
manuscript.
348
FDG analyzed the data and assisted with the bioinformatic parts of the study.
349
IS interpreted the data, co-prepared the figures and co-wrote the manuscript.
350
All authors read and approved the final manuscript.
351
Acknowledgements
352
We thank Prof. Marnix Medema and Dr. Satria Kautsar for their support with BiG-SLiCE
353
program.
354
355
Supporting Information
356
S1 Most divergent BGCs in Umbilicaria as identified by BiG-FAM, along with the cluster
357
information and sequence.
358
S2 Most distantly related BGCs within Umbilicaria as identified by BiG-SLiCE along with
359
the cluster information.
360
361
362
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
18
References
363
1. Newman DJ, Cragg GM. Natural products as sources of new drugs over the 30 years from
364
1981 to 2010. J Nat Prod. 2012;75:311–35.
365
2. Newman DJ, Cragg GM. Natural products as sources of new drugs over the nearly four
366
decades from 01/1981 to 09/2019. J Nat Prod. American Chemical Society; 2020;83:770–803.
367
3. Chakraborty K, Kizhakkekalam VK, Joy M, Chakraborty RD. A leap forward towards
368
unraveling newer anti-infective agents from an unconventional source: a draft genome
369
sequence illuminating the future promise of marine heterotrophic Bacillus sp. against drug-
370
resistant pathogens. Mar Biotechnol. 2021;23:790–808.
371
4. Demain AL. Importance of microbial natural products and the need to revitalize their
372
discovery. J Ind Microbiol Biotechnol. 2014;41:185–201.
373
5. Keller NP. Fungal secondary metabolism: regulation, function and drug discovery. Nat Rev
374
Microbiol. Nature Publishing Group; 2019;17:167–80.
375
6. Jensen PR. Natural products and the gene cluster revolution. Trends Microbiol. 2016. p.
376
968–77.
377
7. Calcott MJ, Ackerley DF, Knight A, Keyzers RA, Owen JG. Secondary metabolism in the
378
lichen symbiosis. Chem Soc Rev. 2018;47:1730–60. Available from:
379
http://xlink.rsc.org/?DOI=C7CS00431A
380
8. Rigali S, Anderssen S, Naômé A, van Wezel GP. Cracking the regulatory code of
381
biosynthetic gene clusters as a strategy for natural product discovery. Biochem. Pharmacol.
382
2018. p. 24–34.
383
9. Kim W, Liu R, Woo S, Kang K Bin, Park H, Yu YH, et al. Linking a gene cluster to
384
atranorin, a major cortical substance of lichens, through genetic dereplication and
385
heterologous expression. MBio. 2021;e0111121. Available from:
386
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
19
https://pubmed.ncbi.nlm.nih.gov/34154413/
387
10. Aigle B, Lautru S, Spiteller D, Dickschat JS, Challis GL, Leblond P, et al. Genome
388
mining of Streptomyces ambofaciens. J Ind Microbiol Biotechnol. 2014;41:251–63. Available
389
from: https://pubmed.ncbi.nlm.nih.gov/24258629/
390
11. Kum E, İnce E. Genome-guided investigation of secondary metabolites produced by a
391
potential new strain Streptomyces BA2 isolated from an endemic plant rhizosphere in Turkey.
392
Arch Microbiol. 2021;203:2431–8.
393
12. Calchera A, Dal Grande F, Bode HB, Schmitt I. Biosynthetic gene content of the
394
“perfume lichens” Evernia prunastri and Pseudevernia furfuracea. Molecules. 2019;24:203.
395
Available from: http://www.ncbi.nlm.nih.gov/pubmed/30626017
396
13. Bertrand RL, Abdel-Hameed M, Sorensen JL. Lichen Biosynthetic Gene Clusters Part II:
397
Homology Mapping Suggests a Functional Diversity. J Nat Prod. 2018;81:732–48. Available
398
from: https://pubs.acs.org/doi/10.1021/acs.jnatprod.7b00770
399
14. Medema MH, Blin K, Cimermancic P, de Jager V, Zakrzewski P, Fischbach MA, et al.
400
antiSMASH: rapid identification, annotation and analysis of secondary metabolite
401
biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res.
402
2011;39:W339-46. Available from: http://www.ncbi.nlm.nih.gov/pubmed/21672958
403
15. Cragg GM, Newman DJ. Natural products: A continuing source of novel drug leads.
404
Biochim Biophys Acta - Gen Subj. 2013;1830:3670–95.
405
16. Yuan H, Ma Q, Ye L, Piao G. The traditional medicine and modern medicine from natural
406
products. Molecules. 2016;21:559. Available from:
407
https://pubmed.ncbi.nlm.nih.gov/27136524/
408
17. Newman DJ, Cragg GM, Snader KM. Natural products as sources of new drugs over the
409
period 1981-2002. J Nat Prod. 2003;66:1022–37.
410
18. Blin K, Shaw S, Steinke K, Villebro R, Ziemert N, Lee SY, et al. antiSMASH 5.0:
411
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
20
updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res.
412
2019;47:W81–7. Available from: http://www.ncbi.nlm.nih.gov/pubmed/31032519
413
19. Kautsar SA, Blin K, Shaw S, Navarro-Muñoz JC, Terlouw BR, Van Der Hooft JJJ, et al.
414
MIBiG 2.0: A repository for biosynthetic gene clusters of known function. Nucleic Acids Res.
415
2020;48:D454–8. Available from: https://json-schema.org
416
20. Kautsar SA, Van Der Hooft JJJ, De Ridder D, Medema MH. BiG-SLiCE: A highly
417
scalable tool maps the diversity of 1.2 million biosynthetic gene clusters. Gigascience.
418
2021;10:1–17. Available from: http://orcid.org/0000-0002-2191-2821
419
21. Kautsar SA, Blin K, Shaw S, Weber T, Medema MH. BiG-FAM: The biosynthetic gene
420
cluster families database. Nucleic Acids Res. 2021;49:D490–7. Available from:
421
https://palletsprojects.com/
422
22. Shukla V, Joshi GP, Rawat MSM. Lichens as a potential natural source of bioactive
423
compounds: A review. Phytochem. Rev. Springer; 2010 [cited 2021 Feb 8]. p. 303–14.
424
Available from: https://link.springer.com/article/10.1007/s11101-010-9189-6
425
23. Shrestha G, St. Clair LL. Lichens: A promising source of antibiotic and anticancer drugs.
426
Phytochem. Rev. 2013. p. 229–44.
427
24. Boustie J, Grube M. Lichens—a promising source of bioactive secondary metabolites.
428
Plant Genet Resour. 2005;3:273–87. Available from:
429
https://www.cambridge.org/core/product/identifier/S1479262105000328/type/journal_article
430
25. Lumbsch HT. Chemical Fungal Taxonomy: An Overview. In: Frisvad JC, Bridge PD,
431
Arora DK, editors. Chem Fungal Taxon. 1st ed. CRC Press; 1998. p. 1–18. Available from:
432
https://www.taylorfrancis.com/chapters/edit/10.1201/9781003064626-1/chemical-fungal-
433
taxonomy-overview-jens-frisvad-paul-bridge-dilip-arora
434
26. Kealey JT, Craig JP, Barr PJ. Identification of a lichen depside polyketide synthase gene
435
by heterologous expression in Saccharomyces cerevisiae. Metab Eng Commun.
436
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
21
2021;13:e00172. Available from:
437
https://linkinghub.elsevier.com/retrieve/pii/S2214030121000122
438
27. Meiser A, Otte J, Schmitt I, Grande FD. Sequencing genomes from mixed DNA samples -
439
Evaluating the metagenome skimming approach in lichenized fungi. Sci Rep. 2017;7:1–13.
440
Available from: www.nature.com/scientificreports/
441
28. Bertrand RL, Sorensen JL. A comprehensive catalogue of polyketide synthase gene
442
clusters in lichenizing fungi. J. Ind. Microbiol. Biotechnol. Springer Verlag; 2018. p. 1067–
443
81.
444
29. Gerasimova J V, Beck A, Werth S, Resl P. High diversity of type I polyketide genes in
445
Bacidia rubella as revealed by the comparative analysis of 23 lichen genomes. J Fungi.
446
2022;8:449. Available from: https://doi.org/10.3390/jof8050449
447
30. Davydov EA, Peršoh D, Rambold G. Umbilicariaceae (lichenized Ascomycota) – Trait
448
evolution and a new generic concept. Taxon. 2017;66:1282–303. Available from:
449
https://onlinelibrary.wiley.com/doi/abs/10.12705/666.2
450
31. Posner B, Feige GB, Huneck S. Studies on the chemistry of the lichen genus Umbilicaria
451
hoffm. Zeitschrift fur Naturforsch - Sect C J Biosci. 1992;47:1–9. Available from:
452
https://www.degruyter.com/view/journals/znc/47/1-2/article-p1.xml
453
32. Singh G, Calchera A, Schulz M, Drechsler M, Bode HB, Schmitt I, et al. Climate-specific
454
biosynthetic gene clusters in populations of a lichen-forming fungus. Environ Microbiol.
455
2021;00:1462-2920.15605. Available from:
456
https://onlinelibrary.wiley.com/doi/10.1111/1462-2920.15605
457
33. Singh G, Armaleo D, Dal Grande F, Schmitt I. Depside and depsidone synthesis in
458
lichenized fungi comes into focus through a genome-wide comparison of the olivetoric acid
459
and physodic acid chemotypes of Pseudevernia furfuracea. Biomolecules. 2021;11:1445.
460
Available from: https://www.mdpi.com/2218-273X/11/10/1445
461
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
22
34. Wang J, Nielsen J, Liu Z. Synthetic biology advanced natural product discovery.
462
Metabolites. 2021;11:785. Available from: https://pubmed.ncbi.nlm.nih.gov/34822443/
463
35. Pizarro D, Divakar PK, Grewe F, Crespo A, Dal Grande F, Lumbsch HT. Genome-wide
464
analysis of biosynthetic gene cluster reveals correlated gene loss with absence of usnic acid in
465
lichen-forming fungi. Genome Biol Evol. 2020;12:1858–68. Available from:
466
https://academic.oup.com/gbe/article/12/10/1858/5903737
467
36. Singh G, Calchera A, Merges D, Otte J, Schmitt I, Grande FD. A candidate gene cluster
468
for the bioactive natural product gyrophoric acid in lichen-forming fungi. bioRxiv.
469
2022;2022.01.14.475839. Available from:
470
https://www.biorxiv.org/content/10.1101/2022.01.14.475839v1
471
37. Robey MT, Caesar LK, Drott MT, Keller NP, Kelleher NL. An interpreted atlas of
472
biosynthetic gene clusters from 1,000 fungal genomes. Proc Natl Acad Sci U S A. 2021;118.
473
Available from: https://doi.org/10.1073/pnas.2020230118
474
38. Lautié E, Russo O, Ducrot P, Boutin JA. Unraveling plant natural chemical diversity for
475
drug discovery purposes. Front. Pharmacol. Frontiers Media S.A.; 2020. p. 397.
476
39. Navarro-Muñoz JC, Selem-Mojica N, Mullowney MW, Kautsar SA, Tryon JH, Parkinson
477
EI, et al. A computational framework to explore large-scale biosynthetic diversity. Nat Chem
478
Biol. 2020;16:60–8. Available from: https://pubmed.ncbi.nlm.nih.gov/31768033/
479
40. Alam K, Islam MM, Li C, Sultana S, Zhong L, Shen Q, et al. Genome mining of
480
Pseudomonas species: diversity and evolution of metabolic and biosynthetic potential.
481
Molecules. 2021 [cited 2022 Mar 8];26:7524. Available from: https://www.mdpi.com/1420-
482
3049/26/24/7524
483
41. Choudoir MJ, Pepe-Ranney C, Buckley DH. Diversification of secondary metabolite
484
biosynthetic gene clusters coincides with lineage divergence in Streptomyces. Antibiotics.
485
2018;7:1–15. Available from: https://pubmed.ncbi.nlm.nih.gov/29438308/
486
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
23
42. Ingelfinger R, Henke M, Roser L, Ulshöfer T, Calchera A, Singh G, et al. Unraveling the
487
pharmacological potential of lichen extracts in the context of cancer and inflammation with a
488
broad screening approach. Front Pharmacol. 2020;11:1322. Available from:
489
https://www.frontiersin.org/article/10.3389/fphar.2020.01322/full
490
43. Manojlović N, Ranković B, Kosanić M, Vasiljević P, Stanojković T. Chemical
491
composition of three Parmelia lichens and antioxidant, antimicrobial and cytotoxic activities
492
of some their major metabolites. Phytomedicine. 2012;19:1166–72.
493
44. Cardile V, Graziano ACE, Avola R, Piovano M, Russo A. Potential anticancer activity of
494
lichen secondary metabolite physodic acid. Chem Biol Interact. 2017;263:36–45.
495
45. Shi J, Zeng YJ, Zhang B, Shao FL, Chen YC, Xu X, et al. Comparative genome mining
496
and heterologous expression of an orphan NRPS gene cluster direct the production of
497
ashimides. Chem Sci. 2019;10:3042–8. Available from:
498
https://pubmed.ncbi.nlm.nih.gov/30996885/
499
46. Mattern DJ, Schoeler H, Weber J, Novohradská S, Kraibooj K, Dahse HM, et al.
500
Identification of the antiphagocytic trypacidin gene cluster in the human-pathogenic fungus
501
Aspergillus fumigatus. Appl Microbiol Biotechnol. 2015;99:10151–61. Available from:
502
https://pubmed.ncbi.nlm.nih.gov/26278536/
503
47. Buijs Y, Isbrandt T, Zhang S-D, Larsen TO, Gram L. The antibiotic andrimid produced by
504
Vibrio coralliilyticus increases expression of biosynthetic gene clusters and antibiotic
505
production in Photobacterium Galatheae. Front Microbiol. 2020;11:3276. Available from:
506
https://www.frontiersin.org/articles/10.3389/fmicb.2020.622055/full
507
48. Ziko L, Saqr AHA, Ouf A, Gimpel M, Aziz RK, Neubauer P, et al. Antibacterial and
508
anticancer activities of orphan biosynthetic gene clusters from Atlantis II Red Sea brine pool.
509
Microb Cell Fact. 2019;18:56. Available from:
510
https://microbialcellfactories.biomedcentral.com/articles/10.1186/s12934-019-1103-3
511
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
24
49. Cox-Georgian D, Ramadoss N, Dona C, Basu C. Therapeutic and medicinal uses of
512
terpenes. Med Plants From Farm to Pharm. 2019. p. 333–59. Available from:
513
/pmc/articles/PMC7120914/
514
50. Guimarães AC, Meireles LM, Lemos MF, Guimarães MCC, Endringer DC, Fronza M, et
515
al. Antibacterial activity of terpenes and terpenoids present in essential oils. Molecules.
516
2019;24:2471. Available from: /pmc/articles/PMC6651100/
517
51. Jiang M, Wu Z, Guo H, Liu L, Chen S. A review of terpenes from marine-derived fungi:
518
2015-2019. Mar Drugs. 2020;18:321. Available from: www.mdpi.com/journal/marinedrugs
519
52. Del Prado-Audelo ML, Cortés H, Caballero-Florán IH, González-Torres M, Escutia-
520
Guadarrama L, Bernal-Chávez SA, et al. Therapeutic applications of terpenes on
521
inflammatory diseases. Front Pharmacol. 2021;12:2114. Available from:
522
https://www.frontiersin.org/articles/10.3389/fphar.2021.704197/full
523
53. Jaeger R, Cuny E. Terpenoids with special pharmacological significance: A review. Nat
524
Prod Commun. 2016;11:1373–90.
525
54. Yang W, Chen X, Li Y, Guo S, Wang Z, Yu X. Advances in pharmacological activities of
526
terpenoids. Nat. Prod. Commun. 2020. p. 1–13. Available from:
527
https://journals.sagepub.com/doi/full/10.1177/1934578X20903555
528
55. Park SY, Choi J, Lee GW, Jeong MH, Kim JA, Oh SO, et al. Draft genome sequence of
529
Umbilicaria muehlenbergii KoLRILF000956, a lichen-forming fungus amenable to genetic
530
manipulation. Genome Announc. 2014;2:e00357. Available from:
531
https://pubmed.ncbi.nlm.nih.gov/24762942/
532
56. Kolmogorov M, Yuan J, Lin Y, Pevzner PA. Assembly of long, error-prone reads using
533
repeat graphs. Nat Biotechnol. 2019;37:540–6. Available from:
534
https://www.nature.com/articles/s41587-019-0072-8
535
57. Qin M, Wu S, Li A, Zhao F, Feng H, Ding L, et al. LRScaf: Improving draft genomes
536
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
25
using long noisy reads. BMC Genomics. 2019;20:955. Available from:
537
https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-019-6337-2
538
58. Huson DH, Beier S, Flade I, Górska A, El-Hadidi M, Mitra S, et al. MEGAN Community
539
Edition - Interactive exploration and analysis of large-scale microbiome sequencing data.
540
PLOS Comput Biol. 2016;12:e1004957. Available from:
541
https://dx.plos.org/10.1371/journal.pcbi.1004957
542
59. Palmer J, Stajich J. Funannotate v1.7.4. Zenodo. 2019;
543
544
.CC-BY-ND 4.0 International licensemade available under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted May 4, 2022. ; https://doi.org/10.1101/2022.05.04.490581doi: bioRxiv preprint
... If it is the common fact that is not easily solved now about the conflict between more lichen natural products by OSMAC from LFF cultures and uncertain or not very well bioactivity compared with those isolated from lichen thallus, genome miningbased strategy will be a more explicit way to discover lichen natural products. With the development of bioinformatics and the applying next-generation sequencing data, there has indeed been more focus on natural product discovery based on genomics (Garima et al., 2022b;Luo et al., 2022). Genome mining has become a powerful tool to discover compounds, identify cryptic biosynthetic gene clusters, characterize the potential biosynthetic pathways, and predict the skeletal structure of the relative products (Liu Q. et al., 2022;Liu T. et al., 2022;Kalra et al., 2023). ...
Article
Full-text available
Lichen natural products are a tremendous source of new bioactive chemical entities for drug discovery. The ability to survive in harsh conditions can be directly correlated with the production of some unique lichen metabolites. Despite the potential applications, these unique metabolites have been underutilized by pharmaceutical and agrochemical industries due to their slow growth, low biomass availability, and technical challenges involved in their artificial cultivation. At the same time, DNA sequence data have revealed that the number of encoded biosynthetic gene clusters in a lichen is much higher than in natural products, and the majority of them are silent or poorly expressed. To meet these challenges, the one strain many compounds (OSMAC) strategy, as a comprehensive and powerful tool, has been developed to stimulate the activation of silent or cryptic biosynthetic gene clusters and exploit interesting lichen compounds for industrial applications. Furthermore, the development of molecular network techniques, modern bioinformatics, and genetic tools is opening up a new opportunity for the mining, modification, and production of lichen metabolites, rather than merely using traditional separation and purification techniques to obtain small amounts of chemical compounds. Heterologous expressed lichen-derived biosynthetic gene clusters in a cultivatable host offer a promising means for a sustainable supply of specialized metabolites. In this review, we summarized the known lichen bioactive metabolites and highlighted the application of OSMAC, molecular network, and genome mining-based strategies in lichen-forming fungi for the discovery of new cryptic lichen compounds.
Article
Lichens produce diverse secondary metabolites. A diversity of these compounds is synthesized by fungal polyketide synthases (PKSs). In this study, we catalogued the PKS genes from Xanthoparmelia taractica, a lichen with global distribution. To accomplish this, we isolated the symbionts to sequence the whole genome of the mycobiont and established an in vitro co-culture system for this lichen. We also added an endolichenic fungus, Coniochaeta fibrosae, to this co-culture to evaluate its effect on lichen symbiosis. The genome of the mycobiont X. taractica was around 43.1 Mb with 10,730 ORFs. Twenty-eight PKS genes were identified in the genome. These included 27 Type I and one Type III gene. Except for three PKS genes, XTPKS12, XTPKS18, and XTPKS22, the function of the majority of PKS genes remained unknown. We selected these genes for the expression analyses using a co-culture system. The co-culture system that included the mycobiont and the photobiont showed an early stage of lichenization because the fungi produced a hyphal network connecting and penetrating the algal cells. Also, XTPKS12 was down-regulated and XTPKS18 and XTPKS22 were modestly up-regulated. As predicted, C. fibrosae did not participate in the symbiosis. This study reconfirms that Type I is the most dominant PKS gene in lichenized fungi and the function of these genes might be influenced by symbiosis.
Article
Full-text available
Natural products of lichen-forming fungi are structurally diverse and have a variety of medicinal properties. Despite this, they have limited implementation in industry mostly because the corresponding genes are unknown for most of their natural products. Here, we implement a long-read sequencing and bioinformatic approach to identify the putative biosynthetic gene cluster of the bioactive natural product gyrophoric acid (GA). Using 15 high-quality genomes representing nine GA-producing species of the lichen-forming fungal genus Umbilicaria, we identify the most likely GA cluster and investigate the cluster gene organization and composition across the nine species. Our results show that GA clusters are promiscuous within Umbilicaria, and only three genes are conserved across species, including the polyketide synthase (PKS) gene. In addition, our results suggest that the same cluster codes for different, but structurally similar compounds, namely, GA, umbilicaric-, and hiascic acid, bringing new evidence that lichen metabolite diversity is also generated through regulatory mechanisms at the molecular level. Ours is the first study to identify the most likely GA cluster and, thus, provides essential information to open new avenues for biotechnological approaches to producing and modifying GA and similar lichen-derived compounds. GA PKS is the first tridepside PKS to be identified.
Article
Full-text available
Fungi involved in lichen symbioses produce a large array of secondary metabolites that are often diagnostic in the taxonomic delimitation of lichens. The most common lichen secondary metabolites—polyketides—are synthesized by polyketide synthases, particularly by Type I PKS (TI-PKS). Here, we present a comparative genomic analysis of the TI-PKS gene content of 23 lichen-forming fungal genomes from Ascomycota, including the de novo sequenced genome of Bacidia rubella. Firstly, we identify a putative atranorin cluster in B. rubella. Secondly, we provide an overview of TI-PKS gene diversity in lichen-forming fungi, and the most comprehensive Type I PKS phylogeny of lichen-forming fungi to date, including 624 sequences. We reveal a high number of biosynthetic gene clusters and examine their domain composition in the context of previously characterized genes, confirming that PKS genes outnumber known secondary substances. Moreover, two novel groups of reducing PKSs were identified. Although many PKSs remain without functional assignments, our findings highlight that genes from lichen-forming fungi represent an untapped source of novel polyketide compounds.
Article
Full-text available
A wide variety of bacteria, fungi and plants can produce bioactive secondary metabolites, which are often referred to as natural products. With the rapid development of DNA sequencing technology and bioinformatics, a large number of putative biosynthetic gene clusters have been reported. However, only a limited number of natural products have been discovered, as most biosynthetic gene clusters are not expressed or are expressed at extremely low levels under conventional laboratory conditions. With the rapid development of synthetic biology, advanced genome mining and engineering strategies have been reported and they provide new opportunities for discovery of natural products. This review discusses advances in recent years that can accelerate the design, build, test, and learn (DBTL) cycle of natural product discovery, and prospects trends and key challenges for future research directions.
Article
Full-text available
Viruses can play critical roles in symbioses by initiating horizontal gene transfer, affecting host phenotypes, or expanding their host's ecological niche. However, knowledge of viral diversity and distribution in symbiotic organisms remains elusive. Here we use deep‐sequenced metagenomic DNA (PacBio Sequel II; two individuals), paired with a population genomics approach (Pool‐seq; 11 populations, 550 individuals) to understand viral distributions in the lichen Umbilicaria phaea. We assess (i) viral diversity in lichen thalli, (ii) putative viral hosts (fungi, algae, bacteria) and (iii) viral distributions along two replicated elevation gradients. We identified five novel viruses, showing 28%–40% amino acid identity to known viruses. They tentatively belong to the families Caulimoviridae, Myoviridae, Podoviridae and Siphoviridae. Our analysis suggests that the Caulimovirus is associated with green algal photobionts (Trebouxia) of the lichen, and the remaining viruses with bacterial hosts. We did not detect viral sequences in the mycobiont. Caulimovirus abundance decreased with increasing elevation, a pattern reflected by a specific algal lineage hosting this virus. Bacteriophages showed population‐specific patterns. Our work provides the first comprehensive insights into viruses associated with a lichen holobiont and suggests an interplay of viral hosts and environment in structuring viral distributions.
Article
Full-text available
Primary biosynthetic enzymes involved in the synthesis of lichen polyphenolic compounds depsides and depsidones are non-reducing polyketide synthases (NR-PKSs), and cytochrome P450s. However, for most depsides and depsidones the corresponding PKSs are unknown. Additionally, in non-lichenized fungi specific fatty acid synthases (FASs) provide starters to the PKSs. Yet, the presence of such FASs in lichenized fungi remains to be investigated. Here we implement comparative genomics and metatranscriptomics to identify the most likely PKS and FASs for olivetoric acid and physodic acid biosynthesis, the primary depside and depsidone defining the two chemotypes of the lichen Pseudevernia furfuracea. We propose that the gene cluster PF33-1_006185, found in both chemotypes, is the most likely candidate for the olivetoric acid and physodic acid biosynthesis. This is the first study to identify the gene cluster and the FAS likely responsible for olivetoric acid and physodic acid biosynthesis in a lichenized fungus. Our findings suggest that gene regulation and other epigenetic factors determine whether the mycobiont produces the depside or the depsidone, providing the first direct indication that chemotype diversity in lichens can arise through regulatory and not only through genetic diversity. Combining these results and existing literature, we propose a detailed scheme for depside/depsidone synthesis.
Article
Full-text available
During the previous decade, genome-built researches on marine heterotrophic microorganisms displayed the chemical heterogeneity of natural product resources coupled with the efficacies of harnessing the genetic divergence in various strains. Herein, we describe the whole genome data of heterotrophic Bacillus amyloliquefaciens MB6 (MTCC 12,716), isolated from a marine macroalga Hypnea valentiae, a 4,107,511-bp circular chromosome comprising 186 contigs, with 4154 protein-coding DNA sequences and a coding ratio of 86%. Simultaneously, bioactivity-guided purification of the bacterial extract resulted in six polyketide classes of compounds with promising antibacterial activity. Draft genome sequence of B. amyloliquefaciens MB6 unveiled biosynthetic gene clusters (BGCs) engaged in the biosynthesis of polyketide–originated macrolactones with prospective antagonistic activity (MIC ≤ 5 µg/mL) against nosocomial pathogens. Genome analysis manifested 34 putative BGCs necessitated to synthesize biologically active polyketide-originated frameworks or their derivatives. These results provide insights into the genetic basis of heterotrophic B. amyloliquefaciens MTCC 12,716 as a prospective lead for biotechnological and pharmaceutical applications.
Article
Full-text available
Lichens play significant roles in ecosystem function and comprise about 20% of all known fungi. Polyketide-derived natural products accumulate in the cortical and medullary layers of lichen thalli, some of which play key roles in protection from biotic and abiotic stresses (e.g., herbivore attacks and UV irradiation).
Article
Full-text available
Natural products can contribute to abiotic stress tolerance in plants and fungi. We hypothesize that biosynthetic gene clusters (BGCs), the genomic elements that underlie natural product biosynthesis, display structured differences along elevation gradients. We analyzed biosynthetic gene variation in natural populations of the lichen‐forming fungus Umbilicaria pustulata. We collected a total of 600 individuals from Mediterranean and cold temperate climate. Population genomic analyses indicate that U. pustulata contains three clusters that are highly differentiated between Mediterranean and cold temperate populations. One entire cluster is exclusively present in cold temperate populations, and a second cluster is putatively dysfunctional in all cold temperate populations. In these two clusters the presence of consistent allele frequency differences among replicate populations/gradients suggests that selection rather than drift is driving the pattern. In the third cluster variation is fixed in all cold temperate populations due to hitchhiking. We advocate that the landscape of fungal biosynthetic genes is shaped by both positive‐ and hitchhiking selection. We demonstrate, for the first‐time, the presence of climate‐associated BGCs and BGC variations in lichen‐forming fungi. While the associated secondary metabolites of the candidate clusters are presently unknown, our study paves the way for targeted discovery of natural products with ecological significance. This article is protected by copyright. All rights reserved.
Article
Full-text available
Lichen-forming fungi produce a variety of secondary metabolites including bioactive polyketides. Advances in DNA and RNA sequencing have led to a growing database of new lichen gene clusters encoding polyketide synthases (PKS) and associated ancillary activities. Definitive assignment of a PKS gene to a metabolic product has been challenging in the lichen field due to a lack of established gene knockout or heterologous gene expression systems. Here, we report the reconstitution of a non-reducing PKS gene from the lichen Pseudevernia furfuracea and successful heterologous expression of the synthetic lichen PKS gene in engineered Saccharomyces cerevisiae. We show that P. furfuracea PFUR17_02294 produces lecanoric acid, the depside dimer of orsellinic acid, at 360 mg/L in small-scale yeast cultures. Our results unequivocally identify PFUR17_02294 as a lecanoric acid synthase and establish that a single lichen PKS synthesizes two phenolic rings and joins them by an ester linkage to form the depside product.
Article
Full-text available
Significance Fungi represent an underexploited resource for new compounds with applications in the pharmaceutical and agriscience industries. Despite the availability of >1,000 fungal genomes, our knowledge of the biosynthetic space encoded by these genomes is limited and ad hoc. We present results from systematically organizing the biosynthetic content of 1,037 fungal genomes, providing a resource for data-driven genome mining and large-scale comparison of the genetic and molecular repertoires produced in fungi, and compare to those present in bacteria.