ArticlePDF Available

Genomic convergence underlying high‐altitude adaptation in alpine plants

Wiley
Journal of Integrative Plant Biology
Authors:

Abstract and Figures

Evolutionary convergence is one of the most striking examples of adaptation driven by natural selection. However, genomic evidence for convergent adaptation to extreme environments remains scarce. Here, we assembled reference genomes of two alpine plants, Saussurea obvallata (Asteraceae) and Rheum alexandrae (Polygonaceae), with 37,938 and 61,463 annotated protein‐coding genes. By integrating an additional five alpine genomes, we elucidated genomic convergence underlying high‐altitude adaptation in alpine plants. Our results detected convergent contractions of disease‐resistance genes in alpine genomes, which might be an energy‐saving strategy for surviving in hostile environments with only a few pathogens present. We identified signatures of positive selection on a set of genes involved in reproduction and respiration (e.g., MMD1, NBS1, and HPR), and revealed signatures of molecular convergence on genes involved in self‐incompatibility, cell wall modification, DNA repair and stress resistance, which may underlie adaptation to extreme cold, high ultraviolet radiation and hypoxia environments. Incorporating transcriptomic data, we further demonstrated that genes associated with cuticular wax and flavonoid biosynthetic pathways exhibit higher expression levels in leafy bracts, shedding light on the genetic mechanisms of the adaptive “greenhouse” morphology. Our integrative data provide novel insights into convergent evolution at a high‐taxonomic level, aiding in a deep understanding of genetic adaptation to complex environments.
This content is subject to copyright. Terms and conditions apply.
J
IPB Journal of Integrative
Plant Biology New Resource
https://doi.org/10.1111/jipb.13485
Genomic convergence underlying highaltitude
adaptation in alpine plants
Xu Zhang
1,2
, Tianhui Kuang
3
, Wenlin Dong
1,2,4
,ZhihaoQian
1,2,4
, Huajie Zhang
1,2
, Jacob B. Landis
5,6
,
Tao Feng
1,2
, Lijuan Li
1,2,4
, Yanxia Sun
1,2
,JinlingHuang
3,7,8
,TaoDeng
3
*, Hengchang Wang
1,2
* and Hang Sun
3
*
1. CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, The Chinese Academy of Sciences, Wuhan
Botanical Garden, Wuhan 430074, China
2. Center of Conservation Biology, Core Botanical Gardens, The Chinese Academy of Sciences, Wuhan 430074, China
3. Yunnan International Joint Laboratory for Biodiversity of Central Asia, Key Laboratory for Plant Diversity and Biogeography of East
Asia, Kunming Institute of Botany, The Chinese Academy of Sciences, Kunming 650201, China
4. University of Chinese Academy of Sciences, Beijing 100049, China
5. School of Integrative Plant Science, Section of Plant Biology and the L. H. Bailey Hortorium, Cornell University, Ithaca, New York
14850, USA
6. BTI Computational Biology Center, Boyce Thompson Institute, Ithaca, New York 14853, USA
7. State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, Kaifeng 475004, China
8. Department of Biology, East Carolina University, Greenville, North Carolina 27858, USA
*Correspondences: Tao Deng (dengtao@mail.kib.ac.cn); Hengchang Wang (hcwang@wbgcas.cn); Hang Sun (sunhang@mail.kib.ac.cn,
Dr. Sun is fully responsible for the distributions of the materials associated with this article)
Xu Zhang Hang Sun
ABSTRACT
Evolutionary convergence is one of the most striking
examples of adaptation driven by natural selection.
However, genomic evidence for convergent adap-
tation to extreme environments remains scarce.
Here, we assembled reference genomes of two
alpine plants, Saussurea obvallata (Asteraceae)
and Rheum alexandrae (Polygonaceae), with 37,938
and 61,463 annotated proteincoding genes. By
integrating an additional ve alpine genomes,
we elucidated genomic convergence underlying
highaltitude adaptation in alpine plants. Our results
detected convergent contractions of disease
resistance genes in alpine genomes, which might be
an energysaving strategy for surviving in hostile
environments with only a few pathogens present.
We identied signatures of positive selection on a
set of genes involved in reproduction and respira-
tion (e.g., MMD1, NBS1,andHPR), and revealed
signatures of molecular convergence on genes in-
volved in selfincompatibility, cell wall modication,
DNA repair and stress resistance, which may un-
derlie adaptation to extreme cold, high ultraviolet
radiation and hypoxia environments. Incorporating
transcriptomic data, we further demonstrated that
genes associated with cuticular wax and avonoid
biosynthetic pathways exhibit higher expression
levels in leafy bracts, shedding light on the genetic
mechanisms of the adaptive greenhousemor-
phology. Our integrative data provide novel insights
into convergent evolution at a hightaxonomic level,
aiding in a deep understanding of genetic adapta-
tion to complex environments.
Keywords: adaptation, alpine plants, evolutionary rates, genomic
convergence, greenhousemorphology, high altitude
Zhang, X., Kuang, T., Dong, W., Qian, Z., Zhang, H., Landis,
J. B., Feng, T., Li, L., Sun, Y., Huang, J., et al. (2023). Genomic
convergence underlying highaltitude adaptation in alpine plants.
J. Integr. Plant Biol. 00: 116.
INTRODUCTION
Evolutionary biologists have long aimed to understand the
extent that evolutionary trajectories are predictable, that
is, the extent to which convergent adaptation in distinct lin-
eages is driven by conserved molecular changes (Zhang and
Kumar, 1997;Stern and Orgogozo, 2009;Zhen et al.,
2012;Storz, 2016). Evolutionary convergence in settings
© 2023 Institute of Botany, Chinese Academy of Sciences.
where different species repeatedly face common selective
pressures offers a powerful opportunity to address this issue
(Yeaman et al., 2016;Birkeland et al., 2020;Xu et al., 2020).
An ideal system for investigating the genetic underpinnings of
convergent evolution is the independent adaptation of
divergent lineages to highaltitude environments.
The HimalayaHengduan Mountains (HHMs) represent the
world's most speciesrich temperate alpine biota, providing
an ideal natural laboratory for studying convergent adapta-
tion to high altitudes (Spicer et al., 2020). In the subnival
summits of the HHMs (elevation above 4,500 m), usually
characterized by freezing temperatures, high ultraviolet (UV)
radiation, and hypoxia, plants typically possess suites of
similar morphological and physiological adaptations to allow
them to survive and reproduce in these hostile environments
(Tsukaya and Tsuge, 2001). In comparison with plants of
lowland areas, plants living in the subnival summits of the
HHMs (and other highaltitude areas) have dwarf stems,
smaller leaves and higher densities of branches, and often
exhibit a specialized morphology such as leafy bracts, woolly
coverings and cushion forms (Nagy and Grabherr, 2009;Sun
et al., 2014). Under similar stressful environmental conditions
of high altitudes, one would predict similar genetic compo-
nents underpinning adaptive evolution in alpine plants.
Genomewide studies have documented some genomic
footprints of highaltitude adaptation by testing for positive
selection and mining expanded gene families, often involving
functional pathways such as DNA repair, abiotic stress re-
sponse, reproductive processes, as well as secondary me-
tabolite biosynthesis (Zeng et al., 2015;Zhang et al.,
2019;Wang et al., 2021). However, the limited availability of
reference genomes for alpine plants restricts further under-
standing of the genomic evolution of highaltitude adaptation;
in particular, the underlying genomic convergence of alpine
adaptation has not been examined.
In this study, we newly assembled and annotated a
referencequality genome of Saussurea obvallata (Aster-
aceae) and a draft genome of Rheum alexandrae (Polygo-
naceae). These two species are primarily found in mountain
slopes and alpine meadows of the HHMs and are renowned
for their glasshousemorphology, that is, the upper leaves
of which have developed into large semitranslucent leafy
bracts that cover the inorescences, which have been shown
to have signicant ecological benets to the plant (Song
et al., 2015). We integrated an additional ve available
genomes of alpine plants [Crucihimalaya himalaica (Brassi-
caceae) (Zhang et al., 2019), Eutrema heterophyllum (Bras-
sicaceae) (Guo et al., 2018), Hordeum vulgare var. nudum
(Poaceae) (Zeng et al., 2015), Prunus mira (Rosaceae) (Wang
et al., 2021) and Salix brachista (Salicaceae) (Chen et al.,
2019)] as well as their lowland relatives for comparative
genomic analyses (Table S1). These seven alpine species
represent major clades of angiosperms that independently
colonized highaltitude environments. Different from previous
case studies of plant genomes, we take advantage of a
comprehensive genomic data set of alpine plants to
characterize genomewide signatures of convergent evolu-
tion. Specically, we intend to address three main questions:
(i) Do expanded or contracted gene families show convergent
patterns in alpine plants and effect highaltitude adaptation?
(ii) Determine which genes have undergone convergent mo-
lecular evolution and are involved in adaptation to freezing,
high UV radiation and hypoxia environments? (iii) Last, what
are the genomic bases underlying the adaptive greenhouse
morphology? In addressing these questions, our study allows
for comprehensive genomic insights into convergent adap-
tation to highaltitude environments.
RESULTS AND DISCUSSION
Assembly and annotation of two reference genomes
of alpine plants
Using a kmer analysis method, we rst estimated the ge-
nome size of S. obvallata and R. alexandrae to be ~2,251 and
~2,137 Mb, respectively (Figures S1,S2;Table S2). We then
generated a chromosomelevel genome of S. obvallata and a
contiglevel genome of R. alexandrae using Illumina, Oxford
Nanopore, and highthroughput chromatin conformation
contact (HiC) sequencing technologies (Table S3). For the
S. obvallata genome, in total ~95 Gb Illumina short reads,
~143 Gb Nanopore long reads and ~206 Gb HiC data were
obtained. For the R. alexandrae genome, in total ~104 Gb
Illumina short reads and ~144 Gb Nanopore long reads were
generated. Additionally, transcriptomic data for R. alexandrae
(~25 Gb) and S. obvallata (~157 Gb) were obtained for
transcriptbased gene annotation (Table S3). Tissuespecic
transcriptomic data for S. obvallata were also used for further
gene expression analysis (Table S4).
Ade novo assembly pipeline allowed us to achieve initial
genome assemblies that captured 2,044 and 2,040 Mb in 145
and 129 contigs for S. obvallata and R. alexandrae genomes,
with contig N50 of 36.96 and 36.32 Mb, respectively
(Table 1,S5). Using a HiC assisted assembly pipeline, 1,952
Mb which accounted for 95.5% of the assembled S. obvallata
genome were anchored on 16 chromosomes (Tables 1,
S6;Figure S3), in line with previous cytological evidence
(Fujikawa et al., 2004). We further evaluated the completeness
of the assembled genomes and found high completeness rates
(94.6% of S. obvallata and 96.6% of R. alexandrae) of both
assemblies as supported by BUSCO (Benchmarking Universal
SingleCopy Orthologs) assessments using the Embry-
ophyta_odb10 database (Tables 1,S7,S8)(Manni et al., 2021).
The long terminal repeat (LTR) assembly index (LAI), which
evaluates the contiguity of intergenic and repetitive regions of
genome assemblies based on the intactness of LTR retro-
transposons (Ou and Chen, and Jiang, 2018), was 19.68 for S.
obvallata and was 4.97 for R. alexandrae, respectively.
Transposable elements (TEs) and other repeat sequences
accounted for 81.88% and 81.65% of the S. obvallata
and R. alexandrae assemblies, respectively (Table 1).
In S. obvallata, LTR retrotransposons (43.95%), followed by
Alpine plant genomes Journal of Integrative Plant Biology
2Month 2023
|
Volume 00
|
Issue 00
|
116 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DNA TEs (2.03%) and LINEs (0.93%), were most abundant,
with LTRGypsy and LTRCopia retrotransposons accounting
for 18.19% and 25.76% of the LTRs, respectively (Table S9).
LTR retrotransposons (44.31%), LINES (2.73%), and DNA
TEs (4.08%) accounted for most of the R. alexandrae repeats,
with LTRGypsy (37.98%) and LTRCopia (6.33%) retro-
transposons predominant among the LTRs (Table S10). A
combination of transcriptbased, de novo and homology
based prediction methods yielded 37,938 and 61,463 high
condence proteincoding gene models (Table 1;Figures
S4,S5). By comparing with public protein databases, in total
36,542 (96.32%) and 47,535 (77.34%) predicted genes of
S. obvallata and R. alexandrae were functionally annotated
(Table S11). For the annotated genes, 93.5% and 96.1% of
the complete BUSCO genes in the Embryophyta_odb10 da-
tabase could be identied in S. obvallata and R. alexandrae
(Tables S7,S8), respectively. Overall, our newly assembled
and annotated genomes of S. obvallata and R. alexandrae are
high quality, providing valuable genomic resources for un-
derstanding the convergent adaptation of alpine plants to
highaltitude environments.
Convergent changes in gene family number
We downloaded annotated protein sequences from genomes
of ve additional alpine species as well as 13 representative
sister species living in low elevations (Table S1). Included
species were phylogenetically placed in seven families of an-
giosperms, including 17 eudicots and three monocots. In total
6,711 gene orthologs were identied in all 20 species (Table
S12), of which 195 singlecopy orthogroups were used for
phylogeny reconstruction. The obtained phylogeny was con-
sistent with the known phylogenetic relationships within an-
giosperms, in which the alpine taxa included in our study oc-
curred in seven independent lineages placed in six families, and
the time tree inferred from MCMCtree showed a wide di-
vergence history between alpine species and their sampled
sisters, ranging from 2 million years ago (Mya) to 31.81 Mya
(Figure 1A). We then determined convergent changes in gene
family number when a gene family showed signicant ex-
pansion or contraction in more than three alpine species. In
total 56 convergently expanded (CoEx) gene families were
identied. Biological Process (BP) of Gene Ontology (GO) and
Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses
of CoEx families found 68 signicantly enriched GO terms and
seven KEGG pathways (Padjust <0.05) (Tables S13,S14).
Enriched pathways of CoEx gene families were mainly related
to abiotic resistance, such as response to hypoxia, regulation of
hormone levels and hormone transports (Figure 1C, D;Tables
S13,S14). Intriguingly, we found a greater number of con-
vergently contracted (CoCo) families than CoEx families, with
1,193 gene families convergently contracted, involving 390
signicantly enriched GO terms and 27 KEGG pathways
(Padjust <0.05) (Tables S15,S16). Most of these pathways
included genes involved in the response to biotic stresses, such
as defense against pathogens and toxicants (Tables S15,S16).
A unique stress at high altitudes is hypobaric hypoxia
(Beall, 2014). In plants, an oxygen deciency dramatically
reduces the efciency of cellular ATP production, which has
diverse ramications for cellular metabolism and devel-
opmental processes (Fukao and BaileySerres, 2004). Var-
ious oxygensensing mechanisms have been described that
are thought to trigger plant responses to lowlevel oxygen
and thus adaptation to high altitudes (Abbas et al., 2022). In
our analysis, we found that many signicantly enriched GO
terms of CoEx families were functionally related to the re-
sponse to oxygen levels, such as response to hypoxia, re-
sponse to decreased oxygen level and response to hydrogen
peroxide (Figure 1C). These pathways included genes en-
coding alcohol dehydrogenase (ADH) (Peng et al., 2001), and
the HSP20like chaperones superfamily proteins that are
functionally enriched in the GO term Cellular response to
hypoxia. In A. thaliana, the oxygensensing system is medi-
ated by the plant cysteine oxidase (PCO) Ndegron pathway
substrates group VII ethylene response factors (ERFVIIs)
(Licausi et al., 2011), which are involved in modulating eth-
ylene response activating the expression of ADH1 (Yang
et al., 2011). While we did not discover the convergent ex-
pansion of ERFVIIs in alpine plants, we found genes that are
signicantly enriched GO terms related to response to eth-
ylene, such as ADH1, ERF1 (ETHYLENE RESPONSE
FACTOR 1) and EER1 (ENHANCED ETHYLENE RESPONSE
1). Nonetheless, the expansion of genes involved in the re-
sponse to hypoxia is necessary for the adaptation of alpine
plants to lowlevel oxygen in highaltitude environments.
In addition, multiple CoEx families were found to be sig-
nicantly enriched in plant hormone pathways. Examples in-
clude genes encoding the probable indole3pyruvate mono-
oxygenases (YUC) involved in auxin biosynthesis (Cao et al.,
2019), and the small auxin upregulated RNAs (SAUR) and the
D6 protein kinase (D6PK) involved in auxin polar transport (van
Berkel et al., 2013). Auxin regulates a series of developmental
processes such as apical dominance, plant organogenesis,
and reproductive development by affecting cell growth, dif-
ferentiation, and patterning (Mockaitis and Estelle, 2008).
Other developmental regulation pathways including leaf sen-
escence, phototropism and plant organ senescence were also
Table 1. Statistics of genome assembly and annonta-
tion of Saussurea obvallata and Rheum alexandrae
Statistic S. obvallata R. alexandrae
Total length (bp) 2,044,030,733 2,039,881,226
Number of contigs 145 129
Largest contig (bp) 126,457,859 160,815,976
Anchored length (bp) 1,951,503,694
GC (%) 37.94 41.41
Contig N50 (bp) 36,958,263 36,323,674
Complete BUSCOs (%) 94.6 96.6
Repeat content (%) 81.88 81.65
Number of genes 37,938 61,463
BUSCO, Benchmarking Universal SingleCopy Orthologs. GC (%),
the percent of Guanine and Cytosine.
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
116 3
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
detected to be signicantly enriched in CoEx families. This
result, coupled with the commonly observed morphological
divergence between alpine plants and lowland relatives (Sun
et al., 2014), indicates that morphological regulation can be a
pivotal path for plant adaptation to highaltitude environments.
Convergent contractions of diseaseresistance genes
Several CoCo families in alpine plants were found to be
functionally related to response to pathogens or toxicants,
involving GO terms of response to toxic substances
and oomycetes, xenobiotic transport, and detoxication
Figure 1. Evolutionary history of alpine plants
(A) Chronogram showing divergence times among alpine plants (cyan background) with their lowland relatives (orange background) with node age and
95% condence intervals (blue bars). The red and blue numbers above the branches represent signicant expansion and contraction events, respectively.
(B) Bar plot showing gene number identied by OrthorFinder, including unassigned genes which were not put into an orthogroup with any other genes,
genes in orthogroup, and genes in speciesspecic orthogroups which consist entirely of genes from one species. (C) Signicantly enriched Gene Ontology
(GO) terms for convergently expended (CoEx) gene families in alpine plant genomes. Each bubble represents a summarized GO term from the full GO list by
reducing functional redundancies, and their closeness on the plot reects their closeness in the GO graph, that is, the semantic similarity. (D) Kyoto
Encyclopedia of Genes and Genomes (KEGG) pathways of CoEx gene families in alpine plant genomes. padjrefers to the adjusted Pvalue using the
BenjaminiHochberg method.
Alpine plant genomes Journal of Integrative Plant Biology
4Month 2023
|
Volume 00
|
Issue 00
|
116 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
(Tables S15,S16). Related gene families include the cysteine
rich receptorlike kinases (CRKs), a large subfamily of
receptorlike protein kinases (RLKs) that play vital roles in
defense responses and programmed cell death in plants
(Chen et al., 2004), the malectinlike receptor kinases (MLRs)
orchestrating the plant immune responses and the accom-
modation of fungal and bacterial symbionts (OrtizMorea
et al., 2022), the multidrug and toxic compound extrusion
(MATE) proteins involved in xenobiotic detoxication and
multidrug resistance (Diener et al., 2001), as well as the
ubiquitinconjugating (E2) enzymes emerging in recent years
as an important regulatory factor underlying plant innate im-
munity (Zhou et al., 2017). Moreover, the results also showed
that genes encoding receptors for tyrosinesulfated glyco-
peptide (PSY1Rs) and phytosulfokine (PSKRs), belonging to
the leucinerich repeat receptor kinases (LRRRKs), have
been undergoing contraction in all alpine species. PSKR1
and PSY1R have been shown to be involved in plant im-
munity, with antagonistic effects on bacterial and fungal re-
sistances (Mosher et al., 2013).
The largest diseaseresistance genes comprise genes
encoding nucleotidebinding site and leucinerichrepeat
domain receptors (NBSLRRs). Three NBSLRR gene sub-
classes, TIRNBSLRR (TNL), CCNBSLRR (CNL), and
RPW8NBSLRR (RNL), have been characterized based on
the Nterminal domains (McHale et al., 2006). We manually
annotated NBSLRR genes in the sampled genomes using
the HMMER search with the Pfam database (Wheeler and
Eddy, 2013). In total 5,655 NBSLRR genes, including 4,058
CNLs, 1,024 TNLs and 173 RNLs were identied among all
analyzed genomes (Figure S6). Additionally, we inferred the
NBSLRR gene tree to examine the gains and losses of NBS
LRR genes in alpine species. The results showed that,
compared with lowland relatives, most alpine species tend to
lose more NBSLRR genes and exhibit a reduced copy
number, while E. heterophyllum,S. obvallata and R. alexan-
drae have similar numbers compared with their closest rela-
tives (Figure S7). In addition, genes functioning in cellular
transport pathways, including cytoskeleton organization,
actin lamentbased process, export across the plasma
membrane and export from the cell, were shown to be con-
tracted in alpine plants (Figure 1D;Table S15). These proc-
esses may be a component of the plant immune system and
possibly have undergone simplication due to the pathogen
depauperate environments of high altitudes.
A similar phenomenon was also described in the case
study of the C. himalaica genome, in which the most sig-
nicantly contracted gene families were functionally enriched
for disease and immune response pathways (Zhang et al.,
2019). Due to the harsh environments characterized by
freezing temperatures, aridity, and high UV radiation, a rea-
sonable hypothesis is that gene families involved in pathogen
or toxicant defense have undergone contraction in alpine
plants, as fewer microorganisms exist. These results suggest
that the contraction of immune system genes might be an
energysaving strategy to mitigate genetic loads for surviving
in hostile environments with few pathogens present. In con-
trast, the contraction of diseaseresistance genes also im-
plies that alpine plants may not adapt to the comfortable
environments of lowland areas where normal pathogens are
present. Therefore, with a faster pace of global warming
leading to the loss or destruction of alpine habitats, in situ
conservation for alpine biodiversity is necessary and pref-
erable to exsitu preservation.
Tests for convergent positive selection
In harsh environments, positive selection is expected to be
common in genes controlling early life history stages, such as
genes involved in reproduction and development (Cui et al.,
2019). Our branchsite tests identied 36 convergently se-
lected genes that show signatures of positive selection in
more than three alpine species (Table S17). These genes
were functionally related to basic life processes involving
reproduction and respiration, such as carpel, gynoecium,
ovule and endosperm development and photorespiration
pathways (Tables S18,S19). Examples include the SE-
PALLATA (SEP) MADSbox genes required in oral organ and
meristem identity (Pelaz et al., 2000), the MALE MEIOCYTE
DEATH 1 (MMD1) gene regulating cellcycle transitions
during male meiosis (Yang et al., 2003), the NIJMEGEN
BREAKAGE SYNDROME 1 (NBS1) gene involved in double
stranded break repair, DNA recombination and maintenance
of telomere integrity in the early stages of meiosis (Zhang
et al., 2006), and the HYDROXYPYRUVATE REDUCTASE
(HPR) gene localized in leaf peroxisomes functioning in the
glycolate pathway of photorespiration (Mano et al., 1999). In
addition, glyoxylate and dicarboxylate metabolism, the most
signicantly enriched KEGG pathways (Table S19), are fun-
damental biochemical processes that ensure a constant
supply of energy to living cells. These convergently selected
genes detected in our analyses are likely to contribute to the
primary adaptation of alpine plants to similar extreme envi-
ronments.
Detection of molecular convergence
We investigated signatures of genes undergoing molecular
convergence among alpine species using a combination of
approaches for the detection of convergent evolutionary rate
shifts and sitebased estimation of convergent amino acid
(AA) evolution. These approaches have been commonly used
to investigate the genomic signatures of convergent evolu-
tion, including convergent adaptation to seasonal habitat
desiccation in African killishes (Cui et al., 2019), convergent
regulatory evolution and loss of ight in paleognathous birds
(Sackton et al., 2019), and convergent evolution of extreme
lifespan in Pacic Ocean rockshes (Kolora et al., 2021).
Convergent shifts in evolutionary rates were detected using
the RERconverge method (Kowalczyk et al., 2019), which
estimates the correlation between relative evolutionary rates
(RERs) of protein sequences and the evolution of a con-
vergent binary or continuous trait across a phylogeny. Our
analysis focused on positive correlations, representing genes
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
116 5
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
with faster evolutionary rates in alpine species relative to
species living in low elevations. An increased RER could arise
due to the relaxation of a constraint or positive selection,
which could be adaptive to habitatrelated changes
(Kowalczyk et al., 2020). We identied 69 gene families un-
dergoing convergently accelerated RERs in alpine species,
signicantly enriched in 93 GO terms and 12 KEGG pathways
(Padjust <0.05) (Tables S20,S21), involving pathways for
selfincompatibility, cell wall modication, DNA repair and
stress resistance (Figure 2). Among them, 26 gene families
were found to have convergent AA shifts by a PCOC analysis
(Figure S8). The PCOC method considers shifts in AA pref-
erence instead of identical substitutions (Rey et al., 2018).
Given the relatively large phylogenetic distances among our
analyzed species, selecting only sites that converged to the
exact same AA in all species is quite strict and is bound to
capture only a subset of the substitutions associated with the
convergent trait change. Below, we demonstrate three main
aspects of BPs that have undergone molecular convergence
contributing to the adaptation to high altitudes.
Selfincompatibility (SI) system
Selfincompatibility in many owering plants is controlled by
the S(sterility) locus (Takayama and Isogai, 2005). The loss of
SI genes in A. thaliana is responsible for the evolutionary
transition to the selffertile mating system (ShermanBroyles
et al., 2007). Although selffertilization is often thought to lead
to the decreased tness of homozygous offspring, this mode
ensures reproduction in the absence of pollinators or suitable
mates, and therefore can be advantageous for plants to oc-
cupy niches in harsh environments (Goodwillie et al., 2005).
Our results revealed the biggest orthogroup (OG0000000),
which includes the Arabidopsis Sreceptor kinase (SRK)
genes and the Slocus anking gene ARK3 (RECEPTOR
KINASE 3), undergoing evolutionary rate acceleration and
convergent AA shifts in alpine species (Figures 3,S9). Gene
Ontology analysis showed that these genes were functionally
enriched for the process of reproduction (recognition of
pollen, recognition of pollen, and pollination) and the immune
system (immune response). Furthermore, we identied the
functional domain using NCBI's conserved domain database
(CDD) (Lu et al., 2020). The result showed that these proteins
contained the Slocus glycoprotein domain (Pfam00954),
conrming their functions in the SI system. Two sites located
on the Slocus glycoprotein domain were found to have un-
dergone a convergent AA shift (Figure S9).
Loss of function at the Slocus in alpine Crucihimalaya
genomes, possibly due to relaxed selection, has been re-
ported (Zhang et al., 2019;Feng et al., 2022), with a similar
phenomenon found in the highaltitude Andes maca (Lepi-
dium meyenii) genome (Zhang et al., 2016). Our branchsite
tests did not detect any signatures of positive selection on
these genes, suggesting that the acceleration of evolutionary
rates may be the result of relaxed constraints. Relaxation of
selection on the Slocus genes was tested using the RELAX
model (Wertheim et al., 2015), which estimates the relaxation
Figure 2. Function enrichment of gene families undergoing convergently evolutionary rate acceleration in alpine plant genomes
(A) Signicantly enriched Gene Ontology (GO) terms. Each bubble represents a summarized GO term from the full GO list by reducing functional
redundancies, and their closeness on the plot reects their closeness in the GO graph, that is, the semantic similarity. (B) Top 20 enriched Kyoto
Encyclopedia of Genes and Genomes (KEGG) pathways. padjrefers to the adjusted Pvalue using the BenjaminiHochberg method.
Alpine plant genomes Journal of Integrative Plant Biology
6Month 2023
|
Volume 00
|
Issue 00
|
116 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
intensity parameter k, with k>1 indicating intensied
selection (i.e., positive or purifying selection) and k<1in-
dicating relaxed selection. The results showed that ve
(C. himalaica,E. heterophyllum,H. vulgare var. nudum,
S. brachista and S. obvallata) of the seven alpine plants ex-
hibited signicantly relaxed selection on Slocus genes
(Pvalue <0.05; Table S22). Bingham and Ort (1998) reported
that low levels of insect diversity, abundance and activity
often occurred in alpine ecosystems and hypothesized that
these factors might limit the pollination of alpine plants.
Species living in isolated habitats like alpine environments or
ocean islands are thus less likely to be SI, in line with Baker's
Law,which assumes that pollen limitation may be an im-
portant force driving the transition of mating systems
(Cheptou, 2012). Pollination biology studies have shown
several cases of autonomous selng in various taxonomically
distant species within HHM communities, although the pro-
portion of selfpollinated species has not yet been calculated
to test the hypothesis (Sun et al., 2014). The convergent
acceleration of evolutionary rates of Slocus genes due to
Figure 3. Four representative cases of alpineaccelerated genes
Signatures of convergent evolutionary rate shifts were detected using the RERconverge method (P<0.05, Wilcoxonrank sum test). Each panel represents
the estimated relative evolutionary rates (RERs) for a gene in alpine species (point in cyan labeled with species names) as well as in their lowland relatives
(points in orange labeled with species names) and ancestral branches (plotted in a single row at the base of the yaxis). A gene's RER for a given branch
represents how quickly or slowly the gene is evolving on that branch relative to its overall rate of evolution throughout the tree.
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
116 7
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
relaxed selection provides convincing genetic evidence of the
evolutionary transition from SI to the selfcompatibility mating
system of alpine plants, which is potentially a convergently
adaptive process for alpine plants to facilitate their re-
production and the occupation of alpine niches.
Cell wall modication
The cuticle membrane lies over and merges into the outer
wall of epidermal cells (Martin and Juniper, 1970). The pri-
mary role of the cuticle, composed of cutin and cuticular
waxes, is to mitigate water loss and excessive UV radiation
by functioning as a physical barrier between the plant surface
and its external environment (Kerstiens, 1996). Cuticular
waxes are composed of a variety of organic solventsoluble
lipids, consisting of verylongchain (VLC) fatty acids and
their derivatives, as well as secondary metabolites like a-
vonoids (Pollard et al., 2008). The cell wall modication
pathway was found to be signicantly enriched in genes
undergoing convergent positive selection (Table S18). Addi-
tionally, in the examination of convergent changes in gene
family number, we found the most signicantly enriched
KEGG pathway of coexpanded genes to be cutin, suberine
and wax biosynthesis (Table S14), corresponding to the en-
riched suberin biosynthetic process in the GO analysis (Table
S13), including genes encoding fatty acylCoA reductases
(FARs). Fatty acylCoA reductases catalyze the formation of
fatty alcohols, which are common components of plant sur-
face lipids (i.e., cutin, suberin, and associated waxes). We did
not detect convergent acceleration of the evolutionary rate of
FARs, suggesting possible modications of FARs through the
increase in gene copy number. The results show that an al-
dehyde decarbonylase enzyme CER1 underwent a con-
vergent acceleration of the evolutionary rate in alpine plants.
Overexpression of the A. thaliana CER1 gene was reported to
promote wax VCL alkane biosynthesis and inuences plant
response to biotic and abiotic stresses (Bourdenx et al.,
2011). The convergent evolution of genes involved in the
cutin, suberine and wax biosynthesis implies that cuticular
waxes may function as protective screens against UV radia-
tion to protect anatomical structures and dissipate excess
light energy for alpine plants.
In addition, we detected positive selection and molecular
convergence of three MYB transcription factors, including
MYB27, MYB48, and MYB59 (Table S17). Among them,
MYB27 was reported to play a role in regulating the accumu-
lation of anthocyanins (Albert et al., 2014), a class of avonoids.
In the STRING database (Szklarczyk et al., 2021), MYB48 was
predicted to interact with proteins that are involved in avonoid
biosynthesis, including F3H (naringenin, 2oxoglutarate 3
dioxygenase), catalyzing the 3βhydroxylation of 2S
avanones to 2R,3Rdihydroavonols which are intermediates
in the biosynthesis of avonols and anthocyanidins, FLS1
(avonol synthase/avanone 3hydroxylase), catalyzing the
formation of avonols from dihydroavonols, DFR
(dihydroavonol reductase), catalyzing the conversion of
dihydroquercetin to leucocyanidin, and TT5, a member of
chalconeavanone isomerase family protein (Figure S10;
Table S23). Flavonoids function as antioxidants that reduce
DNA damage induced by abiotic stresses such as extreme
temperatures, UV radiation and drought, and thus play critical
roles in species adapting to highaltitude environments (Agati
et al., 2012). Many examples of whole genome studies of alpine
plants have shown that expansion and/or positive selection of
genes involved in avonoid biosynthesis constitute an im-
portant part of the genomic footprint of alpine adaptation (Zeng
et al., 2015;Chen et al., 2019;Wang et al., 2021). Taken to-
gether, modications of the cell wall in alpine plants through
evolutionary expansion and adaptive convergence of genes
involved in the biosynthesis of cuticular waxes and avonoids
might be vital strategies for adaptation to dramatic weather
changes and extensive UV radiation in high altitudes.
With the newly generated transcriptomic data of S. ob-
vallata, we were able to investigate expression patterns of
genes related to the biosynthesis of cuticular waxes and
avonoids. Five tissues with three biological replicates, in-
cluding three from leaves (basal leaves [JL], middle leaves
[ML], and bract leaves [BL]) and two from owers and stems,
were sampled (Figure 4A;Table S3). After mapping the RNA
seq data to the assembled genome of S. obvallata, 25,096
genes had expression proles and were retained for differ-
ential gene expression (DEG) analysis. The results show that,
compared with basal and ML, BL exhibit 1,071 signicant
upregulated genes and 798 signicant downregulated genes
(Figure S11). Upregulated genes were signicantly enriched
in cytochrome P450, cutin, suberine and wax biosynthesis
and isoavonoid biosynthesis KEGG pathways (Figure 4B),
and downregulated genes were functionally related to pho-
tosynthesis, energy metabolism as well as plantpathogen
interaction pathways (Figure S12). Furthermore, we analyzed
expression proles of genes involved in the biosynthesis of
cuticular waxes and avonoids. The results showed that
many genes had higher expression levels in BL than in other
leaf tissues. For example, CER1, CER3, CER4, and MAH1 in
the cuticular wax biosynthetic pathway (Figure 4C), and 4CL,
CHS, CHI, F3H, TT7 and OMT in the avonoid biosynthetic
pathway (Figure 4D). These results suggest that the accu-
mulation of cuticular waxes and avonoids is an important
genetic pathway from normal leaves to leafy bracts. Our
ndings provide new insights into the genetic basis of the
specialized glasshousemorphology for a better under-
standing of plant morphological adaptation.
DNA repair and stress resistance pathways
The hypoxia and intense UV radiation in alpine environments
exert high abiotic stress that can cause DNA, RNA, and protein
damage. DNA repair processes thus play an important role in
the highaltitude adaptation of plants, similar to evidence in
alpine animals (Li et al., 2018). The NBS1 gene, involved in DNA
repair, cellular response to DNA damage stimulus and double
stranded break repair, was found to be under convergent se-
lection in alpine genomes (Table S17). Moreover, several sig-
nicantly enriched GO terms related to DNA repair and protein
Alpine plant genomes Journal of Integrative Plant Biology
8Month 2023
|
Volume 00
|
Issue 00
|
116 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
ubiquitination were detected undergoing convergent evolution
in alpine plants (Figure 2A,Table S17). Similar KEGG pathways
were also identied, including mismatch repair, homologous
recombination and nucleotide excision repair (Figure 2B,Table
S18). Examples include the UEV1 genes, enriched in protein
K63linked ubiquitination pathway that reportedly play a role in
DNA damage responses and errorfree postreplicative DNA
repair by participating in lysine63based polyubiquitination re-
actions (Wen et al., 2008), and genes encoding OBfold pro-
teins which were found to be important for genomic stability
including DNA replication, recombination, repair, and telomere
homeostasis (Flynn and Zou, 2010).
In addition to low oxygen levels and excessive UV radia-
tion, highaltitude environments pose various threats to living
organisms from an unpredictable climate. We found that
many genes with convergent evolutionary rate shifts were
signicantly enriched in stress resistance pathways, such as
genes involved in the hormone signal transduction (cell rec-
ognition, intracellular signal transduction, and cytokinin
activated signaling pathway), the cell rhythm system (circa-
dian rhythm, cellcycle phase transition, and programmed
cell death), and the regulation of enzyme activity pathways
(regulation of kinase activity, regulation of GTPase activity,
and regulation of transferase activity) (Figure 2A,Table S17).
Also, some KEGG pathways were found to be signicantly
enriched in metabolic pathways that may contribute to stress
resistance, such as plant hormone signal transduction, phe-
nylpropanoid biosynthesis, and glycine, serine, and threonine
Figure 4. Highly expressed genes in bract leaves of Saussurea obvallata
(A) Sampling information of S. obvallata transcriptome data. Five tissues including three from leaves, basal leaves (JL), middle leaves (ML) and bract leaves
(BL) as well as two from owers (F) and stems (S), were sampled. (B) Top 20 KEGG pathways of signicantly upregulated genes in bract leaves. padj
refers to the adjusted Pvalue using the BenjaminiHochberg method. Expression proles of genes involved in cuticular wax (C) and avonoid biosynthetic
pathways (D) in different tissues of S. obvallata. High expressed genes in BL are shown in red in the simplied pathway models. The bar represents the
gene expression level of each gene (zscore) (4CL, 4coumarate: CoA ligase; CER, protein eceriferum; CHI, chalcone isomerase; CHS, chalcone synthase;
DFR, dihydroavonol 4reductase; F3H, avanone 3hydroxylase; F3H, avonoid 3″‐hydroxylase; FACoA, fatty acylcoenzyme A; FAR, fatty acid syn-
thetase; FLS, avonol synthase; LDOX/ANS, leucoanthocyanidin dioxygenase/anthocyanidin synthase; MAH1, midchain alkane hydroxylase 1; OMT,
Omethyltransferase; TT7, transparent testa 7; VLC, verylongchain; WSD1, diacylglycerol acyltransferase 1).
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
116 9
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
metabolism (Figure 2B;Table S18). Plant hormones like cy-
tokinin and ethylene play pivotal regulatory roles in plant
growth and development, including cell division, shoot ini-
tiation, light responses, and leaf senescence. We found that
the typeAArabidopsis response regulator (ARR) protein
family, involved in response to cytokinin and cytokinin signal
transduction, undergoes molecular convergence in alpine
plants. An experimental study revealed that arr mutants show
altered redlight sensitivity, indicating an important role of
typeA ARRs in light adaptation (To et al., 2004). The circa-
dian clock was selected as a mechanism in the control of
cellcycle progression to avoid sunlightinduced DNA
damage in ancient unicellular organisms (Hut and Beersma,
2011). Circadian rhythms in plants regulate multiple proc-
esses such as photosynthesis, owering, seed germination,
and senescence (Srivastava et al., 2019). Genes involved in
the regulation of the circadian rhythm were found to have
convergently accelerated evolutionary rates in alpine plants,
such as the transcription factor MYB59, which participates in
the regulation of the cell cycle, mitosis and root growth by
controlling the duration of metaphase. The myb59 mutant
was found to have longer roots, smaller leaves and smaller
cells than wildtype plants (Fasani et al., 2019), which are
commonly observed in alpine plants. Taken together, these
results suggested that adaptation to high altitude requires the
participation of multiple BPs.
Summary and perspectives
Taking advantage of an integrative genomic data set of alpine
plants, our study unraveled the convergent genetic changes
that confer highaltitude adaptation (Figure 5). Both molecular
convergence in changes of gene copy number and accel-
erated evolutionary rates are consequences of selective
pressures posed by the surrounding environment. Thus, ge-
nomic signatures of convergent evolution detected here are
direct evidence for alpine plants associated with in-
dependently colonizing, evolving and adapting to extreme
cold, high UV radiation and hypoxia environments of the
HHMs. The alpine plants included in this study belong to
taxonomically distant plant families, hence standing genetic
variation and localized introgression of regions of the genome
can be ruled out as probable causes of genomic
Figure 5. Summary of convergent adaptation to highaltitude environments for seven alpine species
The outer circle shows examples of enriched Gene Ontology (GO)terms (biological pathways). The middle circle shows examples of candidate genes
named after the A. thaliana orthologs. The inner gray circle shows environmental stresses from high altitudes. Species names are provided below the
picture. *Genes having undergone convergent contraction.
Alpine plant genomes Journal of Integrative Plant Biology
10 Month 2023
|
Volume 00
|
Issue 00
|
116 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
convergence. Identical de novo mutations must therefore
have occurred independently in each taxon during evolution
from a lowaltitude ancestor. In addition to molecular con-
vergence, some results that do not cover all alpine plants
may imply the existence of speciesspecic adaptation
strategies, which provide the foundation for further inves-
tigations into highaltitude adaptation. Notably, our study is
limited by a lack of functional validation on the identied
molecular changes since transgene experiments are yet un-
realistic for most alpine plants. Nonetheless, verifying the role
of MYB transcription factors in the development of speci-
alized glasshousemorphology by quantitative realtime
PCR is feasible in future studies. Collectively, our results
generated novel insights into genomic bases of adaptation to
extreme environments at hightaxonomic levels of angio-
sperms, while providing valuable genomic resources for
plants living in specialized habitats. Further genomic study of
highaltitude adaptation should provide detailed evidence
related to morphological and physiological specializations
using updated data, such as proteome and metabolome,
while also referring to discoveries from phylogenetically dis-
tant taxon to evaluate the evolutionary convergence in similar
environments.
MATERIALS AND METHODS
Plant material, DNA extraction, and sequencing
Fresh leaves of wild S. obvallata and R. alexandrae individuals
were collected from Songpan county, Sichuan Province, China
(102°45E, 30°23N) and Linzhi county, Tibet Province, China
(102°45E, 30°23N), respectively. All samples were sent to
Wuhan Benagen Technology Company Limited (Wuhan,
China) for genome sequencing. Total genomic DNA was ex-
tracted using the Qiagen DNeasy Plant Mini Kit. For the Illu-
mina short reads, DNA libraries with 500bp insert sizes were
constructed and sequenced using an Illumina HiSeq 4000
platform (Illumina Inc. USA). For Oxford Nanopore sequencing,
libraries were prepared using the SQKLSK109 ligation kit using
the standard protocol, and the puried library was loaded onto
primed R9.4 SpotOn Flow Cells and sequenced using a
PromethION sequencer (Oxford Nanopore Technologies,
Oxford, UK). The HiC sequencing was performed as follows:
extracted DNA was rst crosslinked by 40 mL of 2% form-
aldehyde solution to capture interacting DNA segments, the
chromatin DNA was then digested with the DpnII restriction
enzyme, and libraries were constructed and sequenced using
Illumina HiSeq 4000 instrument with 2 ×150 bp reads.
For RNA sequencing, fresh tissue samples were collected
and immediately frozen in liquid nitrogen. Three biological
replicates of ve tissues of S. obvallata were sampled. Total
RNA was extracted using the TRIzol® Reagent (Invitrogen,
Shanghai, China). Pairedend cDNA libraries were con-
structed using TruSeq Stranded mRNA Library Prep Kit
(Illumina) and were sequenced using the Illumina HiSeq
4000 platform.
Genome assembly and quality control
Prior to genome assembly, Kmerfreq (https://github.com/
fanagislab/kmerfreq) was used for counting kmer frequency
with the kmer set to 19, and GCE v1.0 was used for esti-
mating the genome size (Liu et al., 2013). The ONT long reads
were corrected and assembled using NextDenovo v2.3.1
(https://github.com/Nextomics/NextDenovo) with default pa-
rameters. Assembled contigs were polished using NextPolish
v1.4.1 (Hu et al., 2020) with the long reads and Pilon v1.23
(Walker et al., 2014) with Illumina short reads for three rounds.
HiC data were used to infer chromosome conformation using
the 3DDNA pipeline with default parameters. The accuracy of
the HiCbased chromosomal assembly was improved using
Juicebox' s chromatin contact matrix (Dudchenko et al., 2017).
The completeness and continuity of the assemblies were
evaluated by the statistics of BUSCO and LAI, respectively.
Genome annotation
Repeat annotation
TEs were identied based on de novo and homologybased
strategies. RepeatMasker v4.0.7 was used to run a homology
search for known repeat sequences against the Repbase
database v22.11 (TarailoGraovac and Chen, 2009). Re-
peatModeler v2.0.10 was used to predict the TEs based on
the de novo method (Flynn et al., 2020). LTRharvest v1.5.10,
LTR_FINDER v1.05 and LTR_retriever v1.8.0 were used to
build an LTR library with default parameters (Xu and Wang,
2007;Ellinghaus et al., 2008;Ou and Jiang, 2018). Finally,
RepeatMasker was used to merge the library les of the two
methods and to identify the repeat contents.
Gene prediction
A combination of de novo, homology, and transcriptbased
methods were used for gene prediction in both genomes. RNA
seq reads were assembled using Trinity v2.1.1 using the de
novobased and genomeguided modes, respectively. Coding
DNA sequences (CDS) and protein sequences were predicted
with TransDecoder (http://transdecoder.github.io). Homologues
were predicted by mapping protein sequences using GeMoMa
v1.6.1 (Keilwagenetal.,2016). Sequences of Arabidopsis
thaliana, Oryza sativa and Solanum tuberosum were mapped to
both genomes. Additionally, sequences of Helianthus annuus,
Lactuca sativa, Cynara cardunculus and Mikania micrantha
were mapped to S. obvallata, and sequences of Fagopyrum
tataricum and Rumex hastatus were mapped to R. alexandrae,
respectively. A de novo gene prediction was performed with
Braker2 v2.1.5 and GlimmerHMM v3.0.4 (Majoros et al.,
2004;Bruna et al., 2021). Assembled transcripts were used for
training gene models in Braker2. Gene models from the three
main sources were merged to produce consensus models
using EVidenceModeler v1.1.1 (Haas et al., 2008).
Gene functional annotation
The annotated proteincoding genes were used for a BLAST
search against the UniProt and NCBI nonredundant protein
databases to predict gene functions. The functional domains
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
116 11
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
of protein sequences were identied by InterProScan v5.51
85.0 using data from Pfam (Jones et al., 2014). Kyoto Ency-
clopedia of Genes and Genomes and GO terms of the
gene models were obtained using eggNOGmapper v2.0.1
(Cantalapiedra et al., 2021).
Orthogroup inference and alignment
Protein sequences from the seven alpine genomes, as well as
13 relatives living at low altitudes, were used for subsequent
comparative analyses. OrthoFinder v2.5.4 was used to construct
orthogroups for all species with default settings (Emms and
Kelly, 2019). Because the included species are phylogenetically
distant (widely dispersed across the eudicots and monocots),
OrthoFinder recovered an extremely low number of strictly
singlecopy orthogroups. We therefore reduced orthogroups
that contained multiple gene copies per species to one copy per
species based on the smallest genetic distance following the
study of Birkeland et al. (2020).Briey, all orthogroups were
aligned based on protein sequence using MAFFT v7.3 (Katoh
and Standley, 2013), and genetic distances between all pairs of
genes were calculated as Kimura protein distances (Kimura,
1980). One gene copy per species was retained based on the
smallest genetic distance to the longest protein sequence of A.
thaliana in each orthogroup. The rationale is that the annotation
of the A. thaliana genome is complete and reliable, and our
subsequent functional analyses of candidate genes were based
on the A. thaliana orthologs. The resulting orthogroup se-
quences that contained only one protein sequence per species
were realigned using PRANK v170427 (Löytynoja, 2014). Coding
sequences of genes in orthogroups were extracted based on
the same gene identier of protein sequences and were aligned
using PRANK with the codon mode.
Phylogeny and estimation of divergence time
Protein sequences of 195 singlecopy orthogroups were
used for phylogenetic inference with RAxML v8.2.12 using
the PROTGAMMAAUTO substitution model with 500 boot-
strap replicates (Stamatakis, 2014). Divergence time was
estimated in the MCMCtree program from PAML v4.9j (Yang,
2007), using the approximate likelihood method with the tree
topology inferred with RAxML, an independent substitution
rate (clock =2), and the HKY85 +GAMMA model. Three
calibration points were assigned based on the TimeTree
database (Kumar et al., 2017)(http://www.timetree.org/,ac-
cessed on May 1, 2022): the MRCA (most recent common
ancestor) of rosids (95% HPD 105115 Mya), the MRCA of
Pentapetalae (95% HPD 110124 Mya), and the MRCA of
Mesangiospermae (95% HPD 148173 Mya). Samples were
drawn every 1,000 MCMC steps from a total of 10
6
steps,
with a burnin of 10
5
steps. Convergence was assessed by
comparing parameter estimates from two independent runs,
with all effective sample sizes >200.
Gene family evolution
Changes in gene family number were examined using CAFÉ5
(Mendes et al., 2020). In addition to the base model, the
number of gamma categories (k) was set to estimate sepa-
rate lambda values for different lineages in the tree (Gamma
model). The highest likelihood was found using k=2 rate
categories (lnL =286114), with λ=0.00553 and α=1.58.
Gene family expansions or contractions were identied only
when the change in gene count was signicant with a
Pvalue <0.05. Genomewide NBSLRR genes were manually
identied using HMMER v3.2 with an evalue 1e05 (Wheeler
and Eddy, 2013). The NBSLRR protein domains (NBARC:
PF00931; RPW8: PF05659; TIR: PF01582; LRR: PF00560,
PF07723, PF07725 and PF12799) were retrieved from Pfam
(http://pfam.xfam.org) and were used to identify conserved
motif of NBSLRR genes in sampled genomes. The ML
phylogenetic tree of NBSLRR was constructed using
RAxML. Then, Notung v2.9 was used to recover gains and
losses of NBSLRR genes by reconciling the NBSLRR gene
tree (Chen et al., 2000). The concatenated tree reconstructed
by RAxML was used as input topology.
Functional enrichment analysis
Gene Ontology and KEGG overrepresentation tests were
performed using clusterProler v4.3.4 implemented in R to
identify signicantly enriched pathways (Wu et al., 2021).
Gene Ontology and KEGG terms were assigned according to
the orthologous genes of the A. thaliana genome. In the
enrichGOfunction, we set ont =BPto only search for
enriched BPs. The resulting Pvalues were corrected for
multiple comparisons using a BenjaminiHochberg FDR
correction. A criterion of Padjust <0.05 was used to assess
the signicance of enrichment analyses.
Tests for positive selection and relaxation of selection
The branchsite model implemented in CodeML (PAML
package) was performed to test for positive selection. In this
test, an alternative model allowing sites to be under positive
selection on the foreground branch was contrasted to a null
model limiting sites to evolve neutrally or under purifying
selection using a likelihood ratio test (LRT). LRT Pvalues
were computed based on chisquared distribution (df =2)
and were corrected for multiple tests at a Padjust <0.05
threshold using a BenjaminiHochberg FDR correction. A
gene showing a signature of positive selection in more than
three alpine species was identied as a convergently se-
lected gene. Additionally, RELAX was used to test for the
relaxation of selection on Slocus genes using the LRT by
comparing the model xing k=1 and the model allowing kto
be estimated (Wertheim et al., 2015). In both analyses, seven
tests were conducted separately by setting each alpine
species as the foreground branch.
Convergent evolutionary rate shifts
To perform the RERconverge analysis, we rst used PAML to
estimate maximumlikelihood gene trees whose branch
lengths represent evolutionary rates using the number of AA
substitutions. RERs were calculated using getAllResiduals
function with weight =T, scale =T and cutoff =0.001
Alpine plant genomes Journal of Integrative Plant Biology
12 Month 2023
|
Volume 00
|
Issue 00
|
116 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
(Kowalczyk et al., 2019). We set alpine species as foreground
branches (branches of the tree with the trait of interest) and ran
foreground2Pathsfunction to estimate RERs for all genes
and to correlate them with trait evolution (i.e., alpine habitats in
our study). We then used correlateWithBinaryPhenotype
function to test for a signicant association between RERs and
traits across all branches of the tree with a Pvalue <0.05.
Convergent AA evolution
Two models were compared in PCOC analysis: the con-
vergent model in which a site on convergent branches evolves
under a prole different from that of the nonconvergent
branches, and the nonconvergent (null) model in which a site
evolves under a single AA prole throughout the phylogeny
(Rey et al., 2018). PCOC then detected convergent sites by
identifying the better t between the two models. To lter for
only sites with strong evidence for convergent prole shifts, we
set a posterior probability threshold of >0.9 in the analysis.
Differential gene expression analysis
RNAseq data for S. obvallata were mapped to the as-
sembled genome using HISAT2 v2.2.1 (Kim et al., 2019). Only
uniquely mapped pairedend reads were retained for read
counting by featureCounts v2.0.3 (Liao et al., 2014) to gen-
erate the count and transcripts per kilobase million (TPM)
Tables. DEG analyses among the ve tissues were performed
in DESeq. 2 v1.36.0 (Love et al., 2014), with a Pvalue <0.05
as a cutoff and a log
2
fold change cutoff of 1.
Data availability statement
The genomic data generated and analyzed in this study including
the raw sequencing data of Oxford Nanopore, Illumina, HiCand
RNAseq, as well as genome assembly have been deposited in
China National GeneBank DataBase (CNGBd, https://db.cngb.
org/) under accession number CNP0003451. All the custom
scripts and specic command lines have been deposited at
GitHub (https://github.com/ZhangXu-CAS/Alpine_genome).
ACKNOWLEDGEMENTS
This work was supported by the Second Tibetan Plateau Sci-
entic Expedition and Research program (2019QZKK0502), the
Strategic Priority Research Program of the Chinese Academy of
Sciences (XDA20050203), the Key Projects of the Joint Fund of
the National Natural Science Foundation of China (U1802232),
the Youth Innovation Promotion Association of Chinese
Academy of Sciences (2019382), the Yunnan Young & Elite
Talents Project (YNWRQNBJ2019033) and the Ten Thousand
Talents Program of Yunnan Province (202005AB160005).
CONFLICTS OF INTEREST
The authors declare that they have no conicts of interest
associated with this work.
AUTHOR CONTRIBUTIONS
T.D., H.W. and H.S. conceived and led the project. X.Z., T.K.,
W.D. and Z.Q. processed data, performed analyses, and
drew gures. X.Z., H.Z., T.F., L.L. and T.D. collected mate-
rials. X.Z. wrote the manuscript. J.B.L., Y.S., J.H., T.D., H.W.
and H.S. revised the manuscript. All authors read and ap-
prove of the manuscript.
Edited by: Hongzhi Kong, Institute of Botany, CAS, China
Received Nov. 18, 2022; Accepted Mar. 21, 2023; Published Mar.
24, 2023
REFERENCES
Abbas, M., Sharma, G., Dambire, C., Marquez, J., AlonsoBlanco, C.,
Proaño, K., and Holdsworth, M.J. (2022). An oxygensensing mech-
anism for angiosperm adaptation to altitude. Nature 606: 565569.
Agati, G., Azzarello, E., Pollastri, S., and Tattini, M. (2012). Flavonoids
as antioxidants in plants: Location and functional signicance. Plant
Sci. 196: 6776.
Albert, N.W., Davies, K.M., Lewis, D.H., Zhang, H., Monteori, M.,
Brendolise, C., Boase, M.R., Ngo, H., Jameson, P.E., and Schwinn,
K.E. (2014). A conserved network of transcriptional activators and re-
pressors regulates anthocyanin pigmentation in eudicots. Plant Cell 26:
962980.
Beall, C.M. (2014). Adaptation to high altitude: Phenotypes and geno-
types. Annu. Rev. Anthropol. 43: 251272.
Bingham, R.A., and Ort, A.R. (1998). Efcient pollination of alpine plants.
Nature 391: 238239.
Birkeland, S., Gustafsson, A.L.S., Brysting, A.K., Brochmann, C., and
Nowak, M.D. (2020). Multiple genetic trajectories to extreme abiotic
stress adaptation in arctic brassicaceae. Mol. Biol. Evol. 37: 20522068.
Bourdenx, B., Bernard, A., Domergue, F., Pascal, S., Leger, A., Roby,
D., Pervent, M., Vile, D., Haslam, R.P., Napier, J.A., Lessire, R., and
Joubès, J. (2011). Overexpression of Arabidopsis ECERIFERUM1
promotes wax verylongchain alkane biosynthesis and inuences
plant response to biotic and abiotic stresses. Plant Physiol. 156: 2945.
Bruna, T., Hoff, K.J., Lomsadze, A., Stanke, M., and Borodovsky, M.
(2021). BRAKER2: Automatic eukaryotic genome annotation with
GeneMarkEP+and AUGUSTUS supported by a protein database.
NAR Genom. Bioinform. 3: lqaa108.
Cantalapiedra, C.P., HernandezPlaza, A., Letunic, I., Bork, P., and
HuertaCepas, J. (2021). eggNOGmapper v2: Functional annotation,
orthology assignments, and domain prediction at the metagenomic
scale. Mol. Biol. Evol. 38: 58255829.
Cao, X., Yang, H., Shang, C., Ma, S., Liu, L., and Cheng, J. (2019). The
roles of auxin biosynthesis YUCCA gene family in plants. Int. J. Mol.
Sci. 20: 6343.
Chen, J.H., Huang, Y., Brachi, B., Yun, Q.Z., Zhang, W., Lu, W., Li, H.N.,
Li, W.Q., Sun, X.D., Wang, G.Y. et al. (2019). Genomewide analysis of
Cushion willow provides insights into alpine plant divergence in a bi-
odiversity hotspot. Nat. Commun. 10: 5230.
Chen, K., Durand, D., and FarachColton, M. (2000). NOTUNG: A pro-
gram for dating gene duplications and optimizing gene family trees. J.
Comput. Biol. 7: 429447.
Chen, K., Fan, B., Du, L., and Chen, Z. (2004). Activation of hyper-
sensitive cell death by pathogeninduced receptorlike protein kinases
from Arabidopsis. Plant Mol. Biol. 56: 271283.
Cheptou, P.O. (2012). Clarifying baker's law. Ann. Bot. 109: 633641.
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
116 13
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Cui, R., Medeiros, T., Willemsen, D., Iasi, L.N.M., Collier, G.E., Graef,
M., Reichard, M., and Valenzano, D.R. (2019). Relaxed selection
limits lifespan by increasing mutation load. Cell 178: 385399.
Diener, A.C., Gaxiola, R.A., and Fink, G.R. (2001). Arabidopsis ALF5, a
multidrug efux transporter gene family member, confers resistance to
toxins. Plant Cell 13: 16251638.
Dudchenko, O., Batra, S.S., Omer, A.D., Nyquist, S.K., Hoeger, M.,
Durand, N.C., Shamim, M.S., Machol, I., Lander, E.S., Aiden, A.P.
et al. (2017). De novo assembly of the Aedes aegypti genome using
HiC yields chromosomelength scaffolds. Science 356: 9295.
Ellinghaus, D., Kurtz, S., and Willhoeft, U. (2008). LTRharvest, an ef-
cient and exible software for de novo detection of LTR retro-
transposons. BMC Bioinformatics 9: 18.
Emms, D.M., and Kelly, S. (2019). OrthoFinder: Phylogenetic orthology
inference for comparative genomics. Genome Biol. 20: 238.
Fasani, E., DalCorso, G., Costa, A., Zenoni, S., and Furini, A. (2019). The
Arabidopsis thaliana transcription factor MYB59 regulates calcium
signalling during plant growth and stress response. Plant Mol. Biol. 99:
517534.
Feng, L., Lin, H., Kang, M., Ren, Y., Yu, X., Xu, Z., Wang, S., Li, T.,
Yang, W., and Hu, Q. (2022). A chromosomelevel genome assembly
of an alpine plant Crucihimalaya lasiocarpa provides insights into high
altitude adaptation. DNA Res. 29: dsac004.
Flynn, J.M., Hubley, R., Goubert, C., Rosen, J., Clark, A.G., Feschotte,
C., and Smit, A.F. (2020). RepeatModeler2 for automated genomic
discovery of transposable element families. Proc. Natl. Acad. Sci. U.S.A.
117: 94519457.
Flynn, R.L., and Zou, L. (2010). Oligonucleotide/oligosaccharidebinding
fold proteins: A growing family of genome guardians. Crit. Rev. Bio-
chem. Mol. Biol. 45: 266275.
Fujikawa, K., Ikeda, H., Murata, K., Kobayashi, T., Nakano, T.,
Ohba, H., and Wu, S.G. (2004). Chromosome numbers of fteen
species of the genus Saussurea DC. (Asteraceae) in the Himalayas and
the adjacent regions. J. Japan Botany 79: 271280.
Fukao, T., and BaileySerres, J. (2004). Plant responses to hypoxiaIs
survival a balancing act? Trends Plant Sci. 9: 449456.
Goodwillie, C., Kalisz, S., and Eckert, C.G. (2005). The evolutionary
enigma of mixed mating systems in plants: Occurrence, theoretical
explanations, and empirical evidence. Annu. Rev. Ecol. Evol. Syst. 36:
4779.
Guo, X., Hu, Q., Hao, G., Wang, X., Zhang, D., Ma, T., and Liu, J. (2018).
The genomes of two Eutrema species provide insight into plant
adaptation to high altitudes. DNA Res. 25: 307315.
Haas, B.J., Salzberg, S.L., Zhu, W., Pertea, M., Allen, J.E., Orvis, J.,
White, O., Buell, C.R., and Wortman, J.R. (2008). Automated eu-
karyotic gene structure annotation using EVidenceModeler and the
program to assemble spliced alignments. Genome Biol. 9: R7.
Hu, J., Fan, J., Sun, Z., and Liu, S. (2020). NextPolish: A fast and efcient
genome polishing tool for longread assembly. Bioinformatics 36:
22532255.
Hut, R.A., and Beersma, D.G. (2011). Evolution of timekeeping mecha-
nisms: Early emergence and adaptation to photoperiod. Philos. Trans.
R. Soc. Lond. B. Biol. Sci. 366: 21412154.
Jones, P., Binns, D., Chang, H.Y., Fraser, M., Li, W., McAnulla, C.,
McWilliam, H., Maslen, J., Mitchell, A., Nuka, G. et al. (2014). Inter-
ProScan 5: Genomescale protein function classication. Bio-
informatics 30: 12361240.
Katoh, K., and Standley, D.M. (2013). MAFFT multiple sequence align-
ment software version 7: Improvements in performance and usability.
Mol. Biol. Evol. 30: 772780.
Keilwagen, J., Wenk, M., Erickson, J.L., Schattat, M.H., Grau, J., and
Hartung, F. (2016). Using intron position conservation for homology
based gene prediction. Nucleic Acids Res. 44: e89.
Kerstiens, G. (1996). Cuticular water permeability and its physiological
signicance. J. Exp. Bot. 47: 18131832.
Kim, D., Paggi, J.M., Park, C., Bennett, C., and Salzberg, S.L. (2019).
Graphbased genome alignment and genotyping with HISAT2 and
HISATgenotype. Nat. Biotechnol. 37: 907915.
Kimura, M. (1980). A simple method for estimating evolutionary rates of
base substitutions through comparative studies of nucleotide se-
quences. J. Mol. Evol. 16: 111120.
Kolora, S.R.R., Owens, G.L., Vazquez, J.M., Stubbs, A., Chatla, K.,
Jainese, C., Seeto, K., McCrea, M., Sandel, M.W., Vianna, J.A. et al.
(2021). Origins and evolution of extreme life span in Pacic Ocean
rockshes. Science 374: 842847.
Kowalczyk, A., Meyer, W.K., Partha, R., Mao, W., Clark, N.L., and
Chikina, M. (2019). RERconverge: an R package for associating evo-
lutionary rates with convergent traits. Bioinformatics 35: 48154817.
Kowalczyk, A., Partha, R., Clark, N.L., and Chikina, M. (2020). Pan
mammalian analysis of molecular constraints underlying extended
lifespan. eLife 9: e51089.
Kumar, S., Stecher, G., Suleski, M., and Hedges, S.B. (2017). TimeTree:
A resource for timelines, timetrees, and divergence times. Mol. Biol.
Evol. 34: 18121819.
Li, J.T., Gao, Y.D., Xie, L., Deng, C., Shi, P., Guan, M.L., Huang, S.,
Ren, J.L., Wu, D.D., Ding, L. et al. (2018). Comparative genomic in-
vestigation of highelevation adaptation in ectothermic snakes. Proc.
Natl. Acad. Sci. U. S. A. 115: 84068411.
Liao, Y., Smyth, G.K., and Shi, W. (2014). featureCounts: an efcient
general purpose program for assigning sequence reads to genomic
features. Bioinformatics 30: 923930.
Licausi, F., Kosmacz, M., Weits, D.A., Giuntoli, B., Giorgi, F.M.,
Voesenek, L.A., Perata, P., and van Dongen, J.T. (2011). Oxygen
sensing in plants is mediated by an Nend rule pathway for protein
destabilization. Nature 479: 419422.
Liu, B., Shi, Y., Yuan, J., Hu, X., Zhang, H., Li, N., Li, Z., Chen, Y., Mu, D.,
and Fan, W. (2013). Estimation of genomic characteristics by analyzing
kmer frequency in de novo genome projects. arXiv: Genomics.
Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of
fold change and dispersion for RNAseq data with DESeq. 2. Genome
Biol. 15: 550.
Löytynoja, A. (2014). Phylogenyaware alignment with PRANK. In Multiple
sequence alignment methods‐‐ Totowa, D.J. Russell, ed., NJ: Humana
Press, pp. 155170.
Lu, S., Wang, J., Chitsaz, F., Derbyshire, M.K., Geer, R.C., Gonzales, N.
R., Gwadz, M., Hurwitz, D.I., Marchler, G.H., Song, J.S. et al. (2020).
CDD/SPARCLE: The conserved domain database in 2020. Nucleic
Acids Res. 48: D265D268.
Majoros, W.H., Pertea, M., and Salzberg, S.L. (2004). TigrScan and
GlimmerHMM: two open source ab initio eukaryotic genenders.
Bioinformatics 20: 28782879.
Manni, M., Berkeley, M.R., Seppey, M., Simao, F.A., and Zdobnov, E.M.
(2021). BUSCO Update: Novel and Streamlined Workows along with
Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic,
Prokaryotic, and Viral Genomes. Mol. Biol. Evol. 38: 46474654.
Mano, S., Hayashi, M., and Nishimura, M. (1999). Light regulates alter-
native splicing of hydroxypyruvate reductase in pumpkin. Plant J. 17:
309320.
Martin, J.T., Juniper, B.E. (1970). The cuticles of plants. New York:
St. Martin's Press.
McHale, L., Tan, X., Koehl, P., and Michelmore, R.W. (2006). Plant NBS
LRR proteins: Adaptable guards. Genome Biol. 7: 212.
Mendes, F.K., Vanderpool, D., Fulton, B., and Hahn, M.W. (2020). CAFE
5 models variation in evolutionary rates among gene families. Bio-
informatics 36: 55165518.
Alpine plant genomes Journal of Integrative Plant Biology
14 Month 2023
|
Volume 00
|
Issue 00
|
116 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Mockaitis, K., and Estelle, M. (2008). Auxin receptors and plant devel-
opment: A new signaling paradigm. Annu. Rev. Cell Dev. Biol. 24:
5580.
Mosher, S., Seybold, H., Rodriguez, P., Stahl, M., Davies, K.A.,
Dayaratne, S., Morillo, S.A., Wierzba, M., Favery, B., Keller, H. et al.
(2013). The tyrosinesulfated peptide receptors PSKR1 and PSY1R
modify the immunity of Arabidopsis to biotrophic and necrotrophic
pathogens in an antagonistic manner. Plant J. 73: 469482.
Nagy, L., Grabherr, G. (2009). The Biology of Alpine Habitats. Oxford:
Oxford University Press.
OrtizMorea, F.A., Liu, J., Shan, L., and He, P. (2022). Malectinlike re-
ceptor kinases as protector deities in plant immunity. Nat. Plants 8:
2737.
Ou, S., Chen, J., and Jiang, N. (2018). Assessing genome assembly
quality using the LTR Assembly Index (LAI). Nucleic Acids Res.
46: e126.
Ou, S., and Jiang, N. (2018). LTR_retriever: A highly accurate and sensi-
tive program for identication of long terminal repeat retrotransposons.
Plant Physiol. 176: 14101422.
Pelaz, S., Ditta, G.S., Baumann, E., Wisman, E., and Yanofsky, M.F.
(2000). B and C oral organ identity functions require SEPALLATA
MADSbox genes. Nature 405: 200203.
Peng, H.P., Chan, C.S., Shih, M.C., and Yang, S.F. (2001). Signaling
events in the hypoxic induction of alcohol dehydrogenase gene in
Arabidopsis. Plant Physiol. 126: 742749.
Pollard, M., Beisson, F., Li, Y., and Ohlrogge, J.B. (2008). Building lipid
barriers: Biosynthesis of cutin and suberin. Trends Plant Sci. 13:
236246.
Rey, C., Gueguen, L., Semon, M., and Boussau, B. (2018). Accurate
detection of convergent aminoacid evolution with PCOC. Mol. Biol.
Evol. 35: 22962306.
Sackton, T.B., Grayson, P., Cloutier, A., Hu, Z., Liu, J.S., Wheeler, N.E.,
Gardner, P.P., Clarke, J.A., Baker, A.J., Clamp, M. et al. (2019).
Convergent regulatory evolution and loss of ight in paleognathous
birds. Science 364: 7478.
ShermanBroyles, S., Boggs, N., Farkas, A., Liu, P., Vrebalov, J.,
Nasrallah, M.E., and Nasrallah, J.B. (2007). S locus genes and the
evolution of selffertility in Arabidopsis thaliana. Plant Cell 19:
94106.
Song, B., Stocklin, J., Peng, D.L., Gao, Y.Q., and Sun, H. (2015). The
bracts of the alpine glasshouseplant Rheum alexandrae (Polygo-
naceae) enhance reproductive tness of its pollinating seedconsuming
mutualist. Bot. J. Linn. Soc. 179: 349359.
Spicer, R.A., Farnsworth, A., and Su, T. (2020). Cenozoic topography,
monsoons and biodiversity conservation within the Tibetan Region: An
evolving story. Plant Divers 42: 229254.
Srivastava, D., Shamim, M., Kumar, M., Mishra, A., Maurya, R.,
Sharma, D., Pandey, P., and Singh, K.N. (2019). Role of circadian
rhythm in plant system: An update from development to stress re-
sponse. Environ. Exp. Bot. 162: 256271.
Stamatakis, A. (2014). RAxML version 8: A tool for phylogenetic analysis
and postanalysis of large phylogenies. Bioinformatics 30: 13121313.
Stern, D.L., and Orgogozo, V. (2009). Is genetic evolution predictable?
Science 323: 746751.
Storz, J.F. (2016). Causes of molecular convergence and parallelism in
protein evolution. Nat. Rev. Genet. 17: 239250.
Sun, H., Niu, Y., Chen, Y.S., Song, B., Liu, C.Q., Peng, D.L., Chen, J.G.,
and Yang, Y. (2014). Survival and reproduction of plant species in the
QinghaiTibet Plateau. J. Sys. Evol. 52: 378396.
Szklarczyk, D., Gable, A.L., Nastou, K.C., Lyon, D., Kirsch, R., Pyysalo,
S., Doncheva, N.T., Legeay, M., Fang, T., Bork, P. et al. (2021). The
STRING database in 2021: Customizable proteinprotein networks, and
functional characterization of useruploaded gene/measurement sets.
Nucleic Acids Res. 49: D605D612.
Takayama, S., and Isogai, A. (2005). Selfincompatibility in plants. Annu.
Rev. Plant Biol. 56: 467489.
TarailoGraovac, M., and Chen, N. (2009). Using RepeatMasker to
identify repetitive elements in genomic sequences. Curr. Protoc. Bio-
informatics Chapter 4: Unit 4. 10.
To, J.P., Haberer, G., Ferreira, F.J., Deruere, J., Mason, M.G., Schaller,
G.E., Alonso, J.M., Ecker, J.R., and Kieber, J.J. (2004). TypeA
Arabidopsis response regulators are partially redundant negative reg-
ulators of cytokinin signaling. Plant Cell 16: 658671.
Tsukaya, H., and Tsuge, T. (2001). Morphological adaptation of in-
orescences in plants that develop at low temperatures in early spring:
The convergent evolution of downy plants. Plant Biol. 3: 536543.
van Berkel, K., de Boer, R.J., Scheres, B., and ten Tusscher, K. (2013).
Polar auxin transport: Models and mechanisms. Development 140:
22532268.
Walker, B.J., Abeel, T., Shea, T., Priest, M., Abouelliel, A., Sakthikumar,
S., Cuomo, C.A., Zeng, Q., Wortman, J., Young, S.K. et al. (2014).
Pilon: an integrated tool for comprehensive microbial variant detection
and genome assembly improvement. PLoS One 9: e112963.
Wang, X., Liu, S., Zuo, H., Zheng, W., Zhang, S., Huang, Y., Pingcuo,
G., Ying, H., Zhao, F., Li, Y. et al. (2021). Genomic basis of high
altitude adaptation in Tibetan Prunus fruit trees. Curr. Biol. 31:
38483860.
Wen, R., TorresAcosta, J.A., Pastushok, L., Lai, X., Pelzer, L.,
Wang, H., and Xiao, W. (2008). Arabidopsis UEV1D promotes Lysine
63linked polyubiquitination and is involved in DNA damage response.
Plant Cell 20: 213227.
Wertheim, J.O., Murrell, B., Smith, M.D., Kosakovsky Pond, S.L., and
Schefer, K. (2015). RELAX: Detecting relaxed selection in a phylo-
genetic framework. Mol. Biol. Evol. 32: 820832.
Wheeler, T.J., and Eddy, S.R. (2013). nhmmer: DNA homology search
with prole HMMs. Bioinformatics 29: 24872489.
Wu, T., Hu, E., Xu, S., Chen, M., Guo, P., Dai, Z., Feng, T., Zhou, L.,
Tang, W., Zhan, L. et al. (2021). clusterProler 4.0: A universal en-
richment tool for interpreting omics data. Innovation (Camb) 2: 100141.
Xu, S., Wang, J., Guo, Z., He, Z., and Shi, S. (2020). Genomic Con-
vergence in the Adaptation to Extreme Environments. Plant Commun.
1: 100117.
Xu, Z., and Wang, H. (2007). LTR_FINDER: An efcient tool for the pre-
diction of fulllength LTR retrotransposons. Nucleic Acids Res. 35:
W265W268.
Yang, C.Y., Hsu, F.C., Li, J.P., Wang, N.N., and Shih, M.C. (2011). The
AP2/ERF transcription factor AtERF73/HRE1 modulates ethylene re-
sponses during hypoxia in Arabidopsis. Plant Physiol. 156: 202212.
Yang, X.H., Makaroff, C.A., and Ma, H. (2003). The Arabidopsis MALE
MEIOCYTE DEATH1 gene encodes a PHDnger protein that is re-
quired for male meiosis. Plant Cell 15: 12811295.
Yang, Z. (2007). PAML 4: Phylogenetic analysis by maximum likelihood.
Mol. Biol. Evol. 24: 15861591.
Yeaman, S., Hodgins, K.A., Lotterhos, K.E., Suren, H., Nadeau, S.,
Degner, J.C., Nurkowski, K.A., Smets, P., Wang, T., Gray, L.K. et al.
(2016). Convergent local adaptation to climate in distantly related
conifers. Science 353: 14311433.
Zeng, X., Long, H., Wang, Z., Zhao, S., Tang, Y., Huang, Z., Wang, Y.,
Xu, Q., Mao, L., Deng, G. et al. (2015). The draft genome of Tibetan
hulless barley reveals adaptive patterns to the high stressful Tibetan
Plateau. Proc. Natl. Acad. Sci. U.S.A. 112: 10951100.
Zhang, J., and Kumar, S. (1997). Detection of convergent and parallel
evolution at the amino acid sequence level. Mol. Biol. Evol. 14:
527536.
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
116 15
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Zhang, J., Tian, Y., Yan, L., Zhang, G., Wang, X., Zeng, Y., Zhang, J.,
Ma, X., Tan, Y., Long, N. et al. (2016). Genome of plant maca (Lepi-
dium meyenii) illuminates genomic basis for highaltitude adaptation in
the central andes. Mol. Plant 9: 10661077.
Zhang, T., Qiao, Q., Novikova, P.Y., Wang, Q., Yue, J., Guan, Y., Ming,
S., Liu, T., De, J., Liu, Y. et al. (2019). Genome of Crucihimalaya hi-
malaica, a close relative of Arabidopsis, shows ecological adaptation to
high altitude. Proc. Natl. Acad. Sci. U. S. A. 116: 71377146.
Zhang, Y., Zhou, J., and Lim, C.U. (2006). The role of NBS1 in DNA
double strand break repair, telomere stability, and cell cycle checkpoint
control. Cell Res. 16: 4554.
Zhen, Y., Aardema, M.L., Medina, E.M., Schumer, M., and Andolfatto,
P. (2012). Parallel molecular evolution in an herbivore community.
Science 337: 16341637.
Zhou, B., Mural, R.V., Chen, X., Oates, M.E., Connor, R.A., Martin, G.B.,
Gough, J., and Zeng, L. (2017). A Subset of UbiquitinConjugating
Enzymes Is Essential for Plant Immunity. Plant Physiol. 173:
13711390.
SUPPORTING INFORMATION
Additional Supporting Information may be found online in the supporting
information tab for this article: http://onlinelibrary.wiley.com/doi/10.1111/
jipb.13485/suppinfo
Figure S1.Kmerbased analysis to estimate the genome size of Saus-
surea obvallata
Figure S2.Kmerbased analysis to estimate the genome size of Rheum
alexandrae
Figure S3. The HiC assisted assembly of Saussurea obvallata pseudo-
molecules
Figure S4. The distribution of sequence length of predicted proteincoding
genes in Saussurea obvallata genome
Figure S5. The distribution of sequence length of predicted proteincoding
genes in Rheum alexandrae genome
Figure S6. Gene tree of the nucleotidebinding site and leucinerichrepeat
domain receptor (NBSLRR) genes
Figure S7. Loss and gain events of NBSLRR genes in sampled species
Figure S8. Veen plot showing the number of genes undergoing molecular
convergence detected by RERconverge and PCOC analysis
Figure S9. Sitebased estimation of convergent amino acid evolution on
Slocus genes using PCOC method
Figure S10. The predicted STRING network of Arabidopsis MYB48
Figure S11. Volcano plot of differentially expressed genes between bract
leaves and normal leaves
Figure S12. Kyoto Encyclopedia of Genes and Genomes (KEGG) path-
ways of signicantly downregulated genes in bract leaves
Table S1. Genome sequences of 20 species used for comparative analysis
Table S2. Estimation of genome sizes for Saussurea obvallata and Rheum
alexandrae based on Kmer statistics
Table S3. The total sequencing data for genome assembly
Table S4. The information of transcriptomic data of Saussurea obvallata
used in the study
Table S5. Statistics of initial genome assembly of Saussurea obvallata and
Rheum alexandrae
Table S6. Summary of the chromosome lengths of the Saussurea ob-
vallata genome
Table S7. Genome completeness measured of Saussurea obvallata ge-
nome and proteome by Benchmarking Universal SingleCopy Orthologs
(BUSCO)
Table S8. Genome completeness of Rheum alexandrae genome and
proteome measured by Benchmarking Universal SingleCopy Orthologs
(BUSCO)
Table S9. Prediction of repetitive elements in the assembled Saussurea
obvallata genome
Table S10. Prediction of repetitive elements in the assembled Rheum
alexandrae genome
Table S11. Functional annotation of predicted genes of Saussurea ob-
vallata and Rheum alexandrae genomes
Table S12. Summary of gene family clustering among the 20 species used
Table S13. Signicantly enriched Gene Ontology (GO) terms of con-
vergently expanded gene families in alpine plant genomes based on the
hypergeometric test
Table S14. Signicantly enriched Kyoto Encyclopedia of Genes and Ge-
nomes (KEGG) pathways of convergently expanded gene families in alpine
plant genomes based on the hypergeometric test
Table S15. Signicantly enriched Gene Ontology (GO) terms of con-
vergently contracted gene families in alpine plant genomes based on the
hypergeometric test
Table S16. Signicantly enriched Kyoto Encyclopedia of Genes and Ge-
nomes (KEGG) pathways of convergently contracted gene families in al-
pine plant genomes based on the hypergeometric test
Table S17. Annotation information of 36 convergent selected genes
Table S18. Signicantly enriched Gene Ontology (GO) terms of genes
undergoing convergent positive selection in alpine plant genomes
Table S19. Signicantly enriched Kyoto Encyclopedia of Genes and Ge-
nomes (KEGG) pathways of genes undergoing convergent positive se-
lection in alpine plant genomes
Table S20. Signicantly enriched Gene Ontology (GO) terms of gene
families undergoing convergently accelerated evolutionary rate in alpine
plant genomes
Table S21. Signicantly enriched Kyoto Encyclopedia of Genes and Ge-
nomes (KEGG) pathways of gene families undergoing convergently ac-
celerated evolutionary rate in alpine plant genomes
Table S22. Tests for relaxed selection on Slocus genes of alpine plants
using RELAX model
Table S23. Functional annotation of genes predicted to interact with
MYB48
Alpine plant genomes Journal of Integrative Plant Biology
16 Month 2023
|
Volume 00
|
Issue 00
|
116 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
... In response to these environmental challenges, alpine plants generally develop distinctive morphological traits, such as a dwarf stature, a cushion form, and a reduction in leaf size, which could result from convergent adaptive evolution [3,4]. Recently, various omics technologies, such as genome sequencing, transcriptomics, and proteomics, have advanced research on the adaptation of alpine plants [5][6][7][8]. The molecular mechanisms of alpine adaptation have been gradually revealed via the detection of positively selected genes (PSGs), fast-evolving genes (FEGs), expanded gene families, and differentially expressed genes (DEGs) [5][6][7][8]. ...
... Recently, various omics technologies, such as genome sequencing, transcriptomics, and proteomics, have advanced research on the adaptation of alpine plants [5][6][7][8]. The molecular mechanisms of alpine adaptation have been gradually revealed via the detection of positively selected genes (PSGs), fast-evolving genes (FEGs), expanded gene families, and differentially expressed genes (DEGs) [5][6][7][8]. ...
... Candidate genes discovered to date for high-elevation adaptation are involved in different functional pathways, especially DNA repair, UV-B tolerance, and cold tolerance [5,9]. Approximately 100 f lowering plant families have alpine representatives [3], and recent research into their adaptations has only revealed the tip of the iceberg. ...
Article
Full-text available
How plants find a way to thrive in alpine habitats remains largely unknown. Here we present a chromosome-level genome assembly for an alpine medicinal herb, Triplostegia glandulifera (Caprifoliaceae), and 13 transcriptomes from other species of Dipsacales. We detected a whole-genome duplication event in T. glandulifera that occurred prior to the diversification of Dipsacales. Preferential gene retention after whole-genome duplication was found to contribute to increasing cold-related genes in T. glandulifera. A series of genes putatively associated with alpine adaptation (e.g. CBFs, ERF-VIIs, and RAD51C) exhibited higher expression levels in T. glandulifera than in its low-elevation relative, Lonicera japonica. Comparative genomic analysis among five pairs of high- vs low-elevation species, including a comparison of T. glandulifera and L. japonica, indicated that the gene families related to disease resistance experienced a significantly convergent contraction in alpine plants compared with their lowland relatives. The reduction in gene repertory size was largely concentrated in clades of genes for pathogen recognition (e.g. CNLs, prRLPs, and XII RLKs), while the clades for signal transduction and development remained nearly unchanged. This finding reflects an energy-saving strategy for survival in hostile alpine areas, where there is a tradeoff with less challenge from pathogens and limited resources for growth. We also identified candidate genes for alpine adaptation (e.g. RAD1, DMC1, and MSH3) that were under convergent positive selection or that exhibited a convergent acceleration in evolutionary rate in the investigated alpine plants. Overall, our study provides novel insights into the high-elevation adaptation strategies of this and other alpine plants.
... The development of high-throughput sequencing technology has greatly accelerated genomic research and identification of key genes and promoted the adaptive evolution and ecological research of non-model organisms [19,20]. This technology is also beneficial for further exploring the adaptation of non-model plants to high elevation [8,9,21,22]. The Himalayas, located on the southern margin of the Tibetan Plateau, have an elevation gradient of over 8000 m from south to north within a narrow latitude range [23,24] and are a biodiversity hotspot. ...
... To survive and be able to inhabit such harsh environments, local species have evolved effective strategies for the adaptation of genes to specific morphological and physiological traits [3][4][5][6][7]. Plant populations across altitude gradients exhibit genetic differentiation and local adaptation to specific environmental conditions [8,9]. High-altitude plants often exhibit genetic adaptations for cold tolerance to withstand freezing temperatures and frost [10]. ...
... The AtLPP1 gene can repair the DNA damage caused by high UV and solar radiation. Notably, among these key genes, E2, RLK, and FAR1 have been proposed to facilitate the adaptation of alpine plants to high-altitude environments through convergent evolution [9,84]. These genes could have facilitated R. alpina adaptation to higher elevation, with extensive distribution along the Himalayas. ...
Article
Full-text available
Environmental stress at high altitudes drives the development of distinct adaptive mechanisms in plants. However, studies exploring the genetic adaptive mechanisms of high-altitude plant species are scarce. In the present study, we explored the high-altitude adaptive mechanisms of plants in the Himalayas through whole-genome resequencing. We studied two widespread members of the Himalayan endemic alpine genus Roscoea (Zingiberaceae): R. alpina (a selfing species) and R. purpurea (an outcrossing species). These species are distributed widely in the Himalayas with distinct non-overlapping altitude distributions; R. alpina is distributed at higher elevations, and R. purpurea occurs at lower elevations. Compared to R. purpurea, R. alpina exhibited higher levels of linkage disequilibrium, Tajima’s D, and inbreeding coefficient, as well as lower recombination rates and genetic diversity. Approximately 96.3% of the genes in the reference genome underwent significant genetic divergence (FST ≥ 0.25). We reported 58 completely divergent genes (FST = 1), of which only 17 genes were annotated with specific functions. The functions of these genes were primarily related to adapting to the specific characteristics of high-altitude environments. Our findings provide novel insights into how evolutionary innovations promote the adaptation of mountain alpine species to high altitudes and harsh habitats.
... In Polygonaceae, at least fifteen whole genome sequencing studies have been conducted, with five focusing on the genus Rheum, i.e. Rheum alexandrae [32], R. nobile [33,34], and two recently published genomes, R. tanguticum [35] and R. officinale [36], uncovering large genome sizes of 2.76 Gb and 7.68 Gb, respectively. Analyses of these genomes have identified that three chalcone synthases (CHSs), four CYP, and two β-glucosidases (BGLs) with strong correlations to anthraquinone accumulation in R. tanguticum, alongside 666 candidate genes potentially involved in anthraquinone biosynthesis within the genome of R. officinale. ...
... In addition, we also detected expansions of Copia/ Ale, Gypsy/CRM, and Copia/SIRE in O. digyna and R. nobile Feng2049 (Table S8), respectively, which aligned with the unexpected expansion of Gypsy/Tekay, Gypsy/ CRM, and Copia/SIRE in R. officinale. The high repeat content is consistently observed in other Rheum species, with percentages ranging from 77 to 87% [32,33,35,36]. However, in other closely related species, such as F. tataricum (249.3 ...
Article
Full-text available
Background Rhubarb is one of common traditional Chinese medicine with a diverse array of therapeutic efficacies. Despite its widespread use, molecular research into rhubarb remains limited, constraining our comprehension of the geoherbalism. Results We assembled the genome of Rheum palmatum L., one of the source plants of rhubarb, to elucidate its genome evolution and unpack the biosynthetic pathways of its bioactive compounds using a combination of PacBio HiFi, Oxford Nanopore, Illumina, and Hi-C scaffolding approaches. Around 2.8 Gb genome was obtained after assembly with more than 99.9% sequences anchored to 11 pseudochromosomes (scaffold N50 = 259.19 Mb). Transposable elements (TE) with a continuous expansion of long terminal repeat retrotransposons (LTRs) is predominant in genome size, contributing to the genome expansion of R. palmatum. Totally 30,480 genes were predicted to be protein-coding genes with 473 significantly expanded gene families enriched in diverse pathways associated with high-altitude adaptation for this species. Two successive rounds of whole genome duplication event (WGD) shared by Fagopyrum tataricum and R. palmatum were confirmed. We also identified 54 genes involved in anthraquinone biosynthesis and other 97 genes entangled in flavonoid biosynthesis. Notably, RpALS emerged as a compelling candidate gene for the octaketide biosynthesis after the key residual screening. Conclusion Overall, our findings offer not only an enhanced understanding of this remarkable medicinal plant but also pave the way for future innovations in its genetic breeding, molecular design, and functional genomic studies.
... This type of convergent evolution has been observed in alpine plants, and has been used to explain their widespread adaptations to the stressful conditions at high elevations (e.g. dwarf stature, smaller leaves, high branch density and specialized morphology such as leafy bracts, wooly coverings and cushion forms; Trewavas 2014, Zhang et al. 2023). This conjecture is further supported by the fact that our highest summit is dominated by species with a wide elevational distribution range (e.g. ...
Preprint
Full-text available
Questions Accounting for multiple facets of biodiversity can help to shed light on community assembly of mountaintop flora across space and time, but this approach has rarely been applied. Here we addressed the following questions: (a) Is the filtering effect of elevation on taxonomic diversity of mountaintop plant communities also mirrored in their functional and phylogenetic structure? (b) Can environmental changes over time interact with, and thus change, elevational patterns in mountaintop plant diversity? Location Dovrefjell, central Norway Methods The floristic composition of four mountaintops, spread across an elevational gradient from tree line to the uppermost margins of vascular plant life, was surveyed every seven years between 2001–2022. Six metrics of taxonomic, functional and phylogenetic richness and diversity were calculated for each mountaintop and survey. With these data, we assessed how richness and diversity metrics varied over space (across the elevational gradient) and over time (between surveys). Results All richness and diversity metrics decreased towards higher elevations, except phylogenetic diversity which showed a marked increase with elevation. Taxonomic richness did not change significantly over time, while functional and phylogenetic richness increased between 2001–2022. No significant temporal trend in taxonomic, functional and phylogenetic diversity was detected. Conclusions Different metrics of taxonomic, functional and phylogenetic diversity can show divergent spatial and temporal trends. Future environmental changes may give rise to functionally or phylogenetically novel communities that cannot be predicted from trends in species richness alone. We therefore encourage researchers to look beyond species richness and consider multiple facets of biodiversity when analysing the impact of environmental change on mountaintop flora.
Chapter
High-altitude regions, characterized by unique climatic conditions and diverse ecosystems are home to an array of crops vital to the economic prosperity of local communities. The cultivation of high-altitude crops plays a significant role in ensuring food security, conserving biodiversity, and boosting the tourism sector. While traditional breeding methods have historically improved crop yield and resilience, recent years have witnessed a surge of interest in adopting multi-omics approaches, particularly genomics. High-throughput sequencing technologies have unveiled critical genes and pathways associated with key traits including stress resistance and tolerance, guiding innovative breeding strategies. Genomics has ushered in a new era for high-altitude crop cultivation, facilitating the development of crops adapted to extreme conditions, resistant to pests and diseases, and enriched with essential nutrients. In this chapter, we delve into the wealth of genomics resources available for high-altitude crops. We also discuss the current status and future prospects of various genomics techniques, including QTL mapping, genome-wide association mapping, marker-assisted selection, genomic selection, and genome editing, aiming to enhance crop yield, bolster climate resilience and stress tolerance, and elevate the nutritional value of high-altitude crops. This chapter offers a comprehensive overview of the latest advancements in high-altitude crop enhancement, the challenges associated with genomics, and sheds light on the future of agriculture in challenging landscapes.
Preprint
Rhododendron nivale subsp. boreale Philipson et M. N. Philipson is a kind of ornamental alpine woody flower from mountaintop scrub at an altitude of approximately 4000 meters. Despite ecological significance, the lack of genomic resources has hindered a comprehensive understanding of its evolutionary and adaptive characteristics in high-altitude environments. In this work, we sequenced and assembled the genome of R. nivale subsp. boreale, which is an assembly of the first subgenus Rhododendron and the first high-altitude woody flowering autotetraploid. The assembly included 52 pseudochromosomes, which belonged to 4 haplotypes, harbor 127,810 predicted protein-coding genes. Comparative genomic analysis revealed that R. nivale subsp. boreale originated as a neopolyploid resulting from R. nivale and experienced two rounds of ancient polyploidy event. Transcriptional expression analysis showed that the expression differences of alleles were common, randomly distributed in the genome. We identified signatures of positive selection involved not only in adaptations to mountaintop ecosystem (response to UV radiation and developmental regulation), but also in strategy of autotetraploid reproduction (meiotic stabilization). Notably, highly expressed ERF VIIs aid survival in hypoxic mountaintop environments. Meanwhile, the expanded families was enriched in brassinosteroid (BR) biosynthesis, which enhances adaptability to the dramatic changes in alpine weather, is likely mediated by the increased number of cytochrome P450 (CYP) genes. This valuable genome of mountaintop woody flowering autotetraploids not only provides genetic resources for studying high-altitude polyploid formation but also provides new insights for understanding the evolution and adaptation mechanism of high-altitude plants.
Article
Full-text available
Flowering plants (angiosperms) can grow at extreme altitudes, and have been observed growing as high as 6,400 metres above sea level1,2; however, the molecular mechanisms that enable plant adaptation specifically to altitude are unknown. One distinguishing feature of increasing altitude is a reduction in the partial pressure of oxygen (pO2). Here we investigated the relationship between altitude and oxygen sensing in relation to chlorophyll biosynthesis—which requires molecular oxygen3—and hypoxia-related gene expression. We show that in etiolated seedlings of angiosperm species, steady-state levels of the phototoxic chlorophyll precursor protochlorophyllide are influenced by sensing of atmospheric oxygen concentration. In Arabidopsis thaliana, this is mediated by the PLANT CYSTEINE OXIDASE (PCO) N-degron pathway substrates GROUP VII ETHYLENE RESPONSE FACTOR transcription factors (ERFVIIs). ERFVIIs positively regulate expression of FLUORESCENT IN BLUE LIGHT (FLU), which represses the first committed step of chlorophyll biosynthesis, forming an inactivation complex with tetrapyrrole synthesis enzymes that are negatively regulated by ERFVIIs, thereby suppressing protochlorophyllide. In natural populations representing diverse angiosperm clades, we find oxygen-dependent altitudinal clines for steady-state levels of protochlorophyllide, expression of inactivation complex components and hypoxia-related genes. Finally, A. thaliana accessions from contrasting altitudes display altitude-dependent ERFVII activity and accumulation. We thus identify a mechanism for genetic adaptation to absolute altitude through alteration of the sensitivity of the oxygen-sensing system. Plants have adapted to grow at specific altitudes by regulating chlorophyll synthesis in response to ambient oxygen concentration, calibrated by altitude-dependent activity of GROUP VII ETHYLENE RESPONSE FACTOR.
Article
Full-text available
It remains largely unknown how plants adapt to the high-altitude habitats. Crucihimalaya (Brassicaceae) is an alpine genus occurring in the Qinghai-Tibet Plateau characterized by cold temperatures and strong UV radiation. Here we generated a chromosome-level genome for C. lasiocarpa with a total size of 255.8 Mb and a scaffold N50 size of 31.9 Mb. We firstly examined the karyotype origin of this species and found that the karyotype of five chromosomes resembled the ancestral karyotype of the Brassicaceae family, while the other threes showed strong chromosomal structural variations. In combination with the rough genome sequence of another congener (C. himalaica), we found that the significantly expanded gene families and positively selected genes involved in alpine adaptation occurred since the origin of this genus. Our new findings provide a valuable information for chromosomal karyotype evolution of Brassicaceae and investigations of high-altitude environment adaptation of the genus.
Article
Full-text available
Plant malectin-like receptor kinases (MLRs), also known as Catharanthus roseus receptor-like kinase-1-like proteins, are well known for their functions in pollen tube reception and tip growth, cell wall integrity sensing, and hormonal responses. Recently, mounting evidence has indicated a critical role for MLRs in plant immunity. Here we focus on the emerging functions of MLRs in modulating the two-tiered immune system mediated by cell-surface-resident pattern recognition receptors (PRRs) and intracellular nucleotide-binding leucine-rich repeat receptors (NLRs). MLRs complex with PRRs and NLRs and regulate immune receptor complex formation and stability. Rapid alkalinization factor peptide ligands, LORELEI-like glycosylphosphatidylinositol-anchored proteins and cell-wall-associated leucine-rich repeat extensins coordinate with MLRs to orchestrate PRR- and NLR-mediated immunity. We discuss the common theme and unique features of MLR complexes concatenating different branches of plant immune signalling.
Article
Full-text available
A fishy tale of long and short life span Fish have wide variations in life span even within closely related species. One such example are the rockfish species found along North Pacific coasts, which have life spans ranging from 11 to more than 200 years. Kolora et al . sequenced and performed a genomic analysis of 88 rockfish species, including long-read sequencing of the genomes of six species (see the Perspective by Lu et al .). From this analysis, the authors unmasked the genetic drivers of longevity evolution, including immunity and DNA repair–related pathways. Copy number expansion in the butyrophilin gene family was shown to be positively associated with life span, and population historical dynamics and life histories correlated differently between long- and short-lived species. These results support the idea that inflammation may modulate the aging process in these fish. —LMZ
Article
Full-text available
Even though automated functional annotation of genes represents a fundamental step in most genomic and metagenomic workflows, it remains challenging at large scales. Here, we describe a major upgrade to eggNOG-mapper, a tool for functional annotation based on precomputed orthology assignments, now optimized for vast (meta)genomic data sets. Improvements in version 2 include a full update of both the genomes and functional databases to those from eggNOG v5, as well as several efficiency enhancements and new features. Most notably, eggNOG-mapper v2 now allows for: (i) de novo gene prediction from raw contigs, (ii) built-in pairwise orthology prediction, (iii) fast protein domain discovery, and (iv) automated GFF decoration. eggNOG-mapper v2 is available as a standalone tool or as an online service at http://eggnog-mapper.embl.de.
Article
Full-text available
Methods for evaluating the quality of genomic and metagenomic data are essential to aid genome assembly and to correctly interpret the results of subsequent analyses. BUSCO estimates the completeness and redundancy of processed genomic data based on universal single-copy orthologs. Here we present new functionalities and major improvements of the BUSCO software, as well as the renewal and expansion of the underlying datasets in sync with the OrthoDB v10 release. Among the major novelties, BUSCO now enables phylogenetic placement of the input sequence to automatically select the most appropriate dataset for the assessment, allowing the analysis of metagenome-assembled genomes of unknown origin. A newly-introduced genome workflow increases the efficiency and runtimes especially on large eukaryotic genomes. BUSCO is the only tool capable of assessing both eukaryotic and prokaryotic species, and can be applied to various data types, from genome assemblies and metagenomic bins, to transcriptomes and gene sets.
Article
Full-text available
The Great Himalayan Mountains and their foothills are believed to be the place of origin and development of many plant species. The genetic basis of adaptation to high plateaus is a fascinating topic that is poorly understood at the population level. We comprehensively collected and sequenced 377 accessions of Prunus germplasm along altitude gradients ranging from 2,067 to 4,492 m in the Himalayas. We de novo assembled three high-quality genomes of Tibetan Prunus species. A comparative analysis of Prunus genomes indicated a remarkable expansion of the SINE retrotransposons occurred in the genomes of Tibetan species. We observed genetic differentiation between Tibetan peaches from high and low altitudes and that genes associated with light stress signaling, especially UV stress signaling, were enriched in the differentiated regions. By profiling the metabolomes of Tibetan peach fruit, we determined 379 metabolites had significant genetic correlations with altitudes and that in particular phenylpropanoids were positively correlated with altitudes. We identified 62 Tibetan peach-specific SINEs that colocalized with metabolites differentially accumualted in Tibetan relative to cultivated peach. We demonstrated that two SINEs were inserted in a locus controlling the accumulation of 3-O-feruloyl quinic acid. SINE1 was specific to Tibetan peach. SINE2 was predominant in high altitudes and associated with the accumulation of 3-O-feruloyl quinic acid. These genomic and metabolic data for Prunus populations native to the Himalayan region indicate that the expansion of SINE retrotransposons helped Tibetan Prunus species adapt to the harsh environment of the Himalayan plateau by promoting the accumulation of beneficial metabolites.
Article
Full-text available
Functional enrichment analysis is pivotal for interpreting high-throughput omics data in life science. It is crucial for this type of tools to use the latest annotation databases for as many organisms as possible. To meet these requirements, we present here an updated version of our popular Bioconductor package, clusterProfiler 4.0. This package has been enhanced considerably compared to its original version published nine years ago. The new version provides a universal interface for functional enrichment analysis in thousands of organisms based on internally supported ontologies and pathways as well as annotation data provided by users or derived from online databases. It also extends the dplyr and ggplot2 packages to offer tidy interfaces for data operation and visualization. Other new features include gene set enrichment analysis (GSEA) and comparison of enrichment results from multiple gene lists. We anticipate that clusterProfiler 4.0 will be applied in a wide range of scenarios across diverse organisms.
Article
Full-text available
The task of eukaryotic genome annotation remains challenging. Only a few genomes could serve as standards of annotation achieved through a tremendous investment of human curation efforts. Still, the correctness of all alternative isoforms, even in the best-annotated genomes, could be a good subject for further investigation. The new BRAKER2 pipeline generates and integrates external protein support into the iterative process of training and gene prediction by GeneMark-EP+ and AUGUSTUS. BRAKER2 continues the line started by BRAKER1 where self-training GeneMark-ET and AUGUSTUS made gene predictions supported by transcriptomic data. Among the challenges addressed by the new pipeline was a generation of reliable hints to protein-coding exon boundaries from likely homologous but evolutionarily distant proteins. In comparison with other pipelines for eukaryotic genome annotation, BRAKER2 is fully automatic. It is favorably compared under equal conditions with other pipelines, e.g. MAKER2, in terms of accuracy and performance. Development of BRAKER2 should facilitate solving the task of harmonization of annotation of protein-coding genes in genomes of different eukaryotic species. However, we fully understand that several more innovations are needed in transcriptomic and proteomic technologies as well as in algorithmic development to reach the goal of highly accurate annotation of eukaryotic genomes.
Article
Full-text available
Motivation Genome sequencing projects have revealed frequent gains and losses of genes between species. Previous versions of our software, CAFE (Computational Analysis of gene Family Evolution), have allowed researchers to estimate parameters of gene gain and loss across a phylogenetic tree. However, the underlying model assumed that all gene families had the same rate of evolution, despite evidence suggesting a large amount of variation in rates among families. Results Here we present CAFE 5, a completely re-written software package with numerous performance and user-interface enhancements over previous versions. These include improved support for multithreading, the explicit modelling of rate variation among families using gamma-distributed rate categories, and command-line arguments that preclude the use of accessory scripts. Availability CAFE 5 source code, documentation, test data, and a detailed manual with examples are freely available at https://github.com/hahnlab/CAFE5/releases.