Content uploaded by Xu Zhang
Author content
All content in this area was uploaded by Xu Zhang on Apr 24, 2023
Content may be subject to copyright.
Content uploaded by Xu Zhang
Author content
All content in this area was uploaded by Xu Zhang on Apr 07, 2023
Content may be subject to copyright.
Content uploaded by Xu Zhang
Author content
All content in this area was uploaded by Xu Zhang on Mar 29, 2023
Content may be subject to copyright.
J
IPB Journal of Integrative
Plant Biology New Resource
https://doi.org/10.1111/jipb.13485
Genomic convergence underlying high‐altitude
adaptation in alpine plants
Xu Zhang
1,2
, Tianhui Kuang
3
, Wenlin Dong
1,2,4
,ZhihaoQian
1,2,4
, Huajie Zhang
1,2
, Jacob B. Landis
5,6
,
Tao Feng
1,2
, Lijuan Li
1,2,4
, Yanxia Sun
1,2
,JinlingHuang
3,7,8
,TaoDeng
3
*, Hengchang Wang
1,2
* and Hang Sun
3
*
1. CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, The Chinese Academy of Sciences, Wuhan
Botanical Garden, Wuhan 430074, China
2. Center of Conservation Biology, Core Botanical Gardens, The Chinese Academy of Sciences, Wuhan 430074, China
3. Yunnan International Joint Laboratory for Biodiversity of Central Asia, Key Laboratory for Plant Diversity and Biogeography of East
Asia, Kunming Institute of Botany, The Chinese Academy of Sciences, Kunming 650201, China
4. University of Chinese Academy of Sciences, Beijing 100049, China
5. School of Integrative Plant Science, Section of Plant Biology and the L. H. Bailey Hortorium, Cornell University, Ithaca, New York
14850, USA
6. BTI Computational Biology Center, Boyce Thompson Institute, Ithaca, New York 14853, USA
7. State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, Kaifeng 475004, China
8. Department of Biology, East Carolina University, Greenville, North Carolina 27858, USA
*Correspondences: Tao Deng (dengtao@mail.kib.ac.cn); Hengchang Wang (hcwang@wbgcas.cn); Hang Sun (sunhang@mail.kib.ac.cn,
Dr. Sun is fully responsible for the distributions of the materials associated with this article)
Xu Zhang Hang Sun
ABSTRACT
Evolutionary convergence is one of the most striking
examples of adaptation driven by natural selection.
However, genomic evidence for convergent adap-
tation to extreme environments remains scarce.
Here, we assembled reference genomes of two
alpine plants, Saussurea obvallata (Asteraceae)
and Rheum alexandrae (Polygonaceae), with 37,938
and 61,463 annotated protein‐coding genes. By
integrating an additional five alpine genomes,
we elucidated genomic convergence underlying
high‐altitude adaptation in alpine plants. Our results
detected convergent contractions of disease‐
resistance genes in alpine genomes, which might be
an energy‐saving strategy for surviving in hostile
environments with only a few pathogens present.
We identified signatures of positive selection on a
set of genes involved in reproduction and respira-
tion (e.g., MMD1, NBS1,andHPR), and revealed
signatures of molecular convergence on genes in-
volved in self‐incompatibility, cell wall modification,
DNA repair and stress resistance, which may un-
derlie adaptation to extreme cold, high ultraviolet
radiation and hypoxia environments. Incorporating
transcriptomic data, we further demonstrated that
genes associated with cuticular wax and flavonoid
biosynthetic pathways exhibit higher expression
levels in leafy bracts, shedding light on the genetic
mechanisms of the adaptive “greenhouse”mor-
phology. Our integrative data provide novel insights
into convergent evolution at a high‐taxonomic level,
aiding in a deep understanding of genetic adapta-
tion to complex environments.
Keywords: adaptation, alpine plants, evolutionary rates, genomic
convergence, “greenhouse”morphology, high altitude
Zhang, X., Kuang, T., Dong, W., Qian, Z., Zhang, H., Landis,
J. B., Feng, T., Li, L., Sun, Y., Huang, J., et al. (2023). Genomic
convergence underlying high‐altitude adaptation in alpine plants.
J. Integr. Plant Biol. 00: 1–16.
INTRODUCTION
Evolutionary biologists have long aimed to understand the
extent that evolutionary trajectories are predictable, that
is, the extent to which convergent adaptation in distinct lin-
eages is driven by conserved molecular changes (Zhang and
Kumar, 1997;Stern and Orgogozo, 2009;Zhen et al.,
2012;Storz, 2016). Evolutionary convergence in settings
© 2023 Institute of Botany, Chinese Academy of Sciences.
where different species repeatedly face common selective
pressures offers a powerful opportunity to address this issue
(Yeaman et al., 2016;Birkeland et al., 2020;Xu et al., 2020).
An ideal system for investigating the genetic underpinnings of
convergent evolution is the independent adaptation of
divergent lineages to high‐altitude environments.
The Himalaya‐Hengduan Mountains (HHMs) represent the
world's most species‐rich temperate alpine biota, providing
an ideal natural laboratory for studying convergent adapta-
tion to high altitudes (Spicer et al., 2020). In the subnival
summits of the HHMs (elevation above 4,500 m), usually
characterized by freezing temperatures, high ultraviolet (UV)
radiation, and hypoxia, plants typically possess suites of
similar morphological and physiological adaptations to allow
them to survive and reproduce in these hostile environments
(Tsukaya and Tsuge, 2001). In comparison with plants of
lowland areas, plants living in the subnival summits of the
HHMs (and other high‐altitude areas) have dwarf stems,
smaller leaves and higher densities of branches, and often
exhibit a specialized morphology such as leafy bracts, woolly
coverings and cushion forms (Nagy and Grabherr, 2009;Sun
et al., 2014). Under similar stressful environmental conditions
of high altitudes, one would predict similar genetic compo-
nents underpinning adaptive evolution in alpine plants.
Genomewide studies have documented some genomic
footprints of high‐altitude adaptation by testing for positive
selection and mining expanded gene families, often involving
functional pathways such as DNA repair, abiotic stress re-
sponse, reproductive processes, as well as secondary me-
tabolite biosynthesis (Zeng et al., 2015;Zhang et al.,
2019;Wang et al., 2021). However, the limited availability of
reference genomes for alpine plants restricts further under-
standing of the genomic evolution of high‐altitude adaptation;
in particular, the underlying genomic convergence of alpine
adaptation has not been examined.
In this study, we newly assembled and annotated a
reference‐quality genome of Saussurea obvallata (Aster-
aceae) and a draft genome of Rheum alexandrae (Polygo-
naceae). These two species are primarily found in mountain
slopes and alpine meadows of the HHMs and are renowned
for their “glasshouse”morphology, that is, the upper leaves
of which have developed into large semitranslucent leafy
bracts that cover the inflorescences, which have been shown
to have significant ecological benefits to the plant (Song
et al., 2015). We integrated an additional five available
genomes of alpine plants [Crucihimalaya himalaica (Brassi-
caceae) (Zhang et al., 2019), Eutrema heterophyllum (Bras-
sicaceae) (Guo et al., 2018), Hordeum vulgare var. nudum
(Poaceae) (Zeng et al., 2015), Prunus mira (Rosaceae) (Wang
et al., 2021) and Salix brachista (Salicaceae) (Chen et al.,
2019)] as well as their lowland relatives for comparative
genomic analyses (Table S1). These seven alpine species
represent major clades of angiosperms that independently
colonized high‐altitude environments. Different from previous
case studies of plant genomes, we take advantage of a
comprehensive genomic data set of alpine plants to
characterize genomewide signatures of convergent evolu-
tion. Specifically, we intend to address three main questions:
(i) Do expanded or contracted gene families show convergent
patterns in alpine plants and effect high‐altitude adaptation?
(ii) Determine which genes have undergone convergent mo-
lecular evolution and are involved in adaptation to freezing,
high UV radiation and hypoxia environments? (iii) Last, what
are the genomic bases underlying the adaptive “greenhouse”
morphology? In addressing these questions, our study allows
for comprehensive genomic insights into convergent adap-
tation to high‐altitude environments.
RESULTS AND DISCUSSION
Assembly and annotation of two reference genomes
of alpine plants
Using a k‐mer analysis method, we first estimated the ge-
nome size of S. obvallata and R. alexandrae to be ~2,251 and
~2,137 Mb, respectively (Figures S1,S2;Table S2). We then
generated a chromosome‐level genome of S. obvallata and a
contig‐level genome of R. alexandrae using Illumina, Oxford
Nanopore, and high‐throughput chromatin conformation
contact (Hi‐C) sequencing technologies (Table S3). For the
S. obvallata genome, in total ~95 Gb Illumina short reads,
~143 Gb Nanopore long reads and ~206 Gb Hi‐C data were
obtained. For the R. alexandrae genome, in total ~104 Gb
Illumina short reads and ~144 Gb Nanopore long reads were
generated. Additionally, transcriptomic data for R. alexandrae
(~25 Gb) and S. obvallata (~157 Gb) were obtained for
transcript‐based gene annotation (Table S3). Tissue‐specific
transcriptomic data for S. obvallata were also used for further
gene expression analysis (Table S4).
Ade novo assembly pipeline allowed us to achieve initial
genome assemblies that captured 2,044 and 2,040 Mb in 145
and 129 contigs for S. obvallata and R. alexandrae genomes,
with contig N50 of 36.96 and 36.32 Mb, respectively
(Table 1,S5). Using a Hi‐C assisted assembly pipeline, 1,952
Mb which accounted for 95.5% of the assembled S. obvallata
genome were anchored on 16 chromosomes (Tables 1,
S6;Figure S3), in line with previous cytological evidence
(Fujikawa et al., 2004). We further evaluated the completeness
of the assembled genomes and found high completeness rates
(94.6% of S. obvallata and 96.6% of R. alexandrae) of both
assemblies as supported by BUSCO (Benchmarking Universal
Single‐Copy Orthologs) assessments using the Embry-
ophyta_odb10 database (Tables 1,S7,S8)(Manni et al., 2021).
The long terminal repeat (LTR) assembly index (LAI), which
evaluates the contiguity of intergenic and repetitive regions of
genome assemblies based on the intactness of LTR retro-
transposons (Ou and Chen, and Jiang, 2018), was 19.68 for S.
obvallata and was 4.97 for R. alexandrae, respectively.
Transposable elements (TEs) and other repeat sequences
accounted for 81.88% and 81.65% of the S. obvallata
and R. alexandrae assemblies, respectively (Table 1).
In S. obvallata, LTR retrotransposons (43.95%), followed by
Alpine plant genomes Journal of Integrative Plant Biology
2Month 2023
|
Volume 00
|
Issue 00
|
1–16 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
DNA TEs (2.03%) and LINEs (0.93%), were most abundant,
with LTR‐Gypsy and LTR‐Copia retrotransposons accounting
for 18.19% and 25.76% of the LTRs, respectively (Table S9).
LTR retrotransposons (44.31%), LINES (2.73%), and DNA
TEs (4.08%) accounted for most of the R. alexandrae repeats,
with LTR‐Gypsy (37.98%) and LTR‐Copia (6.33%) retro-
transposons predominant among the LTRs (Table S10). A
combination of transcript‐based, de novo and homology‐
based prediction methods yielded 37,938 and 61,463 high‐
confidence protein‐coding gene models (Table 1;Figures
S4,S5). By comparing with public protein databases, in total
36,542 (96.32%) and 47,535 (77.34%) predicted genes of
S. obvallata and R. alexandrae were functionally annotated
(Table S11). For the annotated genes, 93.5% and 96.1% of
the complete BUSCO genes in the Embryophyta_odb10 da-
tabase could be identified in S. obvallata and R. alexandrae
(Tables S7,S8), respectively. Overall, our newly assembled
and annotated genomes of S. obvallata and R. alexandrae are
high quality, providing valuable genomic resources for un-
derstanding the convergent adaptation of alpine plants to
high‐altitude environments.
Convergent changes in gene family number
We downloaded annotated protein sequences from genomes
of five additional alpine species as well as 13 representative
sister species living in low elevations (Table S1). Included
species were phylogenetically placed in seven families of an-
giosperms, including 17 eudicots and three monocots. In total
6,711 gene orthologs were identified in all 20 species (Table
S12), of which 195 single‐copy orthogroups were used for
phylogeny reconstruction. The obtained phylogeny was con-
sistent with the known phylogenetic relationships within an-
giosperms, in which the alpine taxa included in our study oc-
curred in seven independent lineages placed in six families, and
the time tree inferred from MCMCtree showed a wide di-
vergence history between alpine species and their sampled
sisters, ranging from 2 million years ago (Mya) to 31.81 Mya
(Figure 1A). We then determined convergent changes in gene
family number when a gene family showed significant ex-
pansion or contraction in more than three alpine species. In
total 56 convergently expanded (CoEx) gene families were
identified. Biological Process (BP) of Gene Ontology (GO) and
Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses
of CoEx families found 68 significantly enriched GO terms and
seven KEGG pathways (P‐adjust <0.05) (Tables S13,S14).
Enriched pathways of CoEx gene families were mainly related
to abiotic resistance, such as response to hypoxia, regulation of
hormone levels and hormone transports (Figure 1C, D;Tables
S13,S14). Intriguingly, we found a greater number of con-
vergently contracted (CoCo) families than CoEx families, with
1,193 gene families convergently contracted, involving 390
significantly enriched GO terms and 27 KEGG pathways
(P‐adjust <0.05) (Tables S15,S16). Most of these pathways
included genes involved in the response to biotic stresses, such
as defense against pathogens and toxicants (Tables S15,S16).
A unique stress at high altitudes is hypobaric hypoxia
(Beall, 2014). In plants, an oxygen deficiency dramatically
reduces the efficiency of cellular ATP production, which has
diverse ramifications for cellular metabolism and devel-
opmental processes (Fukao and Bailey‐Serres, 2004). Var-
ious oxygen‐sensing mechanisms have been described that
are thought to trigger plant responses to low‐level oxygen
and thus adaptation to high altitudes (Abbas et al., 2022). In
our analysis, we found that many significantly enriched GO
terms of CoEx families were functionally related to the re-
sponse to oxygen levels, such as response to hypoxia, re-
sponse to decreased oxygen level and response to hydrogen
peroxide (Figure 1C). These pathways included genes en-
coding alcohol dehydrogenase (ADH) (Peng et al., 2001), and
the HSP20‐like chaperones superfamily proteins that are
functionally enriched in the GO term Cellular response to
hypoxia. In A. thaliana, the oxygen‐sensing system is medi-
ated by the plant cysteine oxidase (PCO) N‐degron pathway
substrates group VII ethylene response factors (ERF‐VIIs)
(Licausi et al., 2011), which are involved in modulating eth-
ylene response activating the expression of ADH1 (Yang
et al., 2011). While we did not discover the convergent ex-
pansion of ERF‐VIIs in alpine plants, we found genes that are
significantly enriched GO terms related to response to eth-
ylene, such as ADH1, ERF1 (ETHYLENE RESPONSE
FACTOR 1) and EER1 (ENHANCED ETHYLENE RESPONSE
1). Nonetheless, the expansion of genes involved in the re-
sponse to hypoxia is necessary for the adaptation of alpine
plants to low‐level oxygen in high‐altitude environments.
In addition, multiple CoEx families were found to be sig-
nificantly enriched in plant hormone pathways. Examples in-
clude genes encoding the probable indole‐3‐pyruvate mono-
oxygenases (YUC) involved in auxin biosynthesis (Cao et al.,
2019), and the small auxin upregulated RNAs (SAUR) and the
D6 protein kinase (D6PK) involved in auxin polar transport (van
Berkel et al., 2013). Auxin regulates a series of developmental
processes such as apical dominance, plant organogenesis,
and reproductive development by affecting cell growth, dif-
ferentiation, and patterning (Mockaitis and Estelle, 2008).
Other developmental regulation pathways including leaf sen-
escence, phototropism and plant organ senescence were also
Table 1. Statistics of genome assembly and annonta-
tion of Saussurea obvallata and Rheum alexandrae
Statistic S. obvallata R. alexandrae
Total length (bp) 2,044,030,733 2,039,881,226
Number of contigs 145 129
Largest contig (bp) 126,457,859 160,815,976
Anchored length (bp) 1,951,503,694 ‐
GC (%) 37.94 41.41
Contig N50 (bp) 36,958,263 36,323,674
Complete BUSCOs (%) 94.6 96.6
Repeat content (%) 81.88 81.65
Number of genes 37,938 61,463
BUSCO, Benchmarking Universal Single‐Copy Orthologs. GC (%),
the percent of Guanine and Cytosine.
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
1–16 3
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
detected to be significantly enriched in CoEx families. This
result, coupled with the commonly observed morphological
divergence between alpine plants and lowland relatives (Sun
et al., 2014), indicates that morphological regulation can be a
pivotal path for plant adaptation to high‐altitude environments.
Convergent contractions of disease‐resistance genes
Several CoCo families in alpine plants were found to be
functionally related to response to pathogens or toxicants,
involving GO terms of response to toxic substances
and oomycetes, xenobiotic transport, and detoxification
Figure 1. Evolutionary history of alpine plants
(A) Chronogram showing divergence times among alpine plants (cyan background) with their lowland relatives (orange background) with node age and
95% confidence intervals (blue bars). The red and blue numbers above the branches represent significant expansion and contraction events, respectively.
(B) Bar plot showing gene number identified by OrthorFinder, including unassigned genes which were not put into an orthogroup with any other genes,
genes in orthogroup, and genes in species‐specific orthogroups which consist entirely of genes from one species. (C) Significantly enriched Gene Ontology
(GO) terms for convergently expended (CoEx) gene families in alpine plant genomes. Each bubble represents a summarized GO term from the full GO list by
reducing functional redundancies, and their closeness on the plot reflects their closeness in the GO graph, that is, the semantic similarity. (D) Kyoto
Encyclopedia of Genes and Genomes (KEGG) pathways of CoEx gene families in alpine plant genomes. “p‐adj”refers to the adjusted P‐value using the
Benjamini–Hochberg method.
Alpine plant genomes Journal of Integrative Plant Biology
4Month 2023
|
Volume 00
|
Issue 00
|
1–16 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
(Tables S15,S16). Related gene families include the cysteine‐
rich receptor‐like kinases (CRKs), a large subfamily of
receptor‐like protein kinases (RLKs) that play vital roles in
defense responses and programmed cell death in plants
(Chen et al., 2004), the malectin‐like receptor kinases (MLRs)
orchestrating the plant immune responses and the accom-
modation of fungal and bacterial symbionts (Ortiz‐Morea
et al., 2022), the multidrug and toxic compound extrusion
(MATE) proteins involved in xenobiotic detoxification and
multidrug resistance (Diener et al., 2001), as well as the
ubiquitin‐conjugating (E2) enzymes emerging in recent years
as an important regulatory factor underlying plant innate im-
munity (Zhou et al., 2017). Moreover, the results also showed
that genes encoding receptors for tyrosine‐sulfated glyco-
peptide (PSY1Rs) and phytosulfokine (PSKRs), belonging to
the leucine‐rich repeat receptor kinases (LRR‐RKs), have
been undergoing contraction in all alpine species. PSKR1
and PSY1R have been shown to be involved in plant im-
munity, with antagonistic effects on bacterial and fungal re-
sistances (Mosher et al., 2013).
The largest disease‐resistance genes comprise genes
encoding nucleotide‐binding site and leucine‐rich‐repeat
domain receptors (NBS‐LRRs). Three NBS‐LRR gene sub-
classes, TIR‐NBS‐LRR (TNL), CC‐NBS‐LRR (CNL), and
RPW8‐NBS‐LRR (RNL), have been characterized based on
the N‐terminal domains (McHale et al., 2006). We manually
annotated NBS‐LRR genes in the sampled genomes using
the HMMER search with the Pfam database (Wheeler and
Eddy, 2013). In total 5,655 NBS‐LRR genes, including 4,058
CNLs, 1,024 TNLs and 173 RNLs were identified among all
analyzed genomes (Figure S6). Additionally, we inferred the
NBS‐LRR gene tree to examine the gains and losses of NBS‐
LRR genes in alpine species. The results showed that,
compared with lowland relatives, most alpine species tend to
lose more NBS‐LRR genes and exhibit a reduced copy
number, while E. heterophyllum,S. obvallata and R. alexan-
drae have similar numbers compared with their closest rela-
tives (Figure S7). In addition, genes functioning in cellular
transport pathways, including cytoskeleton organization,
actin filament‐based process, export across the plasma
membrane and export from the cell, were shown to be con-
tracted in alpine plants (Figure 1D;Table S15). These proc-
esses may be a component of the plant immune system and
possibly have undergone simplification due to the pathogen‐
depauperate environments of high altitudes.
A similar phenomenon was also described in the case
study of the C. himalaica genome, in which the most sig-
nificantly contracted gene families were functionally enriched
for disease and immune response pathways (Zhang et al.,
2019). Due to the harsh environments characterized by
freezing temperatures, aridity, and high UV radiation, a rea-
sonable hypothesis is that gene families involved in pathogen
or toxicant defense have undergone contraction in alpine
plants, as fewer microorganisms exist. These results suggest
that the contraction of immune system genes might be an
energy‐saving strategy to mitigate genetic loads for surviving
in hostile environments with few pathogens present. In con-
trast, the contraction of disease‐resistance genes also im-
plies that alpine plants may not adapt to the “comfortable”
environments of lowland areas where normal pathogens are
present. Therefore, with a faster pace of global warming
leading to the loss or destruction of alpine habitats, in situ
conservation for alpine biodiversity is necessary and pref-
erable to ex‐situ preservation.
Tests for convergent positive selection
In harsh environments, positive selection is expected to be
common in genes controlling early life history stages, such as
genes involved in reproduction and development (Cui et al.,
2019). Our branch‐site tests identified 36 convergently se-
lected genes that show signatures of positive selection in
more than three alpine species (Table S17). These genes
were functionally related to basic life processes involving
reproduction and respiration, such as carpel, gynoecium,
ovule and endosperm development and photorespiration
pathways (Tables S18,S19). Examples include the SE-
PALLATA (SEP) MADS‐box genes required in floral organ and
meristem identity (Pelaz et al., 2000), the MALE MEIOCYTE
DEATH 1 (MMD1) gene regulating cell‐cycle transitions
during male meiosis (Yang et al., 2003), the NIJMEGEN
BREAKAGE SYNDROME 1 (NBS1) gene involved in double‐
stranded break repair, DNA recombination and maintenance
of telomere integrity in the early stages of meiosis (Zhang
et al., 2006), and the HYDROXYPYRUVATE REDUCTASE
(HPR) gene localized in leaf peroxisomes functioning in the
glycolate pathway of photorespiration (Mano et al., 1999). In
addition, glyoxylate and dicarboxylate metabolism, the most
significantly enriched KEGG pathways (Table S19), are fun-
damental biochemical processes that ensure a constant
supply of energy to living cells. These convergently selected
genes detected in our analyses are likely to contribute to the
primary adaptation of alpine plants to similar extreme envi-
ronments.
Detection of molecular convergence
We investigated signatures of genes undergoing molecular
convergence among alpine species using a combination of
approaches for the detection of convergent evolutionary rate
shifts and site‐based estimation of convergent amino acid
(AA) evolution. These approaches have been commonly used
to investigate the genomic signatures of convergent evolu-
tion, including convergent adaptation to seasonal habitat
desiccation in African killifishes (Cui et al., 2019), convergent
regulatory evolution and loss of flight in paleognathous birds
(Sackton et al., 2019), and convergent evolution of extreme
lifespan in Pacific Ocean rockfishes (Kolora et al., 2021).
Convergent shifts in evolutionary rates were detected using
the RERconverge method (Kowalczyk et al., 2019), which
estimates the correlation between relative evolutionary rates
(RERs) of protein sequences and the evolution of a con-
vergent binary or continuous trait across a phylogeny. Our
analysis focused on positive correlations, representing genes
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
1–16 5
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
with faster evolutionary rates in alpine species relative to
species living in low elevations. An increased RER could arise
due to the relaxation of a constraint or positive selection,
which could be adaptive to habitat‐related changes
(Kowalczyk et al., 2020). We identified 69 gene families un-
dergoing convergently accelerated RERs in alpine species,
significantly enriched in 93 GO terms and 12 KEGG pathways
(P‐adjust <0.05) (Tables S20,S21), involving pathways for
self‐incompatibility, cell wall modification, DNA repair and
stress resistance (Figure 2). Among them, 26 gene families
were found to have convergent AA shifts by a PCOC analysis
(Figure S8). The PCOC method considers shifts in AA pref-
erence instead of identical substitutions (Rey et al., 2018).
Given the relatively large phylogenetic distances among our
analyzed species, selecting only sites that converged to the
exact same AA in all species is quite strict and is bound to
capture only a subset of the substitutions associated with the
convergent trait change. Below, we demonstrate three main
aspects of BPs that have undergone molecular convergence
contributing to the adaptation to high altitudes.
Self‐incompatibility (SI) system
Self‐incompatibility in many flowering plants is controlled by
the S(sterility) locus (Takayama and Isogai, 2005). The loss of
SI genes in A. thaliana is responsible for the evolutionary
transition to the self‐fertile mating system (Sherman‐Broyles
et al., 2007). Although self‐fertilization is often thought to lead
to the decreased fitness of homozygous offspring, this mode
ensures reproduction in the absence of pollinators or suitable
mates, and therefore can be advantageous for plants to oc-
cupy niches in harsh environments (Goodwillie et al., 2005).
Our results revealed the biggest orthogroup (OG0000000),
which includes the Arabidopsis S‐receptor kinase (SRK)
genes and the S‐locus flanking gene ARK3 (RECEPTOR
KINASE 3), undergoing evolutionary rate acceleration and
convergent AA shifts in alpine species (Figures 3,S9). Gene
Ontology analysis showed that these genes were functionally
enriched for the process of reproduction (recognition of
pollen, recognition of pollen, and pollination) and the immune
system (immune response). Furthermore, we identified the
functional domain using NCBI's conserved domain database
(CDD) (Lu et al., 2020). The result showed that these proteins
contained the S‐locus glycoprotein domain (Pfam00954),
confirming their functions in the SI system. Two sites located
on the S‐locus glycoprotein domain were found to have un-
dergone a convergent AA shift (Figure S9).
Loss of function at the S‐locus in alpine Crucihimalaya
genomes, possibly due to relaxed selection, has been re-
ported (Zhang et al., 2019;Feng et al., 2022), with a similar
phenomenon found in the high‐altitude Andes maca (Lepi-
dium meyenii) genome (Zhang et al., 2016). Our branch‐site
tests did not detect any signatures of positive selection on
these genes, suggesting that the acceleration of evolutionary
rates may be the result of relaxed constraints. Relaxation of
selection on the S‐locus genes was tested using the RELAX
model (Wertheim et al., 2015), which estimates the relaxation
Figure 2. Function enrichment of gene families undergoing convergently evolutionary rate acceleration in alpine plant genomes
(A) Significantly enriched Gene Ontology (GO) terms. Each bubble represents a summarized GO term from the full GO list by reducing functional
redundancies, and their closeness on the plot reflects their closeness in the GO graph, that is, the semantic similarity. (B) Top 20 enriched Kyoto
Encyclopedia of Genes and Genomes (KEGG) pathways. “p‐adj”refers to the adjusted P‐value using the Benjamini–Hochberg method.
Alpine plant genomes Journal of Integrative Plant Biology
6Month 2023
|
Volume 00
|
Issue 00
|
1–16 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
intensity parameter k, with k>1 indicating intensified
selection (i.e., positive or purifying selection) and k<1in-
dicating relaxed selection. The results showed that five
(C. himalaica,E. heterophyllum,H. vulgare var. nudum,
S. brachista and S. obvallata) of the seven alpine plants ex-
hibited significantly relaxed selection on S‐locus genes
(P‐value <0.05; Table S22). Bingham and Ort (1998) reported
that low levels of insect diversity, abundance and activity
often occurred in alpine ecosystems and hypothesized that
these factors might limit the pollination of alpine plants.
Species living in isolated habitats like alpine environments or
ocean islands are thus less likely to be SI, in line with “Baker's
Law,”which assumes that pollen limitation may be an im-
portant force driving the transition of mating systems
(Cheptou, 2012). Pollination biology studies have shown
several cases of autonomous selfing in various taxonomically
distant species within HHM communities, although the pro-
portion of self‐pollinated species has not yet been calculated
to test the hypothesis (Sun et al., 2014). The convergent
acceleration of evolutionary rates of S‐locus genes due to
Figure 3. Four representative cases of alpine‐accelerated genes
Signatures of convergent evolutionary rate shifts were detected using the RERconverge method (P<0.05, Wilcoxon‐rank sum test). Each panel represents
the estimated relative evolutionary rates (RERs) for a gene in alpine species (point in cyan labeled with species names) as well as in their lowland relatives
(points in orange labeled with species names) and ancestral branches (plotted in a single row at the base of the y‐axis). A gene's RER for a given branch
represents how quickly or slowly the gene is evolving on that branch relative to its overall rate of evolution throughout the tree.
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
1–16 7
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
relaxed selection provides convincing genetic evidence of the
evolutionary transition from SI to the self‐compatibility mating
system of alpine plants, which is potentially a convergently
adaptive process for alpine plants to facilitate their re-
production and the occupation of alpine niches.
Cell wall modification
The cuticle membrane lies over and merges into the outer
wall of epidermal cells (Martin and Juniper, 1970). The pri-
mary role of the cuticle, composed of cutin and cuticular
waxes, is to mitigate water loss and excessive UV radiation
by functioning as a physical barrier between the plant surface
and its external environment (Kerstiens, 1996). Cuticular
waxes are composed of a variety of organic solvent‐soluble
lipids, consisting of very‐long‐chain (VLC) fatty acids and
their derivatives, as well as secondary metabolites like fla-
vonoids (Pollard et al., 2008). The cell wall modification
pathway was found to be significantly enriched in genes
undergoing convergent positive selection (Table S18). Addi-
tionally, in the examination of convergent changes in gene
family number, we found the most significantly enriched
KEGG pathway of co‐expanded genes to be cutin, suberine
and wax biosynthesis (Table S14), corresponding to the en-
riched suberin biosynthetic process in the GO analysis (Table
S13), including genes encoding fatty acyl‐CoA reductases
(FARs). Fatty acyl‐CoA reductases catalyze the formation of
fatty alcohols, which are common components of plant sur-
face lipids (i.e., cutin, suberin, and associated waxes). We did
not detect convergent acceleration of the evolutionary rate of
FARs, suggesting possible modifications of FARs through the
increase in gene copy number. The results show that an al-
dehyde decarbonylase enzyme CER1 underwent a con-
vergent acceleration of the evolutionary rate in alpine plants.
Overexpression of the A. thaliana CER1 gene was reported to
promote wax VCL alkane biosynthesis and influences plant
response to biotic and abiotic stresses (Bourdenx et al.,
2011). The convergent evolution of genes involved in the
cutin, suberine and wax biosynthesis implies that cuticular
waxes may function as protective screens against UV radia-
tion to protect anatomical structures and dissipate excess
light energy for alpine plants.
In addition, we detected positive selection and molecular
convergence of three MYB transcription factors, including
MYB27, MYB48, and MYB59 (Table S17). Among them,
MYB27 was reported to play a role in regulating the accumu-
lation of anthocyanins (Albert et al., 2014), a class of flavonoids.
In the STRING database (Szklarczyk et al., 2021), MYB48 was
predicted to interact with proteins that are involved in flavonoid
biosynthesis, including F3H (naringenin, 2‐oxoglutarate 3‐
dioxygenase), catalyzing the 3‐β‐hydroxylation of 2S‐
flavanones to 2R,3R‐dihydroflavonols which are intermediates
in the biosynthesis of flavonols and anthocyanidins, FLS1
(flavonol synthase/flavanone 3‐hydroxylase), catalyzing the
formation of flavonols from dihydroflavonols, DFR
(dihydroflavonol reductase), catalyzing the conversion of
dihydroquercetin to leucocyanidin, and TT5, a member of
chalcone‐flavanone isomerase family protein (Figure S10;
Table S23). Flavonoids function as antioxidants that reduce
DNA damage induced by abiotic stresses such as extreme
temperatures, UV radiation and drought, and thus play critical
roles in species adapting to high‐altitude environments (Agati
et al., 2012). Many examples of whole genome studies of alpine
plants have shown that expansion and/or positive selection of
genes involved in flavonoid biosynthesis constitute an im-
portant part of the genomic footprint of alpine adaptation (Zeng
et al., 2015;Chen et al., 2019;Wang et al., 2021). Taken to-
gether, modifications of the cell wall in alpine plants through
evolutionary expansion and adaptive convergence of genes
involved in the biosynthesis of cuticular waxes and flavonoids
might be vital strategies for adaptation to dramatic weather
changes and extensive UV radiation in high altitudes.
With the newly generated transcriptomic data of S. ob-
vallata, we were able to investigate expression patterns of
genes related to the biosynthesis of cuticular waxes and
flavonoids. Five tissues with three biological replicates, in-
cluding three from leaves (basal leaves [JL], middle leaves
[ML], and bract leaves [BL]) and two from flowers and stems,
were sampled (Figure 4A;Table S3). After mapping the RNA‐
seq data to the assembled genome of S. obvallata, 25,096
genes had expression profiles and were retained for differ-
ential gene expression (DEG) analysis. The results show that,
compared with basal and ML, BL exhibit 1,071 significant
upregulated genes and 798 significant downregulated genes
(Figure S11). Upregulated genes were significantly enriched
in cytochrome P450, cutin, suberine and wax biosynthesis
and isoflavonoid biosynthesis KEGG pathways (Figure 4B),
and downregulated genes were functionally related to pho-
tosynthesis, energy metabolism as well as plant−pathogen
interaction pathways (Figure S12). Furthermore, we analyzed
expression profiles of genes involved in the biosynthesis of
cuticular waxes and flavonoids. The results showed that
many genes had higher expression levels in BL than in other
leaf tissues. For example, CER1, CER3, CER4, and MAH1 in
the cuticular wax biosynthetic pathway (Figure 4C), and 4CL,
CHS, CHI, F3”H, TT7 and OMT in the flavonoid biosynthetic
pathway (Figure 4D). These results suggest that the accu-
mulation of cuticular waxes and flavonoids is an important
genetic pathway from normal leaves to leafy bracts. Our
findings provide new insights into the genetic basis of the
specialized “glasshouse”morphology for a better under-
standing of plant morphological adaptation.
DNA repair and stress resistance pathways
The hypoxia and intense UV radiation in alpine environments
exert high abiotic stress that can cause DNA, RNA, and protein
damage. DNA repair processes thus play an important role in
the high‐altitude adaptation of plants, similar to evidence in
alpine animals (Li et al., 2018). The NBS1 gene, involved in DNA
repair, cellular response to DNA damage stimulus and double‐
stranded break repair, was found to be under convergent se-
lection in alpine genomes (Table S17). Moreover, several sig-
nificantly enriched GO terms related to DNA repair and protein
Alpine plant genomes Journal of Integrative Plant Biology
8Month 2023
|
Volume 00
|
Issue 00
|
1–16 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
ubiquitination were detected undergoing convergent evolution
in alpine plants (Figure 2A,Table S17). Similar KEGG pathways
were also identified, including mismatch repair, homologous
recombination and nucleotide excision repair (Figure 2B,Table
S18). Examples include the UEV1 genes, enriched in protein
K63‐linked ubiquitination pathway that reportedly play a role in
DNA damage responses and error‐free post‐replicative DNA
repair by participating in lysine‐63‐based polyubiquitination re-
actions (Wen et al., 2008), and genes encoding OB‐fold pro-
teins which were found to be important for genomic stability
including DNA replication, recombination, repair, and telomere
homeostasis (Flynn and Zou, 2010).
In addition to low oxygen levels and excessive UV radia-
tion, high‐altitude environments pose various threats to living
organisms from an unpredictable climate. We found that
many genes with convergent evolutionary rate shifts were
significantly enriched in stress resistance pathways, such as
genes involved in the hormone signal transduction (cell rec-
ognition, intracellular signal transduction, and cytokinin‐
activated signaling pathway), the cell rhythm system (circa-
dian rhythm, cell‐cycle phase transition, and programmed
cell death), and the regulation of enzyme activity pathways
(regulation of kinase activity, regulation of GTPase activity,
and regulation of transferase activity) (Figure 2A,Table S17).
Also, some KEGG pathways were found to be significantly
enriched in metabolic pathways that may contribute to stress
resistance, such as plant hormone signal transduction, phe-
nylpropanoid biosynthesis, and glycine, serine, and threonine
Figure 4. Highly expressed genes in bract leaves of Saussurea obvallata
(A) Sampling information of S. obvallata transcriptome data. Five tissues including three from leaves, basal leaves (JL), middle leaves (ML) and bract leaves
(BL) as well as two from flowers (F) and stems (S), were sampled. (B) Top 20 KEGG pathways of significantly upregulated genes in bract leaves. “p‐adj”
refers to the adjusted P‐value using the Benjamini–Hochberg method. Expression profiles of genes involved in cuticular wax (C) and flavonoid biosynthetic
pathways (D) in different tissues of S. obvallata. High expressed genes in BL are shown in red in the simplified pathway models. The bar represents the
gene expression level of each gene (z‐score) (4CL, 4‐coumarate: CoA ligase; CER, protein eceriferum; CHI, chalcone isomerase; CHS, chalcone synthase;
DFR, dihydroflavonol 4‐reductase; F3H, flavanone 3‐hydroxylase; F3″H, flavonoid 3″‐hydroxylase; FA‐CoA, fatty acyl‐coenzyme A; FAR, fatty acid syn-
thetase; FLS, flavonol synthase; LDOX/ANS, leucoanthocyanidin dioxygenase/anthocyanidin synthase; MAH1, midchain alkane hydroxylase 1; OMT,
O‐methyltransferase; TT7, transparent testa 7; VLC, very‐long‐chain; WSD1, diacylglycerol acyltransferase 1).
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
1–16 9
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
metabolism (Figure 2B;Table S18). Plant hormones like cy-
tokinin and ethylene play pivotal regulatory roles in plant
growth and development, including cell division, shoot ini-
tiation, light responses, and leaf senescence. We found that
the type‐AArabidopsis response regulator (ARR) protein
family, involved in response to cytokinin and cytokinin signal
transduction, undergoes molecular convergence in alpine
plants. An experimental study revealed that arr mutants show
altered red‐light sensitivity, indicating an important role of
type‐A ARRs in light adaptation (To et al., 2004). The circa-
dian clock was selected as a mechanism in the control of
cell‐cycle progression to avoid sunlight‐induced DNA
damage in ancient unicellular organisms (Hut and Beersma,
2011). Circadian rhythms in plants regulate multiple proc-
esses such as photosynthesis, flowering, seed germination,
and senescence (Srivastava et al., 2019). Genes involved in
the regulation of the circadian rhythm were found to have
convergently accelerated evolutionary rates in alpine plants,
such as the transcription factor MYB59, which participates in
the regulation of the cell cycle, mitosis and root growth by
controlling the duration of metaphase. The myb59 mutant
was found to have longer roots, smaller leaves and smaller
cells than wild‐type plants (Fasani et al., 2019), which are
commonly observed in alpine plants. Taken together, these
results suggested that adaptation to high altitude requires the
participation of multiple BPs.
Summary and perspectives
Taking advantage of an integrative genomic data set of alpine
plants, our study unraveled the convergent genetic changes
that confer high‐altitude adaptation (Figure 5). Both molecular
convergence in changes of gene copy number and accel-
erated evolutionary rates are consequences of selective
pressures posed by the surrounding environment. Thus, ge-
nomic signatures of convergent evolution detected here are
direct evidence for alpine plants associated with in-
dependently colonizing, evolving and adapting to extreme
cold, high UV radiation and hypoxia environments of the
HHMs. The alpine plants included in this study belong to
taxonomically distant plant families, hence standing genetic
variation and localized introgression of regions of the genome
can be ruled out as probable causes of genomic
Figure 5. Summary of convergent adaptation to high‐altitude environments for seven alpine species
The outer circle shows examples of enriched Gene Ontology (GO)‐terms (biological pathways). The middle circle shows examples of candidate genes
named after the A. thaliana orthologs. The inner gray circle shows environmental stresses from high altitudes. Species names are provided below the
picture. *Genes having undergone convergent contraction.
Alpine plant genomes Journal of Integrative Plant Biology
10 Month 2023
|
Volume 00
|
Issue 00
|
1–16 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
convergence. Identical de novo mutations must therefore
have occurred independently in each taxon during evolution
from a low‐altitude ancestor. In addition to molecular con-
vergence, some results that do not cover all alpine plants
may imply the existence of species‐specific adaptation
strategies, which provide the foundation for further inves-
tigations into high‐altitude adaptation. Notably, our study is
limited by a lack of functional validation on the identified
molecular changes since transgene experiments are yet un-
realistic for most alpine plants. Nonetheless, verifying the role
of MYB transcription factors in the development of speci-
alized “glasshouse”morphology by quantitative real‐time
PCR is feasible in future studies. Collectively, our results
generated novel insights into genomic bases of adaptation to
extreme environments at high‐taxonomic levels of angio-
sperms, while providing valuable genomic resources for
plants living in specialized habitats. Further genomic study of
high‐altitude adaptation should provide detailed evidence
related to morphological and physiological specializations
using updated data, such as proteome and metabolome,
while also referring to discoveries from phylogenetically dis-
tant taxon to evaluate the evolutionary convergence in similar
environments.
MATERIALS AND METHODS
Plant material, DNA extraction, and sequencing
Fresh leaves of wild S. obvallata and R. alexandrae individuals
were collected from Songpan county, Sichuan Province, China
(102°45′E, 30°23′N) and Linzhi county, Tibet Province, China
(102°45′E, 30°23′N), respectively. All samples were sent to
Wuhan Benagen Technology Company Limited (Wuhan,
China) for genome sequencing. Total genomic DNA was ex-
tracted using the Qiagen DNeasy Plant Mini Kit. For the Illu-
mina short reads, DNA libraries with 500‐bp insert sizes were
constructed and sequenced using an Illumina HiSeq 4000
platform (Illumina Inc. USA). For Oxford Nanopore sequencing,
libraries were prepared using the SQKLSK109 ligation kit using
the standard protocol, and the purified library was loaded onto
primed R9.4 Spot‐On Flow Cells and sequenced using a
PromethION sequencer (Oxford Nanopore Technologies,
Oxford, UK). The Hi‐C sequencing was performed as follows:
extracted DNA was first crosslinked by 40 mL of 2% form-
aldehyde solution to capture interacting DNA segments, the
chromatin DNA was then digested with the DpnII restriction
enzyme, and libraries were constructed and sequenced using
Illumina HiSeq 4000 instrument with 2 ×150 bp reads.
For RNA sequencing, fresh tissue samples were collected
and immediately frozen in liquid nitrogen. Three biological
replicates of five tissues of S. obvallata were sampled. Total
RNA was extracted using the TRIzol® Reagent (Invitrogen,
Shanghai, China). Paired‐end cDNA libraries were con-
structed using TruSeq Stranded mRNA Library Prep Kit
(Illumina) and were sequenced using the Illumina HiSeq
4000 platform.
Genome assembly and quality control
Prior to genome assembly, Kmerfreq (https://github.com/
fanagislab/kmerfreq) was used for counting k‐mer frequency
with the k‐mer set to 19, and GCE v1.0 was used for esti-
mating the genome size (Liu et al., 2013). The ONT long reads
were corrected and assembled using NextDenovo v2.3.1
(https://github.com/Nextomics/NextDenovo) with default pa-
rameters. Assembled contigs were polished using NextPolish
v1.4.1 (Hu et al., 2020) with the long reads and Pilon v1.23
(Walker et al., 2014) with Illumina short reads for three rounds.
Hi‐C data were used to infer chromosome conformation using
the 3D‐DNA pipeline with default parameters. The accuracy of
the Hi‐C‐based chromosomal assembly was improved using
Juicebox' s chromatin contact matrix (Dudchenko et al., 2017).
The completeness and continuity of the assemblies were
evaluated by the statistics of BUSCO and LAI, respectively.
Genome annotation
Repeat annotation
TEs were identified based on de novo and homology‐based
strategies. RepeatMasker v4.0.7 was used to run a homology
search for known repeat sequences against the Repbase
database v22.11 (Tarailo‐Graovac and Chen, 2009). Re-
peatModeler v2.0.10 was used to predict the TEs based on
the de novo method (Flynn et al., 2020). LTRharvest v1.5.10,
LTR_FINDER v1.05 and LTR_retriever v1.8.0 were used to
build an LTR library with default parameters (Xu and Wang,
2007;Ellinghaus et al., 2008;Ou and Jiang, 2018). Finally,
RepeatMasker was used to merge the library files of the two
methods and to identify the repeat contents.
Gene prediction
A combination of de novo‐, homology‐, and transcript‐based
methods were used for gene prediction in both genomes. RNA‐
seq reads were assembled using Trinity v2.1.1 using the de
novo‐based and genome‐guided modes, respectively. Coding
DNA sequences (CDS) and protein sequences were predicted
with TransDecoder (http://transdecoder.github.io). Homologues
were predicted by mapping protein sequences using GeMoMa
v1.6.1 (Keilwagenetal.,2016). Sequences of Arabidopsis
thaliana, Oryza sativa and Solanum tuberosum were mapped to
both genomes. Additionally, sequences of Helianthus annuus,
Lactuca sativa, Cynara cardunculus and Mikania micrantha
were mapped to S. obvallata, and sequences of Fagopyrum
tataricum and Rumex hastatus were mapped to R. alexandrae,
respectively. A de novo gene prediction was performed with
Braker2 v2.1.5 and GlimmerHMM v3.0.4 (Majoros et al.,
2004;Bruna et al., 2021). Assembled transcripts were used for
training gene models in Braker2. Gene models from the three
main sources were merged to produce consensus models
using EVidenceModeler v1.1.1 (Haas et al., 2008).
Gene functional annotation
The annotated protein‐coding genes were used for a BLAST
search against the UniProt and NCBI nonredundant protein
databases to predict gene functions. The functional domains
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
1–16 11
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
of protein sequences were identified by InterProScan v5.51‐
85.0 using data from Pfam (Jones et al., 2014). Kyoto Ency-
clopedia of Genes and Genomes and GO terms of the
gene models were obtained using eggNOG‐mapper v2.0.1
(Cantalapiedra et al., 2021).
Orthogroup inference and alignment
Protein sequences from the seven alpine genomes, as well as
13 relatives living at low altitudes, were used for subsequent
comparative analyses. OrthoFinder v2.5.4 was used to construct
orthogroups for all species with default settings (Emms and
Kelly, 2019). Because the included species are phylogenetically
distant (widely dispersed across the eudicots and monocots),
OrthoFinder recovered an extremely low number of strictly
single‐copy orthogroups. We therefore reduced orthogroups
that contained multiple gene copies per species to one copy per
species based on the smallest genetic distance following the
study of Birkeland et al. (2020).Briefly, all orthogroups were
aligned based on protein sequence using MAFFT v7.3 (Katoh
and Standley, 2013), and genetic distances between all pairs of
genes were calculated as Kimura protein distances (Kimura,
1980). One gene copy per species was retained based on the
smallest genetic distance to the longest protein sequence of A.
thaliana in each orthogroup. The rationale is that the annotation
of the A. thaliana genome is complete and reliable, and our
subsequent functional analyses of candidate genes were based
on the A. thaliana orthologs. The resulting orthogroup se-
quences that contained only one protein sequence per species
were realigned using PRANK v170427 (Löytynoja, 2014). Coding
sequences of genes in orthogroups were extracted based on
the same gene identifier of protein sequences and were aligned
using PRANK with the codon mode.
Phylogeny and estimation of divergence time
Protein sequences of 195 single‐copy orthogroups were
used for phylogenetic inference with RAxML v8.2.12 using
the PROTGAMMAAUTO substitution model with 500 boot-
strap replicates (Stamatakis, 2014). Divergence time was
estimated in the MCMCtree program from PAML v4.9j (Yang,
2007), using the approximate likelihood method with the tree
topology inferred with RAxML, an independent substitution
rate (clock =2), and the HKY85 +GAMMA model. Three
calibration points were assigned based on the TimeTree
database (Kumar et al., 2017)(http://www.timetree.org/,ac-
cessed on May 1, 2022): the MRCA (most recent common
ancestor) of rosids (95% HPD 105–115 Mya), the MRCA of
Pentapetalae (95% HPD 110–124 Mya), and the MRCA of
Mesangiospermae (95% HPD 148–173 Mya). Samples were
drawn every 1,000 MCMC steps from a total of 10
6
steps,
with a burn‐in of 10
5
steps. Convergence was assessed by
comparing parameter estimates from two independent runs,
with all effective sample sizes >200.
Gene family evolution
Changes in gene family number were examined using CAFÉ5
(Mendes et al., 2020). In addition to the base model, the
number of gamma categories (‐k) was set to estimate sepa-
rate lambda values for different lineages in the tree (Gamma
model). The highest likelihood was found using k=2 rate
categories (−lnL =286114), with λ=0.00553 and α=1.58.
Gene family expansions or contractions were identified only
when the change in gene count was significant with a
P‐value <0.05. Genomewide NBS‐LRR genes were manually
identified using HMMER v3.2 with an e‐value 1e−05 (Wheeler
and Eddy, 2013). The NBS‐LRR protein domains (NB‐ARC:
PF00931; RPW8: PF05659; TIR: PF01582; LRR: PF00560,
PF07723, PF07725 and PF12799) were retrieved from Pfam
(http://pfam.xfam.org) and were used to identify conserved
motif of NBS‐LRR genes in sampled genomes. The ML
phylogenetic tree of NBS‐LRR was constructed using
RAxML. Then, Notung v2.9 was used to recover gains and
losses of NBS‐LRR genes by reconciling the NBS‐LRR gene
tree (Chen et al., 2000). The concatenated tree reconstructed
by RAxML was used as input topology.
Functional enrichment analysis
Gene Ontology and KEGG over‐representation tests were
performed using clusterProfiler v4.3.4 implemented in R to
identify significantly enriched pathways (Wu et al., 2021).
Gene Ontology and KEGG terms were assigned according to
the orthologous genes of the A. thaliana genome. In the
“enrichGO”function, we set “ont =BP”to only search for
enriched BPs. The resulting P‐values were corrected for
multiple comparisons using a Benjamini–Hochberg FDR
correction. A criterion of P‐adjust <0.05 was used to assess
the significance of enrichment analyses.
Tests for positive selection and relaxation of selection
The branch‐site model implemented in CodeML (PAML
package) was performed to test for positive selection. In this
test, an alternative model allowing sites to be under positive
selection on the foreground branch was contrasted to a null
model limiting sites to evolve neutrally or under purifying
selection using a likelihood ratio test (LRT). LRT P‐values
were computed based on chi‐squared distribution (df =2)
and were corrected for multiple tests at a P‐adjust <0.05
threshold using a Benjamini–Hochberg FDR correction. A
gene showing a signature of positive selection in more than
three alpine species was identified as a convergently se-
lected gene. Additionally, RELAX was used to test for the
relaxation of selection on S‐locus genes using the LRT by
comparing the model fixing k=1 and the model allowing kto
be estimated (Wertheim et al., 2015). In both analyses, seven
tests were conducted separately by setting each alpine
species as the foreground branch.
Convergent evolutionary rate shifts
To perform the RERconverge analysis, we first used PAML to
estimate maximum‐likelihood gene trees whose branch
lengths represent evolutionary rates using the number of AA
substitutions. RERs were calculated using “getAllResiduals”
function with weight =T, scale =T and cut‐off =0.001
Alpine plant genomes Journal of Integrative Plant Biology
12 Month 2023
|
Volume 00
|
Issue 00
|
1–16 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
(Kowalczyk et al., 2019). We set alpine species as foreground
branches (branches of the tree with the trait of interest) and ran
“foreground2Paths”function to estimate RERs for all genes
and to correlate them with trait evolution (i.e., alpine habitats in
our study). We then used “correlateWithBinaryPhenotype”
function to test for a significant association between RERs and
traits across all branches of the tree with a P‐value <0.05.
Convergent AA evolution
Two models were compared in PCOC analysis: the con-
vergent model in which a site on convergent branches evolves
under a profile different from that of the nonconvergent
branches, and the nonconvergent (null) model in which a site
evolves under a single AA profile throughout the phylogeny
(Rey et al., 2018). PCOC then detected convergent sites by
identifying the better fit between the two models. To filter for
only sites with strong evidence for convergent profile shifts, we
set a posterior probability threshold of >0.9 in the analysis.
Differential gene expression analysis
RNA‐seq data for S. obvallata were mapped to the as-
sembled genome using HISAT2 v2.2.1 (Kim et al., 2019). Only
uniquely mapped paired‐end reads were retained for read
counting by featureCounts v2.0.3 (Liao et al., 2014) to gen-
erate the count and transcripts per kilobase million (TPM)
Tables. DEG analyses among the five tissues were performed
in DESeq. 2 v1.36.0 (Love et al., 2014), with a P‐value <0.05
as a cut‐off and a log
2
fold change cut‐off of 1.
Data availability statement
The genomic data generated and analyzed in this study including
the raw sequencing data of Oxford Nanopore, Illumina, Hi‐Cand
RNA‐seq, as well as genome assembly have been deposited in
China National GeneBank DataBase (CNGBd, https://db.cngb.
org/) under accession number CNP0003451. All the custom
scripts and specific command lines have been deposited at
GitHub (https://github.com/ZhangXu-CAS/Alpine_genome).
ACKNOWLEDGEMENTS
This work was supported by the Second Tibetan Plateau Sci-
entific Expedition and Research program (2019QZKK0502), the
Strategic Priority Research Program of the Chinese Academy of
Sciences (XDA20050203), the Key Projects of the Joint Fund of
the National Natural Science Foundation of China (U1802232),
the Youth Innovation Promotion Association of Chinese
Academy of Sciences (2019382), the Yunnan Young & Elite
Talents Project (YNWR‐QNBJ‐2019‐033) and the Ten Thousand
Talents Program of Yunnan Province (202005AB160005).
CONFLICTS OF INTEREST
The authors declare that they have no conflicts of interest
associated with this work.
AUTHOR CONTRIBUTIONS
T.D., H.W. and H.S. conceived and led the project. X.Z., T.K.,
W.D. and Z.Q. processed data, performed analyses, and
drew figures. X.Z., H.Z., T.F., L.L. and T.D. collected mate-
rials. X.Z. wrote the manuscript. J.B.L., Y.S., J.H., T.D., H.W.
and H.S. revised the manuscript. All authors read and ap-
prove of the manuscript.
Edited by: Hongzhi Kong, Institute of Botany, CAS, China
Received Nov. 18, 2022; Accepted Mar. 21, 2023; Published Mar.
24, 2023
REFERENCES
Abbas, M., Sharma, G., Dambire, C., Marquez, J., Alonso‐Blanco, C.,
Proaño, K., and Holdsworth, M.J. (2022). An oxygen‐sensing mech-
anism for angiosperm adaptation to altitude. Nature 606: 565–569.
Agati, G., Azzarello, E., Pollastri, S., and Tattini, M. (2012). Flavonoids
as antioxidants in plants: Location and functional significance. Plant
Sci. 196: 67–76.
Albert, N.W., Davies, K.M., Lewis, D.H., Zhang, H., Montefiori, M.,
Brendolise, C., Boase, M.R., Ngo, H., Jameson, P.E., and Schwinn,
K.E. (2014). A conserved network of transcriptional activators and re-
pressors regulates anthocyanin pigmentation in eudicots. Plant Cell 26:
962–980.
Beall, C.M. (2014). Adaptation to high altitude: Phenotypes and geno-
types. Annu. Rev. Anthropol. 43: 251–272.
Bingham, R.A., and Ort, A.R. (1998). Efficient pollination of alpine plants.
Nature 391: 238–239.
Birkeland, S., Gustafsson, A.L.S., Brysting, A.K., Brochmann, C., and
Nowak, M.D. (2020). Multiple genetic trajectories to extreme abiotic
stress adaptation in arctic brassicaceae. Mol. Biol. Evol. 37: 2052–2068.
Bourdenx, B., Bernard, A., Domergue, F., Pascal, S., Leger, A., Roby,
D., Pervent, M., Vile, D., Haslam, R.P., Napier, J.A., Lessire, R., and
Joubès, J. (2011). Overexpression of Arabidopsis ECERIFERUM1
promotes wax very‐long‐chain alkane biosynthesis and influences
plant response to biotic and abiotic stresses. Plant Physiol. 156: 29–45.
Bruna, T., Hoff, K.J., Lomsadze, A., Stanke, M., and Borodovsky, M.
(2021). BRAKER2: Automatic eukaryotic genome annotation with
GeneMark‐EP+and AUGUSTUS supported by a protein database.
NAR Genom. Bioinform. 3: lqaa108.
Cantalapiedra, C.P., Hernandez‐Plaza, A., Letunic, I., Bork, P., and
Huerta‐Cepas, J. (2021). eggNOG‐mapper v2: Functional annotation,
orthology assignments, and domain prediction at the metagenomic
scale. Mol. Biol. Evol. 38: 5825–5829.
Cao, X., Yang, H., Shang, C., Ma, S., Liu, L., and Cheng, J. (2019). The
roles of auxin biosynthesis YUCCA gene family in plants. Int. J. Mol.
Sci. 20: 6343.
Chen, J.H., Huang, Y., Brachi, B., Yun, Q.Z., Zhang, W., Lu, W., Li, H.N.,
Li, W.Q., Sun, X.D., Wang, G.Y. et al. (2019). Genome‐wide analysis of
Cushion willow provides insights into alpine plant divergence in a bi-
odiversity hotspot. Nat. Commun. 10: 5230.
Chen, K., Durand, D., and Farach‐Colton, M. (2000). NOTUNG: A pro-
gram for dating gene duplications and optimizing gene family trees. J.
Comput. Biol. 7: 429–447.
Chen, K., Fan, B., Du, L., and Chen, Z. (2004). Activation of hyper-
sensitive cell death by pathogen‐induced receptor‐like protein kinases
from Arabidopsis. Plant Mol. Biol. 56: 271–283.
Cheptou, P.O. (2012). Clarifying baker's law. Ann. Bot. 109: 633–641.
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
1–16 13
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Cui, R., Medeiros, T., Willemsen, D., Iasi, L.N.M., Collier, G.E., Graef,
M., Reichard, M., and Valenzano, D.R. (2019). Relaxed selection
limits lifespan by increasing mutation load. Cell 178: 385–399.
Diener, A.C., Gaxiola, R.A., and Fink, G.R. (2001). Arabidopsis ALF5, a
multidrug efflux transporter gene family member, confers resistance to
toxins. Plant Cell 13: 1625–1638.
Dudchenko, O., Batra, S.S., Omer, A.D., Nyquist, S.K., Hoeger, M.,
Durand, N.C., Shamim, M.S., Machol, I., Lander, E.S., Aiden, A.P.
et al. (2017). De novo assembly of the Aedes aegypti genome using
Hi‐C yields chromosome‐length scaffolds. Science 356: 92–95.
Ellinghaus, D., Kurtz, S., and Willhoeft, U. (2008). LTRharvest, an effi-
cient and flexible software for de novo detection of LTR retro-
transposons. BMC Bioinformatics 9: 18.
Emms, D.M., and Kelly, S. (2019). OrthoFinder: Phylogenetic orthology
inference for comparative genomics. Genome Biol. 20: 238.
Fasani, E., DalCorso, G., Costa, A., Zenoni, S., and Furini, A. (2019). The
Arabidopsis thaliana transcription factor MYB59 regulates calcium
signalling during plant growth and stress response. Plant Mol. Biol. 99:
517–534.
Feng, L., Lin, H., Kang, M., Ren, Y., Yu, X., Xu, Z., Wang, S., Li, T.,
Yang, W., and Hu, Q. (2022). A chromosome‐level genome assembly
of an alpine plant Crucihimalaya lasiocarpa provides insights into high‐
altitude adaptation. DNA Res. 29: dsac004.
Flynn, J.M., Hubley, R., Goubert, C., Rosen, J., Clark, A.G., Feschotte,
C., and Smit, A.F. (2020). RepeatModeler2 for automated genomic
discovery of transposable element families. Proc. Natl. Acad. Sci. U.S.A.
117: 9451‐9457.
Flynn, R.L., and Zou, L. (2010). Oligonucleotide/oligosaccharide‐binding
fold proteins: A growing family of genome guardians. Crit. Rev. Bio-
chem. Mol. Biol. 45: 266–275.
Fujikawa, K., Ikeda, H., Murata, K., Kobayashi, T., Nakano, T.,
Ohba, H., and Wu, S.G. (2004). Chromosome numbers of fifteen
species of the genus Saussurea DC. (Asteraceae) in the Himalayas and
the adjacent regions. J. Japan Botany 79: 271–280.
Fukao, T., and Bailey‐Serres, J. (2004). Plant responses to hypoxia—Is
survival a balancing act? Trends Plant Sci. 9: 449–456.
Goodwillie, C., Kalisz, S., and Eckert, C.G. (2005). The evolutionary
enigma of mixed mating systems in plants: Occurrence, theoretical
explanations, and empirical evidence. Annu. Rev. Ecol. Evol. Syst. 36:
47–79.
Guo, X., Hu, Q., Hao, G., Wang, X., Zhang, D., Ma, T., and Liu, J. (2018).
The genomes of two Eutrema species provide insight into plant
adaptation to high altitudes. DNA Res. 25: 307–315.
Haas, B.J., Salzberg, S.L., Zhu, W., Pertea, M., Allen, J.E., Orvis, J.,
White, O., Buell, C.R., and Wortman, J.R. (2008). Automated eu-
karyotic gene structure annotation using EVidenceModeler and the
program to assemble spliced alignments. Genome Biol. 9: R7.
Hu, J., Fan, J., Sun, Z., and Liu, S. (2020). NextPolish: A fast and efficient
genome polishing tool for long‐read assembly. Bioinformatics 36:
2253–2255.
Hut, R.A., and Beersma, D.G. (2011). Evolution of time‐keeping mecha-
nisms: Early emergence and adaptation to photoperiod. Philos. Trans.
R. Soc. Lond. B. Biol. Sci. 366: 2141–2154.
Jones, P., Binns, D., Chang, H.Y., Fraser, M., Li, W., McAnulla, C.,
McWilliam, H., Maslen, J., Mitchell, A., Nuka, G. et al. (2014). Inter-
ProScan 5: Genome‐scale protein function classification. Bio-
informatics 30: 1236–1240.
Katoh, K., and Standley, D.M. (2013). MAFFT multiple sequence align-
ment software version 7: Improvements in performance and usability.
Mol. Biol. Evol. 30: 772–780.
Keilwagen, J., Wenk, M., Erickson, J.L., Schattat, M.H., Grau, J., and
Hartung, F. (2016). Using intron position conservation for homology‐
based gene prediction. Nucleic Acids Res. 44: e89.
Kerstiens, G. (1996). Cuticular water permeability and its physiological
significance. J. Exp. Bot. 47: 1813–1832.
Kim, D., Paggi, J.M., Park, C., Bennett, C., and Salzberg, S.L. (2019).
Graph‐based genome alignment and genotyping with HISAT2 and
HISAT‐genotype. Nat. Biotechnol. 37: 907–915.
Kimura, M. (1980). A simple method for estimating evolutionary rates of
base substitutions through comparative studies of nucleotide se-
quences. J. Mol. Evol. 16: 111–120.
Kolora, S.R.R., Owens, G.L., Vazquez, J.M., Stubbs, A., Chatla, K.,
Jainese, C., Seeto, K., McCrea, M., Sandel, M.W., Vianna, J.A. et al.
(2021). Origins and evolution of extreme life span in Pacific Ocean
rockfishes. Science 374: 842–847.
Kowalczyk, A., Meyer, W.K., Partha, R., Mao, W., Clark, N.L., and
Chikina, M. (2019). RERconverge: an R package for associating evo-
lutionary rates with convergent traits. Bioinformatics 35: 4815–4817.
Kowalczyk, A., Partha, R., Clark, N.L., and Chikina, M. (2020). Pan‐
mammalian analysis of molecular constraints underlying extended
lifespan. eLife 9: e51089.
Kumar, S., Stecher, G., Suleski, M., and Hedges, S.B. (2017). TimeTree:
A resource for timelines, timetrees, and divergence times. Mol. Biol.
Evol. 34: 1812–1819.
Li, J.T., Gao, Y.D., Xie, L., Deng, C., Shi, P., Guan, M.L., Huang, S.,
Ren, J.L., Wu, D.D., Ding, L. et al. (2018). Comparative genomic in-
vestigation of high‐elevation adaptation in ectothermic snakes. Proc.
Natl. Acad. Sci. U. S. A. 115: 8406‐8411.
Liao, Y., Smyth, G.K., and Shi, W. (2014). featureCounts: an efficient
general purpose program for assigning sequence reads to genomic
features. Bioinformatics 30: 923–930.
Licausi, F., Kosmacz, M., Weits, D.A., Giuntoli, B., Giorgi, F.M.,
Voesenek, L.A., Perata, P., and van Dongen, J.T. (2011). Oxygen
sensing in plants is mediated by an N‐end rule pathway for protein
destabilization. Nature 479: 419–422.
Liu, B., Shi, Y., Yuan, J., Hu, X., Zhang, H., Li, N., Li, Z., Chen, Y., Mu, D.,
and Fan, W. (2013). Estimation of genomic characteristics by analyzing
k‐mer frequency in de novo genome projects. arXiv: Genomics.
Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of
fold change and dispersion for RNA‐seq data with DESeq. 2. Genome
Biol. 15: 550.
Löytynoja, A. (2014). Phylogeny‐aware alignment with PRANK. In Multiple
sequence alignment methods‐‐ Totowa, D.J. Russell, ed., NJ: Humana
Press, pp. 155–170.
Lu, S., Wang, J., Chitsaz, F., Derbyshire, M.K., Geer, R.C., Gonzales, N.
R., Gwadz, M., Hurwitz, D.I., Marchler, G.H., Song, J.S. et al. (2020).
CDD/SPARCLE: The conserved domain database in 2020. Nucleic
Acids Res. 48: D265–D268.
Majoros, W.H., Pertea, M., and Salzberg, S.L. (2004). TigrScan and
GlimmerHMM: two open source ab initio eukaryotic gene‐finders.
Bioinformatics 20: 2878–2879.
Manni, M., Berkeley, M.R., Seppey, M., Simao, F.A., and Zdobnov, E.M.
(2021). BUSCO Update: Novel and Streamlined Workflows along with
Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic,
Prokaryotic, and Viral Genomes. Mol. Biol. Evol. 38: 4647–4654.
Mano, S., Hayashi, M., and Nishimura, M. (1999). Light regulates alter-
native splicing of hydroxypyruvate reductase in pumpkin. Plant J. 17:
309–320.
Martin, J.T., Juniper, B.E. (1970). The cuticles of plants. New York:
St. Martin's Press.
McHale, L., Tan, X., Koehl, P., and Michelmore, R.W. (2006). Plant NBS‐
LRR proteins: Adaptable guards. Genome Biol. 7: 212.
Mendes, F.K., Vanderpool, D., Fulton, B., and Hahn, M.W. (2020). CAFE
5 models variation in evolutionary rates among gene families. Bio-
informatics 36: 5516–5518.
Alpine plant genomes Journal of Integrative Plant Biology
14 Month 2023
|
Volume 00
|
Issue 00
|
1–16 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Mockaitis, K., and Estelle, M. (2008). Auxin receptors and plant devel-
opment: A new signaling paradigm. Annu. Rev. Cell Dev. Biol. 24:
55–80.
Mosher, S., Seybold, H., Rodriguez, P., Stahl, M., Davies, K.A.,
Dayaratne, S., Morillo, S.A., Wierzba, M., Favery, B., Keller, H. et al.
(2013). The tyrosine‐sulfated peptide receptors PSKR1 and PSY1R
modify the immunity of Arabidopsis to biotrophic and necrotrophic
pathogens in an antagonistic manner. Plant J. 73: 469–482.
Nagy, L., Grabherr, G. (2009). The Biology of Alpine Habitats. Oxford:
Oxford University Press.
Ortiz‐Morea, F.A., Liu, J., Shan, L., and He, P. (2022). Malectin‐like re-
ceptor kinases as protector deities in plant immunity. Nat. Plants 8:
27–37.
Ou, S., Chen, J., and Jiang, N. (2018). Assessing genome assembly
quality using the LTR Assembly Index (LAI). Nucleic Acids Res.
46: e126.
Ou, S., and Jiang, N. (2018). LTR_retriever: A highly accurate and sensi-
tive program for identification of long terminal repeat retrotransposons.
Plant Physiol. 176: 1410–1422.
Pelaz, S., Ditta, G.S., Baumann, E., Wisman, E., and Yanofsky, M.F.
(2000). B and C floral organ identity functions require SEPALLATA
MADS‐box genes. Nature 405: 200–203.
Peng, H.P., Chan, C.S., Shih, M.C., and Yang, S.F. (2001). Signaling
events in the hypoxic induction of alcohol dehydrogenase gene in
Arabidopsis. Plant Physiol. 126: 742–749.
Pollard, M., Beisson, F., Li, Y., and Ohlrogge, J.B. (2008). Building lipid
barriers: Biosynthesis of cutin and suberin. Trends Plant Sci. 13:
236–246.
Rey, C., Gueguen, L., Semon, M., and Boussau, B. (2018). Accurate
detection of convergent amino‐acid evolution with PCOC. Mol. Biol.
Evol. 35: 2296–2306.
Sackton, T.B., Grayson, P., Cloutier, A., Hu, Z., Liu, J.S., Wheeler, N.E.,
Gardner, P.P., Clarke, J.A., Baker, A.J., Clamp, M. et al. (2019).
Convergent regulatory evolution and loss of flight in paleognathous
birds. Science 364: 74–78.
Sherman‐Broyles, S., Boggs, N., Farkas, A., Liu, P., Vrebalov, J.,
Nasrallah, M.E., and Nasrallah, J.B. (2007). S locus genes and the
evolution of self‐fertility in Arabidopsis thaliana. Plant Cell 19:
94–106.
Song, B., Stocklin, J., Peng, D.L., Gao, Y.Q., and Sun, H. (2015). The
bracts of the alpine ‘glasshouse’plant Rheum alexandrae (Polygo-
naceae) enhance reproductive fitness of its pollinating seed‐consuming
mutualist. Bot. J. Linn. Soc. 179: 349–359.
Spicer, R.A., Farnsworth, A., and Su, T. (2020). Cenozoic topography,
monsoons and biodiversity conservation within the Tibetan Region: An
evolving story. Plant Divers 42: 229–254.
Srivastava, D., Shamim, M., Kumar, M., Mishra, A., Maurya, R.,
Sharma, D., Pandey, P., and Singh, K.N. (2019). Role of circadian
rhythm in plant system: An update from development to stress re-
sponse. Environ. Exp. Bot. 162: 256–271.
Stamatakis, A. (2014). RAxML version 8: A tool for phylogenetic analysis
and post‐analysis of large phylogenies. Bioinformatics 30: 1312–1313.
Stern, D.L., and Orgogozo, V. (2009). Is genetic evolution predictable?
Science 323: 746–751.
Storz, J.F. (2016). Causes of molecular convergence and parallelism in
protein evolution. Nat. Rev. Genet. 17: 239–250.
Sun, H., Niu, Y., Chen, Y.S., Song, B., Liu, C.Q., Peng, D.L., Chen, J.G.,
and Yang, Y. (2014). Survival and reproduction of plant species in the
Qinghai‐Tibet Plateau. J. Sys. Evol. 52: 378–396.
Szklarczyk, D., Gable, A.L., Nastou, K.C., Lyon, D., Kirsch, R., Pyysalo,
S., Doncheva, N.T., Legeay, M., Fang, T., Bork, P. et al. (2021). The
STRING database in 2021: Customizable protein‐protein networks, and
functional characterization of user‐uploaded gene/measurement sets.
Nucleic Acids Res. 49: D605–D612.
Takayama, S., and Isogai, A. (2005). Self‐incompatibility in plants. Annu.
Rev. Plant Biol. 56: 467–489.
Tarailo‐Graovac, M., and Chen, N. (2009). Using RepeatMasker to
identify repetitive elements in genomic sequences. Curr. Protoc. Bio-
informatics Chapter 4: Unit 4. 10.
To, J.P., Haberer, G., Ferreira, F.J., Deruere, J., Mason, M.G., Schaller,
G.E., Alonso, J.M., Ecker, J.R., and Kieber, J.J. (2004). Type‐A
Arabidopsis response regulators are partially redundant negative reg-
ulators of cytokinin signaling. Plant Cell 16: 658–671.
Tsukaya, H., and Tsuge, T. (2001). Morphological adaptation of in-
florescences in plants that develop at low temperatures in early spring:
The convergent evolution of “downy plants”. Plant Biol. 3: 536–543.
van Berkel, K., de Boer, R.J., Scheres, B., and ten Tusscher, K. (2013).
Polar auxin transport: Models and mechanisms. Development 140:
2253–2268.
Walker, B.J., Abeel, T., Shea, T., Priest, M., Abouelliel, A., Sakthikumar,
S., Cuomo, C.A., Zeng, Q., Wortman, J., Young, S.K. et al. (2014).
Pilon: an integrated tool for comprehensive microbial variant detection
and genome assembly improvement. PLoS One 9: e112963.
Wang, X., Liu, S., Zuo, H., Zheng, W., Zhang, S., Huang, Y., Pingcuo,
G., Ying, H., Zhao, F., Li, Y. et al. (2021). Genomic basis of high‐
altitude adaptation in Tibetan Prunus fruit trees. Curr. Biol. 31:
3848–3860.
Wen, R., Torres‐Acosta, J.A., Pastushok, L., Lai, X., Pelzer, L.,
Wang, H., and Xiao, W. (2008). Arabidopsis UEV1D promotes Lysine‐
63‐linked polyubiquitination and is involved in DNA damage response.
Plant Cell 20: 213–227.
Wertheim, J.O., Murrell, B., Smith, M.D., Kosakovsky Pond, S.L., and
Scheffler, K. (2015). RELAX: Detecting relaxed selection in a phylo-
genetic framework. Mol. Biol. Evol. 32: 820–832.
Wheeler, T.J., and Eddy, S.R. (2013). nhmmer: DNA homology search
with profile HMMs. Bioinformatics 29: 2487–2489.
Wu, T., Hu, E., Xu, S., Chen, M., Guo, P., Dai, Z., Feng, T., Zhou, L.,
Tang, W., Zhan, L. et al. (2021). clusterProfiler 4.0: A universal en-
richment tool for interpreting omics data. Innovation (Camb) 2: 100141.
Xu, S., Wang, J., Guo, Z., He, Z., and Shi, S. (2020). Genomic Con-
vergence in the Adaptation to Extreme Environments. Plant Commun.
1: 100117.
Xu, Z., and Wang, H. (2007). LTR_FINDER: An efficient tool for the pre-
diction of full‐length LTR retrotransposons. Nucleic Acids Res. 35:
W265–W268.
Yang, C.Y., Hsu, F.C., Li, J.P., Wang, N.N., and Shih, M.C. (2011). The
AP2/ERF transcription factor AtERF73/HRE1 modulates ethylene re-
sponses during hypoxia in Arabidopsis. Plant Physiol. 156: 202–212.
Yang, X.H., Makaroff, C.A., and Ma, H. (2003). The Arabidopsis MALE
MEIOCYTE DEATH1 gene encodes a PHD‐finger protein that is re-
quired for male meiosis. Plant Cell 15: 1281–1295.
Yang, Z. (2007). PAML 4: Phylogenetic analysis by maximum likelihood.
Mol. Biol. Evol. 24: 1586–1591.
Yeaman, S., Hodgins, K.A., Lotterhos, K.E., Suren, H., Nadeau, S.,
Degner, J.C., Nurkowski, K.A., Smets, P., Wang, T., Gray, L.K. et al.
(2016). Convergent local adaptation to climate in distantly related
conifers. Science 353: 1431–1433.
Zeng, X., Long, H., Wang, Z., Zhao, S., Tang, Y., Huang, Z., Wang, Y.,
Xu, Q., Mao, L., Deng, G. et al. (2015). The draft genome of Tibetan
hulless barley reveals adaptive patterns to the high stressful Tibetan
Plateau. Proc. Natl. Acad. Sci. U.S.A. 112: 1095–1100.
Zhang, J., and Kumar, S. (1997). Detection of convergent and parallel
evolution at the amino acid sequence level. Mol. Biol. Evol. 14:
527–536.
Alpine plant genomesJournal of Integrative Plant Biology
www.jipb.net Month 2023
|
Volume 00
|
Issue 00
|
1–16 15
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Zhang, J., Tian, Y., Yan, L., Zhang, G., Wang, X., Zeng, Y., Zhang, J.,
Ma, X., Tan, Y., Long, N. et al. (2016). Genome of plant maca (Lepi-
dium meyenii) illuminates genomic basis for high‐altitude adaptation in
the central andes. Mol. Plant 9: 1066–1077.
Zhang, T., Qiao, Q., Novikova, P.Y., Wang, Q., Yue, J., Guan, Y., Ming,
S., Liu, T., De, J., Liu, Y. et al. (2019). Genome of Crucihimalaya hi-
malaica, a close relative of Arabidopsis, shows ecological adaptation to
high altitude. Proc. Natl. Acad. Sci. U. S. A. 116: 7137–7146.
Zhang, Y., Zhou, J., and Lim, C.U. (2006). The role of NBS1 in DNA
double strand break repair, telomere stability, and cell cycle checkpoint
control. Cell Res. 16: 45–54.
Zhen, Y., Aardema, M.L., Medina, E.M., Schumer, M., and Andolfatto,
P. (2012). Parallel molecular evolution in an herbivore community.
Science 337: 1634–1637.
Zhou, B., Mural, R.V., Chen, X., Oates, M.E., Connor, R.A., Martin, G.B.,
Gough, J., and Zeng, L. (2017). A Subset of Ubiquitin‐Conjugating
Enzymes Is Essential for Plant Immunity. Plant Physiol. 173:
1371–1390.
SUPPORTING INFORMATION
Additional Supporting Information may be found online in the supporting
information tab for this article: http://onlinelibrary.wiley.com/doi/10.1111/
jipb.13485/suppinfo
Figure S1.K‐mer‐based analysis to estimate the genome size of Saus-
surea obvallata
Figure S2.K‐mer‐based analysis to estimate the genome size of Rheum
alexandrae
Figure S3. The Hi‐C assisted assembly of Saussurea obvallata pseudo-
molecules
Figure S4. The distribution of sequence length of predicted protein‐coding
genes in Saussurea obvallata genome
Figure S5. The distribution of sequence length of predicted protein‐coding
genes in Rheum alexandrae genome
Figure S6. Gene tree of the nucleotide‐binding site and leucine‐rich‐repeat
domain receptor (NBS‐LRR) genes
Figure S7. Loss and gain events of NBS‐LRR genes in sampled species
Figure S8. Veen plot showing the number of genes undergoing molecular
convergence detected by RERconverge and PCOC analysis
Figure S9. Site‐based estimation of convergent amino acid evolution on
S‐locus genes using PCOC method
Figure S10. The predicted STRING network of Arabidopsis MYB48
Figure S11. Volcano plot of differentially expressed genes between bract
leaves and normal leaves
Figure S12. Kyoto Encyclopedia of Genes and Genomes (KEGG) path-
ways of significantly downregulated genes in bract leaves
Table S1. Genome sequences of 20 species used for comparative analysis
Table S2. Estimation of genome sizes for Saussurea obvallata and Rheum
alexandrae based on K‐mer statistics
Table S3. The total sequencing data for genome assembly
Table S4. The information of transcriptomic data of Saussurea obvallata
used in the study
Table S5. Statistics of initial genome assembly of Saussurea obvallata and
Rheum alexandrae
Table S6. Summary of the chromosome lengths of the Saussurea ob-
vallata genome
Table S7. Genome completeness measured of Saussurea obvallata ge-
nome and proteome by Benchmarking Universal Single‐Copy Orthologs
(BUSCO)
Table S8. Genome completeness of Rheum alexandrae genome and
proteome measured by Benchmarking Universal Single‐Copy Orthologs
(BUSCO)
Table S9. Prediction of repetitive elements in the assembled Saussurea
obvallata genome
Table S10. Prediction of repetitive elements in the assembled Rheum
alexandrae genome
Table S11. Functional annotation of predicted genes of Saussurea ob-
vallata and Rheum alexandrae genomes
Table S12. Summary of gene family clustering among the 20 species used
Table S13. Significantly enriched Gene Ontology (GO) terms of con-
vergently expanded gene families in alpine plant genomes based on the
hypergeometric test
Table S14. Significantly enriched Kyoto Encyclopedia of Genes and Ge-
nomes (KEGG) pathways of convergently expanded gene families in alpine
plant genomes based on the hypergeometric test
Table S15. Significantly enriched Gene Ontology (GO) terms of con-
vergently contracted gene families in alpine plant genomes based on the
hypergeometric test
Table S16. Significantly enriched Kyoto Encyclopedia of Genes and Ge-
nomes (KEGG) pathways of convergently contracted gene families in al-
pine plant genomes based on the hypergeometric test
Table S17. Annotation information of 36 convergent selected genes
Table S18. Significantly enriched Gene Ontology (GO) terms of genes
undergoing convergent positive selection in alpine plant genomes
Table S19. Significantly enriched Kyoto Encyclopedia of Genes and Ge-
nomes (KEGG) pathways of genes undergoing convergent positive se-
lection in alpine plant genomes
Table S20. Significantly enriched Gene Ontology (GO) terms of gene
families undergoing convergently accelerated evolutionary rate in alpine
plant genomes
Table S21. Significantly enriched Kyoto Encyclopedia of Genes and Ge-
nomes (KEGG) pathways of gene families undergoing convergently ac-
celerated evolutionary rate in alpine plant genomes
Table S22. Tests for relaxed selection on S‐locus genes of alpine plants
using RELAX model
Table S23. Functional annotation of genes predicted to interact with
MYB48
Alpine plant genomes Journal of Integrative Plant Biology
16 Month 2023
|
Volume 00
|
Issue 00
|
1–16 www.jipb.net
17447909, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jipb.13485 by Wuhan Institute of Botany/, Wiley Online Library on [23/04/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
A preview of this full-text is provided by Wiley.
Content available from Journal of Integrative Plant Biology
This content is subject to copyright. Terms and conditions apply.