ArticlePDF Available

The complete chloroplast genome of the endangered tree Parashorea chinensis (Dipterocarpaceae)

Taylor & Francis
Mitochondrial DNA Part B
Authors:
  • Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences

Abstract and Figures

The complete chloroplast genome sequence of Parashorea chinensis, an endangered large tree species in the northern edge of tropical Asia, is determined in this study. The total genome size is 152,002 bp, containing a large single copy (LSC) region (84,094 bp) and a small single copy region (20,003), which were separated by two inverted repeat (IRs) regions (23,954 bp). The overall GC contents of the plastid genome were 37.1%. In total, 116 unique genes were annotated and they consisted of 81 protein-coding genes, 31 tRNA genes, and four rRNA genes. Phylogenetic analysis based on 20 chloroplast genomes indicates that P. chinensis is closely related to P. macrophylla.
This content is subject to copyright. Terms and conditions apply.
MITOGENOME ANNOUNCEMENT
The complete chloroplast genome of the endangered tree Parashorea chinensis
(Dipterocarpaceae)
Xing-Fu Zhu and Yongshuai Sun
CAS Key Laboratory of Tropical Forest Ecology, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, China
ABSTRACT
The complete chloroplast genome sequence of Parashorea chinensis, an endangered large tree species
in the northern edge of tropical Asia, is determined in this study. The total genome size is 152,002bp,
containing a large single copy (LSC) region (84,094 bp) and a small single copy region (20,003), which
were separated by two inverted repeat (IRs) regions (23,954bp). The overall GC contents of the plastid
genome were 37.1%. In total, 116 unique genes were annotated and they consisted of 81 protein-cod-
ing genes, 31 tRNA genes, and four rRNA genes. Phylogenetic analysis based on 20 chloroplast
genomes indicates that P. chinensis is closely related to P. macrophylla.
ARTICLE HISTORY
Received 25 January 2019
Accepted 2 March 2019
KEYWORDS
Chloroplast; Parashorea
chinensis; phylogen-
etic analysis
Parashorea chinensis is a large tree species (up to 80 m tall)
in the family Dipterocarpaceae. It is found in southern China
and in northern Vietnam. It has been overexploited and now
is threatened by habitat loss (Li et al. 2007). The plant is clas-
sified as endangered in the IUCN Red List of Threatened
Species (Qin et al. 2017). Consequently, the genetic and gen-
omic information is urgently needed to promote its system-
atics research and the development of conservation value of
P. chinensis. Here, we made the first report of a complete
plastome for P. chinensis (GenBank accession num-
ber: MK424049).
The total genomic DNA was extracted from dry leaves col-
lected from Ningming (Guangxi, China, E107.0981,
N22.2456) and Voucher herbarium specimens were depos-
ited at the Herbarium of Xishuangbanna Tropical Botanical
Garden. Genomic DNA was extracted from the fresh leaves
using the modified CTAB method (Doyle and Doyle 1987).
Total DNA was used for the shotgun library construction.
After cluster generation, libraries were sequenced on an
Illumina Hiseq 4000 platform and 150 bp paired-end reads
were generated. The filtered reads were assembled using the
program GetOrganelle v1.5 (Jin et al. 2018) with the refer-
ence chloroplast genome of P. macrophylla (GenBank acces-
sion number: MH791330.1), annotated by Dual Organellar
GenoMe Annotator (DOGMA; Wyman et al. 2004) and GeSeq
(Tillich et al. 2017).
The complete chloroplast genome of P. chinensis is
152,002 base pairs (bp) in length and contains two inverted
repeat (IRa and IRb) regions of 23,954 bp, which was
separated by a large single-copy (LSC) region of 84,094 bp
and a small single-copy (SSC) region of 20,003 bp. The
overall GC contents of the plastid genome were 37.1%.
The new sequence possesses total 116 genes, including
four ribosomal RNA genes, 31 tRNA genes, and 81 protein-
coding genes. In these genes, six tRNA genes (i.e. trnE-
UUC, trnA-UGC, trnL-UAA, trnS-CGA, trnV-UAC, and trnY-
AUA) and seven protein-coding genes (clpP, ndhA, ndhB,
rpl2, rps16, rpoC1 and ycf1) contained one intron, and the
ycf3 gene have two introns. Most of the genes occurred as
a single copy, whereas four rRNA genes (i.e. 4.5S, 5S, 16S,
and 23S rRNA), seven tRNA genes (i.e. trnA-UGC, trnE-UUC,
trnL-CAA, trnM-CAU, trnN-GUU, trnR-ACG, and trnV-GAC)
and six protein-coding genes (i.e. rpl2, rpl23, ycf2, ndhB,
rps7, and rps12) occur in double.
To understand the phylogenetic position of Parashorea
within the order Malvales, we downloaded the complete
chloroplast genomes of 19 species from the NCBI GenBank
database, including 17 species in Malvales and two species in
Brassicales. The sequences were aligned using MAFFT v7.307
(Katoh and Standley 2013), and RAxML (Stamatakis 2014) was
used to construct a maximum likelihood tree with
Arabidopsis thaliana and Carica papaya as outgroups. All
nodes in the complete plastome trees were strongly sup-
ported. The phylogenetic tree showed that P. chinensis was
closely related to P. macrophylla (Figure 1). This published
P. chinensis chloroplast genome will provide useful informa-
tion for phylogenetic and evolutionary studies in
Dipterocarpaceae and Malvales.
CONTACT Yongshuai Sun sunyongshuai@xtbg.ac.cn CAS Key Laboratory of Tropical Forest Ecology, Xishuangbanna Tropical Botanical Garden, Chinese
Academy of Sciences, Mengla 666303, China
ß2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits
unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
MITOCHONDRIAL DNA PART B
2019, VOL. 4, NO. 1, 11631164
https://doi.org/10.1080/23802359.2019.1591236
Disclosure statement
No potential conflict of interest was reported by the authors.
Funding
This research was supported by the National Natural Science Foundation
of China (no. 31500194).
References
Doyle JJ, Doyle JL. 1987. A rapid DNA isolation procedure for small quan-
tities of fresh leaf tissue. Phytochem Bull. 19:1115.
Jin J-J, Yu W-B, Yang J-B, Song Y, Yi T-S, Li D-Z. 2018. GetOrganelle: a
simple and fast pipeline for de novo assembly of a complete circular
chloroplast genome using genome skimming data. bioRxiv. 256479.
Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment soft-
ware version 7: improvements in performance and usability. Mol Biol
Evol. 30:772780.
Li X, Li H-w, Li J, Ashton PS. 2007. Flora of China. Beijing: Science Press.
p. 5254.
Qin H, Yang Y, Dong S, He Q, Jia Y, Zhao L, Yu S, Liu H, Liu B, Yan Y, A,
et al. 2017. List of threatened species of higher plants in China.
Biodiversity. 25:696744.
Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic
analysis and post-analysis of large phylogenies. Bioinformatics. 30:
13121313.
Tillich M, Lehwark P, Pellizzer T, Ulbricht-Jones ES, Fischer A, Bock R,
Greiner S. 2017. GeSeq -versatile and accurate annotation of organelle
genomes. Nucleic Acids Res. 45:W6W11.
Wyman SK, Jansen RK, Boore JL. 2004. Automatic annotation of organel-
lar genomes with DOGMA. Bioinformatics. 20:32523255.
Figure 1. ML phylogenetic tree of the 18 Malvales based on the available chloroplast genome sequences in GenBank, and the chloroplast sequence of Parashorea
chinensis. The tree is rooted with the Brassicales (Arabidopsis thaliana and Carica papaya). Bootstraps (10,000 replicates) are shown at the nodes.
1164 X.-F. ZHU AND Y. SUN
... The S. macrophylla chloroplast genome consists of 150,778 bp with a large single copy (LSC) of 83,681 bp, a small single copy (SSC) region of 19,813 bp, and a pair of inverted repeats with a length of 23,642 bp. All Dipterocarpaceae species have similar genome size, LSC, SSC and IR length [17,[22][23][24]. S. macrophylla had 112 unique genes, including 78 protein-coding genes, 30 tRNA genes and four rRNA genes. ...
Preprint
Full-text available
Shorea macrophylla belongs to the Shorea genus under the Dipterocarpaceae family. It is a woody tree that grows in the rainforest in Southeast Asia. The complete chloroplast (cp) genome sequence of S. macrophylla is reported here. The genomic size of S. macrophylla is 150,778 bp and it possesses a circular structure with conserved constitute regions of large single copy (LSC, 83,681 bp) and small single copy (SSC, 19,813 bp) regions, as well as a pair of inverted repeats with a length of 23,642 bp. It has 112 unique genes, including 78 protein-coding genes, 30 tRNA genes, and four rRNA genes. The genome exhibits a similar GC content, gene order, structure, and codon usage when compared to previously reported chloroplast genomes from other plant species. The chloroplast genome of S. macrophylla contained 262 SSRs, the most prevalent of which was A/T, followed by AAT/ATT. Furthermore, the sequences contain 43 long repeat sequences, practically most of them are forward or palindrome type long repeats. The genome structure of S. macrophylla was compared to the genomic structures of closely related species from the same family, and eight mutational hotspots were discovered. The phylogenetic analysis demonstrated a close relationship between Shorea and Parashorea species, indicating that Shorea is not monophyletic. The complete chloroplast genome sequence analysis of S. macrophylla reported in this paper will contribute to further studies in molecular identification, genetic diversity, and phylogenetic research.
... Chloroplast (cp) genome information will prove essential to solve this problem. Recently, the whole cp genomes of nine species in Dipterocarpoideae were sequenced and analyzed [17,18]. Here we sequenced, assembled and annotated the cp genomes of eleven species in four genera with the highest species richness in Dipterocarpoideae (Hopea mollissima, Hopea odorata, Shorea henryana, Shorea roxburghii, Shorea leprosula, Dipterocarpus gracilis, Dipterocarpus alatus, Dipterocarpus intricatus, Vatica xishuangbannaensis, Vatica odorata, Vatica rassak). ...
Article
Full-text available
Background In South-east Asia, Dipterocarpoideae is predominant in most mature forest communities, comprising around 20% of all trees. As large quantity and high quality wood are produced in many species, Dipterocarpoideae plants are the most important and valuable source in the timber market. The d -borneol is one of the essential oil components from Dipterocarpoideae (for example, Dryobalanops aromatica or Dipterocarpus turbinatus ) and it is also an important traditional Chinese medicine (TCM) formulation known as “Bingpian” in Chinese, with antibacterial, analgesic and anti-inflammatory effects and can enhance anticancer efficiency. Methods In this study, we analyzed 20 chloroplast (cp) genomes characteristics of Dipterocarpoideae, including eleven newly reported genomes and nine cp genomes previously published elsewhere, then we explored the chloroplast genomic features, inverted repeats contraction and expansion, codon usage, amino acid frequency, the repeat sequences and selective pressure analyses. At last, we constructed phylogenetic relationships of Dipterocarpoideae and found the potential barcoding loci. Results The cp genome of this subfamily has a typical quadripartite structure and maintains a high degree of consistency among species. There were slightly more tandem repeats in cp genomes of Dipterocarpus and Vatica, and the psbH gene was subjected to positive selection in the common ancestor of all the 20 species of Dipterocarpoideae compared with three outgroups. Phylogenetic tree showed that genus Shorea was not a monophyletic group, some Shorea species and genus Parashorea are placed in one clade. In addition, the rpoC2 gene can be used as a potential marker to achieve accurate and rapid species identification in subfamily Dipterocarpoideae. Conclusions Dipterocarpoideae had similar cp genomic features and psbM , rbcL, psbH may function in the growth of Dipterocarpoideae. Phylogenetic analysis suggested new taxon treatment is needed for this subfamily indentification. In addition, rpoC2 is potential to be a barcoding gene to TCM distinguish.
... The GC content was calculated as 36.92%, which is consistent with cpDNAs from other Dipterocarpaceae family members, such as Hopea reticulata (37.4%) [47] and Parashorea chinensis (37.1%) [48]. Several genes with high GC content were exhibited by four ribosomal proteins, namely, rrn23, rrn16, rrn4,5, and rrn5 with 55%, 56%, 50%, and 51%, respectively. ...
Article
Full-text available
Kapur (Dryobalanops aromatica) is an important dipterocarp species currently classified as vulnerable by the IUCN Red List Threatened Species. Science-based conservation and restoration efforts are needed, which can be supported by new genomic data generated from new technologies, including MinION Oxford Nanopore Technology (ONT). ONT allows affordable long-read DNA sequencing, but this technology is still rarely applied to native Indonesian forest trees. Therefore, this study aimed to generate whole genome datasets through ONT and use part of these data to construct the draft of the chloroplast genome and analyze the universal DNA barcode-based genetic relationships for D. aromatica. The method included DNA isolation, library preparation, sequencing, bioinformatics analysis, and phylogenetic tree construction. Results showed that the DNA sequencing of D. aromatica resulted in 1.55 Gb of long-read DNA sequences from which a partial chloroplast genome (148,856 bp) was successfully constructed. The genetic relationship was analyzed using two selected DNA barcodes (rbcL and matK), and its combination showed that species of the genus Dryobalanops had a close relationship as indicated by adjacent branches between species. The phylogenetic tree of matK and the combination of the matK and rbcL genes showed that D. aromatica was closely related to Dryobalanops rappa, whereas the rbcL gene showed group separation between D. aromatica and D. rappa. Therefore, a combination of the matK and rbcL genes is recommended for future use in the phylogenetic or phylogenomic analysis of D. aromatica.
... The chloroplast genome sequences have been utilized as reliable tools for phylogeneitc and evolutionary research . In recently, many chloroplast genomes of valuable plants have been reported Liu and Han, 2018;Zheng et al. 2018;Wang et al. 2018;Kwon et al. 2019;Yang et al. 2019;Zhu and Sun 2019). To facilitate its genetic research and contribute to its utilization, we reported the complete chloroplast genome sequence of A. chinensis based on Illumina HiSeq X Ten. ...
Article
Full-text available
The complete chloroplast genome of Aesculus chinensis was obtained with Illumina HiSeq X Ten. The chloroplast genome is 155,528 bp in length, including a pair of inverted repeat (IR) regions of 25,656 bp, a large single-copy (LSC) region of 85,489 bp, and a small single-copy (SSC) region of 18,727 bp. It contains 115 genes, including 80 protein-coding genes, 31 transfer RNAs, and four ribosomal RNAs. The total GC content is 37.9%, whereas the corresponding values of the LSC, SSC, and IR regions are 36.1%, 31.8%, and 43.2%, separately. Phylogenetic analysis of the protein-coding genes showed a close relationship with Aesculus wangii in Hippocastanaceae.
Article
Full-text available
In 2008, the Ministry of Environmental Protec-tion (MEP) in co-operation with Chinese Academy of Sciences (CAS) initiated work on a comprehensive new edition of the Red List of the Chinese flora. The project, named China Biodiversity Red List—Higher Plants, extended to 2013. We assessed the threatened status of all known species of higher plants in China to produce the Red List of China Higher Plants (RLCHP). In September 2013, the RLCHP was officially released in the form of a joint announcement by MEP and CAS (http://www.zhb.gov.cn/gkml/hbb/bgg/ 201309/t20130912_260061.htm/). In the present paper, we report on the revised 2013 Red List through the use of new literature and data (Dong et al, 2017; He & Jia, 2017; Qin et al, 2017; Yang et al, 2017). Due to space limitations, we list only 3,879 threatened species, i.e. species categorized as Critically Endagnered (CR), Endangered (EN) and Vulnerable (VU). The data include the scientific name of each species, the Chinese name, endemism, red list category and criterion for assigning the designation. We invited more than 300 experts to contribute survival information of species and/or to review the assessments. The RLCHP covers 35,784 species, in-cluding 30,068 species of angiosperms, 251 species of gymnosperms, 2,244 species of lycophytes and ferns, and 3,221 species of bryophytes. This is the first Red List that covers the entire Chinese flora, and the number of experts involved and data used are much more than in previous analyses. Two documents were used as a standard in this assessment: IUCN Red List Categories and Criteria (Version 3.1, Second edition) (IUCN, 2012a) and Guideline for Application of IUCN Red List Criteria at Regional and National Levels (Version 4.0) (IUCN, 2012b). Nine IUCN Red List categories were applied to the RLCHP: Extinct (EX), Extinct in the Wild (EW), Regionally Extinct (RE), Critically Endangered (CR), Endangered (EN), Vulnerable (VU), Near Threatened (NT), Least Concern (LC), and Data Deficient (DD). Four steps were applied in the assessment: setting up a baseline checklist, collecting data, species as-sessment and review. For instance, during the review of the angiosperms assessments, eight expert meetings were held in six cities (Guangzhou, Guilin, Kunming, Wuhan, Nanjing and Beijing), and 87 experts took part in one-on-one interviews with staff members of the program. To guarantee professional results of the as-sessment, we invited 19 leading experts from institu-tions of CAS and universities to join the red list steering committee. The committee was responsible for examining and approving the assessment methodology, examining the annual report and reviewing the results of the assessments. Four steering committee meetings were held in Beijing between 2008 and 2012. Of the 35,784 assessed species of China higher plants, 21 species are Extinct (EX), 9 species are Ex-tinct in the Wild (EW), 10 species are Regionally Ex-tinct (RE), 614 species are Critically Endangered (CR), 1,313 speices are Endangered (EN), 1,952 species are Vulnerable (VU), 2,818 species are Near Threatened (NT), 24,243 species are of Least Concern (LC), and 4,804 species are Data Deficient (DD). The results show that 3,879 species, representing 10.84% of the evaluated species, have been identified as threatened categories (CR, EN and VU). The references cited in this paper can be found at the website http://www.biodiversity-science.net/fileup/ PDF/ 2017-144-1.pdf/.
Article
Full-text available
We have developed the web application GeSeq (https://chlorobox.mpimp-golm.mpg.de/geseq.html) for the rapid and accurate annotation of organellar genome sequences, in particular chloroplast genomes. In contrast to existing tools, GeSeq combines batch processing with a fully customizable reference sequence selection of organellar genome records from NCBI and/or references uploaded by the user. For the annotation of chloroplast genomes, the application additionally provides an integrated database of manually curated reference sequences. GeSeq identifies genes or other feature-encoding regions by BLAT-based homology searches and additionally, by profile HMM searches for protein and rRNA coding genes and two de novo predictors for tRNA genes. These unique features enable the user to conveniently compare the annotations of different state-of-the-art methods, thus supporting high-quality annotations. The main output of GeSeq is a GenBank file that usually requires only little curation and is instantly visualized by OGDRAW. GeSeq also offers a variety of optional additional outputs that facilitate downstream analyzes, for example comparative genomic or phylogenetic studies.
Article
Full-text available
Phylogenies are increasingly used in all fields of medical and biological research. Moreover, because of the next generation sequencing revolution, datasets used for conducting phylogenetic analyses grow at an unprecedented pace. RAxML (Randomized Axelerated Maximum Likelihood) is a popular program for phylogenetic analyses of large datasets under maximum likelihood. Since the last RAxML paper in 2006, it has been continuously maintained and extended to accommodate the increasingly growing input datasets and to serve the needs of the user community. I present some of the most notable new features and extensions of RAxML, such as, a substantial extension of substitution models and supported data types, the introduction of SSE3, AVX, and AVX2 vector intrinsics, techniques for reducing the memory requirements of the code and a plethora of operations for conducting post-analyses on sets of trees. In addition, an up-to-date, 50 page user manual covering all new RAxML options is available. The code is available under GNU GPL at https://github.com/stamatak/standard-RAxML. Alexandros.Stamatakis@h-its.org.
Article
Full-text available
We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.
GetOrganelle: a simple and fast pipeline for de novo assembly of a complete circular chloroplast genome using genome skimming data
  • Yang J-B Song
  • Y Yi
  • T-S Li
Jin J-J, Yu W-B, Yang J-B, Song Y, Yi T-S, Li D-Z. 2018. GetOrganelle: a simple and fast pipeline for de novo assembly of a complete circular chloroplast genome using genome skimming data. bioRxiv. 256479.