ArticlePDF Available

The complete chloroplast genome of Medicago arabica (Fabaceae)

Taylor & Francis
Mitochondrial DNA Part B
Authors:

Abstract and Figures

Medicago arabica (Linnaeus, 1762) Huds. is an important annual legume forage that grows in a wide range of climates, from subtropical to temperate. This study aimed to sequence the chloroplast genome of M. arabica and compare it with other legumes. In this study, we sequenced the entire chloroplast genome of M. arabica, which has 125,056 base pairs. The total GC content of the chloroplast genome of M. arabica was 34.4%. From the 110 unique genes of the circular genome, 30 tRNA genes, four rRNA genes, and 76 protein-coding genes were successfully annotated. A maximum likelihood (ML) tree was constructed using the model species and 17 species of the Medicago genus. M. arabica was shown to be phylogenetically closely related to M. polymorpha. The nucleotide diversity of the chloroplast genome may provide valuable molecular markers to study chloroplast, genetic breeding, and plant molecular evolution. These findings provide a solid foundation for future research on the molecular biology of the chloroplast.
This content is subject to copyright. Terms and conditions apply.
PLASTOME ANNOUNCEMENT
The complete chloroplast genome of Medicago arabica (Fabaceae)
Yingxue Jiao
a
, Xiaofan He
a
, Yuhua Shen
b
, Yuehui Chao
a
and Tiejun Zhang
a
a
School of Grassland Science, Beijing Forestry University, Beijing, China;
b
College of Chemistry and Life Sciences, Chifeng University,
Chifeng, China
ABSTRACT
Medicago arabica (Linnaeus, 1762) Huds. is an important annual legume forage that grows in a wide
range of climates, from subtropical to temperate. This study aimed to sequence the chloroplast gen-
ome of M. arabica and compare it with other legumes. In this study, we sequenced the entire chloro-
plast genome of M. arabica, which has 125,056 base pairs. The total GC content of the chloroplast
genome of M. arabica was 34.4%. From the 110 unique genes of the circular genome, 30 tRNA genes,
four rRNA genes, and 76 protein-coding genes were successfully annotated. A maximum likelihood
(ML) tree was constructed using the model species and 17 species of the Medicago genus. M. arabica
was shown to be phylogenetically closely related to M. polymorpha. The nucleotide diversity of the
chloroplast genome may provide valuable molecular markers to study chloroplast, genetic breeding,
and plant molecular evolution. These findings provide a solid foundation for future research on the
molecular biology of the chloroplast.
ARTICLE HISTORY
Received 31 October 2021
Accepted 11 April 2022
KEYWORDS
Chloroplast genome;
Medicago arabica; Fabaceae
Medicago arabica (Linnaeus, 1762) Huds., also known as spot-
ted medic, is a flowering plant that belongs to the Fabaceae
family. It is native to the Mediterranean basin and has since
spread throughout the world, where it can be found growing
on cliff tops and in different types of grasslands. M. arabica is
one of the essential leguminous forages found worldwide,
especially in subtropical and temperate climates (Nair
et al. 2006).
According to the USDA, M. arabica has greater adaptability
than other annual legumes and is used to improve soil prop-
erties and grazing productivity (Bialy et al. 2004). It has a
symbiotic relationship with Sinorhizobium medicae, a bacter-
ium capable of fixing nitrogen present in the soil. It is consid-
ered essential for pasture improvement because of its short
vegetative period, flat or sub-flat stem type, sclerotized seeds,
and ability to adapt to a wide range of environmental condi-
tions (Tava et al. 2009). The aerial parts of M. arabica contain
high concentrations of saponins that have solid fungicidal
activity against several pathogenic fungi and have the poten-
tial to be developed as a natural source of fungicides
(Saniewska et al. 2005).
The chloroplast is involved in photosynthesis, and the syn-
thesis of key phytohormones is involved in defence
responses and inter-organelle signaling (Bhattacharyya and
Chakraborty 2018). This organelle also regulates starch stor-
age, sugar synthesis, and critical cellular components, includ-
ing amino acids, vitamins, pigments, lipids, and metabolic
pathways for sulfur and nitrogen (Martin et al. 2013).
The chloroplast is a vital organelle in plants that contains
genes and components specific to the chloroplast. In this
evolutionary context, the arrangement of the chloroplast
genome is remarkably conserved. The availability of complete
chloroplast genome sequences can provide essential informa-
tion for plant breeding, chloroplast genetic engineering, the
development of valuable molecular markers, and phylogen-
etic analysis (Tao et al. 2017). The chloroplast genome of M.
arabica will be a valuable source of genetic markers for
determining evolutionary linkage as well as a robust platform
for studying the evolution and genetic breeding of this crop.
The chloroplast genome sequence of M. arabica has not
yet been reported, and further research into its chloroplast
genomes is important and urgent. This study aimed to
sequence and annotate the chloroplast genome of M. arabica
and compare it with that of other legumes. In the present
study, the chloroplast genome of M. arabica was sequenced
and structurally characterized, providing an invaluable
resource for future studies in the Fabaceae family, especially
in the genetic evolution and genetic development of feed
crops and other plant species.
Samples of M.arabica were collected from the Bajia
Botanical Garden in Beijing, China (E116290, N40030). The
seeds of M.arabica were deposited in the forage germplasm
bank of the School of Grassland Science, Beijing Forestry
University (Beijing, China; E116290, N40030). One specimen
was deposited at the Herbarium of the School of Grassland
Science, Beijing Forestry University (http://cxy.bjfu.edu.cn/,
Tiejun Zhang, tiejunzhang@126.com) with the voucher
CONTACT Tiejun Zhang tiejunzhang@126.com School of Grassland Science, Beijing Forestry University, No. 35 Qinghua East Road, Haidian District, Beijing
100083, China
ß2022 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use,
distribution, and reproduction in any medium, provided the original work is properly cited.
MITOCHONDRIAL DNA PART B
2022, VOL. 7, NO. 4, 689691
https://doi.org/10.1080/23802359.2022.2067498
number PI495212. Genomic DNA from post-emergence
shoots was extracted using a DNA extraction kit from
Shanghai Limin Industries Co., Ltd. (Shanghai, China).
Sequencing was performed using the Illumina Novaseq
PE150 platform (Illumina Inc., San Diego, USA), which gener-
ated 150 bp paired-end reads. The complete chloroplast gen-
ome was assembled from the cleaned reads using
GetOrganelle v1.5 (Jin et al. 2020), which used the chloroplast
genome of Medicago truncatula (GenBank accession number:
NC 003119) as a reference. The chloroplast genome was
annotated using CPGAVAS2 (Shi et al. 2019) and GeSeq
(Tillich et al. 2017) and subsequently performed manually.
The annotated chloroplast genome sequences are registered
in GenBank with an accession number (MZ905469). The study
of M. arabica, including collecting plant material, followed
the standards established by the School of Grassland Science,
Beijing Forestry University, and Chinese and international reg-
ulations. Field research adhered to Beijing legislation and fol-
lowed all research protocols.
Our study revealed that the entire chloroplast genome of
M.arabica is 125,056 base pairs long. The GC content of the
entire chloroplast genome was 34.4%. The chloroplast gen-
ome of M.arabica consists of 110 different genes, including
76 protein-coding genes, 30 tRNA genes, and four rRNA
genes. There are 30 genes encoding amino acid transfer
proteins, 15 genes encoding light-harvesting structural pro-
teins (PSII), 11 genes encoding NADH dehydrogenase pro-
teins, and 11 genes encoding small subunit ribosomal
proteins, which are found in the chloroplast genome of
M.arabica.
To determine the phylogenetic relationships of M. arabica,
the chloroplast genomes of 17 species of the Medicago
genus, as well as Melilotus albus and Trifolium repens from sis-
ter groups of Medicago in Fabaceae as outgroup species,
were downloaded from the GenBank database of the
National Center for Biotechnology Information (NCBI). These
sequences were aligned with the help of MAFFT v7 (Katoh
et al. 2019). A maximum likelihood (ML) tree was also gener-
ated using the raxmlGUI 1.5 b programme (v8.2.10), which is
based on the common protein-coding genes of 19 species
and is based on the results of this study (Silvestro and
Michalak 2012). The nucleotide sequences of 69 common
genes were used to construct the ML tree. According to the
results of the phylogenetic survey, M. polymorpha is closely
related to M. arabica (Figure 1). This study provides valuable
information for species identification and phylogenetic rela-
tionships within the Fabaceae family, mainly legume forage.
It will provide a solid foundation for future research into the
molecular biology of chloroplast, genetic breeding, and the
molecular evolution of M. arabica.
Figure 1. A phylogenetic tree was reconstructed using the maximum likelihood (ML) method based on shared protein-coding genes of 17 species of the Medicago
genus. Melilotus albus and Trifolium repens, both members of sister groups of Medicago in Fabaceae, served as outgroups. The numbers above the lines represent
ML bootstrap values (>70%).
690 Y. JIAO ET AL.
Author contributions
YJ and XH both analyzed and interpreted the data, and YJ drafted the
paper. YS and YC critically reviewed the intellectual content and col-
lected samples. TZ was involved in the conception, design, and final
approval of the published version of the article. All authors agreed to
assume responsibility for all aspects of the work.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Funding
This work was supported by the Fundamental Research Funds for the
Central Universities [No.2021ZY81] and the National Natural Science
Foundation of China [Nos. 31772656 and 31402123].
Data availability statement
The genomic sequence data supporting the findings of this study are
publicly available in NCBI GenBank (https://www.ncbi.nlm.nih.gov/) under
the accession number MZ905469. The associated BioProject, SRA, and
Bio-Sample numbers are PRJNA750257, SRR15275748, and
SAMN20447317, respectively.
References
Bhattacharyya D, Chakraborty S. 2018. Chloroplast: the Trojan horse in
plant-virus interaction. Mol Plant Pathol. 19(2):504518.
Bialy Z, Jurzysta M, Mella M, Tava A. 2004. Triterpene saponins from aer-
ial parts of Medicago arabica L. J Agric Food Chem. 52(5):10951099.
Jin JJ, Yu WB, Yang JB, Song Y, dePamphilis CW, Yi TS, Li DZ. 2020.
GetOrganelle: a fast and versatile toolkit for accurate de novo assem-
bly of organelle genomes. Genome Biol. 21(1):241.
Katoh K, Rozewicki J, Yamada KD. 2019. MAFFT online service: multiple
sequence alignment, interactive sequence choice and visualization.
Brief Bioinform. 20(4):11601166.
Martin G, Baurens FC, Cardi C, Aury JM, DHont A. 2013. The complete
chloroplast genome of banana (Musa acuminata, Zingiberales): insight
into plastid monocotyledon evolution. PLoS One. 8(6):e67350.
Nair RM, Hughes SJ, Peck DM, Crocker G, Ellwood S, Hill JR, Hunt CH,
Auricht GC. 2006. Progress in development of spotted medics
(Medicago arabica L. Huds.) for Mediterranean farming systems. Aust J
Agric Res. 57(4):447455.
Saniewska A, Jarecka A, Bialy Z, Jurzysta M. 2005. Antifungal activity of
saponins from Medicago arabica L. shoots against some pathogens.
Allelopathy J. 16:105112.
Shi L, Chen H, Jiang M, Wang L, Wu X, Huang L, Liu C. 2019. CPGAVAS2,
an integrated plastome sequence annotator and analyser. Nucleic
Acids Res. 47:6573.
Silvestro D, Michalak I. 2012. raxmlGUI: a graphical front-end for RAxML.
Org Divers Evol. 12(4):335337.
Tao X, Ma L, Zhang Z, Liu W, Liu Z. 2017. Characterization of the com-
plete chloroplast genome of alfalfa (Medicago sativa) (Leguminosae).
Gene Reports. 6:6773.
Tava A, Mella M, Avato P, Biazzi E, Pecetti L, Bialy Z, Jurzysta M. 2009.
New triterpenic saponins from the aerial parts of Medicago arabica (L.)
huds. J Agric Food Chem. 57(7):28262635.
Tillich M, Lehwark P, Pellizzer T, Ulbricht-Jones ES, Fischer A, Bock R,
Greiner S. 2017. GeSeq - versatile and accurate annotation of organ-
elle genomes. Nucleic Acids Res. 45(W1):W6W11.
MITOCHONDRIAL DNA PART B 691
Article
Full-text available
Asterothamnus centraliasiaticus Novopokr., a species of perennial deciduous semi-shrub within the family Asteraceae, has excellent medical, economic, ecological and genetic value. In this study, the chloroplast genome of A. centraliasiaticus was first assembled using Illumina HiSeq2500 sequences. The results indicate that the complete cp genome of A. centraliasiaticus is 152,205 bp in length, and comprises a pair of inverted repeat (IR) regions of 25,031 bp each, a large single-copy (LSC) region of 83,956 bp and a small single-copy (SSC) region of 18,187 bp. The GC content of A. centraliasiaticus is 37.3%. A total of 130 genes were successfully annotated containing 85 protein-coding genes, 37 transfer RNA genes, and 8 ribosomal RNA genes. The maximum likelihood (ML) phylogenetic analysis based on the complete chloroplast genome data highly supported that A. centraliasiaticus was close to Aster lavandulifolius. These results will provide significant genetic information for the germplasm protection and reasonable development.
Article
Full-text available
GetOrganelle is a state-of-the-art toolkit to accurately assemble organelle genomes from whole genome sequencing data. It recruits organelle-associated reads using a modified “baiting and iterative mapping” approach, conducts de novo assembly, filters and disentangles the assembly graph, and produces all possible configurations of circular organelle genomes. For 50 published plant datasets, we are able to reassemble the circular plastomes from 47 datasets using GetOrganelle. GetOrganelle assemblies are more accurate than published and/or NOVOPlasty-reassembled plastomes as assessed by mapping. We also assemble complete mitochondrial genomes using GetOrganelle. GetOrganelle is freely released under a GPL-3 license (https://github.com/Kinggerm/GetOrganelle).
Article
Full-text available
We previously developed a web server CPGAVAS for annotation, visualization and GenBank submission of plastome sequences. Here, we upgrade the server into CPGAVAS2 to address the following challenges: (i) inaccurate annotation in the reference sequence likely causing the propagation of errors; (ii) difficulty in the annotation of small exons of genes petB, petD and rps16 and trans-splicing gene rps12; (iii) lack of annotation for other genome features and their visualization, such as repeat elements; and (iv) lack of modules for diversity analysis of plastomes. In particular, CPGAVAS2 provides two reference datasets for plastome annotation. The first dataset contains 43 plastomes whose annotation have been validated or corrected by RNA-seq data. The second one contains 2544 plastomes curated with sequence alignment. Two new algorithms are also implemented to correctly annotate small exons and trans-splicing genes. Tandem and dispersed repeats are identified, whose results are displayed on a circular map together with the annotated genes. DNA-seq and RNA-seq data can be uploaded for identification of single-nucleotide polymorphism sites and RNA-editing sites. The results of two case studies show that CPGAVAS2 annotates better than several other servers. CPGAVAS2 will likely become an indispensible tool for plastome research and can be accessed from http://www.herbalgenomics.org/cpgavas2.
Article
Full-text available
This article describes several features in the MAFFT online service for multiple sequence alignment (MSA). As a result of recent advances in sequencing technologies, huge numbers of biological sequences are available and the need for MSAs with large numbers of sequences is increasing. To extract biologically relevant information from such data, sophistication of algorithms is necessary but not sufficient. Intuitive and interactive tools for experimental biologists to semiautomatically handle large data are becoming important. We are working on development of MAFFT toward these two directions. Here, we explain (i) the Web interface for recently developed options for large data and (ii) interactive usage to refine sequence data sets and MSAs.
Article
Full-text available
We have developed the web application GeSeq (https://chlorobox.mpimp-golm.mpg.de/geseq.html) for the rapid and accurate annotation of organellar genome sequences, in particular chloroplast genomes. In contrast to existing tools, GeSeq combines batch processing with a fully customizable reference sequence selection of organellar genome records from NCBI and/or references uploaded by the user. For the annotation of chloroplast genomes, the application additionally provides an integrated database of manually curated reference sequences. GeSeq identifies genes or other feature-encoding regions by BLAT-based homology searches and additionally, by profile HMM searches for protein and rRNA coding genes and two de novo predictors for tRNA genes. These unique features enable the user to conveniently compare the annotations of different state-of-the-art methods, thus supporting high-quality annotations. The main output of GeSeq is a GenBank file that usually requires only little curation and is instantly visualized by OGDRAW. GeSeq also offers a variety of optional additional outputs that facilitate downstream analyzes, for example comparative genomic or phylogenetic studies.
Article
Full-text available
Banana (genus Musa) is a crop of major economic importance worldwide. It is a monocotyledonous member of the Zingiberales, a sister group of the widely studied Poales. Most cultivated bananas are natural Musa inter-(sub-)specific triploid hybrids. A Musa acuminata reference nuclear genome sequence was recently produced based on sequencing of genomic DNA enriched in nucleus. The Musa acuminata chloroplast genome was assembled with chloroplast reads extracted from whole-genome-shotgun sequence data. The Musa chloroplast genome is a circular molecule of 169,972 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC, 88,338 bp) and a Small Single Copy region (SSC, 10,768 bp) separated by Inverted Repeat regions (IRs, 35,433 bp). Two forms of the chloroplast genome relative to the orientation of SSC versus LSC were found. The Musa chloroplast genome shows an extreme IR expansion at the IR/SSC boundary relative to the most common structures found in angiosperms. This expansion consists of the integration of three additional complete genes (rps15, ndhH and ycf1) and part of the ndhA gene. No such expansion has been observed in monocots so far. Simple Sequence Repeats were identified in the Musa chloroplast genome and a new set of Musa chloroplastic markers was designed. The complete sequence of M. acuminata ssp malaccensis chloroplast we reported here is the first one for the Zingiberales order. As such it provides new insight in the evolution of the chloroplast of monocotyledons. In particular, it reinforces that IR/SSC expansion has occurred independently several times within monocotyledons. The discovery of new polymorphic markers within Musa chloroplast opens new perspectives to better understand the origin of cultivated triploid bananas.
Article
Full-text available
With the increasing availability of molecular data, maximum likelihood approaches have gained a new central role in phylogenetic reconstructions. Extremely fast tree-search algorithms have been developed to handle data sets of ample size in reasonable time. In the past few years, RAxML has achieved great relevance in this field and obtained wide distribution among evolutionary biologists and taxonomists because of its high computational performance and accuracy. However, there are certain drawbacks with regard to its usability, since the program is exclusively command-line based. To overcome this problem, we developed raxmlGUI, a graphical user interface that makes the use of RAxML easier and highly intuitive, enabling the user to perform phylogenetic analyses of varying complexity. The GUI includes all main options of RAxML, and a number of functions are automated or simplified. In addition, some features extend the standard use of RAxML, like assembling concatenated alignments with automatic partitioning. RaxmlGUI is an open source Python program, available in a cross-platform package that incorporates RAxML executables for the main operating systems. It can be downloaded from http://sourceforge.net/projects/raxmlgui/. Keywords: Rapid bootstrap; Graphical user interface; Maximum likelihood; Phylogenetic analyses; Python; RAxML
Article
Chloroplast is one of the most dynamic organelle of a plant cell. It carries out photosynthesis, synthesizes major phytohormones, takes active part in defence response, and is crucial for inter-organelle signaling. Viruses, on the other hand, are extremely strategic in manipulating the internal environment of the host cell. Chloroplast, a prime target for viruses, undergoes enormous structural and functional damage during viral infection. In fact, large proportions of affected gene products in a virus infected plant are closely associated to chloroplast and photosynthesis process. Although chloroplast is deficient in gene-silencing machinery, it elicits effector-triggered immune response against viral pathogens. Virus infection induces the organelle to produce extensive network of stromules which are involved in both viral propagation and anti-viral defence. From last few decades' study, involvement of chloroplast in regulating plant-virus interaction has become increasingly evident. Current review presents an exhaustive account of these facts, with their implication in pathogenicity. We have attempted to highlight the intricacies of chloroplast-virus interaction and explained the existing gaps in current knowledge, which will promote the virologists to utilize the chloroplast genome-based antiviral resistance in economically important crops. This article is protected by copyright. All rights reserved.
Article
Total saponins from shoots (aerial parts) of Medicago arabica were tested in vitro against 11 species of pathogenic fungi at 0.01%, 0.05% and 0.1% concentrations. Fungicidal activity of the saponins was positively correlated with their concentrations and the test fungi showed different susceptibilities to the saponins. The most sensitive fungi were : Rhizoctonia solani Kühn, Botrytis tulipae (Lib.) Lind, Phoma narcissi Aderh., Pestalotia spp. and Fusarium oxysporum Schlecht. f. sp. tulipae Apt., and their linear growth was reduced by 86.6%; 83.3%; 65.5%; 64.3% and 63.4%, respectively, at 0.1% saponin concentration. Less susceptible were : Phoma poolensis Taub. (59.3%), Botrytis cinerea Pers. (51.8%), Pythium ultimum Trow. (43.3%), Fusarium oxysporum Schlecht, f. sp. narcissi Snyd. et Hans. (41.6%), Alternaria alternata (Fr.) Kreisler (34.0%) and Fusarium oxysporum Schlecht, sp. callistephi (Beach) Snyd. et Hans. (21.5%). Saponins from the aerial parts of M. arabica show quite high fungicidal activity against some plant pathogenic fungi and this Medicago species could become a source of natural fungicides. Moreover, the differences in fungicidal activity among accessions and individual plants of this species may provide the opportunity to obtain by selection high saponin cultivars useful for pesticide industry.
Article
Spotted medics ( Medicago arabica) have become naturalised in Australia, but the spiny nature of their pods has prevented commercial release of any cultivar. Fifty-eight accessions representing Medicago arabica in the Australian Medicago Genetic Resources Collection were grown as spaced plants at Turretfield, South Australia, and the variation for important agronomic traits was studied. There was large variation for traits including days to flowering, dry matter production, pod and seed yield, and pod spininess. Principal component and cluster analyses conducted for 13 traits revealed 5 clusters. One of the clusters identified comprised accessions originating from Greece and Cyprus, which were found to have high agronomic potential. The study has helped in identifying the relationship among traits, namely pod spininess, days to flowering, dry matter yield, and pod and seed yield, which would be useful to breeders for future breeding and selection programs. A sward trial at Moree, New South Wales, comprising a selected cohort of spotted medic accessions, enabled the identification of 2 early flowering and high dry matter yielding accessions; however, both exhibited spiny pods. These 2 accessions were crossed with a smooth podded accession, and the F-1 plants were confirmed using a microsatellite marker. Days to flowering showed a continuous pattern of variation in the F-2, suggesting that the trait is quantitatively inherited, whereas segregation ratio revealed that a single recessive gene controlled the smooth pod trait. Early flowering, smooth-podded F-2 plants were selected for cultivar development.