ArticlePDF Available

Identification of Pueraria spp. through DNA barcoding and comparative transcriptomics

January 2022
BMC Plant Biology 22(1)

January 2022
22(1)

DOI:10.1186/s12870-021-03383-x

License
CC BY 4.0

Authors:

Richard Dixon

University of North Texas

Background Kudzu is a term used generically to describe members of the genus Pueraria. Kudzu roots have been used for centuries in traditional Chinese medicine in view of their high levels of beneficial isoflavones including the unique 8-C-glycoside of daidzein, puerarin. In the US, kudzu is seen as a noxious weed causing ecological and economic damage. However, not all kudzu species make puerarin or are equally invasive. Kudzu remains difficult to identify due to its diverse morphology and inconsistent nomenclature. Results We have generated sequences for the internal transcribed spacer 2 (ITS2) and maturase K (matK) regions of Pueraria montana lobata, P. montana montana, and P. phaseoloides, and identified two accessions previously used for differential analysis of puerarin biosynthesis as P. lobata and P. phaseoloides. Additionally, we have generated root transcriptomes for the puerarin-producing P. m. lobata and the non-puerarin producing P. phaseoloides. Within the transcriptomes, microsatellites were identified to aid in species identification as well as population diversity. Conclusions The barcode sequences generated will aid in fast and efficient identification of the three kudzu species. Additionally, the microsatellites identified from the transcriptomes will aid in genetic analysis. The root transcriptomes also provide a molecular toolkit for comparative gene expression analysis towards elucidation of the biosynthesis of kudzu phytochemicals.

Morphology of seeds from each kudzu accession. A Oklahoma (wild); B Texas (wild); C PI 9227 (P. m. lobata); D PI 434246 (P. m. lobata); E PI 298615 (P. m. montana); F Kudzu Kingdom (commercial); G BRSEEDS (commercial); H PI 308576 (P. phaseoloides); I DLEG 890244 (P. phaseoloides)

…

Images of vines, leaves, and trichomes for each plant accession. A-B Oklahoma (wild); C-D, Texas (wild); E-F PI 9227 (P. m. lobata); G-H PI 434246 (P. m. lobata); I-J, PI 298615 (P. m. montana); K-L Kudzu Kingdom (commercial); M-N BRSEEDS (commercial); O-P PI 308576 (P. phaseoloides)

…

Isoflavone profiles of the roots of the eight accessions examined. A HPLC chromatogram showing the isoflavone profiles of the wild and P. m. lobata roots (a. PI 9227, b. PI 434246, c. Oklahoma, d. Texas); B The isoflavone profiles of the commercial and P. phaseoloides roots (a. Kudzu Kingdom, b. BRSEEDS, c. PI 308576); C The isoflavone profile of the P. m. montana roots (PI 298615); D Isoflavone standards. mAU is milli-absorbance units. 1. Puerarin, 2. Daidzin, 3. Genistin, 4. Ononin, 5. Daidzein, 6. Genistein, 7. Formononetin

…

Phylogenetic tree of ITS2 sequences from the Pueraria accessions in the present work (colored in blue (wild and P. m. lobata), maroon (P. m. montana), and green (commercial and P. phaseoloides)) and those published in NCBI. The scale bar indicates the length of 0.1 substitutions. The pipeline was created using phylogeny.fr and visualized in Mega 11. (Details for pipeline in Methods)

…

Phylogenetic tree of matK sequences from the Pueraria accessions in the present work (colored in blue (wild and P. m. lobata), maroon (P. m. montana), and green (commercial and P. phaseoloides)) and those published in NCBI. The scale bar indicates the length of 0.06 substitutions. The pipeline was created using phylogeny.fr and visualized in Mega 11. (Details for pipeline in Methods)

…

Figures - available from: BMC Plant Biology

This content is subject to copyright. Terms and conditions apply.

Access to this full-text is provided by Springer Nature.

Learn more

Content available from BMC Plant Biology

This content is subject to copyright. Terms and conditions apply.

Adolfoetal. BMC Plant Biology (2022) 22:10

https://doi.org/10.1186/s12870-021-03383-x

RESEARCH ARTICLE

Identication ofPueraria spp. throughDNA

barcoding andcomparative transcriptomics

Laci M. Adolfo1†, Xiaolan Rao2† and Richard A. Dixon1*

Abstract

Background: Kudzu is a term used generically to describe members of the genus Pueraria. Kudzu roots have been

used for centuries in traditional Chinese medicine in view of their high levels of beneﬁcial isoﬂavones including

the unique 8-C-glycoside of daidzein, puerarin. In the US, kudzu is seen as a noxious weed causing ecological and

economic damage. However, not all kudzu species make puerarin or are equally invasive. Kudzu remains diﬃcult to

identify due to its diverse morphology and inconsistent nomenclature.

Results: We have generated sequences for the internal transcribed spacer 2 (ITS2) and maturase K (matK) regions

of Pueraria montana lobata, P. montana montana, and P. phaseoloides, and identiﬁed two accessions previously used

for diﬀerential analysis of puerarin biosynthesis as P. lobata and P. phaseoloides. Additionally, we have generated root

transcriptomes for the puerarin-producing P. m. lobata and the non-puerarin producing P. phaseoloides. Within the

transcriptomes, microsatellites were identiﬁed to aid in species identiﬁcation as well as population diversity.

Conclusions: The barcode sequences generated will aid in fast and eﬃcient identiﬁcation of the three kudzu species.

Additionally, the microsatellites identiﬁed from the transcriptomes will aid in genetic analysis. The root transcriptomes

also provide a molecular toolkit for comparative gene expression analysis towards elucidation of the biosynthesis of

kudzu phytochemicals.

Keywords: Kudzu, DNA barcoding, Microsatellites, Comparative transcriptomics

permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the

original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or

other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line

to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory

regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this

licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. The Creative Commons Public Domain Dedication waiver (http:// creat iveco

mmons. org/ publi cdoma in/ zero/1. 0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Summary

Various kudzu accessions were analyzed through barcod-

ing and comparative transcriptomics, generating tools for

identiﬁcation and molecular pathway analysis.

Background

Kudzu has been used in traditional Chinese medicine

with the roots being considered the most valuable part of

the plant [1]. e high levels of isoﬂavones in the roots

are believed to be important for the medicinal properties

of kudzu [2]. Kudzu contains the same major isoﬂavones

that are found in other legumes, including the agly-

cones daidzein, genistein, and formononetin as well as

their O-glycosides daidzin, genistin, and ononin. How-

ever, kudzu also contains puerarin, the 8-C-glycoside of

daidzein [3]. Many of the health beneﬁts of kudzu are

believed to come from puerarin, because the carbon-

carbon glycosidic bond in puerarin makes it resistant to

hydrolysis when ingested [2]. However, health beneﬁts

have also been linked to daidzin and genistin, as well as

the methylated isoﬂavone formononetin and its glyco-

side, ononin. A Chinese pharmacopeia dating back to

200 B.C. mentions the roots of kudzu and their use in

various treatments. Kudzu was administered to help with

a range of ailments including inﬂammation, diarrhea,

and even alcoholism [4]. In its native habitat, Asia, kudzu

grows well with growth being controlled by pests and

Open Access

*Correspondence: Richard.Dixon@unt.edu

†Laci M. Adolfo and Xiaolan Rao contributed equally to this work.

1 BioDiscovery Institute and Department of Biological Sciences, University

of North Texas, 1155 Union Circle #305220, Denton, TX 76203-5017, USA

Full list of author information is available at the end of the article

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 2 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

climate. In the US, kudzu is an invasive weed, especially

in the southeast [5].

Mass planting of kudzu allowed it to spread rapidly

throughout the Southeast US, where the climate is per-

fect for it, with high temperatures and plenty of rainfall,

and natural predators are absent. Kudzu vines can grow

up to 12 in. a day. Kudzu out-competed native ﬂora and

caused an economic burden as the vines crept up util-

ity poles and disrupted power [5]. e removal of kudzu

is a diﬃcult process as simply removing the top foliage

does not stop the spread of the plant; kudzu’s extensive

root system includes a large tap root from which many

roots and vines sprout [6, 7]). e US federal govern-

ment declared kudzu a federal noxious weed in the mid

to late 1990’s. It was eventually removed from the federal

noxious weed list; however, it is still on the noxious weed

lists of several states, including Texas [7].

e taxonomy of kudzu is unclear, with multiple syno-

nyms and multiple varieties within species, such as Puer-

aria montana, P. thomsonii, and P. lobata which can also

be referred to as P. montana var. montana, P. montana

var. chinensis, and P. montana var. lobata, respectively.

e classiﬁcation as diﬀerent species and diﬀerent vari-

ants has been confusing, especially as the morphologi-

cal characteristics of these individual varieties are highly

variable [8, 9].

e availability of established DNA barcodes that can

diﬀerentiate between diﬀerent species/varieties would

allow for positive identiﬁcation of kudzu in the wild, and

could aid ecological studies; for example, fecal samples

are often examined to determine the dietary behavior

of animals and insects [10–12]. Furthermore, DNA bar-

coding could facilitate quality control and assurance for

herbal supplements [13–15].

A previous study used kudzu accessions collected in the

ﬁeld (Ardmore, OK) and obtained commercially (Kudzu

Kingdom, Kodak, TN) to interrogate puerarin biosyn-

thesis through diﬀerential expression analysis following

EST sequencing [16]. To aid the identiﬁcation of these

and other kudzu accessions, we have generated barcodes

for the ITS2 and matK regions of three kudzu species/

varieties. We have also generated transcriptomic data of

the roots of the puerarin producing P. m. lobata and the

non-puerarin producing P. phaseoloides. e transcrip-

tomic data generated allows for diﬀerential gene expres-

sion analysis and also identiﬁes simple sequence repeat

(SSRs) markers between the two kudzu species. ese

genomic resources will serve as references for identifying

kudzu species for eradication, harvesting of phytochemi-

cals, validation of supplements, and ecological research.

Additionally, the comparative transcriptomics provides a

molecular resource for exploring genes active in the syn-

thesis of valuable phytochemicals.

Results

Seed morphology

e origins of the kudzu accessions analyzed in the pre-

sent work are provided in the Methods. Wild kudzu col-

lected from Oklahoma and Texas, and USDA PI 434246

and PI 9227 all had kidney-shaped seeds. Most of the

seeds were dark brown with a few being lighter brown

to reddish. e seeds also had lighter colored striations.

ey measured approximately 3.2 mm in length (Fig.1A-

D). e Kudzu Kingdom, BRSEEDS, USDA PI 308576,

and USDA DLEG 890244 seeds were rectangular to

oblong. e seed colors ranged from maroon to orange

to golden yellow and were also approximately 3.2 mm in

length (Fig.1F-I). e USDA PI 298615 seeds were rec-

tangular to oblong, and dark to medium brown in color.

ey were smaller than the other seeds, measuring

approximately 2.1 mm in length (Fig.1E).

Plant morphology

All plants grew as vines with trifoliate leaves and tri-

chomes present on the leaves and stems/vines (Supple-

mental Fig. 1). DLEG 890244 (P. phaseoloides) did not

germinate so analysis of the whole plant, plant parts,

and roots was not possible. e wild kudzu accessions

as well as the P. m. lobata accessions all had prominent

trichomes as did the commercial and P. phaseoloides

accessions; however, the trichomes present on the P. m.

montana accession were less pronounced. e P. m. mon-

tana plants also had smaller, almond shaped leaves and

thinner vines as compared to the other plants (Fig. 2).

e thinner vines on P. m. montana made the vines more

malleable. e leaves of the commercial and the P. pha-

seoloides accessions were rounder than the P. m. mon-

tana accession. Interestingly, the leaves of the wild and

P. m. lobata accessions tended to vary even among the

same accession (Supplemental Fig.2). While some of the

P. m. lobata leaves were rounder, similar to that of the

commercial and P. phaseoloides accessions, others were

lobed. e lobing on the P. m. lobata leaves also var-

ied from slight to deep lobing. However, irrespective of

their overall shape, the leaves of the wild and P. m. lobata

accessions tended to come to a sharp point.

Isoavone content

An examination of the roots of all eight accessions

revealed that the Oklahoma and Texas collected mate-

rial and the P. m. lobata accessions all contained puera-

rin. In contrast, the commercial, P. phaseoloides, and P.

m. montana accessions did not contain puerarin (Fig.3).

In addition to puerarin, roots of the wild and P. m. lobata

accessions contained daidzin and daidzein. Other iso-

ﬂavones, including genistein, genistin, and ononin were

present in reduced amounts in the wild and P. m. lobata

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 3 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

accessions. e commercial and P. phaseoloides roots

contained a higher proportion of genistein, ononin, and

genistin than the Oklahoma and Texas material, P. m.

lobata, and P. m. montana roots. In fact, those three iso-

ﬂavones were found in the highest proportion in roots of

the commercial and P. phaseoloides accessions. e P. m.

montana roots contained the least amount of isoﬂavones

based on HPLC peak areas, and these were mainly daid-

zin and daidzein (Fig.3C). While not containing puera-

rin, the commercial kudzu and P. phaseoloides had higher

percentages of daidzein and genistein aglycones among

their isoﬂavone complement (Supplemental Fig.3).

Internal transcribed spacer 2 sequencing

e internal transcribed spacer 2 (ITS2) region is gen-

erally between 200 and 250 bp. Given its small size, the

entire region was able to be captured using primers from

the 5.8S rRNA and 26S rRNA regions that ﬂank the ITS2,

resulting in amplicons of 425–475 bp. An Illumina MiSeq

with paired end reads 2 × 300 was used, allowing for an

overlap in the middle of the sequence. Following trim-

ming and alignment, the whole sequenced amplicon was

468 bp for P. m. lobata, 449 bp for P. phaseoloides, and

436 bp for P. m. montana. e ITS2 region within the

whole amplicon sequence was 242 bp for P. m. lobata,

224 bp for P. phaseoloides, and 211 bp for P. m. montana.

ere were 80 nucleotide diﬀerences observed in com-

parisons between the P. m. lobata and the P. phaseoloides

groups in the ITS2 region (Supplemental Table1). Addi-

tional diﬀerences in the ITS2 regions were 18 nucleotide

insertions/deletions (indels) in the P. phaseoloides group

including one stretch of eight deleted nucleotides and

one stretch of ten nucleotides (Supplemental Table2).

Comparisons between the P. m. lobata and the P. m. mon-

tana groups revealed 55 nucleotide diﬀerences (Supple-

mental Table3) and 31 indels including one stretch of

19 deleted nucleotides in the P. m. montana group (Sup-

plemental Table4). e comparisons between P. phaseo-

loides and the P. m. montana groups had 51 nucleotide

diﬀerences (Supplemental Table5) and 17 indels (Supple-

mental Table6).

Maturase K (matK) sequencing

Of the ~ 1500 bp matK chloroplast gene, approximately

776 bp were ampliﬁed from the kudzu accessions using

primers suggested by Yu etal. (2011) [17] for having high

ﬁdelity with angiosperms given the low nucleotide diver-

sity found in these regions. Given the length of the ampli-

con to be sequenced, Sanger sequencing was used.

Following trimming and alignment of the matK

sequences there were 17 single nucleotide polymor-

phisms (SNPs) identiﬁed between the P. phaseoloides and

P. m. lobata groups, 20 SNPs identiﬁed between the P. m.

lobata and P. m. montana groups, and 26 SNPs identiﬁed

between the P. phaseoloides and P. m. montana groups

(Table1). Given that matK is a coding region, the amino

Fig. 1 Morphology of seeds from each kudzu accession. A Oklahoma (wild); B Texas (wild); C PI 9227 (P. m. lobata); D PI 434246 (P. m. lobata); E PI

298615 (P. m. montana); F Kudzu Kingdom (commercial); G BRSEEDS (commercial); H PI 308576 (P. phaseoloides); I DLEG 890244 (P. phaseoloides)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 4 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

acid substitutions that resulted from the SNPs were also

examined. ere were eight amino acid substitutions

between the P. phaseoloides and P. m. lobata groups, 12

between the P. m. lobata and P. m. montana groups, and

15 between the P. phaseoloides and P. m. montana groups

(Table2).

Phylogenetic analysis

A neighbor-joining phylogenetic tree was generated

using the ITS2 and matK sequences. For the ITS2 phy-

logenetic tree the generated sequences were combined

with sequences published in NCBI for kudzu species

as well as other legumes. e results in Fig.4 show that

the P. phaseoloides and commercial accessions clustered

together with a previously published P. phaseoloides ITS2

sequence from NCBI. Additionally, the P. m. lobata and

Texas and Oklahoma ITS2 sequences clustered with P.

m. lobata and P. montana sequences published at NCBI,

along with a singular P. m. thomsonii sequence. e P. m.

montana sequences clustered separately.

e phylogenetic tree for the matK sequences revealed

similar clustering as the ITS2 phylogenetic tree. e P.

phaseoloides and commercial kudzu matK sequences

clustered with published matK sequences for P. phaseo-

loides and N. phaseoloides (formerly P. phaseoloides). e

matK sequences of the P. m. lobata and Oklahoma and

Texas accessions were clustered with a few P. m. lobata

and P. montana sequences plus singular P. m. thomsonii

and P. pseudohirsuta sequences available NCBI. How-

ever, the P. m. lobata, Oklahoma, and Texas kudzu matK

sequences did not cluster as closely with many of the P.

m. lobata and P. montana matK sequences analyzed from

NCBI as they did in the ITS2 neighbor-joining tree. e

P. m. montana matK sequences also clustered separately

again, but this time they were grouped closer to other

species showing more similarity to the matK sequences

of Glycine spp (Fig.5).

Transcriptome sequencing andassembly

To obtain Pueraria root transcriptomes, RNA was

extracted and cDNA prepared from roots of Kudzu

Kingdom (P. phaseoloides) and Oklahoma (P. m. lobata)

accessions, and sequenced by the Illumina Hiseq2000

platform. e 100 bp paired-end Illumina reads were

trimmed with quality scores. Clean sequence reads

from P. phaseoloides and P. m. lobata were assembled

Fig. 2 Images of vines, leaves, and trichomes for each plant

accession. A-B Oklahoma (wild); C-D, Texas (wild); E-F PI 9227 (P. m.

lobata); G-H PI 434246 (P. m. lobata); I-J, PI 298615 (P. m. montana); K-L

Kudzu Kingdom (commercial); M-N BRSEEDS (commercial); O-P PI

308576 (P. phaseoloides)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 5 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

separately using a combination of the programs Velvet

[18] and Oases [19]. To optimize the assembly, Velvet/

Oases were run with diﬀerent k-mer sizes (31, 43, 55, 67,

79 and 91 nt).

Several assembly-quality parameters were assessed,

including the ratio of using reads, median coverage

depth, the number of contigs, the number of tran-

scripts, the number of loci, average transcript length,

and the N50 values of contigs and transcripts (Supple-

mental Table 7, Supplemental Fig. 4). N50 represents

the sequence length L for which half of the bases in the

assembly are in sequences of length N > =L. [20–22] Of

the six k-mer tests in Velvet/Oases, a good balance for

the above parameters was found at k-mer 55 assembly,

resulting in 47,011 and 49,277 transcripts for P. phaseo-

loides and P. m. lobata, respectively. e full comparison

of the transcriptome data for P. phaseoloides and P. m.

lobata is given in Table3.

To further demonstrate the quality of the assembled

transcripts, the length distribution of the contigs in the

two transcriptomes is shown in Supplemental Fig. 5.

e N50 values of transcriptomes in P. phaseoloides and

P. m. lobata were 1988 and 1881 bp, respectively. For

further quality control, we mapped the assembled tran-

scriptomes to kudzu ESTs available from GenBank (6365

ESTs) and observed that 81% (5183 ESTs) and 96% (6110

ESTs) of known EST sequences were represented in our

transcriptome sets for P. phaseoloides and P. m. lobata,

Fig. 3 Isoﬂavone proﬁles of the roots of the eight accessions examined. A HPLC chromatogram showing the isoﬂavone proﬁles of the wild and P. m.

lobata roots (a. PI 9227, b. PI 434246, c. Oklahoma, d. Texas); B The isoﬂavone proﬁles of the commercial and P. phaseoloides roots (a. Kudzu Kingdom,

b. BRSEEDS, c. PI 308576); C The isoﬂavone proﬁle of the P. m. montana roots (PI 298615); D Isoﬂavone standards. mAU is milli-absorbance units. 1.

Puerarin, 2. Daidzin, 3. Genistin, 4. Ononin, 5. Daidzein, 6. Genistein, 7. Formononetin

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 6 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

respectively. Kudzu ESTs were provided from a subtrac-

tive library with the P. phaseoloides root cDNA as the

driver and P. m. lobata root cDNA as the target [16]. It

is therefore reasonable that more kudzu ESTs are repre-

sented in the P. m. lobata root transcriptome set than in

the P. phaseoloides set.

Simple sequence repeats (SSRs) inthePueraria root

transcriptomes

Simple sequence repeats (SSRs) or microsatellites have

been broadly used as molecular markers in marker-

assisted selection for DNA fingerprinting [23, 24]. To

supply SSR markers for distinguishing between P. pha-

seoloides and P. m. lobata, we used the MISA scripts

program [25] to scan the Pueraria root transcrip-

tomes to identify gene-derived SSR markers. In total,

we detected 9220 and 6665 SSRs within 6729 and 5370

different transcripts from the P. phaseoloides and P.

m. lobata de novo assembled transcriptomes, respec-

tively. The putative SSRs are summarized in Supple-

mental Dataset 1. Excluding mono-repeats (3246 and

2625), 5974 and 4040 SSRs (dinucleotide to hexanu-

cleotide repeats) were identified within 4516 (13.6%)

and 3373 (9.7%) transcripts of P. phaseoloides and P.

m. lobata, respectively. The average frequency of SSRs

was one per 5.93 kb and 8.53 kb of the transcriptome

sequence in P. phaseoloides and P. m. lobata, respec-

tively. Among dinucleotide to hexanucleotide repeats,

the distribution of SSRs was as follows: di- (2143,

35.9% and 1138, 28.2%); tri- (3255, 54.5% and 2606,

64.5%); tetra- (204, 0.03% and 116, 0.03%); penta- (138,

0.02% and 80, 0.02%) and hexa- (234, 0.04% and 100,

0.02%) in P. phaseoloides and P. m. lobata transcripts,

respectively.

Table 1 Maturase K (matK) SNP analysis

Position SNP Type

P. phaseoloides P. m. lobata P. m. montana

562 C C A Transversion

569 C C G Transversion

581 G T C Variable

606 G T T Transversion

706 T T G Transversion

713–714 TT GC GC Transversion/Transition

780 T T C Transition

807 T T G Transversion

810 G T T Transversion

828 T T C Transition

846 A A C Transversion

891 G A A Transition

894 T C C Transition

905 C A A Transversion

917 A A G Transition

942 T A A Transversion

948 A G G Transition

954 T G T Transversion

966 A C A Transversion

990 G A G Transition

1012 C C T Transition

1014 A C A Transversion

1022 C C T Transition

1023 C A A Transversion

1044 G A G Transition

1045 C C A Transversion

1073 T T G Transversion

1090 A A C Transversion

1098 T A A Transversion

1118 C C T Transition

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 7 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

Annotation, functional classication, mapping

andquantitation ofassembled transcripts

e transcriptome assembly from roots of P. phaseoloides

and P. m. lobata contains 47,011 and 49,277 transcript

isoforms, which represent a total of 33,221 and 34,677

distinct assembled loci, respectively. Each locus may

include several highly similar transcript isoforms, such as

splice variants, homologs and paralogs, and sequencing

errors [16, 22]. To reduce the degree of gene redundancy,

we chose the longest transcript to perform annotation as

the representative of the locus.

A homology search against NR resulted in 24,850

and 27,244 annotated genes in P. phaseoloides and P. m.

lobata, respectively. Among annotated genes, the most

abundant genes are involved in metabolic processes

according to their Gene Ontology (GO) categories using

Plant GOslim ancestor terms [26–28] (Fig.6A). Based on

top hits in the NR database, Pueraria transcripts have

strong homology to transcripts from soybean (Glycine

max), followed by green bean (Phaseolus vulgaris), con-

sistent with the close phylogenetic relationship between

kudzu and soybean [29] (Fig.6B).

To illustrate the coverage distribution of assembled

transcripts on Glycine max as the reference genome, we

aligned the transcripts to the 20 chromosomes in a 500 kb

interval (Fig. 7). Both P. phaseoloides and P. m. lobata

assembled transcripts covered all 20 soybean chromo-

somes without any large gap. e correlation between P.

phaseoloides and P. m. lobata transcriptome density was

0.74, indicating genetic divergence between these two

species. To pinpoint the location of polymorphisms, the

SSR-bearing transcripts were uniquely anchored to the

single best hit in the Glycine max genome. e inconsist-

ency in the SSR locations between P. phaseoloides and P.

m. lobata further indicates the genetic divergence of the

two accessions.

Cross-species transcriptomic comparisons have been

shown to be feasible [30, 31]. erefore, to obtain a

comparative gene expression pattern between the two

Pueraria accessions, we aligned the sequencing reads to

Glycine max as the reference genome [32]. Overall, 65

and 66% of the cleaned reads from P. phaseoloides and

P. m. lobata were mapped to the Glycine max protein

database, respectively, and 84% of Glycine max proteins

were covered with at least one mapped read (Supplemen-

tal Table8). For each Gmax protein code, the number of

matching reads was counted and the hit count was then

transformed to RPKM (the reads per kilobase of tran-

script per million) to normalize for the number of reads

available for each line [30]. e coverage of the functional

classes between P. phaseoloides and P. m. lobata were

similar (Supplemental Fig.6A). e majority of gene cat-

egorieswere well represented by more than 70% of genes

in each class for both mappings. Among them, 87 and

91% of genes classiﬁed in secondary metabolism were

detected in P. phaseoloides and P. m. lobata, respectively.

e average RPKM values for each accession were 19.9

and 20.3, respectively. To deﬁne “diﬀerentially expressed

genes”, we used the criterion of 2-fold diﬀerence in RPKM

value with the ﬁlter of RPKM value above 20 between the

two RNA samples. By these criteria, 1631 and 1675 genes

were considered as diﬀerentially expressed in P. phaseo-

loides and P. m. lobata, respectively. Overall, genes classi-

ﬁed in photosynthesis (PS), oxidative pentose phosphate

pathway (OPP), major and minor carbohydrate (CHO)

metabolism, and secondary metabolism were enriched

in P. m. lobata, whereas genes classiﬁed in C1-metabo-

lism, S-assimilation, and DNA and RNA metabolism

were more represented in P. phaseoloides (Supplemen-

tal Fig. 6B). A detailed comparison for genes enriched

in secondary metabolism is shown in Supplemental

Fig.6C. It is clear that the transcriptome of P. m. lobata

is enriched in genes encoding proteins involved in ﬂavo-

noid biosynthesis.

Discussion

Identication ofkudzu species using barcoding

With the veriﬁed samples provided by GRIN-Global,

the wild collected and commercial kudzu accessions

compared previously for puerarin production [16] were

identiﬁed as P. montana lobata and P. phaseoloides,

respectively. e ITS2 and matK sequences for the P. m.

Table 2 Maturase K (matK) amino acid substitutions

Position Amino acid substitutions

P. phaseoloides P. m. lobata P. m.

montana

188 L L I

190 T T S

194 W L S

202 R S S

236 Y Y D

238 L R R

269 N K K

270 E D D

302 S Y Y

306 Y Y C

318 H Q H

322 L F L

341 S S L

349 Q Q K

358 M M R

364 I I L

373 S S L

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 8 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

Fig. 4 Phylogenetic tree of ITS2 sequences from the Pueraria accessions in the present work (colored in blue (wild and P. m. lobata), maroon (P. m.

montana), and green (commercial and P. phaseoloides)) and those published in NCBI. The scale bar indicates the length of 0.1 substitutions. The

pipeline was created using phylo geny. fr and visualized in Mega 11. (Details for pipeline in Methods)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 9 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

Fig. 5 Phylogenetic tree of matK sequences from the Pueraria accessions in the present work (colored in blue (wild and P. m. lobata), maroon (P. m.

montana), and green (commercial and P. phaseoloides)) and those published in NCBI. The scale bar indicates the length of 0.06 substitutions. The

pipeline was created using phylo geny. fr and visualized in Mega 11. (Details for pipeline in Methods)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 10 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

lobata and Oklahoma kudzu accessions matched one

another and had clear diﬀerences from the P. phaseo-

loides and commercial kudzus, which also matched one

another, and had clear diﬀerences from the P. m. montana

kudzu. e seed morphology of the P. m. montana and P.

phaseoloides was most similar in shape while the seeds of

P. m. lobata and P. phaseoloides were most similar in size.

e plant morphology of the P. m. lobata and the P. pha-

seoloides was most similar with thicker vines and larger

leaves. e P. m. lobata and wild kudzu accessions were

the only plants analyzed that contained puerarin. e

puerarin content for these accessions is consistent with

previous reports [33].

e use of ITS2 and matK combined proved ben-

eﬁcial in strengthening the identiﬁcation of the diﬀerent

Pueraria species. Although the ITS2 region analyzed was

smaller than the matK region analyzed, there were more

nucleotide diﬀerences found in the ITS2 region, presum-

ably because it is a non-coding region. e ITS2 region

varied in size for all three kudzu species analyzed, from

211 bp to 242 bp. e primers used included a plant-

speciﬁc forward primer located in the 5.8S RNA and a

universal reverse primer located in the 26S RNA. e

plant-speciﬁc forward primer oﬀers beneﬁts by reduc-

ing the unintended ampliﬁcation of other organisms such

as fungi. Using the primers in the 5.8S and 26S regions

resulted in an amplicon size between 450 and 500 bp.

is amplicon size was perfect for using next generation

sequencing (NGS). e use of NGS helps reduce noise

that can be generated from ampliﬁcation and sequenc-

ing bias by allowing for greater depth of coverage. e

greater coverage depth also allows for any incorrect

sequences to be muﬄed by the true sequence. is noise

was further reduced by using low cycle numbers in the

ampliﬁcation prior to sequencing. e diﬀerence in size

can make alignment diﬃcult; however, using primers in

the relatively conserved 5.8S and 26S regions helps over-

come alignment and ampliﬁcation problems [34]. In con-

trast, despite the reduced number of nucleotide changes,

the matK region aligned perfectly across all three species

analyzed. e ease of alignment for matK is common

given that it is a coding region of the chloroplast [17].

Table 3 Statistics of the transcriptome data

Data P. phaseoloides P.m. lobata

Raw reads 38,381,722 33,214,058

Clean reads 38,014,210 32,891,280

Assembled transcripts 47,011 49,277

Percent assembled 87.8 82.9

Assembled depth 11.9 10.3

Mean length 1320 1239

Fig. 6 Gene ontology classiﬁcation and homology characteristics of Pueraria root transcript sequences. A Gene ontology analysis of the assembled

transcripts. B Species distribution of homology search of Pueraria transcriptomes against the NR database

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 11 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

e use of published ITS2 and matK sequences from

other plants and Pueraria species in a neighbor-joining

tree with the sequences generated showed clear cluster-

ing of the P. phaseoloides and commercial kudzus with P.

phaseoloides plants published in NCBI. e P. m. lobata

and wild-collected Oklahoma and Texas kudzu clustered

with the other P. m. lobata and P. montana sequences

while the P. m. montana sequences clustered separately.

e neighbor-joining trees for both genes resulted in

similar clades encompassing the diﬀerent accessions

analyzed along with the published sequences in NCBI.

Although a single concatenated tree that included both

genes could have provided additional resolving power to

show the relatedness of all the accessions analyzed, there

was a lack of ITS2 and matK sequences in NCBI from the

same samples of kudzu and other legumes, making such

analysis not possible.

Unlike with animals where the cytochrome oxidase I

(COI) gene of the mitochondria is considered the gold

standard for species diﬀerentiation, plants do not cur-

rently have a speciﬁc region that is accepted as having

good discriminatory value. However, several regions have

been proposed as well as the use of two regions together

[35, 36]. e ITS2 region has been shown to have high

discriminatory power in both Fabaceae genera and angi-

osperms [37–43]. In Vignaspecies, coupling matK and

ITS2 increased the resolving power of the barcodes com-

pared to using them individually [40].

e use of the ITS2 and matK regions can success-

fully diﬀerentiate species of the genus Pueraria as well

as variants of the same species. e ITS2 and matK for

P. m. lobata and P. phaseoloides were generated from

four diﬀerent populations of the respective species. e

sequences for the populations of each species matched

one another as well as from samples within the popula-

tions. is shows that for the kudzu species analyzed,

ITS2 and matK have enough nucleotide exchange to dif-

ferentiate the diﬀerent species but do not segregate out

Fig. 7 Distribution of the assembled Pueraria transcripts mapped to the soybean genome. External track shows the density of P. m. lobata

transcripts aligned to the Gmax genome, in both + (outside) and – (inside) strands in purple. The middle track shows the density of P. phaseoloides

transcripts aligned to the Gmax genome, in both + (outside) and – (inside) strands in blue. Inner track show the SSR-bearing transcripts aligned to

the Gmax genome sequence, with P. m. lobata strands in orange (outside) and P. phaseoloides strands in green (inside)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 12 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

diﬀerent populations of the same species. e ability of

these two regions to not set apart diﬀerent populations

of the same species is extremely important in allowing for

clear identiﬁcation of kudzu species regardless of where

the plant originated. e ITS2 and matK sequences

generated have been uploaded to BOLD (Barcode of

Life Database) [44] to be available to other researchers

attempting to identify plants whether directly or through

the examination of plant material present in supplements

or even in the feces of organisms to understand their diet

as done by Yamamoto and Uchida (2018) [12].

Interestingly, the seed and plant morphology, barcod-

ing sequence diﬀerences, and phylogenetic separation

between P. montana montana and P. montana lobata

would suggest that these plants are more than mere vari-

eties of the same species as suggested by van der Maesen

[8]. e diﬀerences present at both a phenotypic and gen-

otypic level for these plants align with their being sepa-

rate species as previously suggested by Ohashi etal. [45].

Ohashi etal. suggest the presence of two species, P. mon-

tana and P. lobata, where P. lobata has the subspecies P.

l. lobata and P. l. thomsonii. A comprehensive analysis of

P. l. thomsonii (also known as P. thomsonii and P. m. chin-

ensis) as done here for P. m. lobata, P. m. montana, and P.

phaseoloides could discern whether P. l. thomsonii is best

categorized as a subspecies of Pueraria lobata or as its

own species.

Summary ofthetranscriptome dataset

e rapid development of next-generation sequenc-

ing (NGS) technologies has enabled discovery of novel

genes by using the RNA-seq approach [46, 47]. To pro-

vide a basis for a better understanding of the bioactive

natural products in kudzu, we have performed a com-

parative whole root transcriptome analysis. ree other

reports have generated transcriptomes for diﬀerent tis-

sue types of P. m. lobata [48–50], and more recently,

for diﬀerent tissues of P. thomsonii and P. candollei var.

miriﬁca [51, 52]. However, none of these analyses exam-

ine two diﬀerent kudzu species for comparative gene

expression. A previous phylogenetic study showed 80%

of US kudzu analyzed had matching genotypes with one

or more samples from the same population [53]. is

suggest that the transcriptome generated from kudzu

from Oklahoma (P. m. lobata) could be a representa-

tive genomic resource for this noxious weed that domi-

nates throughout the Southeastern US. In Oklahoma

alone a report suggests a loss of almost $168 million in

the lumber industry over 5 years [54]. Knowledge of its

transcriptome can lead to development of methods of

biological eradication.

It is challenging to perform de novo assembly of tran-

scriptomes in non-model organisms lacking a reference

genome. Early studies demonstrated that optimization of

the transcriptome assembly using various k-mer lengths

is highly desirable for de novo assemblies [22, 55, 56].

In the present study, various parameters were analyzed

with a combination of Velvet and Oases. Velvet/Oases

start by constructing de Bruijn graphs directly from

sequencing reads, remove errors, and then resolve each

de Bruijn graph to extract transcripts for each connected

component (called “loci”) in the graph [18, 19, 22]. Vel-

vet/Oases allow a range of k-mer sizes to accommodate

variation in read coverages among genes. Longer k-mers

lead to more speciﬁcity, with lower coverage and sensi-

tivity. Assembly quality decreases towards both lower

and higher k values [18, 19, 22]. Assembly quality tests

were performed to determine the most suitable param-

eter; the usage ratio of reads, depth, length, and number

of assembled transcripts [22, 55, 56]. e Velvet/Oases

k–mer 55 assembly was selected as the representative

for the Pueraria root transcriptomes, resulting in 47,011

and 49,277 transcripts with 33,221 and 34,677 loci,

respectively. is is consistent with the gene number

for the majority of sequenced plant genomes of between

20,000 and 40,000 [21].

Dierentiation ofPueraria species

Simple sequence repeats (SSRs) markers have been

widely used in plant genetic studies because of their

tendency toward being multiallelic, expression of both

parental alleles, quantity, and vast coverage in genomes

[57]. Genic SSRs (derived from genes, ESTs, or cDNA

clones) have some advantages over genomic SSRs includ-

ing being easily generated, characterized, and possessing

transferability between diﬀerent species [58].

Previous markers identiﬁed to distinguish kudzus

included 13 allozyme loci, 11–49 randomly ampliﬁed

polymorphic DNAs (RAPDs), and 13–15 microsatellite

locations [9, 53, 59–62]. Most recently, genic SSRs were

identiﬁed from P. m. montana and P. phaseoloides [63].

Some of these reports used other kudzu species or varie-

ties; however, the goal of all of them was beyond identiﬁ-

cation and focused more on population/genetic diversity

and origin of kudzu’s introduction. Here we identiﬁed

9220 and 6665 genic SSRs from the assembled tran-

scripts from P. phaseoloides and P. m. lobata, respectively.

Excluding mono-SSRs, 5974 and 4040 genic SSRs were

detected in 13.6 and 9.7% of the transcripts with the fre-

quency of one SSR per 5.93 kb and 8.53 kb in the P. pha-

seoloides and P. m. lobata transcriptomes, respectively.

Frequencies of genic SSRs were reported as 1 per 3.92 kb

or 8.63 kb from de novo assembled transcriptomes in the

legume species lentil and chickpea, respectively [56, 64].

Additionally, the genic SSR frequency in Chinese sweet-

gum was 1 per 5.12 kb [65].

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 13 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

Factors aﬀecting the frequency and types of SSRs

include the taxon, the genomic make-up, and the SSR

mining length used for analysis [66]. Here we applied

the same parameters for mining microsatellites in the P.

phaseoloides and P. m. lobata transcriptomes, so the dif-

ferences in SSR frequency likely indicate diﬀerences in

genomic composition. Except for mono-repeats, the most

abundant SSRs were tri-nucleotide repeats (54.5 and

64.5%), then di-nucleotide repeats (35.95 and 28.2%) in

P. phaseoloides and P. m. lobata transcripts, respectively.

is is consistent with the observation that tri-SSRs are

generally the most frequently occurring SSRs found in

genic SSRs, followed by di-SSRs [58, 67]; however, there

are exceptions as with Camellia japonica [68]. Among all

the tri-nucleotides, AAG/CTT was found to be the most

frequent motif, consistent with recent studies [69–71].

Our results suggest that the SSRs identiﬁed here are reli-

able and can be useful tools for assaying genetic variation

in Pueraria populations.

Cross-species mapping in protein space is a viable

strategy to compare diﬀerent species when an equidis-

tant reference is available [30]. rough mapping reads

by alignment on the soybean protein sequence, we quan-

tiﬁed transcript abundance in P. phaseoloides and P. m.

lobata. Transcripts catalogued in photosynthesis, major

CHO metabolism and minor CHO metabolism were

enriched in the wild-collected, invasive P. m. lobata com-

pared with the commercial species P. phaseoloides. is

is consistent with the competitive ability of P.m. lobata

for ﬁxing carbon [72]. Transcripts classiﬁed in secondary

metabolism were also enriched in P. m. lobata, particu-

larly genes involved in ﬂavonoid biosynthesis.

Conclusions

Puerarin is found in some but not all species of Pueraria.

Here we have identiﬁed the ITS2 and matK barcodes as

suﬃcient to diﬀerentiate between three kudzu species

(P. montana, P. lobata, and P. phaseoloides), and in so

doing identiﬁed the wild and commercial kudzu species

used previously for preliminary gene identiﬁcation in the

puerarin pathway [16]. We have also provided molecular

tools for more in-depth diﬀerential expression analysis of

natural product pathways between transcriptomes of P.

m. lobata and P. phaseoloides, as well as the identiﬁcation

of microsatellites for further use to aid in identiﬁcation of

the two species.

Methods

Chemicals

Daidzin, genistein, and genistin were purchased from

Cayman Chemical Company (Ann Arbor, MI). All other

standards were purchased from Indoﬁne Chemical

Company (Hillsborough, NJ). HPLC solvents were from

FisherSci (Walthanm, MA). Other chemicals were pur-

chased from Sigma-Aldrich (St. Louis, MO) unless oth-

erwise indicated.

Seeds

Oklahoma wild kudzu seeds were collected (under Texas

Department of Agriculture permit no 14-NIPP-01) from

P street SE, near the junction with Springdale Road, in

Ardmore, OK (34.159, − 97.108). e kudzu from Okla-

homa had previously been identiﬁed as P. montana [73].

Kudzu Kingdom seeds were ordered from Kudzu King-

dom, a division of SunTop Inc., in Kodak, TN. Texas wild

kudzu seeds were collected (under Texas Department of

Agriculture permit no 19-NIPP-01) oﬀ Copeland road

under Batman the ride at Six Flags Over Texas in Arling-

ton, TX (32.759, − 97.067). e kudzu from Texas had

previously been identiﬁed as P. m. lobata and validated

by Texas Invaders (Site Record 19,737). BR seeds were

ordered from the company BRSeeds in Araçatuba, São

Paulo, Brazil as P. phaseoloides. P. montana (Lour.) Merr.

var. lobata (Willd.) collected in the United States (PI

434246); P. montana (Lour.) Merr. var. lobata (Willd.) col-

lected in Kanagawa, Japan (PI 9227); P. montana (Lour.)

Merr. var. montana donated from Taiwan (PI 298615);

Neustanthus phaseoloides (Roxb.) Benth. (formerly P.

phaseoloides (Roxb.) Benth.) collected in Venezuela (PI

308576) were ordered through USDA Grin Global from

the Plant Genetic Resources Conservation Unit in Grif-

ﬁn, GA (under Texas Department of Agriculture permit

no 19-NIPP-01 where applicable). N. phaseoloides (DLEG

890244) seeds collected from an unknown location were

ordered through USDA Grin Global from the Desert

Legume Program in Tucson, AZ. Seeds ordered through

USDA Grin Global were veriﬁed by an ARS Systematic

Botanist and are publicly available.

Seed sterilization, germination, andplant growth

conditions

Seeds were scariﬁed in sulfuric acid for 20 min (BR seeds,

Kudzu Kingdom seeds, USDA P. phaseoloides, and USDA

P. montana var. montana seeds), or 45 min (Texas, Okla-

homa, and USDA P. montana lobata (Origins Japan and

US). ey were then rinsed with copious amounts of

water three times, dried and sterilized in 20% (v/v) bleach

for 5 min. e seeds were allowed to dry before being

plated on water agar. e plates were placed in the dark

at 4 °C for 5 days, then moved to a 24 °C light chamber

and monitored for germination. Once germinated the

seeds were placed in a greenhouse with temperature set-

tings from 20 °C–28 °C and at least 14 h of light.

For root isoﬂavone analysis a young vine was cut from

the main plant and the cut tip dipped in IBA (indole

3-butyric acid) before being placed in damp soil. e

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 14 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

cuttings were monitored and after 4 weeks were repotted.

After 8 weeks the roots were washed of excess soil and

harvested for isoﬂavone analysis.

DNA isolation

Tissues, including leaves and seeds, were collected and

placed in 2 mL Eppendorf tubes with a single ball bear-

ing. e tubes were placed in liquid nitrogen and the tis-

sue was ground using a Retsch Mixer Mill 400 at 30 Hz

for 15 s. e samples were then checked for degree of

grinding and placed in liquid nitrogen. If the tissue was

not thoroughly ground, it was run on the Retsch Mill

again until eﬃcient tissue grinding was achieved.

Tissue was suspended in 500 μL of 2X CTAB extrac-

tion buﬀer, vortexed for 5 s to mix and placed in a 60 °C

oven for 30 min with occasional mixing. Tissue was cen-

trifuged at room temperature at 16,000 x g for 5 min. e

upper liquid was transferred to a new tube being careful

to avoid the tissue debris. An equal volume of cold chlo-

roform was added to the tubes, which were then vor-

texed for 5 s and centrifuged at 4 °C for 10 min at 12,000

x g. e upper aqueous phase was carefully transferred

to a new tube, an equal volume of cold chloroform was

added, the mixture vortexed for 5 s and then centrifuged

at 4 °C for 10 min at 12,000 x g. e upper aqueous phase

was collected, an equal volume of cold isopropanol was

added, the tube incubated at room temperature for

10 min, and then centrifuged at 4 °C for 10 min at 12,000

x g. e liquid was carefully poured oﬀ and 1 mL of 70%

(v/v) ethanol was added to the tube, which was centri-

fuged for 1 min at room temperature at 12,000 x g. e

liquid was again poured oﬀ, the tube re-centrifuged for

10 s and the remaining liquid carefully removed avoiding

the pellet. e tube was brieﬂy placed in a centrifuge with

a cold trap (SpeedVac) to remove any residual ethanol.

e pellet was resuspended in 50 μL ddH2O. e DNA

concentration was calculated on a NanoDrop™ 2000.

Flavonoid extraction

Root tissue was collected from plants and placed in a

2 mL Eppendorf tube with a single ball bearing. e tis-

sue was placed in liquid nitrogen before being lyophi-

lized on a Labconco freeze dryer for 3 days. e tube was

then placed in liquid nitrogen and ground on a Retsch

Mixer Mill 400 at 30 Hz for 15 s. Twenty mg of tissue was

transferred to a new tube and remaining tissue stored at

-80 °C. e 20 mg of tissue was resuspended in 1.5 mL of

80% (v/v) methanol and sonicated for 1 h in an ice water

ultrasonic bath (Branson, Danbury, CT). Following soni-

cation, the tubes were placed on an end-over-end rotator

at 4 °C overnight, then centrifuged for 20 min at 12,000 x

g. e supernatant was transferred to a new tube being

careful to avoid the tissue debris pelleted at the bottom of

the tube. e tubes were placed on a nitrogen evaporator

(Organomation Associates Inc., Berlin, MA) to dry under

a stream of air/nitrogen. After the contents of the tubes

had dried, 250 μL of ddH2O was added and the tubes

placed on an end-over-end rotor at 4 °C for 1 h.

Ethyl acetate extraction of ﬂavonoids was performed

twice by adding 2 times the volume of ethyl acetate to

the tube, inverting to mix, and centrifuging at 12,000 x

g for 10 min at 4 °C. e top layer was transferred to a

new tube and dried under a stream of air/nitrogen on an

Organomation nitrogen evaporator. e contents of the

tubes were resuspended in 150 μL of 100% methanol. e

samples were then analyzed by HPLC.

ITS2 metagenomic sequencing

e ITS2 region was sequenced in collaboration with the

BioDiscovery Institute (BDI) Genomics Center (Denton,

TX) and Salient Genomics LLC (Krum, TX). Total DNA

was used to amplify the ITS2 regions with barcode and

index adapters attached to ITS2 primer sequences ITS-

p3/ITS-u4 [34]. e samples were prepped and run on

an Illumina MiSeq (Illumina, Inc., San Diego, CA). Prior

to sequencing, DNA from every accession was ampliﬁed

with the ITS-p3/ITS-u4 primers to check amplicon size

[34]. When run on a 1% agarose gel, all of the amplicons

ran just under the 500 bp band of the ladder, consistent

with the expected amplicon size of around 450 bp. How-

ever, the size of the P. m. montana amplicon was slightly

lower than that of the other accessions consistent with

the sequencing results.

matK Sanger sequencing

DNA samples were ampliﬁed with matK primers [17]

using NEB’s Q5 Hot-start polymerase following the man-

ufacturer’s instructions including extension time. e

annealing temperature was calculated using NEB’s Tm

calculator. Following ampliﬁcation, the samples were sent

to Euroﬁns Genomics (Louisville, KY) for PCR clean-up

and one-pass Sanger method sequencing. To conﬁrm the

amplicons prior to sequencing, they were run on a 1%

agarose gel. All the amplicons ran between the 500 bp and

1000 bp band of the ladder, consistent with the expected

amplicon length of around 775 bp.

Barcode sequence analysis

Barcoding sequences were analyzed using Geneious

Prime (San Diego, CA). Once the sequences were

imported in Geneious Prime they were paired and

trimmed using the BBDuk plugin to remove Illumina

adapters as well as low quality (below 30) and short (less

than 100 bp) reads (for ITS2 sequences). e forward and

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 15 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

reversed reads were merged together using BBMerge.

Merged sequences with a length between 430 and 480 bp

were extracted (for ITS2 sequences). e reads were

assembled de novo using the Geneious assembler and a

consensus sequence was generated for each sample. e

samples were aligned for each amplicon group to identify

SNPs and Indels between the three accessions.

Barcoding phylogenetic trees

e phylogenetic trees were made using a pipeline built

with phylogeny.fr. e pipeline settings used MUSCLE

for the sequence alignment, Gblocks for the alignment

curation, and ProtDist/FastDist + BioNJ for building

the phylogenetic tree with a bootstrap value of 1000.

e phylogenetic tree was viewed and edited with

Mega 11 [74–81].

HPLC analysis

Twenty μL samples were injected on an Agilent 1220

Inﬁnity II with a C18 reverse phase column. e 50 min

run used the solvents 0.1% (v/v) formic acid (A) and ace-

tonitrile (B) with a gradient as follows: 0–5 min, 95% A;

5–10 min, 85% A; 10–25 min, 77% A; 25–30 min, 67% A;

30–35 min, 60% A; 35–40 min, 0% A; 40–45 min, 0% A;

45–50 min, 95% A with a ﬂow rate of 1 mL/ min. Absorp-

tion was measured at 254 nm.

RNA extraction, cDNA library construction andIllumina

sequencing

As described [82], each RNA-library was prepared from

1 μg of total RNA isolated from one sample each of Kudzu

Kingdom (P. phaseoloides) and Oklahoma (P.m. lobata)

roots using TruSeq RNA Sample Prep Kits v2 (Illumina

Inc., San Diego, CA), according to the manufacturer’s

instructions, at the Genomics Core Facility at the Noble

Foundation. e prepped samples with individual indexes

were pooled together to run on one Hiseq2000 lane tar-

geting 100 bp paired reads. e Hiseq2000 run was con-

ducted at the Genomics Core Facility of the Oklahoma

Medical Research Foundation, Oklahoma City.

Short read de novo assembly oftranscriptomes

Processing of the 100 bp paired-end Illumina reads began

by interleaving the read mates for each sample into a

single ﬁle and trimming bases with quality scores of 20

or less from the end of each read. Reads less than 40 bp

long after trimming were discarded along with their

mates [82]. Each of the Pueraria root Illumina libraries

was assembled separately using a combination of Vel-

vet 1.2.10 [18] and Oases 0.2.08 [19]. To optimize the

assembly towards higher contiguity and speciﬁcity, Vel-

vet was run using diﬀerent hash lengths (k-mers 31, 43,

55, 67, 79 and 91) with an average insert length of 300 bp.

e results of the Velvet assemblies were then run

through Oases using an insert length of 300 bp. Other

parameters of Velvet and Oases were set as default.

Annotation

e assembled transcript isoforms were searched

against the NCBI NR database using blastx alignment

(1e-6) [83], and further annotated with default param-

eter values using Blast2Go [84]. After the Blast2Go

mapping process, EC numbers from the KEGG pathway

[85] and GO terms were generated.

SSR detection

In a pre-process step, poly-T (poly-A) stretches from

the 5′ (3′) were removed by EST-trimmer scripts

[86]. Parameters were set as removing (T)5 or (A)5

in a range of 50 bp on the 5′- or 3′-end, respectively.

Sequences of less than 100 bp were discarded and

sequences larger than 3000 bp were clipped at their

3′ side [30]. Then trimmed sequences were analyzed

using MISA scripts [30] to identify Simple Sequence

Repeats (SSRs). Mono-, di, tri-, tetra-, penta- and

hexanucleotide repeats with a minimum of 10, 7, 5, 5,

5, and 5 subunits were regarded as SSRs, respectively.

Mapping andquantication ofsequence reads

As described [30], the Illumina sequence reads were

mapped onto coding sequences of the Glycine max

genome (version Gmax_275_Wm82.a2.v1 download

from Phytozome website) by blastx [83] with threshold

as 1e-6. To reduce multiple-mapping problems, cod-

ing sequences from primary transcripts without alter-

native splice sites in the Glycine max genome were

used [32]. The blastx output was parsed with in-house

PERL scripts to count the number of reads mapped to

each Glymax protein and then to calculate the RPKM

value for every Glymax protein in each library.

Abbreviations

BLAST: Basic local alignment search tool; BOLD: Barcode of life database;

CHO: Carbohydrate; COI: Cytochrome oxidase I; CTAB: Cetyltrimethylam-

monium bromide; DNA: Deoxyribonucleic acid; EC: Enzyme nomen-

clature; EST: Expressed sequence tag; GO: Gene ontology; GPS: Global

positioning system; GRIN: Germplasm resource information network;

HPLC: High performance liquid chromatography; IBA: Indole 3-butyric

acid; ITS: Internal transcribed spacer; ITS2: Internal transcribed spacer

2; KEGG: Kyoto encyclopedia of genes and genomes; matK: Maturase

K; mAU: Milli-absorbance units; MISA: Microsatellite identification tool;

NCBI: National center for biotechnology information; NGS: Next genera-

tion sequencing; NR: Non-redundant protein; OPP: Oxidative pentose

phosphate; PCR: Polymerase chain reaction; PERL: Practical extraction

and reporting language; PS: Photosynthesis; RNA: Ribonucleic acid;

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 16 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

RPKM: Reads per kilobase of transcript per million; SNP: Single nucleo-

tide polymorphism; SRA: Sequence read archive; SSR: Simple sequence

repeat; USDA: United States Department of Agriculture.

Supplementary Information

The online version contains supplementary material available at https:// doi.

org/ 10. 1186/ s12870- 021- 03383-x.

Additional le1: Supplemental Figure1. Images of vines and whole

plant morphology. Supplemental Figure2. Leaves from USDA PI 9227

P. m. lobata plants. Supplemental Figure3. Percent composition of six

common isoﬂavones in each of the seven accessions. Supplemental

Figure4. Quality measurements for Velvet/Oases assemblies. Supple-

mental Figure5. Length distribution of the assembled transcripts in P.

phaseoloides and P. m. lobata. Supplemental Figure6. Pathway repre-

sentation analysis of the soybean transcripts mapped by Pueraria reads.

Supplemental Table1. ITS2 nucleotide changes between P. m. lobata and

P. phaseoloides. Supplemental Table2. ITS2 insertions/deletions between

P. m. lobata and P. phaseoloides. Supplemental Table3. ITS2 nucleotide

changes between P. m. lobata and P. m. montana. Supplemental Table4.

ITS2 insertions/deletions between P. m. lobata and P. m. montana. Sup-

plemental Table5. ITS2 nucleotide changes between P. phaseoloides and

P. m. montana. Supplemental Table6. ITS2 insertions/deletions between

P. phaseoloides and P. m. montana. Supplemental Table7. Assembly

statistics (Velvet/Oases) for P. phaseoloides and P. m. lobata. Supplemental

Table8. Statistics of Pueraria reads mapped to soybean by BLAST.

Additional le2: Supplemental Dataset 1. Putative SSRs from tran-

scripts of P. phaseoloides and P. m. lobata.

Acknowledgements

We thank the Desert Legume Program and the Plant Genetic Resources

Conservation Unit, Griﬃn, GA in connection with GRIN-Global for supplying

seeds. We thank Awinash Bhatkar of the Texas Department of Agriculture for

supplying the kudzu transportation permit. We thank Sebastien Santini (CNRS/

AMU IGS UMR7256) and the PACA Bioinfo platform (supported by IBISA) for

the availability and management of the phylo geny. fr website used to build

neighbor-joining phylogenetic trees for barcode sequence comparison.

Authors’ contributions

LMA performed DNA barcoding analysis. XR performed RNA-seq analysis. RAD

conceived experiments, and funded and guided research. All authors read and

approved the ﬁnal manuscript.

Funding

This work was supported by the University of North Texas using start-up funds

awarded to Dr. Richard Dixon. The funding body played no role in the design

of the study and collection, analysis, and interpretation of data and in writing

the manuscript.

Availability of data and materials

The DNA barcoding sequences are available on the BOLD system

with the processIDs KUDZU002–21 to KUDZU046–21. Sequence data

from this article can be found in the NCBI Sequence Read Archive

(SRA) repository, NCBI SRA accession No. SRX768865. The assembled

transcriptomes can be found at NCBI, accession numbers 10672212

and 10671973.

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Author details

1 BioDiscovery Institute and Department of Biological Sciences, University

of North Texas, 1155 Union Circle #305220, Denton, TX 76203-5017, USA. 2 Col-

lege of Life Sciences, Hubei University, Wuhan 430068, Hubei Province, China.

Received: 12 July 2021 Accepted: 5 December 2021

References

1. Keung WM, Vallee BL. Kudzu root: an ancient chinese source of modern

antidipsotropic agents. Phytochemistry. 1998;47(4):499–506.

2. Prasain JK, Barnes S, Wyss JM. Kudzu isoﬂavone C-glycosides: analysis,

biological activities, and metabolism. Food Front. 2021;2(3):383–9.

3. Rong H, Stevens JF, Deinzer ML, Cooman LD, Keukeleire DD. Iden-

tiﬁcation of isoﬂavones in the roots of Pueraria lobata. Planta Med.

1998;64(7):620–7.

4. Wong KH, Li GQ, Li KM, Razmovski-Naumovski V, Chan K. Kudzu root:

traditional uses and potential medicinal beneﬁts in diabetes and cardio-

vascular diseases. J Ethnopharmacol. 2011;134(3):584–607.

5. Winberry JJ, Jones DM. Rise and decline of the “miracle vine”: kudzu in the

southern landscape. Southeast Geogr. 1973;13(2):61–70.

6. (EPPO) EaMPPO. Data sheets on quarantine pests: Pueraria lobata. 2007.

7. Loewenstein NJ, Enloe SF, Everest JW, Miller JH, Ball DM, Patterson MG.

The history and use of kudzu in the southeastern United States. In: Sys-

tem ACE, editor; 2014.

8. van der Maesen LJG. Pueraria: botanical characteristics. In: Keung WM,

editor. Pueraria the genus Pueraria. London and New York: Taylor and

Francis; 2002. p. 1–28.

9. Sun JH, Li ZC, Jewett DK, Britton KO, Ye WH, Ge XJ. Genetic diversity of

Pueraria lobata (kudzu) and closely related taxa as revealed by inter-

simple sequence repeat analysis. Weed Res. 2005;45(4):255–60.

10. Hamad I, Delaporte E, Raoult D, Bittar F. Detection of termites and other

insects consumed by African great apes using molecular fecal analysis.

Sci Rep. 2014;4:4478.

11. Rytkonen S, Vesterinen EJ, Westerduin C, Leviakangas T, Vatka E, Mutanen

M, et al. From feces to data: a metabarcoding method for analyzing

consumed and available prey in a bird-insect food web. Ecol Evol.

2019;9(1):631–9.

12. Yamamoto S, Uchida K. A generalist herbivore requires a wide array of

plant species to maintain its populations. Biol Conserv. 2018;228:167–74.

13. Coutinho Moraes DF, Still DW, Lum MR, Hirsch AM. DNA-based

authentication of botanicals and plant-derived dietary supple-

ments: where have we been and where are we going? Planta Med.

2015;81(9):687–95.

14. Fibigr J, Satinsky D, Solich P. Current trends in the analysis and quality

control of food supplements based on plant extracts. Anal Chim Acta.

2018;1036:1–15.

15. Lopez-Gutierrez N, Romero-Gonzalez R, Vidal JLM, Frenich AG. Quality

control evaluation of nutraceutical products from ginkgo biloba using

liquid chromatography coupled to high resolution mass spectrometry. J

Pharm Biomed Anal. 2016;121:151–60.

16. He X, Blount JW, Ge S, Tang Y, Dixon RA. A genomic approach to isoﬂa-

vone biosynthesis in kudzu (Pueraria lobata). Planta. 2011;233(4):843–55.

17. Yu J, Xue J-H, Zhou S-L. New universal matK primers for DNA barcoding

angiosperms. J Syst Evol. 2011;49(3):176–81.

18. Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly

using de Bruijn graphs. Genome Res. 2008;18(5):821–9.

19. Schulz MH, Zerbino DR, Vingron M, Birney E. Oases: robust de novo RNA-

seq assembly across the dynamic range of expression levels. Bioinformat-

ics. 2012;28(8):1086–92.

20. Rana SB, Zadlock FJ, Zhang Z, Murphy WR, Bentivegna CS. Comparison of

de novo transcriptome assemblers and k-mer strategies using the killiﬁsh,

Fundulus heteroclitus. PLoS One. 2016;11(4):e0153104.

21. Schliesky S, Gowik U, Weber AP, Brautigam A. RNA-seq assembly - are we

there yet? Front Plant Sci. 2012;3:220.

22. Yang Y, Smith SA. Optimizing de novo assembly of short-read RNA-seq

data for phylogenomics. BMC Genomics. 2013;14:328.

23. Garrido-Cardenas JA, Mesa-Valle C, Manzano-Agugliaro F. Trends in plant

research using molecular markers. Planta. 2018;247(3):543–57.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 17 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

24. Taheri S, Lee Abdullah T, Yusop MR, Hanaﬁ MM, Sahebi M, Azizi P, et al.

Mining and development of novel SSR markers using next generation

sequencing (NGS) data in plants. Molecules. 2018;23(2):399.

25. Thiel T, Michalek W, Varshney RK, Graner A. Exploiting EST databases for

the development and characterization of gene-derived SSR-markers in

barley (Hordeum vulgare L.). Theor Appl Genet. 2003;106(3):411–22.

26. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene

ontology: tool for the uniﬁcation of biology. Nat Genet. 2000;25(1):25–9.

27. Consortium GO. The gene ontology resource: enriching a GOld mine.

Nucleic Acids Res. 2021;49(D1):D325–D34.

28. The gene ontology resource. Available from: http:// geneo ntolo gy. org/.

Accessed 16 Nov 2021.

29. Britton KO, Orr D, Sun J. Kudzu. Morgantown: Forest Health Technology

Enterprise Team; 2002. Contract No.: FHTET-2002-04

30. Brautigam A, Kajala K, Wullenweber J, Sommer M, Gagneul D, Weber KL,

et al. An mRNA blueprint for C4 photosynthesis derived from compara-

tive transcriptomics of closely related C3 and C4 species. Plant Physiol.

2011;155(1):142–56.

31. Voelckel C, Gruenheit N, Lockhart P. Evolutionary transcriptom-

ics and proteomics: insight into plant adaptation. Trends Plant Sci.

2017;22(6):462–71.

32. Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, et al.

Genome sequence of the palaeopolyploid soybean. Nature.

2010;463(7278):178–83.

33. Zhu Y-P, Zhang H-M, Zeng M. Pueraria (Ge) in traditional Chinese herbal

medicine. In: Keung WM, editor. Pueraria the genus Pueraria. London and

New York: Taylor and Francis; 2002. p. 57–69.

34. Cheng T, Xu C, Lei L, Li C, Zhang Y, Zhou S. Barcoding the kingdom Plan-

tae: new PCR primers for ITS regions of plants with improved universality

and speciﬁcity. Mol Ecol Resour. 2016;16(1):138–49.

35. Hollingsworth PM, Graham SW, Little DP. Choosing andusing a plant DNA

barcode. PLoS One. 2011;6(5):e19254.

36. Kress WJ. Plant DNA barcodes: applications today and in the future. J Syst

Evol. 2017;55(4):291–307.

37. Bolson M, Smidt Ede C, Brotto ML, Silva-Pereira V. ITS and trnH-psbA

as eﬃcient DNA barcodes to identify threatened commercial woody

angiosperms from southern Brazilian Atlantic rainforests. PLoS One.

2015;10(12):e0143049.

38. Kang Y, Deng Z, Zang R, Long W. DNA barcoding analysis and phylo-

genetic relationships of tree species in tropical cloud forests. Sci Rep.

2017;7(1):12564.

39. Liu J, Shi L, Han J, Li G, Lu H, Hou J, et al. Identiﬁcation of species in the

angiosperm family Apiaceae using DNA barcodes. Mol Ecol Resour.

2014;14(6):1231–8.

40. Raveenadar S, Lee G-A, Lee J-R, Lee KJ, Lee S-Y, Cho G-T, et al. DNA barcodes

for the assessment of phylogenetic relationships based on CpDNA and

NrDNA regions in Vigna species. Plant Breed Biotechnol. 2018;6(3):285–92.

41. Tahir A, Hussain F, Ahmed N, Ghorbani A, Jamil A. Assessing universality

of DNA barcoding in geographically isolated selected desert medicinal

species of Fabaceae and Poaceae. PeerJ. 2018;6:e4499.

42. Wu F, Ma J, Meng Y, Zhang D, Pascal Muvunyi B, Luo K, et al. Potential

DNA barcodes for Melilotus species based on ﬁve single loci and their

combinations. PLoS One. 2017;12(9):e0182693.

43. Zhang D, Jiang B. Species identiﬁcation in complex groups of

medicinal plants based on DNA barcoding: a case study on Astra-

galus spp. (Fabaceae) from southwest China. Conserv Genet Resour.

2019;12(3):469–78.

44. Ratnasingham S, Hebert PD. BOLD: the barcode of life data system. Mol

Ecol Notes. 2007;7(3):355–64 http:// www. barco dingl ife. org.

45. Ohashi H, Tateishi Y, Nemoto T, Endo Y. Taxonomic studies on the Legumi-

nosae of Taiwan III. Sci Rep Tohoku Univ Ser 4. 1988;39:191–248.

46. Stark R, Grzelak M, Hadﬁeld J. RNA sequencing: the teenage years. Nat

Rev Genet. 2019;20(11):631–56.

47. Zhang S, Zhang L, Tai Y, Wang X, Ho CT, Wan X. Gene discovery of char-

acteristic metabolic pathways in the tea plant (Camellia sinensis) using

‘omics’-based network approaches: a future perspective. Front Plant Sci.

2018;9:480.

48. Han R, Takahashi H, Nakamura M, Yoshimoto N, Suzuki H, Shibata D, et al.

Transcriptomic landscape of Pueraria lobata demonstrates potential for

phytochemical study. Front Plant Sci. 2015;6:426.

49. Wang X, Li S, Li J, Li C, Zhang Y. De novo transcriptome sequencing in

Pueraria lobata to identify putative genes involved in isoﬂavones biosyn-

thesis. Plant Cell Rep. 2015;34(5):733–43.

50. Wang C, Xu N, Cui S. Comparative transcriptome analysis of roots, stems,

and leaves of Pueraria lobata (Willd.) Ohwi: identiﬁcation of genes

involved in isoﬂavonoid biosynthesis. PeerJ. 2021;9:e10885.

51. He M, Yao Y, Li Y, Yang M, Li Y, Wu B, et al. Comprehensive transcriptome

analysis reveals genes potentially involved in isoﬂavone biosynthesis in

Pueraria thomsonii Benth. PLoS One. 2019;14(6):e0217593.

52. Suntichaikamolkul N, Tantisuwanichkul K, Prombutara P, Kobtrakul K,

Zumsteg J, Wannachart S, et al. Transcriptome analysis of Pueraria candol-

lei var. miriﬁca for gene discovery in the biosyntheses of isoﬂavones and

miroestrol. BMC Plant Biol. 2019;19(1):581.

53. Bentley KE, Mauricio R. High degree of clonal reproduction and lack of

large-scale geographic patterning mark the introduced range of the

invasive vine, kudzu (Pueraria montana var. lobata), in North America. Am

J Bot. 2016;103(8):1499–507.

54. Harron P, Joshi O, Edgar CB, Paudel S, Adhikari A. Predicting kudzu (Puer-

aria montana) spread and its economic impacts in timber industry: a case

study from Oklahoma. PLoS One. 2020;15(3):e0229835.

55. Gutierrez-Gonzalez JJ, Tu ZJ, Garvin DF. Analysis and annotation of the

hexaploid oat seed transcriptome. BMC Genomics. 2013;14:471.

56. Verma P, Shah N, Bhatia S. Development of an expressed gene catalogue

and molecular markers from the de novo assembly of short sequence

reads of the lentil (Lens culinaris Medik.) transcriptome. Plant Biotechnol J.

2013;11(7):894–905.

57. Powell W, Machray GC, Provan J. Polymorphism revealed by simple

sequence repeats. Trends Plant Sci. 1996;1(7):215–22.

58. Varshney RK, Graner A, Sorrells ME. Genic microsatellite markers in plants:

features and applications. Trends Biotechnol. 2005;23(1):48–55.

59. Heider B, Fischer E, Berndl T, Schultze-Kraft R. Analysis of genetic variation

among accessions of Pueraria montana (Lour.) Merr. var. lobata and Puer-

aria phaseoloides (Roxb.) Benth. based on RAPD markers. Genet Resour

Crop Evol. 2006;54(3):529–42.

60. Hoﬀberg SL, Bentley KE, Lee JB, Myhre KE, Iwao K, Glenn TC, et al.

Characterization of 15 microsatellite loci in kudzu (Pueraria montana var.

lobata) from the native and introduced ranges. Conserv Genet Resour.

2014;7(2):403–5.

61. Jewett DK, Jiang CJ, Britton KO, Sun JH, Tang J. Characterizing specimens

of kudzu and related taxa with RAPD’s. Castanea. 2003;68(3):254–60.

62. Pappert RA, Hamrick JL, Donovan LA. Genetic variation in Pueraria lobata

(Fabaceae), an introduced, clonal, invasive plant of the southeastern

United States. Am J Bot. 2000;87(9):1240–5.

63. Haynsen MS, Vatanparast M, Mahadwar G, Zhu D, Moger-Reischer RZ,

Doyle JJ, et al. De novo transcriptome assembly of Pueraria montana

var. lobata and Neustanthus phaseoloides for the development of eSSR

and SNP markers: narrowing the US origin(s) of the invasive kudzu. BMC

Genomics. 2018;19(1):439.

64. Garg R, Patel RK, Jhanwar S, Priya P, Bhattacharjee A, Yadav G, et al. Gene

discovery and tissue-speciﬁc transcriptome analysis in chickpea with

massively parallel pyrosequencing and web resource development. Plant

Physiol. 2011;156(4):1661–78.

65. Sun R, Lin F, Huang P, Zheng Y. Moderate genetic diversity and genetic

diﬀerentiation in the relict tree Liquidambar formosana Hance revealed by

genic simple sequence repeat markers. Front Plant Sci. 2016;7:1411.

66. Toth G, Gaspari Z, Jurka J. Microsatellites in diﬀerent eukaryotic genomes:

survey and analysis. Genome Res. 2000;10(7):967–81.

67. Zalapa JE, Cuevas H, Zhu H, Steﬀan S, Senalik D, Zeldin E, et al. Using next-

generation sequencing approaches to isolate simple sequence repeat

(SSR) loci in the plant sciences. Am J Bot. 2012;99(2):193–208.

68. Li Q, Su X, Ma H, Du K, Yang M, Chen B, et al. Development of genic SSR

marker resources from RNA-seq data in Camellia japonica and their appli-

cation in the genus Camellia. Sci Rep. 2021;11(1):9919.

69. Li T, Zhou H, Ma J, Dong L, Xu F, Fu X, et al. Quality assessment of licorice

based on quantitative analysis of multicomponents by single marker

combined with HPLC ﬁngerprint. Evid Based Complement Alternat Med.

2021;2021:1–12.

70. Karciota H, Paizila A, Topcu H, Ilikcioglu E, Kafkas S. Transcriptome

sequencing and development of novel genic SSR markers from Pistacia

vera L. Front Genet. 2020;11:1021.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Page 18 of 18

Adolfoetal. BMC Plant Biology (2022) 22:10

•

fast, convenient online submission

•

thorough peer review by experienced researchers in your ﬁeld

•

rapid publication on acceptance

•

support for research data, including large and complex data types

•

gold Open Access which fosters wider collaboration and increased citations

maximum visibility for your research: over 100M website views per year

•

At BMC, research is always in progress.

Learn more biomedcentral.com/submissions

Ready to submit your research

? Choose BMC and benefit from:

71. Liu L, Fan X, Tan P, Wu J, Zhang H, Han C, et al. The development of SSR

markers based on RNA-sequencing and its validation between and

within Carex L. species. BMC Plant Biol. 2021;21(1):17.

72. Sasek TW, Strain BR. Eﬀects of carbon dioxide enrichment on the growth

and morphology of kudzu (Pueraria lobata). Weed Sci. 1988;36(1):28–36.

73. Claytor M, Hickman KR. Kudzu, Pueraria montana (Lour.) Merr. abundance

and distribution in Oklahoma. J Okla Native Plant Soc. 2015;15:9.

74. Castresana J. Selection of conserved blocks from multiple alignments for

their use in phylogenetic analysis. Mol Biol Evol. 2000;17(4):540–52.

75. Dereeper A, Guignon V, Blanc G, Audic S, Buﬀet S, Chevenet F, et al.

Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic

Acids Res. 2008;36(Web Server issue):W465–9.

76. Dereeper A, Audic S, Claverie JM, Blanc G. BLAST-EXPLORER helps you

building datasets for phylogenetic analysis. BMC Evol Biol. 2010;10:8.

77. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and

high throughput. Nucleic Acids Res. 2004;32(5):1792–7.

78. Elias I, Lagergren J. Fast computation of distance estimators. BMC Bioin-

formatics. 2007;8:89.

79. Felsenstein J. PHYLIP - phylogeny inference package (version 3.2). Cladis-

tics. 1989;5(2):164–6.

80. Gascuel O. BIONJ: an improved version of the NJ algorithm based on a

simple model of sequence data. Mol Biol Evol. 1997;14(7):685–95.

81. Tamura K, Stecher G, Kumar S. MEGA11: molecular evolutionary genetics

analysis version 11. Mol Biol Evol. 2021;38(7):3022–7.

82. Rao X, Krom N, Tang Y, Widiez T, Havkin-Frenkel D, Belanger FC, et al. A

deep transcriptomic analysis of pod development in the vanilla orchid

(Vanilla planifolia). BMC Genomics. 2014;15:964.

83. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al.

BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421.

84. Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO:

a universal tool for annotation, visualization and analysis in functional

genomics research. Bioinformatics. 2005;21(18):3674–6.

85. KEGG: Kyoto Encyclopedia of Genes and Genomes. Available from:

https:// www. genome. jp/ kegg/. Accessed 1 Oct 2021.

86. Microsatellite Identiﬁcation Tool (MISA). Available from: https:// webbl ast.

ipk- gater sleben. de/ misa/.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in pub-

lished maps and institutional aﬃliations.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Terms and Conditions

Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).

Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-

scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By

accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these

purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.

These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal

subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription

(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will

apply.

We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within

ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not

otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as

detailed in the Privacy Policy.

While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may

not:

use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access

control;

use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is

otherwise unlawful;

falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in

writing;

use bots or other automated methods to access the content or redirect messages

override any security feature or exclusionary protocol; or

share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal

content.

In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,

royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal

content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any

other, institutional repository.

These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or

content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature

may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.

To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied

with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,

including merchantability or fitness for any particular purpose.

Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed

from third parties.

If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not

expressly permitted by these Terms, please contact Springer Nature at

onlineservice@springernature.com

Endorsement and phylogenetic analysis of some Fabaceae plants based on DNA barcoding

Article

Full-text available

Jun 2022
MOL BIOL REP

Background DNA barcoding have been considered as a tool to facilitate species identification based on its simplicity and high-level accuracy in compression to the complexity and subjective biases linked to morphological identification of taxa. MaturaseK gene ( MatK gene) of the chloroplast is very vital in the plant system which is involved in the group II intron splicing. The main objective of this study is to determine the relative utility of the “ MatK ” chloroplast gene for barcoding in 15 legume as a tool to facilitate species identification based on their simplicity and high-level accuracy linked to morphological identification of taxa. Methods and Results MatK gene sequences were submitted to GenBank and the accession numbers were obtained with sequence length ranging from 730 to 1545 nucleotides. These DNA sequences were aligned with database sequence using PROMALS server , Clustal Omega server and Bioedit program. Maximum likelihood and neighbor-joining algorithms were employed for constructing phylogeny. Overall, these results indicated that the phylogenetic tree analysis and the evolutionary distances of an individual dataset of each species were agreed with a phylogenetic tree of all each other consisting of two clades, the first clade comprising (Enterolobium contortisiliquum, Albizia lebbek), Acacia saligna , Leucaena leucocephala, Dichrostachys Cinerea, (Delonix regia, Parkinsonia aculeata), (Senna surattensis, Cassia fistula, Cassia javanica) and Schotia brachypetala were more closely to each other, respectively. The remaining four species of Erythrina humeana, (Sophora secundiflora, Dalbergia Sissoo, Tipuana Tipu) constituted the second clade. Conclusion Moreover, their sequences could be successfully utilized in single nucleotide polymorphism or as part of the sequence as DNA fragment analysis utilizing polymerase chain reaction in plant systematic. Therefore, MatK gene is considered promising a candidate for DNA barcoding in the plant family Fabaceae and provides a clear relationship between the families.

High integrity Pueraria montana var. lobata genome and population analysis revealed the genetic diversity of Pueraria genus

Article

Full-text available

May 2024

Pueraria montana var. lobata (P. lobata) is a traditional medicinal plant belonging to the Pueraria genus of Fabaceae family. Pueraria montana var. thomsonii (P. thomsonii) and Pueraria montana var. montana (P. montana) are its related species. However, evolutionary history of the Pueraria genus is still largely unknown. Here, a high-integrity, chromosome-level genome of P. lobata and an improved genome of P. thomsonii were reported. It found evidence for an ancient whole-genome triplication and a recent whole-genome duplication shared with Fabaceae in three Pueraria species. Population genomics of 121 Pueraria accessions demonstrated that P. lobata populations had substantially higher genetic diversity, and P. thomsonii was probably derived from P. lobata by domestication as a subspecies. Selection sweep analysis identified candidate genes in P. thomsonii populations associated with the synthesis of auxin and gibberellin, which potentially play a role in the expansion and starch accumulation of tubers in P. thomsonii. Overall, the findings provide new insights into the evolutionary and domestication history of the Pueraria genome and offer a valuable genomic resource for genetic improvement of these species.

Identification of Treculia africana L. varieties using Internal Transcribed Spacer Region 1 (ITS 1) and Internal Transcribed Spacer Region 2 (ITS 2) DNA barcodes

Preprint

Full-text available

May 2024

Background Treculia africana L. (African breadfruit), is an underutilized, underexploited, and endangered species of southern Nigeria. It has been identified and classified using anatomical features, but there is insufficient information on its molecular identification and classification. There is a need to complement the morphological identification of the plant with molecular methods. Results To identify 86 accessions of Treculia africana var inversa and Treculia africana var africana, Internal Transcribed Spacer Region ITS-2 and Internal Transcribed Spacer Region lTS- 1 DNA barcodes were used. In this study, we observed that to determine the homology between sequences obtained and the Genbank database, the National Center for Biotechnology Information (NCBI) basic alignment search tool (BLAST) did not reveal any match. An alignment of the accessions with KU855474.1 Artocarpus altilis showed similarities via molecular evolutionary genetic analysis (mega 11). Conclusions The alignment revealed that the Treculia accessions were related and genetically similar to Artocarpus species, members of the Moraceae family, indicating that the accessions belong to the same family. However, the two varieties of Treculia could not be distinguished with ITS Barcodes. The molecular data of Treculia species need to be populated on the gene bank to support future molecular studies and also a combination of DNA barcodes is recommended for identification purposes.

Relevance of genetic and active ingredient content differences in Leonurus japonicus Houtt from different origins

Preprint

Full-text available

Feb 2023

Leonurus japonicus Houtt. (Labiatae), a perennial herb, is used to treat cardiovascular, uterine, and gynecological diseases. In the present study, a phylogenetic tree was constructed based on the ITS + psbA - trnH + rbcL + rpoB concatenation sequence, and partial least squares-discriminant analysis (PLS-DA) was performed based on high-performance liquid chromatography. The phylogenetic tree and PLS-DA were combined to correlate genetic and chemical differences among L. japonicus derived from different origins. The results showed that the concatenation sequence could distinguish among L. japonicus from different origins. Moreover, chemical analysis revealed intergroup differences, but the results were not of sufficiently high quality as that of molecular phylogeny. Furthermore, the results of combined chemical and phylogenetic analyses suggested that differences in metabolites are influenced by not only genetic differences but also environmental factors. These results provide valuable information for the artificial cultivation of L. japonicus and new ideas for improving its quality.

Chromosome-Level Genome Assembly and Multi-Omics Dataset Provide Insights into Isoflavone and Puerarin Biosynthesis in Pueraria lobata (Wild.) Ohwi

Article

Full-text available

Nov 2022

Pueraria lobata (wild.) Ohwi is a leguminous plant and one of the traditional Chinese herbal medicines. Its puerarin extract is widely used in the pharmaceutical industry. This study reported a chromosome-level genome assembly for P. lobata and its characteristics. The genome size was ~939.2 Mb, with a contig N50 of 29.51 Mbp. Approximately 97.82% of the assembled sequences were represented by 11 pseudochromosomes. We identified that the repetitive sequences accounted for 63.50% of the P. lobata genome. A total of 33,171 coding genes were predicted, of which 97.34% could predict the function. Compared with other species, P. lobata had 757 species-specific gene families, including 1874 genes. The genome evolution analysis revealed that P. lobata was most closely related to Glycine max and underwent two whole-genome duplication (WGD) events. One was in a gamma event shared by the core dicotyledons at around 65 million years ago, and another was in the common ancestor shared by legume species at around 25 million years ago. The collinearity analysis showed that 61.45% of the genes (54,579 gene pairs) in G. max and P. lobata had collinearity. In this study, six unique PlUGT43 homologous genes were retrieved from the genome of P. lobata, and no 2-hydroxyisoflavanone 8-C-glucoside was found in the metabolites. This also revealed that the puerarin synthesis was mainly from the glycation of daidzein. The combined transcriptome and metabolome analysis suggested that two bHLHs, six MYBs and four WRKYs were involved in the expression regulation of puerarin synthesis structural genes. The genetic information obtained in this study provided novel insights into the biological evolution of P. lobata and leguminous species, and it laid the foundation for further exploring the regulatory mechanism of puerarin synthesis.

Evaluation of pathways to the C‐glycosyl isoflavone puerarin in roots of kudzu (Pueraria montana lobata)

Article

Full-text available

Sep 2022

Kudzu (Pueraria montana lobata) is used as a traditional medicine in China and Southeast Asia but is a noxious weed in the Southeastern United States. It produces both O‐ and C‐glycosylated isoflavones, with puerarin (C‐glucosyl daidzein) as an important bioactive compound. Currently, the stage of the isoflavone pathway at which the C‐glycosyl unit is added remains unclear, with a recent report of direct C‐glycosylation of daidzein contradicting earlier labeling studies supporting C‐glycosylation at the level of chalcone. We have employed comparative mRNA sequencing of the roots from two Pueraria species, one of which produces puerarin (field collected P. montana lobata) and one of which does not (commercial Pueraria phaseoloides), to identify candidate uridine diphosphate glycosyltransferase (UGT) enzymes involved in puerarin biosynthesis. Expression of recombinant UGTs in Escherichia coli and candidate C‐glycosyltransferases in Medicago truncatula were used to explore substrate specificities, and gene silencing of UGT and key isoflavone biosynthetic genes in kudzu hairy roots employed to test hypotheses concerning the substrate(s) for C‐glycosylation. Our results confirm UGT71T5 as a C‐glycosyltransferase of isoflavone biosynthesis in kudzu. Enzymatic, isotope labeling, and genetic analyses suggest that puerarin arises both from the direct action of UGT71T5 on daidzein and via a second route in which the C‐glycosidic linkage is introduced to the chalcone isoliquiritigenin. Comparative RNA sequencing, isotopic labeling, and genetic gain‐ and loss‐of‐function experiments define the role of UGT71T5 in the biosynthesis of C‐glycosyl isoflavones in kudzu (Pueraria montana lobata).

DNA Sequencing Technologies and DNA Barcoding

Chapter

Apr 2024

DNA barcodes are short, standardized DNA segments that geneticists can use to identify all living taxa. On the other hand, DNA barcoding identifies species by analyzing these specific regions against a DNA barcode reference library. In its initial years, DNA barcodes sequenced by Sanger’s method were extensively used by taxonomists for the characterization and identification of species. But in recent years, DNA barcoding by next-generation sequencing (NGS) has found broader applications, such as quality control, biomonitoring of protected species, and biodiversity assessment. Technological advancements have also paved the way to metabarcoding, which has enabled massive parallel sequ.encing of complex bulk samples using high-throughput sequencing techniques. In future, DNA barcoding along with high-throughput techniques will show stupendous progress in taxonomic classification with reference to available sequence data.

Inhibiting Protein Aggregation Using Cellulose Nanocrystal in MALDI-TOF MS Analysis: Improving the Sensitivity and Repeatability of Intact Protein in Pueraria

Article

Dec 2023
J AGR FOOD CHEM

Relevance of genetic and active ingredient content differences in Leonurus japonicus Houtt from different origins

Article

Full-text available

Jul 2023
GENET RESOUR CROP EV

Leonurus japonicus Houtt. (Lamiaceae) is a perennial herb, which is commonly used in the treatment of cardiovascular, uterine, and gynecological diseases. In the present study, we constructed a phylogenetic tree based on the ITS + psbA-trnH + rbcL + rpoB concatenation sequence and performed partial least squares-discriminant analysis was used high-performance liquid chromatography. The results indicated that the concatenation sequence could distinguish among L. japonicus from different origins. Additionally, chemical analysis revealed intergroup differences, albeit of lower quality than the molecular phylogeny. By combining both methods, we were able to correlate the genetic and chemical differences among L. japonicus derived from different origins. Furthermore, our combined chemical and phylogenetic analyses suggested that differences in metabolites are not solely influenced by genetic differences but also environmental factors. These findings can be valuable for the artificial cultivation of L. japonicus and provide new insights for improving its quality.

Chromosome-level and graphic genomes provide insights into metabolism of bioactive metabolites and cold-adaption of Pueraria lobata var. montana

Article

Full-text available

Aug 2022

Pueraria lobata var. montana (P. montana) belongs to the genus Pueraria and originated in Asia. Compared with its sister P. thomsonii, P. montana has stronger growth vigor and cold-adaption, but contains less bioactive metabolites such as puerarin. To promote the investigation of metabolic regulation and genetic improvement of Pueraria, the present study reports a chromosome-level genome of P. montana with length of 978.59 Mb and scaffold N50 of 80.18 Mb. Comparative genomics analysis showed that P. montana possesses smaller genome size than that of P. thomsonii owing to less repeat sequences and duplicated genes. A total of 6,548 and 4,675 variety-specific gene families were identified in P. montana and P. thomsonii, respectively. The identified variety-specific and expanded/contracted gene families related to biosynthesis of bioactive metabolites and microtubules are likely the causes for the different characteristics of metabolism and cold-adaption of P. montana and P. thomsonii. Moreover, a graphic genome was constructed based on 11 P. montana accessions. Total 92 structural variants were identified and most of which are related to stimulus-response. In conclusion, the chromosome-level and graphic genomes of P. montana will not only facilitate the studies of evolution and metabolic regulation, but also promote the breeding of Pueraria.

Kudzu isoflavone C‐glycosides: Analysis, biological activities, and metabolism

Article

Full-text available

Aug 2021

Radix Pueraria (the root of kudzu Pueraria lobota) is a popular traditional Chinese medicine used in dietary supplements in Western markets and has potential health benefits. Kudzu roots are rich in isoflavones C- and O-glycosides, of which puerarin (daidzein 8-C-glucoside) is the most abundant isoflavone. Puerarin is a unique isoflavone that it is resistant to intestinal hydrolysis and has a wide range of effects in preventing metabolic diseases. Our previous studies indicate that chronic exposure to a diet enriched in puerarin significantly reduces serum total cholesterol, arterial blood pressure, insulin resistance and hyperglycemia in ovariectomized, stroke-prone spontaneously hypertensive rats (SP-SHR), a model of metabolic syndrome. Further, our studies demonstrate that puerarin is absorbed as the intact glucoside and acutely improves glucose tolerance, indicating that it has potential for the prevention and treatment of diabetes. This paper reviews recent progress in the understanding of biological activities and metabolism and in the analysis of puerarin in kudzu root extracts or supplements.

Development of genic SSR marker resources from RNA-seq data in Camellia japonica and their application in the genus Camellia

Article

Full-text available

May 2021

Camellia is a genus of flowering plants in the family Theaceae, and several species in this genus have economic importance. Although a great deal of molecular makers has been developed for molecular assisted breeding in genus Camellia in the past decade, the number of simple sequence repeats (SSRs) publicly available for plants in this genus is insufficient. In this study, a total of 28,854 potential SSRs were identified with a frequency of 4.63 kb. A total of 172 primer pairs were synthesized and preliminarily screened in 10 C. japonica accessions, and of these primer pairs, 111 were found to be polymorphic. Fifty-one polymorphic SSR markers were randomly selected to perform further analysis of the genetic relationships of 89 accessions across the genus Camellia. Cluster analysis revealed major clusters corresponding to those based on taxonomic classification and geographic origin. Furthermore, all the genotypes of C. japonica separated and consistently grouped well in the genetic structure analysis. The results of the present study provide high-quality SSR resources for molecular genetic breeding studies in camellia plants.

MEGA11: Molecular Evolutionary Genetics Analysis Version 11

Article

Full-text available

Apr 2021
MOL BIOL EVOL

The Molecular Evolutionary Genetics Analysis (MEGA) software has matured to contain a large collection of methods and tools of computational molecular evolution. Here, we describe new additions that make MEGA a more comprehensive tool for building timetrees of species, pathogens, and gene families using rapid relaxed-clock methods. Methods for estimating divergence times and confidence intervals are implemented to use probability densities for calibration constraints for node-dating and sequence sampling dates for tip-dating analyses, which will be supported by new options for tagging sequences with spatiotemporal sampling information, an expanded interactive Node Calibrations Editor, and an extended Tree Explorer to display timetrees. We have now added a Bayesian method for estimating neutral evolutionary probabilities of alleles in a species using multispecies sequence alignments and a machine learning method to test for the autocorrelation of evolutionary rates in phylogenies. The computer memory requirements for the maximum likelihood analysis are reduced significantly through reprogramming, and the graphical user interface (GUI) has been made more responsive and interactive for very big datasets. These enhancements will improve the user experience, quality of results, and the pace of biological discovery. Natively compiled GUI and command-line versions of MEGA11 are available for Microsoft Windows, Linux, and macOS from www.megasoftware.net.

Comparative transcriptome analysis of roots, stems, and leaves of Pueraria lobata (Willd.) Ohwi: identification of genes involved in isoflavonoid biosynthesis

Article

Full-text available

Feb 2021

Background Pueraria lobata (Willd.) Ohwi is a valuable herb used in traditional Chinese medicine. Isoflavonoids are the major bioactive compounds in P. lobata , namely puerarin, daidzin, glycitin, genistin, daidzein, and glycitein, which have pharmacological properties of anti-cardiovascular, anti-hypertension, anti-inflammatory, and anti-arrhythmic. Methods To characterize the corresponding genes of the compounds in the isoflavonoid pathway, RNA sequencing (RNA-Seq) analyses of roots, stems, and leaves of P. lobata were carried out on the BGISEQ-500 sequencing platform. Results We identified 140,905 unigenes in total, of which 109,687 were annotated in public databases, after assembling the transcripts from all three tissues. Multiple genes encoding key enzymes, such as IF7GT and transcription factors, associated with isoflavonoid biosynthesis were identified and then further analyzed. Quantitative real-time PCR (qRT-PCR) results of some genes encoding key enzymes were consistent with our RNA-Seq analysis. Differentially expressed genes (DEGs) were determined by analyzing the expression profiles of roots compared with other tissues (leaves and stems). This analysis revealed numerous DEGs that were either uniquely expressed or up-regulated in the roots. Finally, quantitative analyses of isoflavonoid metabolites occurring in the three P. lobata tissue types were done via high-performance liquid-chromatography and tandem mass spectrometry methodology (HPLC-MS/MS). Our comprehensive transcriptome investigation substantially expands the genomic resources of P. lobata and provides valuable knowledge on both gene expression regulation and promising candidate genes that are involved in plant isoflavonoid pathways.

Quality Assessment of Licorice Based on Quantitative Analysis of Multicomponents by Single Marker Combined with HPLC Fingerprint

Article

Full-text available

Jan 2021
EVID-BASED COMPL ALT

Licorice is a commonly used traditional Chinese medicine and natural sweetening agent, rich in numerous bioactive compounds. Moreover, it is one of the oldest and most frequently employed folk medicines in both eastern and western countries. It is prescribed for the treatment of asthma, fever, and cough. However, with the increasing demand of licorice, its quality and safety become the important issue. The content in licorice varies significantly in materials from different geographical origins. In this study, a reasonable and feasible evaluation method for the quality assessment of licorice was developed based on the analysis of high-performance liquid chromatography (HPLC) fingerprint, combined with the quantitative analysis of multicomponents by single marker (QAMS) method. Glycyrrhizic acid was selected as the internal reference substance, and ten components were simultaneously determined based on relative correction factors. The contents of eleven components in 21 batches of licorice were determined by the QAMS and the ESM (external standard method); there was no significant difference by comparison of the quantitative results between the QAMS and the ESM method; the cosine value (Cir > 0.9999) confirmed the consistency of the two methods. According to the outcomes of 21 batches of licorice samples, the contents of the eleven components were used for further chemometric analysis. All of the samples of licorice from various geographical origins were divided into five categories based on hierarchical cluster analysis, which indicated the crucial influence of geographical origins on licorice. This study showed that QAMS combined with HPLC fingerprint and chemometrics methods could effectively control the quality of licorice. Hence, QAMS is a feasible and promising method for promoting the quality control standardization process of herbal medicines.

The Gene Ontology resource: enriching a GOld mine

Article

Full-text available

Jan 2021

The Gene Ontology Consortium (GOC) provides the most comprehensive resource currently available for computable knowledge regarding the functions of genes and gene products. Here, we report the advances of the consortium over the past two years. The new GO-CAM annotation framework was notably improved, and we formalized the model with a computational schema to check and validate the rapidly increasing repository of 2838 GO-CAMs. In addition, we describe the impacts of several collaborations to refine GO and report a 10% increase in the number of GO annotations, a 25% increase in annotated gene products, and over 9,400 new scientific articles annotated. As the project matures, we continue our efforts to review older annotations in light of newer findings, and, to maintain consistency with other ontologies. As a result, 20 000 annotations derived from experimental data were reviewed, corresponding to 2.5% of experimental GO annotations. The website (http://geneontology.org) was redesigned for quick access to documentation, downloads and tools. To maintain an accurate resource and support traceability and reproducibility, we have made available a historical archive covering the past 15 years of GO data with a consistent format and file structure for both the ontology and annotations.

The development of SSR markers based on RNA-sequencing and its validation between and within Carex L. species

Article

Full-text available

Jan 2021
BMC PLANT BIOL

Background Carex L. is one of the largest genera in the Cyperaceae family and an important vascular plant in the ecosystem. However, the genetic background of Carex is complex and the classification is not clear. In order to investigate the gene function annotation of Carex , RNA-sequencing analysis was performed. Simple sequence repeats (SSRs) were generated based on the Illumina data and then were utilized to investigate the genetic characteristics of the 79 Carex germplasms. Results In this study, 36,403 unigenes with a total length of 41,724,615 bp were obtained and annotated based on GO, KOG, KEGG, NR databases. The results provide a theoretical basis for gene function exploration. Out of 8776 SSRs, 96 pairs of primers were randomly selected. One hundred eighty polymorphic bands were amplified with a polymorphism rate of 100% based on 42 pairs of primers with higher polymorphism levels. The average band number was 4.3 per primer, the average distance value was 0.548, and the polymorphic information content was ranged from 0.133 to 0.494. The number of observed alleles (Na), effective alleles (Ne), Nei’s (1973) gene diversity (H), and the Shannon information index (I) were 2.000, 1.376, 0.243, and 0.391, respectively. NJ clustering divided into three groups and the accessions from New Zealand showed a similar genetic attribute and clustered into one group. UPGMA and PCoA analysis also revealed the same result. The analysis of molecular variance (AMOVA) revealed a superior genetic diversity within accessions than between accessions based on geographic origin cluster and NJ cluster. What’s more, the fingerprints of 79 Carex species are established in this study. Different combinations of primer pairs can be used to identify multiple Carex at one time, which overcomes the difficulties of traditional identification methods. Conclusions The transcriptomic analysis shed new light on the function categories from the annotated genes and will facilitate future gene functional studies. The genetic characteristics analysis indicated that gene flow was extensive among 79 Carex species. These markers can be used to investigate the evolutionary history of Carex and related species, as well as to serve as a guide in future breeding projects.

The Gene Ontology resource: Enriching a GOld mine

Article

Full-text available

Dec 2020

The Gene Ontology Consortium (GOC) provides the most comprehensive resource currently available for computable knowledge regarding the functions of genes and gene products. Here, we report the advances of the consortium over the past two years. The new GO-CAM annotation framework was notably improved, and we formalized the model with a computational schema to check and validate the rapidly increasing repository of 2838 GO-CAMs. In addition, we describe the impacts of several collaborations to refine GO and report a 10% increase in the number of GO annotations, a 25% increase in annotated gene products, and over 9,400 new scientific articles annotated. As the project matures, we continue our efforts to review older annotations in light of newer findings, and, to maintain consistency with other on-tologies. As a result, 20 000 annotations derived from experimental data were reviewed, corresponding to 2.5% of experimental GO annotations. The website (http://geneontology.org) was redesigned for quick access to documentation, downloads and tools. To maintain an accurate resource and support trace-ability and reproducibility, we have made available a historical archive covering the past 15 years of GO data with a consistent format and file structure for both the ontology and annotations.

Transcriptome Sequencing and Development of Novel Genic SSR Markers From Pistacia vera L.

Article

Full-text available

Sep 2020

In this study, we aimed to develop novel genic simple sequence repeat (eSSR) markers and to study phylogenetic relationship among Pistacia species. Transcriptome sequencing was performed in different tissues of Siirt and Atl cultivars of pistachio (Pistacia vera). A total of 37.5-Gb data were used in the assembly. The number of total contigs and unigenes was calculated as 98,831, and the length of N50 was 1,333 bp after assembly. A total of 14,308 dinucleotide, trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide SSR motifs (4–17) were detected, and the most abundant SSR repeat types were trinucleotide (29.54%), dinucleotide (24.06%), hexanucleotide (20.67%), pentanucleotide (18.88%), and tetranucleotide (6.85%), respectively. Overall 250 primer pairs were designed randomly and tested in eight Pistacia species for amplification. Of them, 233 were generated polymerase chain reaction products in at least one of the Pistacia species. A total of 55 primer pairs that had amplifications in all tested Pistacia species were used to characterize 11 P. vera cultivars and 78 wild Pistacia genotypes belonging to nine Pistacia species (P. khinjuk, P. eurycarpa, P. atlantica, P. mutica, P. integerrima, P. chinensis, P. terebinthus, P. palaestina, and P. lentiscus). A total of 434 alleles were generated from 55 polymorphic eSSR loci with an average of 7.89 alleles per locus. The mean number of effective allele was 3.40 per locus. Polymorphism information content was 0.61, whereas observed (Ho) and expected heterozygosity (He) values were 0.39 and 0.65, respectively. UPGMA (unweighted pair-group method with arithmetic averages) and STRUCTURE analysis divided 89 Pistacia genotypes into seven populations. The closest species to P. vera was P. khinjuk. P. eurycarpa was closer P. atlantica than P. khinjuk. P. atlantica–P. mutica and P. terebinthus–P. palaestina pairs of species were not clearly separated from each other, and they were suggested as the same species. The present study demonstrated that eSSR markers can be used in the characterization and phylogenetic analysis of Pistacia species and cultivars, as well as genetic linkage mapping and QTL (quantitative trait locus) analysis.

Characterizing specimens of kudzu and related taxa with RAPD’s

Article

Jan 2003
CASTANEA

Identification of Pueraria spp. through DNA barcoding and comparative transcriptomics

Abstract and Figures

Recommended publications

Evaluation of pathways to the C‐glycosyl isoflavone puerarin in roots of kudzu (Pueraria montana lob...

Transcriptome profiling reveals the genes involved in tuberous root expansion in Pueraria (Pueraria...

DNA Barcoding of the Solanaceae Family in Puerto Rico Including Endangered and Endemic Species

Transcriptome Profiling Reveals the Genes Involved in Tuberous Root Expansion in Pueraria (Pueraria...