ArticlePDF Available

ITS and ETS Sequence Data and Phylogeny Reconstruction in Allopolyploids and Hybrids

Authors:

Abstract and Figures

The impact of unknowingly including a hybrid or an allopolyploid in which rDNA homogenization (via gene loss, concerted evolution, or some other mechanism) has not occurred to completion in a phylogenetic analysis of internal transcribed spacer (ITS) or external transcribed spacer (ETS) sequences is unclear. To investigate the impact of polymorphic sites on phylogeny reconstruction, we used ITS and ETS sequence data for diploids and allotetraploids in Tragopogon, as well as ITS data for diploid and allopolyploid species of Paeonia and Glycine, and for diploids and their hybrids in Rubus. Only very general predictions can be made regarding the placement of these polymorphic sequences. The polymorphic sequences of hybrids and allopolyploids appear (1) with a diploid parent (but not necessarily the one with which it shares more apomorphies), (2) at or near the base of the clade containing one or both parents, or (3) in a basal position relative to all other ingroup taxa in the data set. The inclusion of a polymorphic sequence may be accompanied by an increase in the number of shortest trees, less resolution in the strict consensus, and a decrease in bootstrap support for some nodes; CI and RI values are little, if at all, affected. In no case did the addition of a sequence from a hybrid or allopolyploid alter the overall topology in a major way. Our results generally parallel those of phylogenetic studies that include F1 hybrids and their parents and use morphology.
Content may be subject to copyright.
FEATURED PAPER
ITS and ETS Sequence Data and Phylogeny Reconstruction in Allopolyploids
and Hybrids
Douglas E. Soltis,
1,5
Evgeny V. Mavrodiev,
1
Jeff J. Doyle,
2
Jason Rauscher,
2,4
and Pamela S. Soltis
3
1
Department of Botany, University of Florida, Gainesville, Florida 32611 U.S.A.
2
L. H. Bailey Hortorium, Department of Plant Biology, Cornell University, Ithaca, New York 14853 U.S.A.
3
Florida Museum of Natural History, University of Florida, Gainesville, Florida 32611 U.S.A.
4
Department of Biology, University of Puerto Rico, Rio Piedras, San Juan, P.R. 00931-3360
5
Author for correspondence (dsoltis@botany.ufl.edu)
Communicating Editor: Mark P. Simmons
Abstract—The impact of unknowingly including a hybrid or an allopolyploid in which rDNA homogenization (via gene loss, concerted
evolution, or some other mechanism) has not occurred to completion in a phylogenetic analysis of internal transcribed spacer (ITS) or external
transcribed spacer (ETS) sequences is unclear. To investigate the impact of polymorphic sites on phylogeny reconstruction, we used ITS and
ETS sequence data for diploids and allotetraploids in Tragopogon, as well as ITS data for diploid and allopolyploid species of Paeonia and
Glycine, and for diploids and their hybrids in Rubus. Only very general predictions can be made regarding the placement of these polymorphic
sequences. The polymorphic sequences of hybrids and allopolyploids appear (1) with a diploid parent (but not necessarily the one with which
it shares more apomorphies), (2) at or near the base of the clade containing one or both parents, or (3) in a basal position relative to all other
ingroup taxa in the data set. The inclusion of a polymorphic sequence may be accompanied by an increase in the number of shortest trees,
less resolution in the strict consensus, and a decrease in bootstrap support for some nodes; CI and RI values are little, if at all, affected. In
no case did the addition of a sequence from a hybrid or allopolyploid alter the overall topology in a major way. Our results generally parallel
those of phylogenetic studies that include F
1
hybrids and their parents and use morphology.
Keywords—allopolyploids, ETS, hybrids, ITS, phylogeny reconstruction, rDNA homogenization.
ITS sequence data have been the tool of choice for exam-
ining lower-level relationships (typically within genera or
among a group of closely related genera) in angiosperms for
over a decade (reviewed in Baldwin et al. 1995; Soltis and
Soltis 1998; Hershkovitz et al. 1999; Álvarez and Wendel
2003). In fact, ITS sequences are among the most numerous of
all angiosperm sequences submitted to GenBank, outnum-
bering even the plastid gene rbcL (Hershkovitz et al. 1999;
Álvarez and Wendel 2003). Álvarez and Wendel (2003) esti-
mated that between 1997 and 2002, 66% of all plant phylo-
genetic publications involving comparisons at the generic
level and below included ITS sequence data and that 34% of
all published phylogenetic hypotheses involved exclusively
ITS sequences.
More recently, the external transcribed spacer (ETS) region
of the nuclear ribosomal DNA (rDNA) cistron has become
widely used to infer phylogeny at the same taxonomic levels
as ITS sequences (Clevinger and Panero 2000; Markos and
Baldwin 2001; Lee et al. 2002, 2003; Saar et al. 2003; Urbatsch
et al. 2003; Mavrodiev et al. 2005). The ETS region has a rate
of molecular evolution similar to that of ITS, and as part of
the rDNA cistron, ETS is subject to the same molecular ge-
netic processes as ITS.
The rDNA cistron is widely used in phylogeny reconstruc-
tion because: 1) its high copy number makes it technically
simple to obtain sequences; and 2) molecular evolutionary
processes often “homogenize” repeats so that the rDNA be-
haves like a single-copy gene. We use the term “homogeni-
zation” to refer to several possible processes that yield a
single rDNA type among the thousands of tandemly re-
peated copies at a locus (or even among loci). Of particular
interest here is the presence of only one of two possible pa-
rental rDNA contributions in an allopolyploid. Mechanisms
of homogenization (Álvarez and Wendel 2003) include loss
of repeats at a homoeologous locus (e.g. some Glycine L. al-
lopolyploids, Joly et al. 2004) and concerted evolution (Zim-
mer et al. 1980) of repeats from one diploid parent to those
from the other parent, as reported in species of Gossypium L.
(Wendel et al. 1995a), Nicotiana L. (Kovarik et al. 2004; Lim et
al. 2000; Volkov et al. 1999), Cardamine L. (Franzke and Mum-
menhoff 1999), Triticum L. (Flavell and O’Dell 1976), Glycine
(Rauscher et al. 2004), and Senecio L. (Abbott and Lowe 2004).
Following the union of divergent rDNA repeats from dip-
loid progenitors in an allopolyploid, homogenization of
rDNA loci can be incomplete. Hence, both parental arrays
may be present in an allopolyploid, as in some allopolyploids
of Glycine (Doyle and Beachy 1985; Rauscher et al. 2004),
Triticum (Appels and Dvorak 1982), Paeonia L. (Sang et al.
1995; Zhang and Sang 1999), Krigia Schreb. (Kim and Jansen
1994), Brassica napus L. (Bennett and Smith 1991), and Arabi-
dopsis suecica Norrl. ex O. E. Schulz (O’Kane et al. 1996). The
recen tly formed allo polyploids Tragopogon mirus Ownbey
and T. miscellus Ownbey exemplify incomplete homogeniza-
tion, with concerted evolution “caught in the act”; both pa-
rental diploid ITS units were detected in these plants, but one
parental type was in low frequency and had been largely
converted to the other diploid type (Kovarik et al. 2005). In
other allopolyploids in Tragopogon L., the mechanism of ho-
mogenization has not yet been investigated, but homogeni-
zation has apparently occurred nearly to completion; the con-
tribution of one parent is not evident in DNA sequence chro-
matograms, but is detected as a second, distinctive, albeit
rare sequence via the sequencing of numerous clones
(Mavrodiev et al. in press). Similarly, in only one of the al-
lopolyploid species of Glycine subgenus Glycine do direct se-
quences of ITS amplicons reveal the contribution of both ho-
moeologous rDNA loci in all individuals. Other species are
polymorphic for the homoeologous loci, and in all individu-
als where only one repeat class is observed in direct sequenc-
ing, the “missing” locus is still present and can be amplified
with homoeol ogue-s pecifi c primers (Rauscher et al. 2002,
2004). In one species, the repeat of one parental diploid pre-
dominates in all but one individual studied; in that indi-
vidual, the other parental repeat is prevalent. Segregational
Systematic Botany (2008), 33(1): pp. 7–20
© Copyright 2008 by the American Society of Plant Taxonomists
7
analysis of hybrids between these two types of individuals
shows that the imbalance is due to concerted evolution in
some cases, and to repeat loss in others (Joly et al. 2004).
Because of incomplete homogenization and other molecu-
lar genetic processes, such as sequence variation within in-
divid uals (perhaps d ue to anci ent or recent duplication),
presence of pseudogenes, and recombination (e.g. Buckler et
al. 1997; Franzke and Mummenhoff 1999; Hughes and Pe-
tersen 2001; Kovarik et al. 2005; Kuzoff et al. 1999; Leitch and
Bennett 1997; Wendel et al. 1995b), caution must accompany
the use of rDNA spacer sequences in phylogeny reconstruc-
tion (e.g. Soltis and Soltis 1998; Doyle and Gaut 2000; Álvarez
and Wendel 2003; Bailey et al. 2003). When homogeniza-
tion does not occur (or does not occur to completion), and
paralogous copies tracking different histories exist within a
single genome, the classic paralogy/orthology (Fitch, 1970)
problem exists for phylogeny reconstruction. Such polymor-
phism can be very useful in tracing hybrid origins in diploid
or allopolyploid groups, when repeats from both parents are
retained. Unfortunately, phylogenetic analyses routinely em-
ploy ITS, and now ETS, sequences without a firm under-
standing of the chromosome numbers and ploidy of the spe-
cies and/or populations under investigation. Taxa or popu-
lations may be included in a phylogenetic study without the
investigator being aware that the entity in question is a dip-
loid, or unsuspected hybrid, or allopolyploid. This lack of a
firm cytogenetic b ackground can be particularly acute in
poorly studied groups, such as genera occurring in the trop-
ics.
Despite repeated cautions about the use of ITS and ETS (i.e.
rDNA spacers) sequences in phylogeny reconstruction, in-
cluding problems inherent in the study of hybrids and allo-
polyploids (reviewed in Álvarez and Wendel 2003; Bailey et
al. 2003; see also Kovarik et al. 2005), these spacers remain the
regions of choice for phylogenetic systematics at lower taxo-
nomic levels. Although the limitations and problems with
these rDNA regions have been well reviewed, the impact of
unknowingly includ i n g a hybrid or an allopolyploid that
combines the contributions of its parents (i.e. is polymorphic
at nucleotide sites) in a phylogenetic analysis is unclear.
Note, as well, that not all polymorphism in these rDNA re-
gions is attributable to hybridization or allopolyploidy; poly-
morphisms may exist among the hundreds of tandemly re-
peated copies at a locus, or there may be multiple loci.
Using morphology, McDade (1990, 1992) conducted phy-
logenetic analyses of hybrids and known diploid parents in
Aphelandra R. Br. (Acanthaceae) and examined the placement
of the hybrids and the impact of the hybrids on the overall
topology. Analyses of the topological effects of recombina-
tion and incomplete concerted evolution ha ve been con-
ducted using molecular data (e.g. Posada et al. 2002; Sand-
erson and Doyle 1992; Doyle 1995). Doyle (1995) used an
example from Drosophila adh to show the effect of recombi-
nation on allele trees (elaborating on a comment made by
Schaeffer and Miller (1992) that high recombination rates re-
sulted in allele trees with no resolution). However, there are
few investigations of the phylogenetic fate of hybrids and
allopolyploids with polymorphic rDNA spacer regions. Ni-
eto Feliner et al. (2001) examined the impact of artificial hy-
brids (F1, F2, and backcrosses) in phylogenetic inference of
ITS sequences in Armeria (Plumbaginaceae), and, depending
on the analysis, found hybrids to be placed either near the
base of the tree or in or near clades containing the parents. In
an ITS analysis of Rubus, a hybrid taxon with polymorphic
sites appeared as sister to one of its parents (Alice and Camp-
bell 1999). In Amelanchier Medik., a suspected hybrid ap-
peared as sister to the clade containing the parent with which
it shares more apomorphies (Campbell et al. 1997).
We ask: 1) Where will a direct (polymorphic) ITS or ETS
sequence of a hybrid or allopolyploid appear in a phyloge-
netic analysis of the group under investigation? 2) Does the
inclusion of a polymorphic sequence from an allopolyploid
or hybrid alter th e overall topology? 3 ) Are ther e effects
caused by these sequences that could be used to identify
unknown allopolyploids or hybrids? If homogenization has
occurred to completion in the direction of one or the other
diploid parent, obviously the allopolyploid will appear with
that diploid parent in a phylogenetic tree. However, in those
instances in which homogenization has not occurred, or has
not occurred to completion, researchers may incorrectly as-
sume that homoeologous ITS/ETS sequences will be de-
tected in the sequencing process, particularly if the two pa-
rental sequences are of different lengths, or if there is a large
amount of polymorphism. In the former case, messy se-
quences may result, alerting the investigator to the possibility
of polymorphism, and possibly hybridization. However, if
length differences are not apparent, then detecting a hybrid
or allopolyploid becomes more problematic. In some cases,
both parental contributions are evident in the chromatogram
of the allopolyploid, but the contribution of one parent is
very weak compared to that of the second parent. Using
default settings, base-calling software accompanying auto-
mated sequencers do not score sites as polymorphic unless
the two peaks are close to equal in height. Without careful
study of the actual chromatogram, the polymorphic sites that
result from two parental contributions might be overlooked
(Fig. 1). Hence, polymorphisms can be missed in situations in
which the contribution of one parent is diminished via con-
certed evolution, loss, unequal amplification, or some other
process.
When the contributions of both diploid parents are de-
tected by the base-calling program, those sites that differ
between the diploid parents will be scored as polymorphic in
the sequence chromatogram. But an investigator may not
necessarily suspect hybridization or allopolyploidy if the
number of polymorphic sites is low (e.g. less than 5 or 6 over
300 bp) and the biology of the study group is poorly known.
At some point, the number of such polymorphic sites should
alarm the investigator to obtain sequences for each variant,
for example by cloning (e.g. Johnson and Johnson 2006).
With respect to whether or not the allopolyploid or hybrid
sequence alters the overall topology, McDade (1990) con-
cluded that the inclusion of hybrids does not lead to unre-
solved trees with rampant homoplasy (pg. 1685), as some
had predicted. McDade (1990) made two predictions con-
cerning the phylogenetic placement of hybrids. First, be-
cause the hybrids expressed both maternal and paternal, and
ancestral and derived conditions with equal probability, a
given hybrid will, on average, share more derived features
with its parent that has the most derived character states.
Thus, when parental species differ in the number of derived
features, the hybrid will be placed nearest to its most derived
parent (pg. 16941695). Second, possession of intermediate
states should result in the placement of hybrids in interme-
diate cladistic positions, i.e., between the clades to which
their parental species belong (pg. 1695). We can test the first
8 SYSTEMATIC BOTANY [Volume 33
predictio n of McDa de (1990 ) using r DNA sequence dat a.
However, the second hypothesis (intermediacy) does not ap-
ply to molecular data. Whereas morphological characters are
often the product of several loci (QTL), and a hybrid may be
more similar to one diploid parent than the other, when se-
quences are combined an allopolyploid exhibits both paren-
tal contributions. There is no intermediate state between an
A and a G, for example. With a small number of poly-
morphic sites, one might predict that a diploid hybrid or
allopolyploid may group with either paren t in different,
equally optimal trees. This was the case in Collomia (Polem-
oniaceae) where the clade containing the parental species
collapsed in the strict consensus when direct allopolyploid
sequences with polymorphic sites were used because the se-
quences cluster with either parent, not just one (Johnson and
Johnson 2006). As the number of polymorphic sites increases,
the hybrid or polyploid could appear in a completely novel
position well removed from both parents.
To investigate these issues, we employed ITS and, in some
cases, ETS sequence data available for diploid and allopoly-
ploids in Tragopogon (Asteraceae; Mavrodiev et al. 2005, in
press), Glycine (Fabaceae; Joly et al . 2004; Rausche r et al.
2004), and Paeonia (Paeoniaceae; Sang et al. 1995, 1997). Al-
though we focus here on allopolyploids, we also analyzed
clonally reproducing hybrids in Rubus L. (Alice and Camp-
bell 1999). We do not address sexually reproducing diploid
hybrids; these exhibit additional confounding signals due to
the complexities involved in segregation.
M
ATERIALS AND METHODS
Sequence DataTragopogon comprises approximately 150 species from
Eurasia (reviewed in Mavrodiev et al. 2005; Soltis et al. 2004). Most mem-
bers of the genus are diploid, but there are several polyploids in Eurasia,
as well as two recently formed allopolyploids in North America (T. mirus
and T. miscellus). We focused our analyses on the allotetraploids (2n = 24)
T. mirus, T. miscellus, and T. latifolius Boiss.; we also included the putative
allopolyploid T. acanthocarpus Boiss. Homogenization of rDNA copy
types is not complete in any of these Tragopogon allopolyploids. The
contributions of both diploid parents are evident in direct sequencing
chromatograms. After cloning, two distinct ITS and ETS types were de-
tected, representing the genomes of the inferred parents.
The parents of T. mirus and T. miscellus are well known; these allotet-
raploids formed recently (in the last 70 yr) and recurrently (Ownbey 1950;
reviewed in Soltis et al. 2004). We examined two populations each of T.
mirus (2601 and 2602) and T. miscellus (2604 and 2605). We also included
samples of the diploid parents (T. dubius Scop., T. pratensis L., T. porrifolius
L.) from the geographic areas of allopolyploid formation. These diploid
sequences also match the parental contributions obtained via cloning and
sequencing of ITS from the polyploids.
Tragopogon latifolius is a Eurasian species that comprises both diploid
and tetraploid populations (Nazarova 1991). Recent analyses of cloned
ITS and ETS sequences revealed that this tetraploid is an allotetraploid,
but diploid T. latifolius is not one of the parents (hence, nomenclatural
changes are needed here; discussed in Mavrodiev et al. in press). The
parents of tetraploid T. latifolius are found in two distinct clades of the
Tragopogon topology: one parent is an as-yet-undetermined diploid (e.g.
T. porphyrocephalus Rech. f., T. fibrosum Freyn et Sint. ex Freyn, T. albinerve
Freyn et Sint, T. armeniacus S. Kuthath., or T. bakhtiaricus Rech. f.). It is
hard to differentiate among these species as possible parents due to the
similarity or identity of their ITS/ETS sequences. The second parent is
either T. graminifolius DC. or T. pusillus M. Bieb., two species that have
nearly identical ITS/ETS sequences (Mavrodiev et al. in press). We in-
cluded here two different populations of tetraploid T. latifolius (201, 67),
as well as cloned sequences of these two allotetraplo id populations
(Mavrodiev et al. in press).
The chromosome number of Tragopogon acanthocarpus is unknown; this
rare species clearly exhibits the combination of two divergent ITS/ETS
types and is either a hybrid or, more likely, an allotetraploid, with T.
pusillus or T. undulatus Jacq. as one parent, plus a member of a clade of T.
dasyrhynchus Artemczuk, T. brevirostris DC., T. podolicus Bess. ex DC., T.
reticulatus Boiss. et Huet, T. ruthenicus Bess. ex Claus, T. dubjanskyi Krasch.
et S. Nikit. A hybrid or allopolyploid origin of this taxon is also supported
by morphology (Mavrodiev et al. in press).
We primarily used ITS sequence data to examine the phylogenetic
position of direct sequences from allotetraplo id Tragopogon species
among a data set of representative diploid members of the genus. For
several of the examples from Tragopogon, we also used ETS sequence data.
ETS and ITS have similar rates of evolution in Tragopogon and yield
similar topologies. We employed previously published ITS and ETS se-
quence data sets (and alignment) for Tragopogon (Mavrodiev et al. 2005).
These data sets involved 62 diploid species of Tragopogon, as well as
sequences of Lactuca L. and Scorzonera L. as outgroups. We included the
cloned sequences of T. acanthocarpus and T. latifolius in our phylogenetic
analyses; these clones represent the putative diploid parental contribu-
tions to each allopolyploid. We also included the scored, direct sequence
chromatogram of the ITS or ETS sequences obtained from each allopoly-
ploid.
We used a similar approach in our phylogenetic analyses of Paeonia
(Paeoniaceae), a genus of 35 species comprising diploid as well as allo-
polyploid species (reviewed in Sang et al. 1997). We again used previ-
ously published ITS sequence data (kindly supplied by T. Sang). Zhang
and Sang (1999) provided ITS sequence data and a phylogenetic analysis
of diploid members of the genus; Sang et al. (1995) used ITS sequence
data to document the parentage of allopolyploids in the genus. Poly-
ploids in Paeonia exhibit essentially equal contributions of their diploid
parents in sequencing gels (Sang et al. 1995). Sang et al.s (1997) ITS tree
for diploid species of Paeonia revealed several well-supported clades of
diploid species. Sang et al. (1995) showed that allotetraploid species of the
genus clearly combined the contributions of their diploid parents; there
was no evidence of concerted evolution. To examine the impact on phy-
logeny reconstruction, we scored as polymorphisms those sites in the
allopolyploids that differed between the two divergent ITS copies and
included these sequences in a phylogenetic analysis.
Rubus is a genus of approximately 250 species and is well known for its
complexity due to hybridization, apomixis, and polyploidy (Grant 1981).
FIG. 1. The recently formed allopolyploids T. mirus and T. miscellus
exemplify t he challenge presented by incomplete ho mogenization of
rDNA spacers. In these taxa, concerted evolution is underway, but is
incomplete (Kovarik et al. 2005); hence, the ITS/ETS contribution of one
parent is much less than that of the other parent. Portion of sequence
chromatogram from the allotetraploid T. miscellus compared to its diploid
parents (T. dubius and T. pratensis) showing a polymorphic site and the
much smaller contribution of T. dubius at that site.
2008] SOLTIS ET AL.: ITS AND ETS IN ALLOPOLYPLOIDS 9
Alice and Campbell (1999) conducted a phylogenetic analysis of Rubus
(Rosaceae) using ITS sequence data. We obtained an ITS sequence from a
known hybrid in Rubus (R. caesius L. × R. idaeus L.; sequence kindly
provided by L. Alice) and added this polymorphic sequence to the Alice
and Campbell (1999) ITS sequence data matrix. We also considered the
placement of R. ursinus, a putative hybrid (Alice and Campbell 1999).
Following Alice and Campbell (1999), sequences of Fallugia Endl., Geum
L., and Waldsteinia Willd. were employed as outgroups.
Glycine (Fabaceae) includes the cultivated soybean (G. max Merr.) and
its wild progenitor (G. soja Siebold et Zucc.), both annual (2n = 40) species
native to Asia classified as subgenus Soja (Moench) F. J. Herm. The re-
mainder of the genus comprises subg. Glycine, a group of around 26
perennials mostly native to Australia that includes both diploid (2n = 38,
40) and polyploid (2n = 78, 80) species. There are eight known allopoly-
ploid species, each recently formed (< 50,000 yr), and with a different
combination drawn from eight diploid genomes; most show evidence of
multiple origins (Doyle et al. 2002, 2004a, b). Six of these comprise the G.
tomentella Hayata allopolyploid complex, which includes a number of
unnamed species at the diploid (G. tomentella D1, D3, D5A, D5B) and
polyploid (G. tomentella T1T6) levels.
The rDNA ITS constitutions of these polyploids were identified using
information from diploid pr ogenitors to design homoeolo g u e - specific
primers (Rauscher et al. 2002, 2004). Allopolyploid plants differed in the
degree to which they retained both homoeologous repeat classes. In most
plants, direct sequencing of PCR products from universal primers pro-
duced clean sequence, indicating a great excess of one homoeologous
repeat class. However, in all cases the second homoeologue could be
amplified with specific primers. Moreover, individual plants of a poly-
ploid species varied with respect to which homoeologue predominated.
Both repeat loss and concerted evolution could be responsible for nu-
merical inequalities between homoeologues (Joly et al. 2004).
We used the sequences and alignments produced by Rauscher et al.
(2004) and Joly et al. (2004), which were either: 1) clean direct sequences
using universal primers on individuals in which one homoeologous re-
peat comprised > 95% of the amplified rDNA; 2) direct sequences ob-
tained using homoeologue-specific primers to amplify the missing mi-
nor homoeologous repeats from individuals with a predominant repeat
type; or 3) direct sequences using homoeologue-specific primers from
both repeat types in plants with detectable amounts of both homoeo-
logues where direct sequencin g with universal primers produced
messy sequences. In the present study, we also used these messy
direct sequences, which were not reported in either Rauscher et al. (2004)
or Joly et al. (2004). A subset of accessions used in these comprehensive
studies was used here to illustrate relevant issues.
The T1 allopolyploid species of the G. tomentella complex comprises
two classes with respect to rDNA ITS sequences, represented here by
accessions 1288 and 1392. In both classes, a single rDNA homoeologue
predominates; in 1392 it is the repeat from the G. tomentella D1 species,
whereas in 1288 and all other sampled T1 accessions it is the D3 repeat.
Whereas the D1 and D3 genomes are closely related, the T2 allopolyploid
represents a wider cross, involving D3 and the more distantly related
A-genome species, G. syndetika B. E. Pfeil et Craven (formerly G. tomen-
tella D4). In all T2 accessions, the G. syndetika repeat predominates; ac-
cession 1286 was used here. The T4 allopolyploid is complex, with some
plants (e.g. 1348) having a great preponderance of D3 repeats, others (e.g.
2469) in which repeats related to G. tomentella D5B and associated diploid
species (G. pindanica Tindale et Craven, G. pullenii B. E. Pfeil, Tindale et
Craven, G. hirticaulis Tindale et Craven) predominate, and others in
which both homoeologues are detectable by direct sequencing. Polymor-
phic accessions are found among accessions from two different origins of
the T4 genome combination, and thus fall into two classes in terms of
polymorphisms in direct sequences; one accession from each of these was
used here (accessions 1469 and 2468, with 10 and 12 polymorphisms,
respectively, six of which are shared). All accessions of the T5 allopoly-
ploid contain detectable levels of both homoeologues; 1969 was used here
(13 polymorphisms).
Phylogenetic AnalysesPhylogenetic analyses to assess the effects of
polymorphisms were conducted using the maximum parsimony criterion
in PAUP* 4.0 b10 (Swofford 2002). Phylogenetic data matrices are depos-
ited in TreeBASE (study number S3592). PAUP* provides several options
under parsimon y settings” —uncertainty, polymorphism, and
variable. The default is uncertainty, and this is standardly applied by
researchers. However, if we are trying to examine the effect of including
an allopolyploid in a phylogenetic analysis, then dual base calls are likely
due to true polymorphism (i.e. it is additivity of the diploid parental
sequences); hence, the correct specification would be polymorphism.
However, to mimic standard usage, we should specify uncertainty. We
therefore conducted paired analyses using either uncertainty or poly-
morphism options, respectively.
However, coding is actually more complicated than noted. Apparent
polymorphism in a diploid may be true uncertainty on the part of the
sequence detection system, in which case it should be scored differently
than multistate coding due to true polymorphism. Alternatively, a true
polymorphism may be present if the diploid has two rDNA loci that have
different sequences. In actuality, a data set could be a mixture of poly-
morphisms with codings that reflect true uncertainty or true polymor-
phism. We will not deal with these issues here, however, because it is
not possible to specify uncertainty for some characters and polymor-
phism for others.
In all analyses, gaps were treated as missing data. For those sites scored
as polymorphic, we used IUPAC terminology (e.g. R, M). For all but the
Glycine datasets, maximum parsimony (MP) analyses were conducted
using heuristic searches with 100 random addition replicates with no
more than 100 trees saved per replicate, TBR (tree-bisection-reconnection)
branch swapping, and the MulTrees option (saving all optimal trees) in
effect. Internal support for clades was assessed using the bootstrap (Fel-
senstein 1985). For Tragopogon we used the fast bootstrap because of the
size of the data sets. Fast bootstrap estimates were obtained using 1000
replicates, each with 100 random taxon addition replicates. For the Rubus
and Paeonia data sets, support for clades was estimated using 100 boot-
strap replicates each with 10 random addition replicates, saving no more
than 1500 trees per bootstrap replicate, TBR branch swapping, and the
MulTrees option in effect. For Glycine, the small size of the data sets
allowed branch and bound parsimony searches to be conducted both for
identifying most parsimonious trees and for bootstrap analyses, with the
exception of the data set that included polymorphic sequences. For that
data set, the bootstrap analysis used TBR swapping with maxtrees set to
100 per bootstrap replicate. For each analysis, individual trees were ex-
amined to investigate the positions occupied by hybrid or allopolyploid
taxa. In addition, strict consensus, majority rule, and Adams consensus
(Adams 1972; which typically preserves more structure than strict meth-
ods) were used to summarize the results of parsimony searches. Adams
consensus results are presented here primarily for Glycine and to a lesser
extent for Rubus, taxa for which it appears to be an effective approach.
However, Adams trees were not of utility in Paeonia and were difficult to
interpret with the larger, complex data sets posed by Tragopogon; the
problems with Adams consensus trees are well known (e.g. reviewed in
Wilkinson 1994).
RESULTS
GeneralAnalyses employing polymorphic and uncer-
tainty codings yielded the same results in terms of topolo-
gies. Likewise, bootstrap support values were not affected by
the two codings. However, the two codings do result in dif-
ferent tree lengths (these values are higher when the poly-
morphic coding is used); ensemble consistency index (CI;
Kluge and Farris 1969) values also typically differed slightly.
The trees presented (Figs. 28) were obtained using the un-
certainty coding. We provide the number of shortest trees,
tree length, CI (including parsimony-uninformative sites ),
and ensemble retention index (RI; Farris 1989) obtained with
both options below, giving the polymorphic values first (in-
dicated by P) followed by the uncertainty values (indicated
by U).
Tragopogon mirus/ T. miscellusThe parents of T. mirus
and T. miscellus are T. dubius and T. porrifolius, and T. dubius
and T. pratensis, respectively (P. Soltis et al. 1995; Soltis et al.
2004). The number of polymorphic sites in T. mirus and T.
miscellus that reflect the additivity of the divergent diploid
genomes depends on the polyploid population. For ITS in T.
mirus populations 2601 and 2602 there were 13 such sites. For
ITS in T. miscellus populations 2604 and 2605 there were 12
and 13 such changes, respectively. For ETS there were 17
polymorphic sites in the populations of T. mirus and 16 for T.
miscellus. The direct sequences of both populations examined
of both T. mirus and T. miscellus appeared in a clade with T.
10 SYSTEMATIC BOTANY [Volume 33
dubius, the parent with which they both share more apomor-
phies; also in this small clade are T. pterocarpus DC., T. major
Jacq., T. pterodes Panc., and T. charadzae S. Kuthath. (Fig. 2).
Furthermore, in 98% of the shortest trees obtained, the direct
sequences of the polyploids appeared with the cloned se-
quence of the parent T. dubius. Identical results were obtained
when T. mirus and T. miscellus were analyzed independently.
When the polymorphic sequences of the two allopoly-
ploids were removed, the length of the shortest trees de-
creased with the polymorphic coding (from 426400, P), but
was unchanged with the uncertainty coding (312312, U); the
CI value changed slightly (0.8800.887, P) or remained the
same (0.846, U); the RI value also changed slightly (0.876
0.877, P; and 0.8760.877, U). The strict consensus of shortest
trees was unchanged with the removal of the polyploid se-
quences. When the polyploid sequences were removed, two
additional nodes (both involving the larger clade in which
the polyploid sequences appeared) that did not initially re-
ceive bootstrap values greater than 50% achieved values of
62% and 66%; other nodes were essentially unchanged (Fig. 2).
Tragopogon latifolius 4x/ T. acanthocarpusThe direct
ITS sequence of tetraploid T. latifolius has nine polymorphic
sites resulting from the combination of its two parents and
appeared as part of a small clade that contained one group of
cloned sequences from the tetraploid, as well as T. bakhtiari-
cus, T. albinerve, T. armeniacus, T. kotschyi Boiss., T. ketzhovelii
S. Kuthath., and T. porphyrocephalus (Fig. 3). In 84% of the
shortest trees the direct sequence of the polyploid appeared
with T. bakhtiaricus. The second group of cloned sequences of
the polyploid appeared in a clade with T. segetum S. Kuthath.,
T. serotinus D. Sosn., T. pusillus, as well as some clones of T.
acanthocarpus (see below).
Phylogenetic analyses o f a comparable ETS data set
yielded similar results (Fig. 4). The direct ETS sequence of T.
latifolius (population 201) has 16 polymorphic sites resulting
from the combination of its two parents; the direct sequence
appeared in the strict consensus tree as a member of a clade
that contained one group of the cloned sequences from this
allotetraploid, as well as one of its likely diploid parents (see
above; T. porphyrocephalus, T. albinerve, T. armeniacus, T. kotsc-
chyi, T. bakhtiaricus, as well as T. fibrosum; Fig. 4). The second
set of cloned sequences appeared as sister to T. pusilus and
some of the clones of T. acanthocarpus (bootstrap = 72%).
The direct ITS sequence of T. acanthocarpus had 10 poly-
morphic sites, and its phylogenetic position was variable. In
most of the shortest trees (74%), the direct sequence appeared
as sister to the largest of the two major subclades of Trago-
pogon recovered. In a small subset of shortest trees, the direct
FIG. 2. Summary tree depicting one of > 1,000,000 shortest trees, the
maximum number that could be stored (length = 426; CI = 0.887; RI =
0.872), obtained in an analysis of ITS sequences (using uncertainty
coding) representing the genus Tragopogon, with the direct polymorphic
sequence of the allotetraploids T. miscellus and T. mirus included. The
allotetraloid T. miscellus and its parents (T. dubius, accession # 1 and T.
pratensis), as well as the allotetraloid T. mirus and its parents (T. dubius,
accession # 2 and T. porrifolius), are shown in bold; the direct sequences of
the two tetraploids are boxed. Branches that collapse in the strict consen-
sus are indicated by arrows. Numbers above branches are branch lengths;
values below are bootstrap values. Larger clades have been reduced, with
the number of terminals provided inside the triangle used to represent
that clade.
FIG. 3. Summary tree depicting one of > 1,000,000 shortest trees, the
maximum number that could be stored (length = 324; CI = 0.840; RI =
0.864), obtained in an analysis of ITS sequences (using uncertainty
coding) representing the genus Tragopogon, with the direct, polymorphic
sequence of the allotetraploid T. latifolius, and also of the putative hybrid
or allopolyploid T. acanthocarpus, as well as clones of both included (num-
ber of clones examined given in parentheses). The allotetraploid T. lati-
folius and T. acanthocarpus and clones of both are shown in bold; the direct
sequences of T. latifolius and T. acanthocarpus are boxed. Branches that
collapse in the strict consensus are indicated by arrows. Numbers above
branches are branch lengths; values below are bootstrap values. Larger
clades have been reduced, with the number of terminals provided inside
the triangle used to represent that clade.
2008] SOLTIS ET AL.: ITS AND ETS IN ALLOPOLYPLOIDS 11
sequence of T. acanthocarpus appeared as part of a clade with
T. segetum, T. serotinus, T. pusillus (one of its putative parents),
some clones of T. acanthocarpus, and some clones of T. latifo-
lius (Fig. 3). The other clones of T. acanthocarpus are part of a
clade with T. brevirostris, T. dasyrhynchus, T. dubjanskyi, T.
podolicus, T. reticulatus, and T. ruthenicus.
The direct ETS sequence chromatogram of T. acanthocarpus
has 21 polymorphic sites resulting from the combination of
its two parents. In the strict consensus of shortest trees, the
direct sequence appeared as sister to a clade of taxa including
some clones of T. acanthocarpus and T. pusillus (one of its
putative parents; these form a clade with bootstrap = 53%),
and T. segetum, T. sosnowskyi, T. charadzae, and T. graminifo-
lius. The other clones of T. acanthocarpus are part of a clade
that includes T. brevirostris, T. dasyrhynchus, T. dubjanskyi, T.
podolicus, T. reticulatus, T. undulatus, T. floccosus Waldst. et
Kit., and T. ruthenicus.
When the polymorphic sequences of T. latifolius and T.
acanthocarpus were removed from the ITS data set (Fig. 3), the
tree length decreased (from 431408, P; 324321, U); the CI
and RI values were only slightly, from 0.8790.878 (P; 0.840
0.844, U), and from 0.8640.868 (P; 0.8640.867, U), respec-
tively. The strict consensus of shortest trees was better re-
solved with the removal of the polymorphic ITS sequences,
with one clade of nine taxa retrieved that was not present
when the polymorphic sequences were included. The boot-
strap value for one node increased substantially with the
removal of the polymorphic sequences (from less than 50
64%) and involved the larger clade in which the direct se-
quence of the allopolyploid T. latifolius appeared (Fig. 3).
When the polymorphic sequences of T. latifolius and T.
acanthocarpus were removed from the ETS data set (Fig. 4),
the tree length decreased with polymorphic coding (from
420385, P; 341341, U); the CI values were only slightly if at
all modified from 0.8670.855 (P; 0.8360.836, U), and RI val-
ues were unchanged (0.897, P; 0.896, U). The bootstrap value
for two nodes increased substantially with the removal of the
polymorphic sequences (from 5969%; from 7994%) and in-
volved the two larger clades in which the direct sequences of
the allopolyploid T. latifolius and T. acanthocarpus appeared
(Fig. 4).
PaeoniaWe conducted separate phylogenetic analyses in
which one of the three allotetraploids, P. banatica Rochel, P.
russi (P. russoi Biv. is correct, but Sang et al. 1997 and others
have used russi Bivona; we will use P. russi here to avoid
confusion in comparisons with Sang et al. 1997), or P. pereg-
rina Mill. was included. Sang et al. (1995, 1997) indicated that
the parents of P. banatica are P. mairei H. Lév. and a member
of the clade consisting of P. arietina Anders., P. humilis Retz.,
P. officinalis L., P. parnassica Tzanoud., and P. tenuifolia L.
(which have identical, or nearly identical sequences; Sang et
al. 1995, 1997). In our analysis, P. banatica appeared in four
general positions in the 44 shortest trees obtained: as part of
a polytomy with P. brownii Dougl. and a clade of P. mairei, P.
japonica (Makino) Miyabe et Takeda, and P. obovata Maxim.
(Fig. 5A); sister to a clade of P. mairei, P. japonica, and P.
obovata (Fig. 5B); sister to a large clade of all species but P.
brownii, P. mairei, P. japonica, and P. obovata (Fig. 5C); part of
a basal polytomy (Fig. 5D). Thus, depending on which short-
est tree was examined, the direct, polymorphic sequence of
the polyploid P. banatica variously appeared in a clade with
one parent, the other parent, or in a position removed from
both parents. In the strict, majority-rule, and Adams consen-
F
IG. 5. Four (AD) of 44 shortest trees (length = 40; CI = 0.950; RI = 0.968) obtained in an analysis of ITS sequences (using uncertainty coding)
representing the genus Paeonia (Sang et al. 1995), with the direct, polymorphic sequences of the allotetraploid P. banatica included. These trees illustrate
the different phylogenetic positions observed for the allotetraploid. The allotetraloid is shown in bold. Numbers above branches are branch lengths;
values below are bootstrap values.
FIG. 4. Summary tree depicting one of 1,000,000 shortest trees, the
maximum number that could be stored (length = 341; CI = 0.836; RI =
0.897), obtained in an analysis of ETS sequences (using uncertainty
coding) representing the genus Tragopogon, with the direct, polymorphic
sequences of the allotetraploid T. latifolius and also of T. acanthocarpus, as
well as clones of both included (number of clones examined given in
parentheses). The allotetraploid T. latifolius, T. acanthocarpus, and clones
of each are shown in bold; the direct sequences of T. latifolius and T.
acanthocarpus are boxed. Branches that collapse in the strict consensus are
indicated by arrows. Numbers above branches are branch lengths; values
below are bootstrap values. Larger clades have been reduced, with the
number of terminals provided inside the triangle used to represent that
clade.
12 SYSTEMATIC BOTANY [Volume 33
2008] SOLTIS ET AL.: ITS AND ETS IN ALLOPOLYPLOIDS 13
14 SYSTEMATIC BOTANY [Volume 33
sus of shortest trees, P. banatica appeared in an unresolved
position as part of a polytomy. With the removal of P. ba-
natica the number of shortest trees dropped from 444; the
strict consensus was unchanged; the CI dropped from 0.950
0.923 (P; 0.9230.923, U), and the RI was unchanged (0.968, P
and U); two bootstrap values increased dramatically (from
65100% and 6498%).
The placement of the allopolyploid P. russi was also un-
stable in our phylogenetic analyses. Based on Sang et al.
(1995), the parents of P. russi are P. lactiflora Pall. and P.
mairei. In our analyses, the direct sequence of P. russi vari-
ously appeared as: part of a trichotomy with P. brownii and a
clade of P. mairei, P. japonica, and P. obovata (Fig. 6A) sister to
a clade of P. brownii plus P. mairei, P. japonica, and P. obovata
(Fig. 6B); sister to P. mairei, P. japonica, and P. obovata (Fig.
6C); or part of a basal polytomy (Fig. 6D). In the strict, Ad-
ams, and majority-rule consensus trees, P. russi appeared in
an unresolved position as part of a polytomy. Thus, the direct
sequence of P. russi often appeared as sister to a clade con-
taining one parent (P. mairei), or in a position well removed
FIG. 7. Summary tree (based on the 3839 shortest trees; length = 447; CI = 0.660; RI = 0.748) obtained in an analysis of ITS sequences (using
uncertainty coding) representing the genus Rubus (Alice and Campbell 1999). This tree is based on the shortest trees showing the three placements
of the direct, polymorphic sequence of the hybrid R. caesius × idaeus included. The hybrid (R. caesius × idaeus) and its parents (R. caesius and, idaeus) are
shown in bold, and the hybrid is boxed. Numbers below branches are bootstrap values; tr. # refers to the tree number. Larger clades have been reduced,
with the number of terminals provided inside the triangle used to represent that clade.
F
IG. 6. Four (AD) of 16 shortest trees (length = 37; CI = 0.946; RI = 0.969) obtained in an analysis of ITS sequences (using uncertainty coding)
representing the genus Paeonia (Sang et al. 1995), with the direct, polymorphic sequence of the allotetraploid P. russi; and one of eight shortest trees
(length = 35; CI = 0.950; RI = 0.970) of P. peregrina (6D only) included. These trees illustrate the different phylogenetic positions observed for the
allotetraploid P. russi; P. peregrina appears in only one position (6D). The allotetraploid is shown in bold. Numbers above branches are branch lengths;
values below are bootstrap values.
2008] SOLTIS ET AL.: ITS AND ETS IN ALLOPOLYPLOIDS 15
from both parents, but it never appeared in a position asso-
ciated with the parent P. lactiflora. With the removal of P.
russi, the number of shortest trees decreased from 164; the
strict consensus was unchanged; the CI dropped from 0.946
0.923 (P; 0.9260.923, U), and the RI was essentially un-
changed (0.9690.968, P; 0.9680.969, U); one bootstrap value
increased dramatically (from 7198%).
The remaining allopolyploid examined, P. peregrina,
showed a dif ferent pattern from the other two polyploid
Paeonias described above. The parents of P. peregrina are P.
anomala L. and a member of a clade of P. arietina, P. humilis,
P. officinalis, P. parnassila, and P. tenuiflora (which have iden-
tical, or nearly identical sequences). In the strict, Adams, and
majority-rule consensus trees from our analyses, the poly-
morphic P. peregrina sequence appeared in an unresolved
position as part of a polytomy within Paeonia (Fig. 6d). With
the removal of P. peregrina, the number of shortest trees de-
creased from eight to four; the CI dropped from 0.9430.923
(P; 0.9230.923, U), and the RI was essentially unchanged
(0.9700.968, P; 0.9680.970, U); there were no changes in
bootstrap support or the strict consensus tree.
RubusITS sequences representing two hybrids, R. ursi-
nus Cham. et Schltdl. 197 and R. caesius × R. idaeus, were
added to the original data set of Alice and Campbell (1999).
Although the parentage of one of these hybrids, R. ursinus
197, is unclear, we included it in our analyses because it
illustrates a category of result seen with other polyploids and
hybrids examined. Rubus ursinus (197) is thought to be a
hybrid species between R. macraei A. Gray (or a close relative)
and an unidentified species of subgenus Rubus (represented
here by R. robustus C. Presl., R. cuneifolius Pursh, R. caesius L.,
R. ulmifolius Schott; the clade also contains R. alpinus Macfad.,
see Alice and Campbell 1999). In our analyses, R. ursinus
always appeared in a small clade as sister to one putative
parent, R. macraei, with R. occidentalis L. their sister (Fig. 7).
The putative hybrid species R. ursinus did not appear closely
related to the subgenus Rubus clade in any of the shortest
trees.
FIG. 8. Glycine phylogenies inferred from rDNA ITS sequences. a. Phylogeny of rDNA ITS sequences (using uncertainty coding) from Glycine
diploid and polyploid accessions involved in the G. tomentella polyploid complex, rooted with a member of the B-genome (G. stenophita). Strict consensus
topology from 19 equally parsimonious trees identified by branch-and-bound search (L = 63; CI with/without autapomorphies = 0.831/0.784; RI = 0.933).
Bootstrap support is from 100 replicate branch and bound searches. Sequences from diploids are labeled with the species name and the accession
number; for G. tomentella diploids, the informal species designation (D1, D3, D5B) is given. Sequences from polyploids are in bold print; the G. tomentella
taxon name (T1, T2, T4, T5) is given for each, followed by the accession number. Sequences from polyploids are either: 1) direct sequences, labeled as
direct, with the homoeologue designation given in parentheses following the accession number (D1, D3, D5B = G. tomentella diploid species; cla = G.
clandestina; syn = G. syndetika; e.g. T2-1286-direct (syn) is the nonpolymorphic product obtained by direct sequencing of the amplicon obtained from
T2 accession 1282 using universal ITS primers, and was contributed by the G. syndetica diploid progenitor), or 2) were produced using homoeologue-
specific primers, in which case the homoeologue is given immediately after the accession number (e.g. T2-1286-D3 is the sequence from T2 accession
1286, obtained by using a homoeologue-specific primer set to amplify the D3 homoeologue, which was expected to be present in this plant but was not
observed by direct sequencing of amplicons produced using universal ITS primers). b. Glycine rDNA ITS phylogeny including direct sequences from
diploids and those polyploids whose direct sequences showed no polymorphism. Strict consensus of six trees identified by branch-and-bound parsi-
mony searching (L = 61; CI = 0.853/0.775; RI = 0.905). Designations as in Fig. 8a. Bootstrap support is from 100 branch-and-bound searches. c. Glycine
rDNA ITS phylogeny including all sequences shown in b, plus three heterogeneous direct sequences (indicated by stars) in which characters differing
between homoeologues are coded as polymorphisms. Adams consensus of 2,010 trees (uncertainty coding: L = 61; CI = 0.853/0.775; RI = 0.908;
polymorphism coding: L = 97; CI = 0.907/0.864; RI = 0.908) identified by branch and bound parsimony analysis. Numbers along branches show
bootstrap support (100 replicates; heuristic searching, TBR branch-swapping, maxtrees = 100); with the exception of numbers not in parentheses, all other
clades collapse in the strict consensus. Clades supported by numbers in parentheses appear in the bootstrap majority rule consensus tree. (uncertainty
coding: L = 61; CI = 0.85/0.78; RI = 0.91; polymorphism coding: L = 97; CI = 0.91/0.86; RI = 0.91) identified by branch and bound parsimony analysis.
Numbers along branches show bootstrap support (100 replicates; heuristic searching, TBR branch-swapping, maxtrees = 100); with the exception of
numbers not in parentheses, all other clades collapse in the strict consensus. Clades supported by numbers in parentheses appear in the bootstrap
majority rule consensus tree.
16 SYSTEMATIC BOTANY [Volume 33
The phylogenetic position inferred by our analyses of the
second hybrid analyzed in Rubus, R. caesius × R. idaeus, was
unstable. Its position also was not affected by the presence of
the sequence for the hybrid R. ursinus 197, so this sequence
was retained in the topologies depicted here. In the strict
consensus of 3839 shortest trees obtained, the hybrid R. cae-
sius × R. idaeus appeared as part of a large polytomy. In a
small majority of the trees (51%), R. caesius × R. idaeus ap-
peared as sister to a clade of R. idaeus + R. saxatilis L.; i.e. close
to one parent (R. idaeus). Examination of individual trees re-
vealed considerable variation in placement, including (Fig.
7): 1) placement with R. idaeus + R. saxatilis; 2) part of a clade
with R. phoenicolasius Maxim., R. occidentalis, R. ursinus, and
R. macraei (as sister to the last three species); 3) sister to a
clade composed of R. pectinellus Maxim., R. nepalensis Hort.,
R. lineatus Reinw. ex Blume, R. lambertianus Ser., R. tricolor
Focke, R. tephrodes Hance, and R. assamensis Focke. However,
in none of the trees examined did R. caesius × R. idaeus appear
in a clade with its remaining parent, R. caesius.
When the two hybrids (R. ursinus 197 and R. caesius × R.
idaeus) were removed from the ITS data set (Fig. 7), the tree
length decreased from 525488 (P; 447447, U); the number of
trees decreased (from 3839211, P; 3839211, U), and the CI
and RI decreased sligh tly or were essential ly unchanged
(0.7100.689, 0.7480.745, respectively, P; 0.6600.660, 0.748
0.745, respectively, U). The strict consensus was better re-
solved with the removal of these two sequences; further-
more, the Adams consensus (not shown) placed R. caesius ×
R. idaeus as part of a polytomy, suggesting that it might be a
problematic sequence. The bootstrap value for one clade (R.
alpinus, R. robustus, R. cuneifolius, R. caesius, and R. ulmifolius)
increased sub stantially with the removal of the two se-
quences (from 7789%); this clade contains R. caesius, one of
the parental taxa. When R. caesius × R. idaeus was removed,
with R. ursinus (the position of which did not vary in phy-
logenetic analyses) retained, the results were similar to those
above. The tree length was 499 (P; 447, U); the number of
trees was 212 (P; 212, U); CI and RI = 0.695 and 0.748 (P; 0.660
and 0.748, U), respectively; the boot strap support for the
clade of R. alpinus, R. robustus, R. cuneifolius, R. caesius, and R.
ulmifolius was similarly affected. With R. ursinus removed,
the results were similar to those obtained with both R. ursinus
and R. caesius × R. idaeus in the data set, further indicating
that R. caesius × R. idaeus is the unstable and problematic
entry. The tree length was 514 (P; 447, U); the number of trees
was 3838 (P; 3838, U); CI and RI = 0.704 and 0.745 (P; 0.660
and 0.745, U), respectively; the boot strap support for the
clade of R. alpinus, R. robustus, R. cuneifolius, R. caesius, and R.
ulmifolius was 79%.
Glycine Analyses were first conducted using the two
nonpolymorphic (clean) homoeologue sequences from
each of the eight allopolyploid accessions, obtained by direct
sequencing of PCR products produced either using universal
products on plants with a great excess of one homoeologue
repeat type, or using homoeologue-specific primers on plants
with detectable amounts of both homoeologue classes. Par-
simony analysis identified a small number of trees (19) with
high ensemble cons istency and retention indices, a well-
resolved strict consensus tree nearly identical to the Adams
consensus tree, and bootstrap support comparable to analy-
ses of diploids alone (Fig. 8a).
The effect of the unequal contribution of homoeologous
genomes to rDNA repeat number in allopolyploids was ex-
plored by excluding sequences that had been produced by
using homoeologue-specific primers either 1) to obtain low-
abundance repeats or 2) to obtain sequences from accessions
that yielded polymorphic sequences using universal primers.
This led to the elimination of both homoeologues from three
accessions (T4: 1469 and 2468; T5: 1969) and one homoeo-
logue from each of five other accessions (T1: 1288 and 1392;
T2: 1286; T4: 1348 and 2469). Analysis of the single remaining
sequences from these five accessions, along with sequences
from diploids, identified a small number of parsimonious
trees, with a well-resolved strict consensus topology identical
to that of the Adams consensus tree, in which sequences from
polyp loids joined th eir putative diploi d progenitors w ith
strong bootstrap support (Fig. 8b). Allopolyploid plants be-
haved as nonhybrid, homozygous diploids. In the absence of
additional information, the T2 individual (1286) would be
classified as G. syndetika. Individuals of both T1 (1288) and T4
(1348) would be classified as G. tomentella D3 diploids despite
morphological differences, whereas second individuals of
these two allopolyploid species would be classified as G. to-
mentella D1 (in the case of T1 accession 1392) or part of the G.
tomentella D5B species clade (in the case of T4 accession 2468).
The three heterogeneous direct sequences (T4: 1469 and
2468; T5: 1969) were then added to the data set from Fig. 8b
and the data set analyzed using branch-and-bound search-
ing; polymorphisms were coded using IUPAC ambiguity
codes. Inclusion of these sequences led to increases in homo-
plasy and in the number of equally parsimonious trees, and
to low resolution of the strict consensus (Fig. 8c). The Adams
consensus topology placed these sequences sister to the
clades to which the two homoeologues obtained using ho-
moeologue-specific primers belonged (Fig. 8a): sequences
from T4 accessions 1469 and 2468 formed a polytomy in the
Adams consensus tree with the D3 and D5B clades, and T5
accession 1969 formed a polytomy with the A-genome clade
(G. syndetika and G. clandestina J. C. Wendl.) and the clade
containing all remaining diploids other than the outgroup.
Inspection of equally parsimonious trees revealed that these
polymorphic sequence s grouped with or near seq uences
from one of the parental diploids in different trees.
D
ISCUSSION
ITS sequences are the longstanding tool of choice for phy-
logenetic analysis within genera or among closely related
genera. ETS, another rDNA region with a rate of change and
mechanisms of molecular evolution comparable to ITS, is
gaining popularity in phylogenetic systematics. A series of
reviews has pointed to the inherent molecular evolutionary
problems in using rDNA spacers for inferring phyloge n y
(e.g. Soltis and Soltis 1998; Doyle and Gaut 2000; Álvarez and
Wendel 2003; Bailey et al. 2003). We add another potential
pitfall in using rDNA spacers to infer phylogeny. Because of
the enormous popularity of these regions for inferring phy-
logeny, species are sometimes included in analyses without
knowledge of chromosome number or ploidy. It is likely that
researchers have unknowingly included ITS sequences of un-
recognized diplo id or polyploid hybrids in phylogenetic
studies; the opportunities for this inclusion are particularly
high in poorly studied groups. For example, in a survey of
papers published from 20002006 in Systematic Botany using
ITS sequences to infer phylogeny, chromosome counts are
available for just 181 of 588 species (30.7%).
2008] SOLTIS ET AL.: ITS AND ETS IN ALLOPOLYPLOIDS 17
If rDNA sequences of unrecognized hybrids or polyploids
are added to analyses, researchers can be misled by what is
not seen via direct sequencing, either due to complete ho-
mogenization of ITS/ETS in unsuspected allopolyploids or
hybrids, partial homogenization leading to major differences
in peak height, or segregational loss in later generation hy-
brids when examining one locus (ITS being an example used
here). Alternatively, if polymorphic sites are detected, but are
few in number in an unsuspected hybrid or allopolyploid, a
researcher may not pursue the issue further via cloning and
sequencingthe sites may be scored as polymorphic and the
sequence added to the matrix.
These concerns are made more urgent by the fact that in
this molecular era, few systematists make chro mosome
counts, and a new generation of investigators is being trained
without cytological skills. One option is to employ flow cy-
tometry, rather than cytology, to estimate ploidy; however,
as with traditional chromosome cytology, it can also be a
time- consuming process to optimize this method for a new
group that has not been previously investigated (e.g. Hus-
band and Schemske 1998; Burton and Husband 1999; Seg-
raves et al. 1999). Hence, investigators may be unable to ob-
tain counts for those taxa for which chromosome counts are
unavailable in the literature. However, chromosome counts
alone will not reveal hybridity at the diploid level. Clearly, in
those groups in which polyploidy and/or hybridization are
known to occur, special concern is in order when ITS/ETS
sequences are employed. We stress that, in many ways, ITS/
ETS sequence data represent a weaker tool for examining
allopolyploid phylogenetics than low-copy nuclear loci be-
cause of the unique properties of the rDNA cistron (i.e. rDNA
homogenization). In permanent polyploid hybrids (i.e. allo-
polyploids), most single loci would be expected to exhibit a
combination of the two parental genomes (except in cases of
homoeologue loss following allopolyploidization; e.g. re-
viewed in Kovarik et al. 2005).
The situation observed in Tragopogon and Glycine is par-
ticularly worrisome in that there is no obvious evidence from
the direct ITS and ETS sequences that some plants are allo-
polyploids. In the recently formed allotetraploids T. mirus
and T. miscellus, the contribution of one diploid parent (typi-
cally T. dubius) has been largely (but incompletely) converted
to the other diploid type via concerted evolution (Kovarik et
al. 2005); sites where the parental diploids differ are repre-
sented by an obvious major peak in chromatograms and a
very weak minor peak (Fig. 1). In Glycine, the situation is
even worse: in most cases, only by using primers specific for
the missing homoeologue could the contribution from a
second diploid parent be detected in most cases. On the basis
of ITS alone, such plants look like diploids. Moreover, the
predominant homoeologue can differ in plants with the same
two homoeologous genomes and similar morphologies,
which would lead to erroneous conclusions about species
relationships and taxonomy.
The major focus of this manuscript is to consider the phy-
logenetic impact of adding a hybrid or allopolyploid having
polymorphic sites to a data matrix. McDade (1992) asked the
same question using morphological data. Our analyses indi-
cate that if a hybrid or allopolyploid with even a small or
moderate number of sites scored as polymorphic is included
in a phylogenetic analysis, few general predictions can be
made regard ing the placement of these polymorphic se-
quences. The polymorphic sequence of a hybrid or allopoly-
ploid may occur in diverse phylogenetic positions, including
with one diploid parent, or in one of several basal positions:
at or near the base of the clade containing one or both par-
ents, or in a basal position relative to all other taxa in the data
set. Also noteworthy are instances of phylogenetic instability,
in which a hybrid or allopolyploid appears in a diverse array
of positions, some of which are not associated with either
parent.
Our findings, in part, mirror the results of McDade (1992).
McDade included 17 F
1
hybrids of known parentage in a
broader phylogenetic analysis of the genus Aphelandra (Acan-
thaceae). However, the F
1
hybrids did not necessarily appear
with either parental species in her analyses. Approximately
two-thirds of the hybrids analyzed were placed as sister to a
clade that included the parent with which the hybrid shared
more apomorphies. In our molecular analyses we sometimes
observed a placement of an allotetraploid with the parent
with which it shares more apomorphies. For example, the
direct polymorphic sequences of both Tragopogon mirus and
T. miscellus both appeared in a clade with T. dubius (Fig. 2),
the parent with which both allopolyploids share more apo-
morphies. However, in Rubus, R. caesius × R. idaeus appears in
several positions, but in none of the trees examined does R.
caesius × R. idaeus appear in a clade with R. caesius, the parent
with which it shares more apomorphies (Fig. 7).
We have also shown that the inclusion of direct, polymor-
phic sequences in a data set of diploid taxa does not have a
major impact on the topology of the diploids. However, the
inclusion of a polymorphic sequence (via a hybrid or allo-
polyploid) may be accompanied by an increase in the num-
ber of shortest trees, less resolution in the strict consensus,
and a large decrease in bootstrap support for some nodes. It
is also noteworthy that CI and RI values often decrease
slightly, or remain essentially unchanged with the removal of
hybrid or polyploid taxa. Nieto Feliner et al. (2001) made a
similar observation in their analysis of artificial hybrids and
ITS sequences in Armeria. These result differ, however, from
the classic cladistic analysis of diploids and interspecific F
1
hybr ids based on morph ological charac ters by McDade
(1992). In McDade (1992), hybrids exhibited a range of mor-
phologies (similar to one parent, intermediate, transgressive);
here, in contrast, with DNA sequence data the hybrid or
polyploid is additive of the parents.
McDade (1992) concluded that cladistics may not be spe-
cially useful in distinguishing hybrids from normal taxa and
that hybridization is one of many sources of error in clad-
istic analyses(pg. 1329; see also Hull 1979). Our molecular
data for allopolyploids and hybrids echo these concerns, es-
pecially with the use of rDNA or other genes that typically
undergo homogenization. Just as McDades results caution
against the use of putative hybrids in phylogenetic analyses
based on morphology, we similarly advise that our data join
a wealth of already available data indicating that caution be
employed in analyses involving ITS and ETS sequence data
of diploid or allopolyploid hybrids.
ACKNOWLEDGMENTS. This work was supported in part by NSF grants
DGE-0209500, MCB-0346437, and DEB-0614421 (DES and PSS) and DEB-
0516673 (JJD). We thank L. Alice and T. Sang for kindly sharing published
data sets for use in this investigation.
18 SYSTEMATIC BOTANY [Volume 33
LITERATURE CITED
Abbott, R. J. and A. J. Lowe. 2004. Origins, establishment and evolution of
two new polyploid species of Senecio in the British Isles. Biological
Journal of the Linnean Society 82: 467474.
Adams, E. N. 1972. Consensus techniques and the comparison of taxo-
nomic trees. Systematic Zoology 21: 390397.
Alice, L. A. and C. S. Campbell. 1999. Phylogeny of Rubus (Rosaceae)
based on nuclear ribosomal DNA internal transcribed spacer region
sequences. American Journal of Botany 86: 8197.
Álvarez, I. and J. F. Wendel. 2003. Ribosomal ITS sequences and plant
phylogenetic inference. Molecular Phylogenetics and Evolution 29: 417
434.
Appels, R. and J. Dvorak. 1982. The wheat ribosomal DNA spacer region:
its structure and variation in populations and among species. Theo-
retical and Applied Genetics 63: 337348.
Bailey, C. D., T. G. Carr, S. A. Harris, and C. E. Hughes. 2003. Charac-
terization of angiosperm rDNA polymorphism, paralogy, and pseu-
dogenes. Molecular Phylogenetics and Evolution 29: 435455.
Baldwin, B. G., M. J. Sanderson, J. M. Porter, M. F. Wojciechowski, C. S.
Campbell, and M. J. Donoghue. 1995. The ITS region of nuclear
ribosomal DNA: A valuable source of evidence on angiopserm phy-
logeny. Annals of the Missouri Botanical Garden 82: 247277.
Bennett, R. I. and A. G. Smith. 1991. Use of a genomic clone for ribosomal
RNA from Brassica oleracea in RFLP analysis of Brassica species. Plant
Molecular Biology 16: 685688.
Buckler, E. S., A. Ippolito, and T. P. Holtsford. 1997. The evolution of
ribosomal DNA: divergent paralogues and phylogenetic implica-
tions. Genetics 145: 821832.
Burton, T. L. and B. C. Husband. 1999. Population cytotype structure in
the polyploid Galax urceolata (Diapensiaceae). Heredity 82: 381390.
Campbell, C. S., M. F. Wojciechowski, B. G. Baldwin, L. A. Alice, and M.
J. Donoghue. 1997. Persistent nuclear ribosom al DNA sequence
polymorphism in the Amelanchier agamic complex (Rosaceae). Mo-
lecular Biology and Evolution 14: 8190.
Clevinger, J. A. and J. L. Panero. 2000. Phylogenetic analysis of Silphium
and subtribe Englemanninae (Asteraceae: Heliantheae) based on ITS
and ETS sequence data. American Journal of Botany 87: 565572.
Doyle, J. J. 1995. The irrelevance of allele tree topologies for species de-
limitation, and a nontopological alternative. Systematic Botany 20:
574588.
Doyle, J. J. and R. N. Beachy. 1985. Ribosomal gene variation in soybean
(Glycine max) and its relatives. Theoretical and Applied Genetics 70:
369376.
Doyle, J. J. and B. S. Gaut. 2000. Evolution of genes and taxa: a primer.
Plant Molecular Biology 42: 123.
Doyle, J. J., J. L. Doyle, J. Rauscher, and A. H. D. Brown. 2004a. Evolution
of the perennial soybean polyploid complex (Glycine subgenus Gly-
cine): a study of contrasts. Biological Journal of the Linnaean Society 82:
583597.
Doyle, J. J., J. L. Doyle, J. Rauscher, and A. H. D. Brown. 2004b. Diploid
and polyploid reticulate evolution throughout the history of the pe-
rennial soybeans (Glycine subgenus Glycine). The New Phytologist 161:
121132.
Doyle, J. J., J. L. Doyle, A. H. D. Brown, and R. G. Palmer. 2002. Genomes,
multiple origins, and lineage recombination in the Glycine tomentella
(Leguminosae) polyploid complex: histone H3D gene sequences.
Evolution; International Journal of Organic Evolution 56: 13881402.
Farris, J. S. 1989. The retention index and rescaled consistency index.
Cladistics 5: 417419.
Felsenstein, J. 1985. Confidence limits on phylogenetics: an approach us-
ing the bootstrap. Evolution; International Journal of Organic Evolution
39: 783791.
Fitch, W. M. 1970. Distinguishing homologous from analogous proteins.
Systematic Zoology 19: 99113.
Flavell, R. B. and M. ODell. 1976. Ribosomal RNA genes in homeologous
chromosomes of groups 5 and 6 in hexaploid wheat. Heredity 37:
377385.
Franzke, A. and K. Mummenhoff. 1999. Recent hybrid speciation in Car-
damine (Brassicacea)conversion of nuclear ribosomal ITS sequences
in statu nascendi. Theoretical and Applied Genetics 98: 831834.
Grant, V. 1981. Plant speciation. New York: Columbia University Press.
Hershkovitz, M. A., E. A. Zimmer, and W. J. Hahn. 1999. Ribosomal DNA
sequences and angiosperm systematics. Pp. 268326 in Molecular sys-
tematics and plant evolution, eds. P. M. Hollingsworth, R. M. Bateman,
and R. J. Gornall. London: Taylor and Francis.
Hughes, K. W. and R. H. Petersen. 2001. Apparent recombination or gene
conve rsion in the ribosom al IT S reg ion o f a Flammu lina (Fung i,
Agaricales) hybrid. Molecular Biology and Evolution 18: 9496.
Hull, D. L. 1979. The limits of cladism. Systematic Zoology 28: 416440.
Husband, B. C. and D. W. Schemske. 1998. Cytotype distribution at a
diploidtetraploid contact zone in Chamerion (Epilobium) angustifo-
lium (Onagraceae). American Journal of Botany 85: 16881694.
Johnson, L. A. and R. L. Johnson. 2006. Morphological delimitation and
molecular evidence for allopolyploidy in Collomia wilkenii (Polem-
oniaceae), a new species from northern Nevada. Systematic Botany 31:
349360.
Joly, S., J. T. Rauscher, S. L. ShermanBroyles, A. H. D. Brown, and J. J.
Doyle. 2004. Evolutionary dynamics and preferential expression of
homoeologous 18S5.8S26S nuclear ribosomal genes in natural and
artificial Glycine allopolyloids. Molecular Biology and Evolution 21:
14091421.
Kim, K. J. and R. K. Jansen. 1994. Comparisons of phylogenetic hypoth-
eses among different data sets in dwarf dandelions (Krigia): addi-
tional infor mation from internal transcribed spac er sequences of
nuclear ribosomal DNA. Plant Systematics and Evolution 190: 157159.
Kluge, A. and J. S. Farris. 1969. Quantitative phyletics and the evolution
of anurans. Systematic Zoology 18: 132.
Kovarik, A., R. Matyasek, K. Y. Lim, K. Skalická, B. Koukalová, S. Knapp,
M. Chase, and A. R. Leitch. 2004. Concerted evolution of 185.826S
rDNA repeats in Nicotiana allotetraploids. Biological Journal of the
Linnaean Society 82: 615625.
Kovarik, A., J. C. Pires, A. Leitch, K. Y. Lim, A. Sherwood, R. Matyasek,
J. Rocca, D. E. Soltis, and P. S. Soltis. 2005. Rapid concerted evolution
in two allopolyploids of recent and recurrent origin. Genetics 169:
931944.
Kuzoff, R. K., D. E. Soltis, L. Hufford, and P. S. Soltis. 1999. Phylogenetic
relationships within Lithophragma (Saxifragaceae): hybridization, al-
lopolyploidy, and ovary diversification. Systematic Botany 24: 598
615.
Lee, J., B. G. Baldwin, and L. D. Gottlieb. 2002. Phylogeny of Stephano-
meria and related genera (CompositaeLactuceae) based on analysis
of 18S-26S nuclear rDNA ITS and ETS sequences. American Journal of
Botany 89: 160168.
Lee, J., B. G. Baldwin, and L. D. Gottlieb. 2003. Phylogenetic relationship
among the primary North American genera of Cichoriae (Composi-
tae) based on analysis of 18S-26S nuclear rDNA ITS and ETS se-
quences. Systematic Botany 28: 616626.
Leitch, I. J. and M. D. Bennett. 1997. Polyploidy in angiosperms. Trends in
Plant Science 2: 470476.
Lim, K. Y., A. Kovarik, R. Matyasek, M. Bezdek, C. P. Lichtenstein, and A.
R. Leitch. 2000. Gene conversion of ribosomal DNA in Nicotiana
tabacum is associated with undermethylated, decondensed and prob-
ably active gene units. Chromosoma 109: 161172.
Markos, S. and B. G. Baldwin. 2001. Higher-level relationship and major
lineages of Lessingia (Compositae, Astereae) based on nuclear rDNA
internal and external transcribed spacer (ITS and ETS) sequences.
Systematic Botany 26: 168183.
Mavrodiev, E. V., M. Tancig, A. M. Sherwood, M. A. Gitzendanner, J.
Rocca, P. S. Soltis, and D. E. Soltis. 2005. Phylogeny of Tragopogon L.
(Asteraceae) based on internal and external transcribed spacer se-
quence data. International Journal of Plant Sciences 166: 117133.
Mavrodiev, E. V., P. S. Soltis, and D. E. Soltis. Parentage of six Old World
polyploids in Tragopogon L. (Asteraceae: Scorzonerinae) based on
ITS, ETS, and plastid sequence data. Taxon (In press).
McDade, L. A. 1990. Hybrids and phylogenetic systematics I. Patterns of
character expression in hybrids and their implications for cladistic
analysis. Evolution; International Journal of Organic Evolution 44: 1685
1700.
McDade, L. A. 1992. Hybrids and phylogenetic systematics II. The impact
of hybrids on cladistic analysis. Evolution; International Journal of Or-
ganic Evolution 46: 13291346.
Nazarova, E. A. 1991. Karyotypical evolution in genus Tragopogon L. (Lac-
tuceae, Asteraceae). Pp. 116134 in Proceedings of Academy of Science
of Armenia: flora, vegetation and vegetable resources of Armenia, 13, ed. A.
L. Takhtajan. Yerevan: Academy of Science of Armenia (In Russian
with Armenian abstract).
Nieto Feliner, G. N., J. F. Aguilar, and J. A. Rosselló. 2001. Can extensive
reticulation and concerted evolution result in a cladistically struc-
tured molecular data set? Cladistics 17: 301312.
OKane, S. L., B. A. Schaal, and I. A. AlShehbaz. 1996. The origins of
Arabidopsis suecica as indicated by nuclear rDNA sequences. System-
atic Botany 21: 559566.
2008] SOLTIS ET AL.: ITS AND ETS IN ALLOPOLYPLOIDS 19
Ownbey, M. 1950. Natural hybridization and amphiploidy in the genus
Tragopogon. American Journal of Botany 37: 487499.
Posada, D., K. A. Crandall, and E. C. Holmes. 2002. Recombination in
evolutionary genomics. Annual Review of Genetics 36: 7597.
Rauscher, J. T., J. J. Doyle, and A. H. D. Brown. 2002. Internal transcribed
spacer repeat-specific primers and the analysis of hybridization in
the Glycine tomentella (Leguminosae) polyploid complex. Molecular
Ecology 11: 26912702.
Rauscher, J. T., J. J. Doyle, and A. H. D. Brown. 2004. Multiple origins and
rDNA internal transcribed homoeolog evolution in the Glycine to-
mentella (Leguminose) allopolyploid complex. Genetics 166: 987998.
Saar, D. E., N. O. Polans, and P. D. Sorensen. 2003. A phylogenetic analy-
sis of the genus Dahlia (Asteraceae) based on internal and external
transcribed spacer regions of nuclear ribosomal DNA. Systematic
Botany 28: 627639.
Sanderson, M. J. and J. J. Doyle. 1992. Reconstruction of organismal phy-
logenies from multigene families: paralogy, concerted evolution, and
homoplasy. Systematic Biology 41: 417.
Sang, T., D. J. Crawford, and T. F. Stuessy. 1995. Documentation of re-
ticulate evoluti on in peonies (P aeonia) using internal transcribed
spacer sequences of nuclear ribosomal DNA: implications for bioge-
ography and concerted evolution. Proceedings of the National Academy
of Sciences of the United States of America 92: 68136817.
Sang, T., D. J. Crawford, and T. F. Stuessy. 1997. Chloroplast DNA phy-
logeny, reticulate evolution, and biogeography of Paeonia (Paeonia-
ceae). American Journal of Botany 84: 11201136.
Schaeffer, S. W. and E. L. Miller. 1992. Estimates of gene flow in Drosophila
pseudoobscura determined from nucleotide sequence analysis of the
alcohol dehydrogenase region. Genetics 132: 471480.
Segraves, K. A., J. N. Thompson, P. S. Soltis, and D. E. Soltis. 1999. Mul-
tiple origins of polyploidy and the geographic structure of Heuchera
grossulariifolia. Molecular Ecology 8: 253262.
Soltis, D. E. and P. S. Soltis. 1998. Choosing an approach and appropriate
gene for phylogenetic analysis. Pp. 142 in Molecular systematics of
plants II. DNA sequencing, eds. D. E. Soltis, P. S. Soltis, and J. J. Doyle.
Boston: Kluwer Academic Publisher.
Soltis, D. E., P. S. Soltis, J. C. Pires, A. Kovarik, J. Tate, and E. Mavrodiev.
2004. Recent and recurrent polyploidy in Tragopogon (Asteraceae):
genetics, genomic, and cytogenetic comparisons. Biological Journal of
the Linnaean Society 82: 485501.
Soltis, P. S., G. M. Plunkett, S. J. Novak, and D. E. Soltis. 1995. Genetic
variation in Tragopogon species: additional origins of the allotetra-
ploids T. mirus and T. miscellus (Compositae). American Journal of
Botany 82: 13291341.
Swofford, D. L. 2002. PAUP* Phylogenetic Analysis Using Parsimony
(*And Other Methods). Version 4.10b. Sunderland: Sinauer Associ-
ates.
Urbatsch, L. E., R. P. Roberts, and V. Karaman. 2003. Phylogenetic evalu-
ation of Xylothamia gundlachia and related genera (Asteraceae, As-
tereae) based on ETS and ITS rDNA sequence data. American Journal
of Botany 90: 634649.
Volkov, R. A., N. V. Borisjuk, I. I. Panchuk, D. Schweizer, and V. Hem-
leben. 1999. Elimination and rearrangement of parental rDNA in the
allotetraploid Nicotiana tabacum. Molecular Biology and Evolution 16:
311320.
Wendel, J. F., A. Schnabel, and T. Seelanan. 1995a. Bidirectional interlocus
concerted e volution followin g allopolyploid speciation i n cotton
(Gossypium). Proceedings of the National Academy of Sciences of the
United States of America 92: 280284.
Wendel, J. F., A. Schnabel, and T. Seelanan. 1995b. An unusual ribosomal
DNA sequence from Gossypium gossypioides reveals ancient, cryptic,
intergenomic introgression. Molecular Phylogenetics and Evolution 4:
298313.
Wilkinson, M. 1994. Common cladistic information and its consensus
repres entation: re duced Adams and reduced cladistic consensus
trees and profiles. Systematic Biology 43: 343368.
Zhang, D. and T. Sang. 1999. Physical mapping of ribosomal RNA genes
in peonies (Paeonia, Paeoniaceae) by fluorescent in situ hybridiza-
tion: implications for phylogeny and concerted evolution. American
Journal of Botany 86: 735740.
Zimmer, E. A., S. L. Martin, S. M. Beverley, Y. W. Kan, and A. C. Wilson.
1980. Rapid duplication and loss of genes coding for the ! chains of
hemoglobin. Proceedings of the National Academy of S c i e n c e s of the
United States of America 77: 21582162.
20 SYSTEMATIC BOTANY [Volume 33
... They present several advantages for exploring the origin and evolution of nGMO species. To begin with, cT-DNAs are well-defined, highly specific, and well Inconsistencies can be explained by the difficult interpretation of the results of phylogenetic analysis based on ITS and matK, in taxon, where hybridization and polyploidization played a significant evolutionary role [33]. ...
Article
Full-text available
A variety of plant species found in nature contain agrobacterial T-DNAs in their genomes which they transmit in a series of sexual generations. Such T-DNAs are called cellular T-DNAs (cT-DNAs). cT-DNAs have been discovered in dozens of plant genera, and are suggested to be used in phylogenetic studies, since they are well-defined and unrelated to other plant sequences. Their integration into a particular chromosomal site indicates a founder event and a clear start of a new clade. cT-DNA inserts do not disseminate in the genome after insertion. They can be large and old enough to generate a range of variants, thereby allowing the construction of detailed trees. Unusual cT-DNAs (containing the rolB/C-like gene) were found in our previous study in the genome data of two Vaccinium L. species. Here, we present a deeper study of these sequences in Vaccinium L. Molecular-genetic and bioinformatics methods were applied for sequencing, assembly, and analysis of the rolB/C-like gene. The rolB/C-like gene was discovered in 26 new Vaccinium species and Agapetes serpens (Wight) Sleumer. Most samples were found to contain full-size genes. It allowed us to develop approaches for the phasing of cT-DNA alleles and reconstruct a Vaccinium phylogenetic relationship. Intra- and interspecific polymorphism found in cT-DNA makes it possible to use it for phylogenetic and phylogeographic studies of the Vaccinium genus.
... In this study, we reconstructed the phylogeny of Eragrostis using plastid data and ITS regions based on 66 representative samples from the worldwide distributions. ITS region is often used to trace the origin and evolution of allopolyploids at low taxonomic level (Soltis et al., 2008), and plastid data can also be employed to determine the maternal parents of E. tef assuming their maternally inherited attribution in grasses (Mogensen, 1988). We further estimated genome sizes of Eragrostis 66 samples based on flow cytometric experiments, including previously identified taxa (E. ...
Article
Full-text available
Background Biologists have long debated the drivers of the genome size evolution and variation ever since Darwin. Assumptions for the adaptive or maladaptive consequences of the associations between genome sizes and environmental factors have been proposed, but the significance of these hypotheses remains controversial. Eragrostis is a large genus in the grass family and is often used as crop or forage during the dry seasons. The wide range and complex ploidy levels make Eragrostis an excellent model for investigating how the genome size variation and evolution is associated with environmental factors and how these changes can ben interpreted. Methods We reconstructed the Eragrostis phylogeny and estimated genome sizes through flow cytometric analyses. Phylogenetic comparative analyses were performed to explore how genome size variation and evolution is related to their climatic niches and geographical ranges. The genome size evolution and environmental factors were examined using different models to study the phylogenetic signal, mode and tempo throughout evolutionary history. Results Our results support the monophyly of Eragrostis. The genome sizes in Eragrostis ranged from ~0.66 pg to ~3.80 pg. We found that a moderate phylogenetic conservatism existed in terms of the genome sizes but was absent from environmental factors. In addition, phylogeny-based associations revealed close correlations between genome sizes and precipitation-related variables, indicating that the genome size variation mainly caused by polyploidization may have evolved as an adaptation to various environments in the genus Eragrostis. Conclusion This is the first study to take a global perspective on the genome size variation and evolution in the genus Eragrostis. Our results suggest that the adaptation and conservatism are manifested in the genome size variation, allowing the arid species of Eragrostis to spread the xeric area throughout the world.
... Allopolyploids are known to bias ordinary tree reconstructions due to reticulate evolution and mingling of different evolutionary histories in consensus sequences Dauphin et al., 2018;. Nevertheless, allopolyploids resulting from crosses of less diverged progenitor species are expected to not tremendously impact the reconstructed tree phylogeny Soltis et al., 2008), and present DNA sequence subgenome dominance is probably also Nuclear-plastid (i.e., cytonuclear) discordance is a strong signal for hybridization events, as demonstrated at the diploid but also at the mixed-ploidy level Dauphin et al., 2018;Carter et al., 2019;Stull et al., 2020), and as already explained herein for the probably homoploid R. flabellifolius and allopolyploid R. marsicus and their respective polyploid apomicts (chapters 1, 5: Karbstein et al., 2020bKarbstein et al., , 2021b. Interestingly, Hodel et al. (2021) also investigated a well-resolved nuclear-gene tree backbone despite cytonuclear discordance in the cherry and plum genus (Prunus), which is shaped by past hybridization and allopolyploidization events. ...
Thesis
--- Thesis available on: https://ediss.uni-goettingen.de/handle/21.11130/00-1735-0000-0008-5946-6 --- Polyploidy, the presence of two or more full genomic complements, repeatedly occurs across the tree of life. In plants, not only the economic but particularly the evolutionary importance is overwhelming. Polyploidization events, probably connected to key innovations (e.g., vessel elements or the carpel), occurred frequently in the evolutionary history of flowering plants, which are the most species-rich group in the plant kingdom (ca. 370,000 species) and contain 30–70% neopolyploids. Polyploidy and hybridization (i.e., allopolyploidy) are particularly considered to create biotypes with novel genomic compositions and to be key factors for subsequent speciation and macroevolution. In plants, both processes are frequently connected to apomixis, i.e., the reproduction via asexually-formed seeds. However, the enigmatic phenomenon of plant speciation accompanied by polyploidy and apomixis is still poorly understood despite tremendous progress in the field of genomics. The question of “What is a species?” is of highest priority for evolutionary biologists: Species are the fundamental units for biodiversity, and further evolutionary and ecological research relies on well-defined entities. Evolutionarily young plant species complexes offer a unique opportunity to study plant speciation and accompanying processes. They usually comprise a few sexual progenitor species, and numerous polyploid, partly apomictic, hybrid derivatives. In apomictic lineages, the lack of recombination and cross-fertilization can result in numerous clonal lineages with fixed morphological and ecological traits (agamospecies). Nevertheless, even recognizing and delimiting the sexual progenitors of species complexes is methodically challenging due to low genetic divergence, possible hybrid origins, ongoing gene flow, and/or incomplete lineage sorting (ILS). Integrative approaches using both genomic and morphometric data for disentangling the young progenitors are still lacking so far. The biogeography and evolution of those plant complexes is even more challenging. Apomicts frequently occupy larger areas or more northern regions compared to their sexual relatives, a phenomenon called geographical parthenogenesis (GP). GP patterns usually have a Pleistocene context because climatic range shifts in temperate to boreal zones offered frequent opportunities for interspecific hybridization, probably giving rise to apomixis in the Northern Hemisphere. Factors shaping GP patterns are still controversially discussed. GP has been widely attributed to advantages of apomicts caused by polyploidy and uniparental reproduction, i.e., fixed levels of high heterozygosity leading to increased stress tolerance, and self-fertility leading to better colonizing capabilities. On the one hand, complex interactions of genome-wide heterozygosity, ploidy, reproduction mode (sexual versus asexual), and climatic environmental factors shaping GP have not been studied enough. On the other hand, potential disadvantages of sexual progenitors due to their breeding system on fitness and genetic diversity have received even less attention. Finally, alongside biogeography, the reticulate relationships and genome composition and evolution of young, large polyploid plant species complexes have not yet been deciphered comprehensively. Besides challenges attributed to numerous numbers of polyploidization and hybridization events, bioinformatic analyses are also often hampered by missing information on progenitors, ploidy levels, and reproduction modes. The European apomictic polyploid Ranunculus auricomus (goldilock buttercup) plant complex is well-suited to study all the aforementioned issues. The majority of goldilock buttercups probably arose from hybridization of a few sexual progenitors, leading to more than 800 described, morphologically highly diverse agamospecies. Sexuals are estimated to have speciated less than 1.0 million years ago, and agamospecies are probably much younger. In this thesis, using R. auricomus as a model system, I examined the recalcitrant and hitherto poorly understood phylogenetic, genomic, and biogeographical relationships of young polyploid apomictic plant complexes. I developed a comprehensive theoretical and bioinformatic workflow, starting with analyzing the evolution of the sexual progenitor species, continuing with unraveling reproduction modes and biogeography of apomictic polyploids, and ending up with revealing the reticulate origins and genome composition and evolution of the polyploid complex. Spanning up to 251 populations and 87 R. auricomus taxa Europe-wide, this work gathered data of 97,312 genomic loci (RADseq), 663 nuclear genes (target enrichment), and 71 plastid regions, and 1,474 leaf ploidy, 4,669 reproductive seed, 284 reproductive crossing (seed sets), as well as 1,593 geometric morphometric measurements. First of all, phylogenomics based on RADseq, nuclear gene, and geometric morphometric data supported the lumping of the twelve described sexual morphospecies into five newly circumscribed progenitor species. These species represent clearly distinguishable genetic main lineages or clusters, which are both well geographically isolated and morphologically differentiated: R. cassubicifolius s.l., R. envalirensis s.l., R. flabellifolius, R. marsicus, and R. notabilis s.l. Mainly within-clade reticulate relationships, missing geographical isolation, and a lack of distinctive morphological characters led to this taxonomic treatment. Interestingly, allopatric speciation events took place ca. 0.83–0.58 million years ago during a period of severe climatic oscillations, and were probably triggered by vicariance processes of a widespread European forest-understory ancestor. Sexual species re-circumscriptions were additionally supported by population crossing experiments. Besides inbreeding depression, outbreeding benefits, and sudden self-compatibility, crossings also revealed a lack of reproductive barriers among some of the formerly described morphospecies. Moreover, flow cytometric ploidy and reproductive, RADseq, and environmental data were combined into a genetically informed path analysis based on Generalized Linear Mixed Models (GLMMs). The analysis unveiled a complex European GP scenario, whereby diploids compared to polyploids showed significantly higher sexuality (percent of sexual seeds), more petals (petaloid nectary leaves), and up to three times less genome-wide heterozygosity. Surprisingly, sexuality was positively associated with solar radiation and isothermality, and heterozygosity was positively related to temperature seasonality. Results fit the southern distribution of diploid sexuals and suggest a higher resistance of polyploid apomicts to more extreme climatic conditions. Finally, a self-developed, multidisciplinary workflow incorporating all previously gathered data demonstrated, for the first time, the predominantly allopolyploid origin, genome composition, and post-origin genome evolution of the R. auricomus complex. Taxa were organized in only three to five supported, north-south distributed clades or cluster, each usually containing diploid sexual progenitor species. Allopolyploidizations involved two to three different diploid sexual subgenomes per event. Only one autotetraploid event was detected. Allotetraploids were characterized by subgenome dominance and enormous post-origin evolution, i.e., Mendelian segregation of hybrid generations, back-crossing to parents, and/or gene flow due to facultative sexuality of apomicts. Four diploid sexual progenitors and a previously unknown, nowadays extinct progenitor, probably gave rise to the more than 800 taxa of the European R. auricomus complex. Analyses also showed that the majority of analyzed polyploid agamospecies are non-monophyletic and similar morphotypes probably originated multiple times. The lack of monophyly suggests a comprehensive taxonomic revision of the entire complex. In the General Discussion, I combine my thesis results with existing plant studies on diploid sexual and polyploid apomictic phylogenetics, biogeography, and composition and genome evolution of young species complexes. I explain the taxonomic conclusions and how species complexes link micro- and macroevolutionary processes. Finally, I give conclusions of my thesis and an outlook of the project and the field of polyploid phylogenetics.
Article
Full-text available
Tragopogon (Asteraceae) includes two recently and repeatedly formed allopolyploids, T. mirus and T. miscellus, both of which formed in western North America following the human-mediated introduction of three diploids from Europe: T. dubius, T. porrifolius, and T. pratensis. We recently investigated the genetics of the introduction history to North America of T. dubius, the shared parent of both allopolyploids. Here, we investigate the introduction of T. pratensis into North America, the second diploid parent of T. miscellus. Using ITS sequence data, we found that T. pratensis as currently defined in the narrow sense is polyphyletic and comprises at least four different major ITS types in its native range. Of these native range ITS patterns, two have been introduced from Europe into North America and now occur widely across Canada and the U.S.A. Although the allotetraploid T. miscellus formed multiple times in western North America, only one of these ITS types was involved in the recurrent formations. These results for T. pratensis parallel our findings for T. dubius and further suggest that not all genotypes of these two species may be able to participate in the formation of allopolyploids. Our phylogenetic analyses reveal that several entities traditionally considered part of T. pratensis in the narrow sense are genetically distinct and mark unique lineages that may ultimately merit recognition as separate species. This proclivity for genetically distinct entities (potential cryptic species) within species recognized based on morphology appears common in Tragopogon. To unravel the complexities of what is referred to as “T. pratensis”, more intensive phylogenetic analyses involving many more samples from across the geographic range of the species are required, as are detailed assessments of taxonomy, morphology, and cytology.
Article
Full-text available
Hybridization and polyploidy are key evolutionary forces in plant diversification, and their co-occurrence in the context of allopolyploid speciation is often associated with increased ability to colonize new environments and invasiveness. In the genus Ulex (Fabaceae), the European gorse (Ulex europaeus subsp. europaeus) is the only invasive and the only polyploid that has recently spread in different eco-geographical regions across the world. Understanding what confers such ecological advantages to this species, compared to its diploid and polyploid congeners, first requires clarification of the ecogeographical and evolutionary context of its formation. To achieve this, the geographical distributions of all Ulex spp. were estimated from species occurrence records, and phylogenetic analyses including all Ulex spp. were performed based on four nuclear (ITS and ETS nrDNA) and plastid (rps12 intron and trnK-matK) regions. The resulting trees were dated using a secondary calibration. Patterns of DNA sequence variation and dated phylogenetic trees were then interpreted in light of previous knowledge of chromosome numbers in Ulex to infer past events of polyploid speciation in the genus. We show that: (1) most current Ulex spp. radiated in the Iberian Peninsula during the past 1–2 Myr; (2) the history of Ulex was punctuated by multiple whole-genome duplication events; and (3) U. europaeus subsp. europaeus is the only gorse taxon that was formed by hybridization of two well-differentiated lineages (which separated c. 5 Mya) with wide climatic ranges (currently represented by Ulex minor and Ulex europaeus subsp. latebracteatus), possibly contributing to the invasive nature and wider climatic range of U. europaeus subsp. europaeus. These findings provide a much-needed evolutionary framework in which to explore the adaptive consequences of genome mergers and duplication in Ulex.
Article
Full-text available
The genus Vaccinium includes almost 500 species, among which there are economically important species of cranberries V. macrocarpon Ait. and V. oxycoccos L., lingonberries V. vitis-idaea L., bilberries V. myrtillus L. and blueberries V. uliginosum L., V. angustifolium Ait., V. corymbosum L., V. virgatum Ait. Despite the fact that many of these species were actively used by humans in medicine and food, their active selection began in the 20th century, in connection with which a classification of the genus according to morphological characters was developed. Many of these data remain relevant to the present day. The development of the ideas of molecular phylogeny prompted a revision of the old classification, identifying a number of difficulties that do not allow one to unambiguously determine phylogenetic relationships within the genus. Today, the genus includes 33 sections, while the species composition of the sections and the evolutionary relationships between them remain controversial. This review discusses various approaches to the study of the structure of the genus Vaccinium: from classical to phylogenomic, the main results of using these approaches and their prospects.
Article
Full-text available
This work represents a morphological and molecular study of Salicornia and Sarcocornia species growing in the southern dryland of Tunisia. Internal transcribed spacers of the rDNA (ITS) data of six specimens from seven locations are analyzed. Flowers and seeds of Sarcocornia and Salicornia specimens are also compared. The results confirm the presence of Sarcocornia fruticosa (L.) A.J. Scott and two newly recorded species (Sarcocornia alpini (Lag.) Rivas Mart. and Salicornia emerici Duval-Jouve) in Tunisia. Flowers and seeds can be used to discriminate between the different specimens. Sarcocornia flowers have horizontal arrangement while Salicornia ones have triangular arrangement. The rounded and black seeds of S. fruticosa are the biggest. S. emerici seeds are light brown and elongated while those of S. alpini are flattened and dark brown.
Article
Tragopogon (Asteraceae) is an evolutionary model for the study of whole‐genome duplication, with two recently and repeatedly formed allopolyploids, T. mirus and T. miscellus, and many additional polyploid species. Tragopogon mirus and T. miscellus formed in western North America following the introduction of three diploids from Europe: T. dubius, T. porrifolius, and T. pratensis. Of these diploids, T. dubius is a shared parent of both tetraploids and is broadly defined and widely distributed in Eurasia. Because human‐mediated intercontinental introductions may lead to hybridization with local species, and associated polyploidization, the introduction history of T. dubius from Europe to North America provides further opportunity to investigate both the extent and consequences of plant introductions. Using ITS sequence data, we show that the morphologically diverse, broadly defined T. dubius comprises a complex of at least 10 different ITS types in its native range, six of which have been introduced from Europe into North America. Significantly, although the two allotetraploid species have each formed multiple times on geographical scales from local to regional, recurrent formation is the result of repeated hybridization involving only one of these ITS subtypes. These results reinforce earlier data suggesting that not all diploid genotypes can form allopolyploids. Several entities traditionally considered part of T. dubius s.l. are now recognized as distinct species (e.g., T. lainzii), and it is likely that other distinct ITS genotypes identified here may also mark unique lineages that ultimately merit recognition as separate species. However, more intensive phylogenetic analyses involving many more samples from across the geographic range of T. dubius are required, as are detailed assessments of taxonomy, morphology, and cytology.
Article
Full-text available
Oresitrophe and Mukdenia (Saxifragaceae) are epilithic sister genera used in traditional Chinese medicine. The taxonomy of Mukdenia, especially of M. acanthifolia, has been controversial. To address this, we produced plastid and mitochondrial data using genome skimming for M. acanthifolia and M. rossii, including three individuals of each species. We assembled complete plastomes, mitochondrial CDS and nuclear ribosomal ETS/ITS sequences using these data. Comparative analysis shows that the plastomes of Mukdenia and Oresitrophe are relatively conservative in terms of genome size, structure, gene content, RNA editing sites and codon usage. Five plastid regions that represent hotspots of change (trnH‐psbA, psbC‐trnS, trnM‐atpE, petA‐psbJ and ccsA‐ndhD) are identified within Mukdenia, and six regions (trnH‐psbA, petN‐psbM, trnM‐atpE, rps16‐trnQ, ycf1 and ndhF) contain a higher number of species‐specific parsimony‐informative sites that may serve as potential DNA barcodes for species identification. To infer phylogenetic relationships between Mukdenia and Oresitrophe, we combined our data with published data based on three different datasets. The monophyly of each species (O. rupifraga, M. acanthifolia and M. rossii) and the inferred topology ((M. rossii, M. acanthifolia), O. rupifraga) are well supported in trees reconstructed using the complete plastome sequences, but M. acanthifolia and M. rossii did not form a separate clade in the trees based on ETS+ITS data, while the mitochondrial CDS trees are not well‐resolved. We found low recovery of genes in the Angiosperms353 target enrichment panel from our unenriched genome skimming data. Hybridization or incomplete lineage sorting may be the cause of discordance between trees reconstructed from organellar and nuclear data. Considering its morphological distinctiveness and our molecular phylogenetic results, we strongly recommend that M. acanthifolia be treated as a distinct species. This article is protected by copyright. All rights reserved.
Article
Lineage recombination is an important source of genetic and morphological variation in species-rich groups of plants. Tetraploids that are intermediate in morphology and ecology with respect to sympatric diploids are regularly hypothesized to be the products of hybridization. Arctostaphylos mewukka is one such intermediate tetraploid long regarded as the result of hybridization and genome duplication among divergent and geographically overlapping diploids widely distributed across the western slope of the Sierra Nevada. Here we set out to test this hypothesis leveraging the notion that allopolyploids arise repetitively and may show signs of reciprocal organellar exchange among species between maternal and paternal progenitors. We compared nuclear ribosomal and plastid sequence data acquired from samples within and outside this target species complex. Molecular sequence data show striking patterns indicative of widespread reticulation and chloroplast captureevents across the genus Arctostaphylos . Results support the notion that outcrossing, long-lived woody plant species such as members of the genus Arctostaphylos can retain a secured morphological identity despite ongoing influence of interspecific gene flow that would otherwise render speciesboundaries vulnerable to dissolution.
Article
Full-text available
Origin and rearrangement of ribosomal DNA repeats in natural allotetraploid Nicotiana tabacum are described. Comparative sequence analysis of the intergenic spacer (IGS) regions of Nicotiana tomentosiformis (the paternal diploid progenitor) and Nicotiana sylvestris (the maternal diploid progenitor) showed species-specific molecular features. These markers allowed us to trace the molecular evolution of parental rDNA in the allopolyploid genome of N. tabacum; at least the majority of tobacco rDNA repeats originated from N. tomentosiformis, which endured reconstruction of subrepeated regions in the IGS. We infer that after hybridization of the parental diploid species, rDNA with a longer IGS, donated by N. tomentosiformis, dominated over the rDNA with a shorter IGS from N. sylvestris; the latter was then eliminated from the allopolyploid genome. Thus, repeated sequences in allopolyploid genomes are targets for molecular rearrangement, demonstrating the dynamic nature of allopolyploid genomes.
Article
Full-text available
The apparent recency of diversification of Californian Lessingia (Compositae, Astereae) makes the genus a particularly interesting group for evolutionary investigation. Here we focus on the major evolutionary lineages within Lessingia (sensu Lane 1992) and the higher-level relationships of the genus and presumed close relatives using sequence data from the 18S–26S nuclear ribosomal DNA (nrDNA) internal transcribed spacer (ITS) region and the 3’ end (561–563 bp) of the external transcribed spacer (ETS). We present new 3‘ETS primers that are useful across Astereae and examine the phylogenetic utility of the 3‘ETS in Lessingia and close relatives. In Lessingia, the 3‘ETS region appears to have evolved up to 1.4 times more rapidly by nucleotide substitution than has the ITS region. Our results show that data from the ETS greatly augments data from the ITS region; the combined data set yields the best resolved and best supported molecular trees for Lessingia. These topologies lead us to five conclusions regarding the phylogenetic relationships of Lessingia (sensu Lane 1992) and closely related genera: (1) Lessingia may not be monophyletic when Benitoa is included within the genus, (2) among the taxa sampled, Benitoa and Hazardia appear to be the closest living relatives of Lessingia s. s. and L. filaginifolia (= Corethrogyne), (3) a sister group relationship exists between the radiate, perennial L. filaginifolia (= Corethrogyne) and the discoid, annual members of the genus (Lessingia s. s.), (4) different corolla coloration (pink/white vs. yellow) diagnoses the two major clades of Lessingia s. s., and (5) the “yellow group” of lessingias comprises three distinct, morphologically diagnosable, lineages which span the currently accepted circumscriptions of two taxa (L. lemmonii and L. glandulifera). Communicating Editor: Kathleen A. Kron
Article
Full-text available
Phylogenetic analyses of internal transcribed spacer (ITS), external transcribed spacer (ETS), and 5.8S gene sequences of 18S–26S nuclear rDNA from all 23 genera of Cichorieae with centers of diversity in North America (and Picrosia from South America) show that all but three of the genera (Glyptopleura, Krigia, and Phalacroseris) belong to a series of seven clades that are well supported by bootstrap values >90%. Phalacroseris, endemic to California, with a single species (P. bolanderi), is sister to a well-supported (>95% bootstrap) clade that includes all other principally North American genera (plus Picrosia). The seven clades with major support and their component genera are: 1) the Lygodesmia Clade: Chaetadelpha, Lygodesmia, and Shinnersoseris; 2) the Pinaropappus Clade: Marshalljohnstonia and Pinaropappus; 3) the Pyrrhopappus Clade: Picrosia and Pyrrhopappus; 4) the Microseris Clade: Agoseris, Microseris, Nothocalais, Stebbinsoseris, and Uropappus; 5) the Stephanomeria Clade: Munzothamnus, Pleiacanthus, Prenanthella, Rafinesquia, and Stephanomeria; 6) the Malacothrix 1 Clade: Atrichoseris and various species of Malacothrix; and 7) the Malacothrix 2 Clade: Anisocoma, Calycoseris, and various other species of Malacothrix. The rDNA sequence data provide < 80% bootstrap support for other, larger groups that combine two or more of the seven major clades, except for one uniting all 24 ingroup genera and one uniting the Lygodesmia Clade and Pyrrhopappus Clade. The present analysis shows that Malacothrix, a genus of 22 species, is not monophyletic. None of the clades corresponds precisely to a suprageneric taxon of Cichorieae proposed previously, although taxa constituting each clade belong to a common subtribe or subgroup in classifications by Bremer, Jeffrey, and Stebbins, with two to three exceptions. As a group, the 24 genera represent a single, major radiation of Cichorieae based in North America.
Article
Full-text available
Relationships among the various diploid and polyploid taxa that comprise Glycine tomentella have been hypothesized from crossing studies, isozyme data, and repeat length variation for the 5S nuclear ribosomal gene loci. However, several key questions have persisted, and detailed phylogenetic evidence from homoeologous nuclear genes has been lacking. The histone H3-D locus is single copy in diploid Glycine species and has been used to elucidate relationships among diploid races of G. tomentella, providing a framework for testing genome origins in the polyploid complex. For all six G. tomentella polyploid races (T1–T6), alleles at two homoeologous histone H3-D loci were isolated and analyzed phylogenetically with alleles from diploid Glycine species, permitting the identification of all of the homoeologous genomes of the complex. Allele networks were constructed to subdivide groups of homoeologous alleles further, and two-locus genotypes were constructed using these allele classes. Results suggest that some races have more than one origin and that interfertility within races has led to lineage recombination. Most alleles in polyploids are identical or closely related to alleles in diploids, suggesting recency of polyploid origins and spread beyond Australia. These features parallel the other component of the Glycine subgenus Glycine polyploid complex, G. tabacina, one of whose races shares a diploid genome with a G. tomentella polyploid race.
Article
Genetic diversity in the introduced diploids Tragopogon dubius, T. porrifolius, and T. pratensis and their neoallotetraploid derivatives T. mirus and T. miscellus was estimated to assess the numbers of recurrent, independent origins of the two tetraploid species in the Palouse region of eastern Washington and adjacent Idaho. These tetraploid species arose in this region, probably within the past 50–60 yr, and provide one of the best models for the study of polyploidy in plants. The parental species of both T. mirus and T. miscellus have been well documented, and each tetraploid species has apparently formed multiple times. However, a recent survey of the distributions of these allotetraploids revealed that both tetraploid species have expanded their ranges considerably during the past 50 yr, and several new populations of each species were discovered. Therefore, to evaluate the possibility that these recently discovered populations are of recent independent origin, a broad analysis of genetic diversity in T. mirus, T. miscellus, and their diploid progenitors was conducted. Analyses of allozymic and DNA restriction site variation in all known populations of T. mirus and T. miscellus in the Palouse and several populations of each parental diploid species revealed several distinct genotypes in each tetraploid species. Four isozymic multilocus genotypes were observed in T. mirus, and seven were detected in T. miscellus. Tragopogon mirus possesses a single chloroplast genome, that of T. porrifolius, and two distinct repeat types of the 18S-26S ribosomal RNA genes. Populations of T. miscellus from Pullman, Washington, have the chloroplast genome of T. dubius; all other populations of T. miscellus have the chloroplast DNA of T. pratensis. All populations of T. miscellus combine the ribosomal RNA repeat types of T. dubius and T. pratensis, as demonstrated previously. When all current and previously published data are considered, both T. mirus and T. miscellus appear to have formed numerous times even within the small geographic confines of the Palouse, with estimates of five to nine and two to 21 independent origins, respectively. Such recurrent polyploidization appears to characterize most polyploid plant species investigated to date (although this number is small) and may contribute to the genetic diversity and ultimate success of polyploid species.
Article
I examined three aspects of the cladistic treatment of a set of 17 F1 hybrids of known parental origin: (1) impact of hybrids on consistency index (CI) and number of most parsimonious trees (Trees), (2) placement of hybrids in cladograms, and (3) impact of hybrids on hypotheses of relationship among species. The hybrids were added singly and in randomly selected sets of two to five to a data set composed of Central American species of Aphelandra (including the parents of all hybrids). Compared to analyses with the same number of OTUs all of which were species, the analyses with hybrids yielded results with significantly higher CI. There was no difference in Trees between analyses with hybrids versus species. There was thus no evidence that hybrids would appear to be more problematic for cladistic methods than species. Accordingly, hybrids will not be readily identifiable as taxa that cause marked change in these indices. About % of the hybrids were placed as the cladistically basal members of the lineage that included the most apomorphic parent. Relatively apomorphic hybrids were placed proximate to the most derived parent (ca. 13% of hybrids). Other placements occurred more rarely. The most frequent placements of hybrids thus did not distinguish them from normal intermediate or apomorphic taxa. When analyses with hybrids yielded multiple most parsimonious trees, these were no more different from each other than were the equally parsimonious trees that resulted from analyses with species. Most analyses with one or two hybrids resulted in minor or no change in topology. When hybrids caused topological change, they frequently caused rearrangements of weakly supported portions of the cladogram that did not include their parents. When they disrupted the cladistic placement of their parents, they often caused their parents to change positions, with at least one topology bringing the parental lineages into closer proximity with the hybrid placed between them. Hybrids between parents from the two main lineages of the group caused total cladistic restructuring. In fact, the degree of relationship between a hybrid's parents (measured by both cladistic and patristic distance) was strongly correlated with CI (negatively) and with the degree of disturbance to cladistic relationships (positively). Thus, hybrids between distantly related parents resulted in cladograms with low CI and major topological changes. This study suggests that hybrids are unlikely to cause breakdown of cladistic structure unless they are between distantly related parents. However, these results also indicate that cladistics may not be specially useful in distinguishing hybrids from normal taxa. The applicability of these results to other kinds of hybrids is examined and the likely cladistic treatment of hybrids using other sources of data is discussed.
Article
The recently-developed statistical method known as the "bootstrap" can be used to place confidence intervals on phylogenies. It involves resampling points from one's own data, with replacement, to create a series of bootstrap samples of the same size as the original data. Each of these is analyzed, and the variation among the resulting estimates taken to indicate the size of the error involved in making estimates from the original data. In the case of phylogenies, it is argued that the proper method of resampling is to keep all of the original species while sampling characters with replacement, under the assumption that the characters have been independently drawn by the systematist and have evolved independently. Majority-rule consensus trees can be used to construct a phylogeny showing all of the inferred monophyletic groups that occurred in a majority of the bootstrap samples. If a group shows up 95% of the time or more, the evidence for it is taken to be statistically significant. Existing computer programs can be used to analyze different bootstrap samples by using weights on the characters, the weight of a character being how many times it was drawn in bootstrap sampling. When all characters are perfectly compatible, as envisioned by Hennig, bootstrap sampling becomes unnecessary; the bootstrap method would show significant evidence for a group if it is defined by three or more characters.
Article
The genus Dahlia presently consists of 35 species, primarily from Mexico. Species are usually placed in four sections: Pseudodendron, Epiphytum, Entemophyllon, and Dahlia, based largely on morphological characters, supplemented with cytological, geographical, and biochemical data. Combined molecular sequence data from both the internal and external transcribed spacer regions (ITS and ETS), located within the nuclear ribosomal gene repeat unit, are used to infer a phylogeny of the genus. Section Entemophyllon forms a very well-defined clade based on these data. Dahlia merckii and D. tubulata are positioned between sect. Entemophyllon the remaining taxa. Sections Pseudodendron and Epiphytum are closely allied with each other and a few species from sect. Dahlia to form the variable root clade (VRC), which incorporates all species with unusual underground structures, along with some species exhibiting the more typical tuberous type. The remaining species of sect. Dahlia form a well-defined clade, the core Dahlia clade (CDC).