PreprintPDF Available

Resolving spatial complexities of hybridization in the context of the gray zone of speciation in North American ratsnakes ( Pantherophis obsoletus complex)

Authors:

Abstract

Estimating species divergence with gene flow has been crucial for characterizing the gray zone of speciation, which is the period of time where lineages have diverged but have not yet achieved strict reproductive isolation. However, estimates of divergence times and gene flow often ignores spatial information, for example the formation and shape of hybrid zones. Using population genomic data from the eastern ratsnake complex ( Pantherophis obsoletus ), we infer phylogeographic groups, gene flow, changes in demography, the timing of divergence, and hybrid zone widths. We examine the spatial context of diversification by linking migration and timing of divergence to the size, shape, and types of hybridization (e.g., F1, backcrosses) in hybrid zones. Rates of migration between lineages are associated with the width and shape of hybrid zones. Timing of divergence is not related to migration rate across species pairs and is therefore a poor proxy for inferring position in the gray zone. Artificial neural network approaches are applied to understand how landscape features and past climate have influenced population genetic structure among these lineages prior to hybridization. The Mississippi River produced the deepest divergence in this complex, whereas Pleistocene climate and elevation secondarily structured lineages.
1
Title: Resolving spatial complexities of hybridization in the context of the gray zone of
1
speciation in North American ratsnakes (Pantherophis obsoletus complex)
2
3
Frank T. Burbrink1*
4
5
Marcelo Gehara1,2
6
7
Edward A. Myers1,3
8
9
10
1Department of Herpetology, The American Museum of Natural History, Central Park West and
11
79th Street, New York, NY 10024 USA
12
13
2Department of Biological Sciences, Rutgers University Newark, 195 University Ave,
14
Newark, NJ 07102
15
16
3Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian
17
Institution, Washington, DC, USA
18
19
20
21
*Corresponding author – fburbrink@amnh.org
22
23
24
25
26
27
28
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
2
Abstract:
29
Estimating species divergence with gene flow has been crucial for characterizing the gray
30
zone of speciation, which is the period of time where lineages have diverged but have not yet
31
achieved strict reproductive isolation. However, estimates of divergence times and gene flow
32
often ignores spatial information, for example the formation and shape of hybrid zones. Using
33
population genomic data from the eastern ratsnake complex (Pantherophis obsoletus), we infer
34
phylogeographic groups, gene flow, changes in demography, the timing of divergence, and
35
hybrid zone widths. We examine the spatial context of diversification by linking migration and
36
timing of divergence to the size, shape, and types of hybridization (e.g., F1, backcrosses) in
37
hybrid zones. Rates of migration between lineages are associated with the width and shape of
38
hybrid zones. Timing of divergence is not related to migration rate across species pairs and is
39
therefore a poor proxy for inferring position in the gray zone. Artificial neural network
40
approaches are applied to understand how landscape features and past climate have influenced
41
population genetic structure among these lineages prior to hybridization. The Mississippi River
42
produced the deepest divergence in this complex, whereas Pleistocene climate and elevation
43
secondarily structured lineages.
44
45
Key Words: Eastern Nearctic, migration, isolation, neural networks, reproductive isolation, cline,
46
hybrid zone
47
48
Wide-ranging species complexes that cross numerous biogeographic barriers provide
49
opportunities to better understand how changing landscapes affects diversification, gene flow,
50
demography, and the formation of hybrid zones. Lineages within species complexes ranging
51
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
3
across heterogeneous landscapes are likely at different stages of the speciation process reflecting
52
how specific environmental and biogeographic barriers have uniquely altered changes in gene
53
flow and other demographic processes in a particular complex (Myers et al. 2020). Even at the
54
same barrier within related groups of organisms, the timing of divergence and degree of gene
55
flow can be species specific and dependent on when they encountered the barrier (Riddle 2016;
56
Myers et al. 2019). Alternatively, generalist species not as constrained to specific habitats may
57
not show a correlation between genetic variation and landscape features (Joseph and Wilke 2007;
58
Makowsky et al. 2009; Lourenço et al. 2017).
59
Lineage divergence can be placed in the context of the gray zone of speciation, which
60
defines the range of time where speciation proceeds from early population differentiation with
61
unfettered gene flow to nearly reproductive isolation defined by low rates of gene flow (de
62
Queiroz, 2007; Hewitt, 2008; Jackson et al., 2016; Roux et al., 2016). The ability to delimit
63
species therefore may be correlated with their position in the gray zone, which has previously
64
been assessed by relating measures of genetic isolation and migration (the genealogical
65
divergence index; GDI) with species-delimitation probabilities (Jackson et al. 2016; Leaché et al.
66
2019). Moreover, given a model of speciation that proceeds by Dobzhansky-Muller
67
incompatibilities and genetic drift alone, the degree of gene flow might be predicted by the age
68
of lineage formation relative to population size and thus position within the gray zone (Orr 1995;
69
Gavrilets 2004; Singhal and Moritz 2013). Clearly some bias exists here; enhanced gene flow
70
between the youngest divergences may never proceed to the later stages in the gray zone as they
71
may rapidly collapse into a single lineage (Garrick et al. 2019), such as in the case where
72
physical barriers to gene flow disappear. On the other hand, strong selection may prevent young
73
lineages from exchanging alleles at particular loci, shortening the time in the gray zone (Mayr
74
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
4
1963; Barton 2010; Feder et al. 2012; Roux et al. 2016; Edwards et al. 2020). Therefore, finding
75
a threshold that can determine if two species are unique given only generations since divergence
76
may fail in cases where rates of gene flow between lineages drastically change over time and
77
space (Gourbière and Mallet 2010; Nosil et al. 2017).
78
Conceptualizing the gray zone without a spatial demographic component therefore may
79
be inadequate for understanding speciation and delimiting species. Isolation-migration models
80
(Hey and Nielsen 2004; Hey 2010) are coalescent estimators of historical processes and can
81
assess gene flow throughout time, reflecting position in the gray zone of speciation. These
82
models, however, lack any spatial component necessary to understand the underlying geographic
83
and environmental processes influencing migration. Studies of hybrid zones on the other hand
84
can clarify the location of migration and rates of contemporary gene flow, provided selection
85
does not eliminate F1 hybrids or backcrossing. If species pairs rapidly adapt to unique habitats
86
across their distribution, then age of divergence and degree of gene flow may be unrelated
87
relative to neutral processes when compared across taxa (Nosil and Crespi 2006; Agrawal et al.
88
2011; Nosil 2012; Karrenberg et al. 2019). Therefore, divergence over landscape can show how
89
selection at a barrier can be discordant from the timing of origin of the lineages (Barton and
90
Hewitt 1985; Harrison 1993; Jiggins and Mallet 2000; Gay et al. 2008; Seehausen et al. 2014;
91
Stankowski et al. 2017). Combining spatial information with genetic data will better help
92
understand how genetic variation is structured by geographic distance, environment, and the
93
presence of hybrid zones. This should provide a clearer understanding of why lineages maintain
94
particular rates of gene flow along a biogeographic barrier that has resulted in a hybrid zone.
95
Moreover, investigating gene flow in space determines if lineages were generated via isolation
96
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
5
and migration or are simply artificially partitioned groups with continuous gene flow over the
97
landscape defined by IBD (Wright 1943; Frantz et al. 2010; Bradburd et al. 2018).
98
To understand the factors shaping reproductive isolation in a species complex, we ask
99
whether divergence time predicts rates of gene flow or cline width across space in eastern
100
ratsnakes (Pantherophis obsoletus complex). This complex is a wide-ranging group of four
101
species found throughout the forested regions of the Eastern Nearctic and nearby Chihuahuan
102
Desert that have diverged at three unique biogeographic barriers. The primary divergence in this
103
complex occurred at the Mississippi River, which has consistently been identified as one of the
104
main biogeographic barriers in the ENA (Robison 1986; Burbrink et al. 2000; Soltis et al. 2006;
105
Brandley et al. 2010; Zellmer et al. 2012; Myers et al. 2020). Further east, the Appalachian
106
Mountains and Apalachicola/Chattahoochee River System (AARS) have noted barrier effects in
107
other organisms and likely contributed to the divergence between P. alleghaniensis and P.
108
spiloides (Walker and Avise 1998; Burbrink et al. 2000; Soltis et al. 2006). West of the
109
Mississippi River, divergences occurred at the transition between temperate forests to the rocky
110
areas on the western edge of the Edwards Plateau into the Chihuahuan Desert and isolated the
111
species P. obsoletus and P. bairdi (Fig.1; Lawson and Lieb 1990; Burbrink et al. 2000; Burbrink
112
2001). Since the Pliocene, the ENA has experienced numerous climate change events associated
113
with glacial cycles that forced species into refugia and compressed populations; upon climate
114
amelioration populations expanded into formerly glaciated areas (Hewitt 2000; Bintanja and van
115
de Wal 2008; Burbrink et al. 2016). Using dense population sampling and genomic-scale data,
116
we combine isolation-migration estimates with spatial information to understand species
117
divergence in the context of geographic distance, contemporary and historical environments, and
118
biogeographic boundaries. We also examine how migration rates and divergence dates relate to
119
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
6
the width and types of hybrids (F1, backcrosses) in these hybrid zones at biogeographic
120
boundaries. We also investigate demographic changes through time to understand if population
121
expansion occurs among all lineages as predicted given glacial cycling. Addressing these
122
questions provides an integrated view of phylogeographic history over a physically and
123
historically complex landscape and yields a clearer understanding of speciation processes and
124
delimitation in the gray zone.
125
126
127
Methods
128
129
Dataset
130
We sampled 288 individuals liberally covering the range of all species within the
131
Pantherophis obsoletus complex (Fig.1; Dryad XXX). DNA was extracted from all samples
132
using Qiagen DNeasy Blood & Tissue Kits and samples were screened for quality using broad-
133
range Qubit Assays. We used services from RAPiD Genomics (https://www.rapid-
134
genomics.com/services/) to generate 5472 baits and to sequence 5060 conserved elements
135
(UCEs) loci following the protocols from (Faircloth et al. 2012) and (Sun et al. 2014). These
136
markers have been used to address phylogeographic/population genetic and deeper phylogenetic
137
questions (Branstetter and Longino 2019; Younger et al. 2019). We mapped UCE reads to a
138
Chromium 10x Pantherophis spiloides genome (in prep), removed loci containing >50% missing
139
samples, and removed individual specimens missing >30% of all alignments (details on the
140
assembly of UCE loci are available in Supporting Information Material 1). We produced both
141
phased locus datasets, used in PHRAPL and PipeMaster, and single SNPs/locus for all other
142
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
7
analyses, furthermore we filtered the unlinked SNP dataset to remove alleles with a minor allele
143
frequency <0.1 and used these data for population structuring analyses.
144
145
Geographic groupings
146
We estimated population structure by comparing Discriminant Analysis of Principal
147
Components (DAPC; Jombart et al., 2010) in adegenet v2.1.2 (Jombart 2008). We compared this
148
to estimated effective migration surfaces (EEMS v; Petkova et al., 2016), which models effective
149
migration rates over geography to represent regions where migration is low in cases where
150
genetic dissimilarity increases rapidly, thus providing a uniquely distinct view of the location of
151
population clusters relative to biogeographic barriers as compared to DAPC (details on these
152
population grouping methods and other methods for assessing structure are available in
153
Supporting Information Material 1). We also examined fixation indices (FST) estimated from Nei
154
(1973) in adegenet v2.1.2 (Jombart 2008) across all loci comparing spatially adjacent lineage
155
pairs from DAPC .
156
157
Isolation, migration, and historical demography
158
We generated a coalescent-based species tree with SNAPP v1.3.0 (Bryant et al. 2012) to
159
understand relationships among the four taxa identified from clustering analyses and test species
160
delimitation. Because generating a tree in SNAPP using all individuals was computationally
161
intractable, we used four individuals per taxon that were sampled from different regions of the
162
taxon’s distribution. We estimated SNAPP trees using BEAST v2.5.2 (Bouckaert et al. 2014;
163
details on parameter setting for SNAPP are available in Supporting Information Material 1). We
164
assessed four alternative species delimitation models including four taxa, three taxa (collapsing
165
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
8
P. bairdi/P. obsoletus or P. alleghaniensis/P. spiloides), and two taxa (collapsing P. bairdi/P.
166
obsoletus and P.alleghaniensis/P. spiloides).
167
Using the four inferred lineages, we examined which models best described the origins of
168
these groups given divergence time, historical demographic change, and migration between
169
spatially adjacent taxa. We tested 2,300 candidate isolation-migration models including all
170
possible topologies with or without migration between all spatially adjacent pairs using the
171
program PHRAPL (Jackson et al. 2016) implemented in R. To further test that four genetic
172
lineages exist, we ran pairwise comparisons between the best selected model and models of three
173
populations where sister species were collapsed into a single entity; four Pantherophis species
174
against one model where P. obsoletus and P. bairdi were collapsed, and another where P.
175
spiloides and P. alleghaniensis were collapsed (details on PHRAPL runs are available in
176
Supporting Information Material 1).
177
To estimate gene flow, timing of divergence, and demographic change, we used
178
PipeMaster (Gehara et al., 2017; Gehara et al. in review) to simulate genetic data and perform
179
approximate Bayesian computation (ABC) and supervised machine-learning. Here we used the
180
best model selected in the PHRAPL analysis as a template (same topology and migration
181
parameters) to generate three competing models: (i) an isolation migration model with constant
182
population size for each lineage and constant migration, (ii) demographic change along each
183
lineage and constant migration through time and demographic change, and (iii) population size
184
change with migration occurring after the Last Glacial Maximum. We simulated 100,000 data
185
sets of 54 summary statistics and performed ABC rejection with 0.01 tolerance level to select the
186
best of these three models (Table S1). We then took the selected model and performed rejection
187
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
9
using a 0.1 tolerance level with a neural network regression using a single layer and 20 nodes to
188
estimate the model parameters.
189
190
Spatial population genetics
191
We used artificial neural networks (ANN; Lek and Guégan 1999; Legendre and Fortin
192
2010; Legendre et al. 2011) to understand how contemporary and historical aspects of the
193
landscape and climate generated genetic diversity. We first compared niche models of current
194
climates between adjacent species pairs to determine if contemporary niches were significantly
195
different prior to using these variables for understanding their effects on genetic structure (see
196
Supporting Information Material 1 for details). We then used Nei’s distance (Nei, 1972) to
197
estimate genetic distance among all individuals in the R package adegenet. For predictor
198
variables, we used 1) geographic distances measured as pairwise great-circle distances among all
199
points to account for isolation-by-distance using the R package fossil (Vavrek 2020), 2) binary
200
categorization of isolation east and west of the Mississippi River, 3) elevation, and 4) bioclimatic
201
variables at 2.5 mins representing current climate, Last Glacial Maximum (21kya), Last
202
Interglacial (130 kya), Pleistocene Marine Isotope Stage 19 (787 kya) and Mid-Pliocene
203
Warming (3.2 Ma) all obtained from the PaleoClim database (Brown et al. 2018). For the
204
bioclimatic data and elevation, we extracted parameter values given the latitude and longitude for
205
each individual sample. To reduce dimensionality of the bioclimatic variables, we transformed
206
the former using principal components analyses using the R package raster (Hijmans et al. 2014).
207
Spatial distances were transformed using principal coordinates analyses in the R package ape
208
(Paradis and Schliep 2019). For principal coordinate/component analyses we retained axes that
209
account for >95% of variance.
210
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
10
We used regression-based ANN to determine which variables predict genetic distances
211
and rank those with the highest model importance. Machine learning methods can infer non-
212
linear interactions among many predictor variables, regardless of the distribution or variable type
213
and can generate a range of models of varying complexity (Lek et al. 1996; Zhang 2010;
214
Libbrecht and Noble 2015; Sheehan et al. 2016). To conduct the ANN regressions we used the R
215
package caret (Kuhn 2008) and partitioned the data into the standard 70% training and 30% test
216
sets (Lek et al. 1996; Zhang 2010; Burbrink et al. 2020). This analysis was run using 1,000
217
maximum iterations to ensure convergence. We resampled the data using the default 25 bootstrap
218
replicates to reach convergence over the following parameters: weight decay, root mean squared
219
error, r2, and mean absolute error. We assessed the power of these models by recomposing the
220
test and training sets 100 times and compared three test statistics (root mean squared error, r2,
221
and mean absolute error) to those from randomized response variables for each of these 100 re-
222
estimated models. We ranked the most important variables over these replicates after accounting
223
for multicollinearity among variable affects using the R package Rnalytica (see Supporting
224
Information Material 1). We also compared results from our ANN analysis to those using
225
redundancy analyses (RDA, see Supporting Information Material 1 for details).
226
227
Hybrid-zone dynamics
228
To understand if the composition and width of hybrid-zones differed among adjacent
229
lineages we first estimated the composition of individuals and their locations (parentals, F1s,
230
backcrosses). We first used snapclust (Beugin et al. 2018) in adegenet to assign pure parentals
231
(P1 and P2), backcrosses and F1s between the following spatially adjacent pairs of taxa, herein
232
referred to by geographic area: 1) eastern (P. allegahniensis/P. spiloides), 2) central (P.
233
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
11
spiloides/P. obsoletus), and 3) western (P. obsoletus/P. bairdi). This method maximizes the
234
likelihood of two fixed panmictic populations using a geometric approach with an expectation-
235
maximization algorithm. We estimated group membership first using the “k-means” option in the
236
function. To check the probability that snapclust identified hybrids correctly with these data, we
237
isolated pure parental individuals (membership >0.95), used the hybridize function to generate
238
the same number of hybrids and parentals and then determined the probability that snapclust
239
finds the correct number of hybrids and parentals.
240
We then fit clines in the program HZAR v.2-9 (Derryberry et al. 2014) for the same
241
species pairs using individual genetic assignments from DAPC to determine if cline widths differ
242
among groups. We compared these to ancestry alternatively estimated in snaplcust (see
243
Supporting Information Material 1 for details). Using the Gaussian cline model, we estimated the
244
center and width of the cline and determined if these sigmoidal distributions have significant tails
245
by fitting the following models: 1) no tails, 2) right tail only, 3) left tail only, 4) mirrored tails,
246
and 5) both tails estimated independently (see Derryberry et al. 2014). We fit these models to our
247
data using AICc and ran the MCMC chains for 5x106 generations, thinned by 5x103 generations,
248
and estimated stationarity using ESS >200 in the R package CODA (Plummer et al. 2006). We
249
note that making transects through multiple locations along each hybrid zone would better define
250
the width throughout the length of the cline, though this was not possible here for all clines.
251
Using equations from Barton and Gale (1993) and applied in Bailey et al. (2015) we
252
addressed selection against hybridization for each cline. Using estimates of maximum lifetime
253
dispersal (s) for ratsnakes, here 0.937-4.3 km (Weatherhead and Blouin-Demers pers comm.),
254
we predict the width of the cline under neutrality over a range of generation times (T) since the
255
origin of each sister pair and group using the equation:
! " #$%&'
(
)$
We also predict selection
256
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
12
at the center of the cline, also using s and estimates of cline width (w) from HZAR using the
257
equation:
258
* "
+
,-
.
/
,
. We estimate these given the range of dispersal estimates and estimates for the widths
259
of clines.
260
261
All scripts, genetic, and spatial data are available on Dryad XX.
262
263
264
Results
265
Data and population genetic structure
266
After filtering loci for presence in 70% or more individuals, we generated 2,491 UCE
267
loci. For all downstream analyses we randomly sampled one SNP per locus yielding a total of
268
846 SNPs for 238 individuals for an average of 13.96% missing data.
269
Both EEMS and DAPC find population structure that generally matched the geographic
270
ranges of P. alleghaniensis, P. spiloides, P. obsoletus, and P. bairdi (Fig. 1 and Fig. S1;
271
Burbrink, 2001). All three EEMS generated similar acceptance proportions for all proposal types
272
(12-51%). Total DAPC assignment probabilities without apriori species groupings were 0.975
273
(P. bairdi = 1.0, P. obsoletus = 0.96; P. spiloides = 1.0, and P. alleghaniensis = 0.97). Similarly,
274
all three EEMS runs suggest low migration at the MR and the area separating the P. bairdi and
275
P. obsoletus in west Texas. East of the MR, estimated low migration occurred near the
276
Appalachian Mountains, though this area is a complex mixture of isolation and gene flow (Fig.
277
1).
278
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
13
These estimates of population structure all showed P. alleghaniensis as occurring mostly
279
in the Florida Peninsula and north along the east coast to Virginia/Delaware (not as far north as
280
in Burbrink 2001), P. spiloides is found from western Florida to the Mississippi River up through
281
the Midwest and east to the northeastern US, P. obsoletus is found west of the Mississippi River
282
in the forested regions of the Midwest, and P. bairdi is distributed in the Chihuahuan Desert of
283
west Texas (Figs.1-2).
284
285
Isolation-migration processes
286
Our SNAPP analyses produced a phylogeny with a root separating the taxa east and west
287
of the Mississippi River and then a sister relationship between P. bairdi and P. obsoletus and
288
then P. spiloides and P. alleghaniensis (Fig. 2), consistent with Burbrink et al., (2000). Bayes
289
factors (BF) showed the four-taxon model as decisively superior to the three-taxon (BF =35.40)
290
and two-taxon model (BF=2022.46).
291
Using those four lineages we filtered models of divergence using PHRAPL. Models with
292
all four taxa, incorporating migration between spatially adjacent lineages was preferred with the
293
same topology as found using SNAPP. DAIC between the best ranked model and next model was
294
> 20, suggesting high confidence in model selection (Table S2).
295
The demographic model estimated from PipeMaster that incorporated historical
296
population size change and constant migration (IMD) best fit the data (posterior probability 1.0;
297
see PCA plots in Fig. S2; Figs. 2 & 3). Median migration rates were highest between eastern
298
groups (P. alleghaniensis to P. spiloides = 2.79 and P. spiloides to P. alleghaniensis = 3.79
299
individuals/generation), lower in the central groups across the Mississippi River (P. spiloides to
300
P. obsoletus = 1.96 and P. obsoletus to P. spiloides = 1.28 individuals/generation), and lowest in
301
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
14
western groups (P. bairdi to P. obsoletus= 0.55 and P. obsoletus to P. bairdi = 0.65
302
individuals/generations; Table 1). Divergence occurred earliest in the central lineages at the
303
Mississippi River at 1.08 My (95% CI =0.553 to 1.674 My), then the eastern lineages at 0.650
304
My (95%CI = 0.225 to 0.983 My) and western lineages at 0.367 My (95% CI 0.367 to 0.875 My;
305
Table 1). Divergence dates and average rates of migration were poorly correlated (ρ=0.264,
306
P=0.83, n=3). For all taxa we inferred population expansion from ancestral Ne sizes ranging from
307
5,200 to 5,600 to modern sizes at 138,000 to 281,000. The posterior distributions for all
308
parameters were different from the uniform priors, though wide distributions of several
309
parameters suggested some uncertainty in estimates (Fig. 3).
310
311
Spatial population genetics and hybrid zones
312
We used ANN to determine what features of the landscape and environmental layers
313
through time best predict genetic structure (Fig.4). Contemporary niches between taxa were
314
significantly different for all taxa (see Supporting Information Material 1 for details, Fig. S3).
315
Estimates of accuracy using ANN were high (>90%) for all comparisons. Most genetic structure
316
can be predicted by the Mississippi River showing 100% variable importance (Table S3).
317
Distance also played a role in structuring these data (although they were partialed out in RDA
318
analyses yielding similar conclusions regarding other landscape variables, see Supporting
319
Information Material 1). For those lineages east of the Mississippi River, population structure
320
was predicted by climate at the Last Glacial Maxima and elevation (variable importance=100).
321
For those lineages west of the Mississippi River, climate during Pleistocene Marine Isotope
322
Stage 19 and elevation structured these lineages (variable importance = 100).
323
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
15
Using snapclust, we showed that the number of hybrids of all types decreases with
324
species pair comparisons from east to west (Fig. 5). The parentals, hybrids and backcrosses were
325
predicted with a mean accuracy of 0.99 in simulations. All hybrid types were spatially oriented
326
between parental types where the number of hybrids and backcrosses between the eastern groups
327
approached the number of parental individuals (49%), for the central groups the hybrids and
328
backcrosses to parental ratio was 5.9%, and for the western groups there were no hybrids
329
detected (Fig. 5). It is important to note, however, that sampling was more concentrated between
330
P. alleghaniensis/P. spiloides compared to the other pairs.
331
For the eastern lineages, HZAR showed strong support for a right-tailed model
332
(ΔAICDAPC =-31.96 - -107.42), whereas central lineages at the Mississippi River (ΔAICDAPC = -
333
4.26 - -8.869) and western lineages at the Chihuahuan Desert and Edward’s Plateau (ΔAICDAPC =
334
-5.51 - -221.35) both supported a no-tails model (Fig. 6; Table S4 & S5). However, estimating
335
tailed models for the latter two comparisons may have required more samples than were
336
available. The width of clines decreased from the eastern lineages (width=126 km, 1st and 3rd
337
quartiles 114-138 km, tail length =53 km, 1st and 3rd quartiles 44-61 km), to central (width = 98
338
km, 1st and 3rd quartile 89-107 km), to the western lineages (width = 8 km, 1st and 3rd quartile =4-
339
15 km; Table 1). Sharply declining hybrid zones characterized the central and western lineages,
340
whereas the cline for eastern lineages showed a long right tail extending north.
341
Hybrid zone widths under neutrality should be larger than the actual widths given that
342
migration between sister pairs likely occurred prior to the LGM (Fig. S4). Contact time for each
343
of these pairs to approach the same actual width sizes all occurred within the Holocene: 516-
344
10,785 ybp (eastern lineages), 261-5424 ybp (central lineages) and 3-27 ybp (western lineages).
345
We also showed that selection against hybrids at the center of hybrid zones ranged from a
346
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
16
median of 0.3 % (0.02-0.6%) for eastern, 0.5% (0.03 to 1%) for central, and 79% (1.5-100%) for
347
western lineages.
348
When using a single SNP per locus we showed that 30% of loci from central lineages,
349
10.9% of loci for western lineages, and only 1.9% of loci for eastern lineages attained an Fst
350
>0.35. Given demographic histories between comparisons, this threshold for Fst is generally
351
considered extreme population disconnection beyond which advantageous loci will be exchanged
352
between populations (Wright 1978; Frankham et al. 2010).
353
354
Discussion
355
Grey Zone Dynamics
356
To understand the factors shaping reproductive isolation in a young species complex, we
357
ask whether divergence time predicts rates of gene flow or cline width across space in eastern
358
ratsnakes. This species complex is composed of four divergent lineages, with P. alleghaniensis
359
and P. spiloides meeting in the southeastern US at the intersection of subtropical and temperate
360
areas, P. spiloides and P. obsoletus separated at the Mississippi River, and P. obsoletus and P.
361
bairdi meeting at the connection on the forested and rocky habitats on the western edge of the
362
Edwards Plateau. All three spatially adjacent pairs of taxa show variable migration each having
363
unique cline shapes. The western lineages (P. obsoletus, P. bairdi) show very low migration
364
rates, a cline equivalent to only 0.35% of the parental range, fixation (Fst > 0.35) in 10.9% of
365
loci, and likely selection against hybrids. This is in comparison with central lineages (P.
366
obsoletus, P. spiloides) showing more migration, a cline occupying 5% of the parental ranges,
367
but with 35% of loci showing fixation, and possibly lower selection. Finally, eastern lineages (P.
368
alleghanienis, P. spiloides) have incomplete reproductive isolation (showing numerous F1 and
369
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
17
backcrosses) with only 1.9% of loci showing fixation, and high migration rates with a clinal tail
370
(Figs. 3 & 6). Where reproductive isolation occurs given low numbers of migrants/generation,
371
there is increased fixation among loci and steeper clines occur. The inverse is also true; when
372
reproductive isolation is incomplete, higher rates of gene flow less fixation of loci, and variable
373
cline widths occur. Importantly, however, reproductive isolation, gene flow estimates, and hybrid
374
clines show no association with timing of divergence.
375
Hybrid zone dynamics help to understand the interactions between pairs of taxa as
376
characteristics of the hybrid zone reveal whether there is selection against hybrids (tension
377
zones), selection for hybrids (bounded superiority), or ecological gradients with reduced fitness.
378
They also reveal whether selection is endogenous (genomic) or exogenous (environmental;
379
Barton and Hewitt 1985; Barton and Gale 1993). Our models indicate that migration at the
380
Mississippi River and in the eastern lineages occurred through the Pleistocene suggesting some
381
form of a hybrid zone existed through major climate change cycles. Given this age and assuming
382
selective neutrality, we demonstrate that these hybrid zones should be much larger, in most cases
383
exceeding the ranges of parental taxa (Fig. S4). The widths of the current hybrid zones are
384
maintained over time, preserving the historical and geographic identity of the parental taxa.
385
While there is some selection against hybrids as estimated from hybrid zone width and dispersal,
386
depending on species-pair comparison, it is possible that other factors may be constraining these
387
hybrid zones like fitness changes along an ecological gradient or isolation by environment (Case
388
and Taper 2000; Case et al. 2005; McEntee et al. 2018). There is support for this hypothesis
389
given significant differences among niches for all adjacent pairs of lineages (Fig S3).
390
391
Influence of barriers and environment on population structure
392
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
18
These ratsnake lineages are structured by a complex of physical barriers, historical
393
climate profiles, and geographic distance. Machine learning approaches show that the
394
Mississippi River has served as a strong barrier for isolating lineages of ratsnakes, forming
395
groups east and west of this river during the mid-Pleistocene at 1.08 My (95% CI =0.553 to
396
1.674 My; Figs 1&4, Table 1). These results are consistent with previous mtDNA and
397
morphological conclusions (Burbrink, 2001; Burbrink et al., 2000). The Mississippi River has
398
served as a strong barrier to gene flow given evidence from historical estimates of increased
399
drainage (Cox et al. 2014) and numerous unrelated organisms show genetic disconnection across
400
the river (Robison 1986; Burbrink 2002; Soltis et al. 2006; Burbrink et al. 2008; Pyron and
401
Burbrink 2009; Brandley et al. 2010; Satler and Carstens 2017; Myers et al. 2020).
402
Hybridization, while low (1-2 individuals/generation, Fig.3), does occur in a zone (~100km) east
403
of the Mississippi River between the central lineages (P. obsoletus and P. spiloides; Fig. 6).
404
Furthermore, divergence in this group can be predicted by elevation, geographic distance, and
405
mid-Pleistocene climate change (787kya). The timing of the mid-Pleistocene climate change falls
406
within the timing of divergence estimated from PipeMaster (Fig.4).
407
The Mississippi River is responsible for the earliest divergence in this complex but other
408
geographic features are also important. East of the Mississippi River, divergence between P.
409
alleghaniensis and P. spiloides was influenced by elevation, climate at the LGM, and geographic
410
distance after accounting for variable inflation (Fig 4). For these taxa, divergence occurred at the
411
Appalachian Mountains and Apalachicola/Chattahoochee River System, which suggests that
412
changes in elevation at the Appalachian Mountains, in part, played a role in isolating lineages or
413
maintaining divergence. Finer-scale testing is necessary to tease apart the effects of these river
414
systems, ancient embayments, and connections to uplifted areas. Although the timing of
415
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
19
divergence predates the LGM, it is possible that repeated 100,000 year glacial cycles isolated
416
these taxa into refugia, as has been suggested previously for ratsnakes and other organisms
417
(Waltari et al. 2007; Noss et al. 2015). Alternatively, Pantherophis alleghaniensis may have
418
diverged in an isolated Florida, due to sea level changes during interglacials, and subsequently
419
expanded north and west. The concentration of the yellow, striped morphology in the range of P.
420
alleghaniensis in Florida and parts of the southeastern coast of the US indicates that this isolation
421
likely resulted in these morphologically unique taxa (Schultz 1996; Burbrink 2001). This area of
422
isolation, defined as a suture zone between the Florida peninsula and the continental US has been
423
found for many taxa (Remington 1968; Swenson and Howard 2005; Ruane et al. 2015). While
424
these two forms can be delimited with genetic data and occupy distinctly different geographic
425
regions, reproductive isolation is not complete. However, niches are significantly different and P.
426
alleghaniensis is typically found in the Florida peninsula and coastal plain environments (Bailey
427
1995; Burbrink 2001), whereas P. spiloides is found throughout the remainder of the forested
428
habitats east of the Mississippi River including the ecoregions defined as interior river valleys
429
and hills, interior plateau, Appalachian habitats, southeastern and southcentral plains, and
430
Mississippi River valley. Uncertainty in the identity of taxa using morphology where these taxa
431
meet is likely due to extensive gene flow in hybrid zones at barriers following postglacial range
432
expansion (Fig. 1, 3; Burbrink 2001; Gibbs et al. 2006).
433
Divergence between P. bairdi and P. obsoletus occurred in the western edge of the
434
Edwardas Plateau, with the former extending westward into xeric forested or Chihuahuan desert
435
scrub habitats always associated with rocky habitats, whereas the latter extends east into Texas,
436
preferring forested habitats, (Lawson and Lieb 1990; Werler and Dixon 2000). Population
437
genetic structure between these taxa is associated with mid-Pleistocene climate, elevation, and
438
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
20
geographic distance (Fig. 4). Dates of origin of these taxa are within the mid-Pleistocene,
439
suggesting that past climate played a strong role in promoting population divergence. During this
440
period, the Chihuahuan Desert was much drier relative to the late-Pleistocene (Graham and Mead
441
1987; Metcalfe et al. 2002; Metcalfe 2006) and this increased aridity could have driven
442
ecological divergence between this species pair. While divergence time is not the oldest here,
443
migration rates are lowest, and a hybrid zone is narrow. Previous research shows limited
444
hybridization with backcrosses between these taxa occurring in a narrow region of the
445
southwestern Edwards Plateau (Lawson and Lieb 1990; Vandewege et al. 2012).
446
Despite each lineage occupying uniquely different habitats, population responses to
447
Pleistocene climate change were similar (Fig. 3). Given the divergence time for each group, Ne
448
has expanded by 50 times since the Pleistocene. These estimates are consistent with other
449
predictions in the ENA, where 75% of the tested vertebrates show population size expansion and
450
75% of tested snakes show synchronous expansion of Ne, likely coinciding with the retreat of
451
glaciers at various times in the Pleistocene or during the Holocene (Bintanja and van de Wal
452
2008; Burbrink et al. 2016). However, Pantherophis bairdi has a range only in a small area of
453
the west Texas and isolated populations in northeastern Mexico, where habitats may have been
454
indirectly affected by glacial cycles.
455
456
Taxonomy and delimitation in the genomic age
457
Delimiting species using genetic data, while useful, has proved controversial (Bauer et al.
458
2011; Sukumaran and Knowles 2017; Leaché et al. 2019). In particular, most genetic
459
delimitation methods fail to account for gene flow (but see Flouris et al. 2019), do not consider
460
spatial information, and place individuals into predefined groups prior to testing (O’Meara
461
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
21
2010). These methods all assume species diverge at an evolutionary time scale where mutations
462
accumulate through drift or selection (Rannala 2015). And while simulations show that a
463
threshold above one migrating individual/10 generations will cause model-based delimitation
464
methods to fail at finding two species (Zhang et al. 2011), it is not clear how pulses or
465
inconsistent migration affect species transitioning through the gray zone. Importantly, Wright
466
(1931) indicated that up to 1,000 individuals migrating/generation may not prevent populations
467
from drifting given effective population size and spatial segregation.
468
Modern thresholds for species delimitation (Jackson et al. 2016; Leaché et al. 2019) use
469
GDI (>0.7), estimates of species divergence time over population size (2τ/θ>1), absolute
470
divergence time (104) and maximum migration rates (M=Nm<1), metrics that are all likely
471
influenced by sampling effort, location of samples sequenced, and detection of hybrids. For
472
example, if large hybrid zones exist but remain poorly sampled then estimates of migration will
473
be reduced, GDI will be high, and the probability of MSC estimators will suggest multiple
474
species. Conversely, if only hybrid zones are sampled then all metrics will predict only a single
475
species. For all pairs of ratsnakes 2τ/θ>1 and divergence time is always greater than 104.
476
However, migration rates remain high between adjacent pairs in the eastern lineages. This forces
477
us to address how we are defining a species in the context of hybrid zones, how we infer
478
reproductive isolation beyond MSC methods, and how selection is maintaining hybrid zones. For
479
all lineages of ratsnakes, hybrid zones are smaller than predicted by neutral selection if these
480
hybrid zones formed in the Pleistocene. It is possible that these hybrid zones formed after the
481
retreat of the last glacial cycle during the Holocene. However, geographic lineage identity has
482
been maintained for >133,000 generations in all pairs, regardless of migration, and niches are
483
unique among all lineages. Moreover, where hybrid zones extend over large areas, gene flow
484
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
22
likely varies throughout space and may have changed over time (Barton and Hewitt 1985; Mallet
485
2007), particularly in response to glacial cycles. Thus, identifying and delimiting species in the
486
context of admixture is complex. Still, maintaining lineage and geographic identity, even with a
487
hybrid zone suggests that selection may have maintained unique evolutionary trajectories.
488
Here we have demonstrated that these four species, delimited previously using mtDNA
489
and morphology (Burbrink, 2001; Burbrink et al., 2000) can be recognized as distinct taxa given
490
methods of spatial grouping and species delimitation, though ranges in the eastern-most lineage
491
are restricted much further south than previously thought. However, migration rates between P.
492
spiloides and P. alleghaniensis (Fig. 3) are high enough to draw some concern that these taxa
493
should not be recognized given that they may not represent independent evolutionary trajectories
494
(de Queiroz, 2007; Rannala, 2015; Zhang et al., 2011). Unique niches in the subtropical areas of
495
the southeastern US and morphology provides some evidence for isolation between these taxa:
496
the striped forms (with variable ground colors of orange, yellow, green, and gray) are indicative
497
of P. alleghaniensis whereas adults with dark saddle patterns on grey, olive, brown, or black
498
ground colors (or completely black with no pattern) are found west of Florida and east of the
499
Mississippi River, throughout the Midwest and up to the northeastern US and Canada represent
500
P. spiloides. However, hybridization north and west of Florida is likely represented by a dulling
501
of this yellow ground color and mixed patterned individuals (Schultz 1996; see Supporting
502
Information Material 1 for more information on taxonomy).
503
504
Future Directions
505
To better understand the complexity of speciation processes, an even more integrated
506
whole-genomic approach with dense population sampling should be implemented. Even when
507
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
23
researchers have demonstrated that integrating spatial information along with estimates of timing
508
and gene-flow among groups presents a more comprehensive picture of speciation over a
509
complex landscape, there still remain several areas of investigation. For instance, estimating
510
gene flow and cline shape using detailed transects across various intersections along hybrid
511
zones would provide a better understanding of how migration changes in different habitats at
512
different latitudes and what type of hybrid zone is present. Additionally, better integrating
513
spatial, habitat, and migration estimates would allow researchers to test dynamic speciation
514
models that could be used to understand how migration changes over the landscape at various
515
times throughout the history of these taxa. Finally, using whole genome data along clines will
516
help determine if selection is occurring in the hybrid zones and determine the function of loci
517
showing evidence of selection and resisting gene flow.
518
519
Author contributions: FTB and EAM contributed to data collection, all authors contributed to
520
analyses, FTB wrote the initial draft, and all authors edited subsequent versions.
521
522
Acknowledgments: We thank R. Glor, R. Brown, and L. Welton at KU and C. Austin and D.
523
Dittman at LSUMN and K. Krysko and FLMNH, D. Shepard at Louisiana Tech University, and
524
A. McKelvy for providing tissues. We also thank L. and G. Derryberry for help with HZAR. We
525
also thank E. Chen for help extracting DNA. EAM was funded by the Gerstner Scholar/Theodore
526
Roosevelt and Peter Buck/Walter Rathbone Bacon fellowships. We thank D. Frost, A. Leaché,
527
A. Bauer, and K. deQuieroz for helpful discussions on species delimitation, type specimens, and
528
admixed type specimens.
529
530
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
24
Data Accessibility Statement: All data and unique scripts are available on Dryad XXX.
531
532
533
534
Literature Cited
535
Agrawal, A. F., J. L. Feder, and P. Nosil. 2011. Ecological Divergence and the Origins of
536
Intrinsic Postmating Isolation with Gene Flow. Int. J. Ecol. 2011:1–15.
537
Bailey, R. G. 1995. Description of the ecoregions of the United States. 2nd ed. USDA Forest
538
Service.
539
Bailey, R. I., M. R. Tesaker, C. N. Trier, and G.-P. Saetre. 2015. Strong selection on male
540
plumage in a hybrid zone between a hybrid bird species and one of its parents. J. Evol. Biol.
541
28:1257–1269.
542
Barton, H. H., and K. Gale. 1993. Genetic analysis of hybrid zones. Pp. 13–45 in Hybrid Zones
543
and the Evolutionary Process. Oxford University Press, New York.
544
Barton, N. H. 2010. What role does natural selection play in speciation? Philos. Trans. R. Soc.
545
Lond. B. Biol. Sci. 365:1825–40.
546
Barton, N. H., and G. M. Hewitt. 1985. Analysis of Hybrid Zones. Annu. Rev. Ecol. Syst.
547
16:113–148.
548
Bauer, A. M., J. F. Parham, R. M. Brown, B. L. Stuart, L. Grismer, T. J. Papenfuss, W. Böhme,
549
J. M. Savage, S. Carranza, J. L. Grismer, P. Wagner, A. Schmitz, N. B. Ananjeva, and R. F.
550
Inger. 2011. Availability of new Bayesian-delimited gecko names and the importance of
551
character-based species descriptions. Proc. R. Soc. B Biol. Sci. 278:490.
552
Beugin, M.-P., T. Gayet, D. Pontier, S. Devillard, T. Jombart, and T. Hansen. 2018. A fast
553
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
25
likelihood solution to the genetic clustering problem. Methods Ecol. Evol. 9:1006–1016.
554
Bintanja, R., and R. S. W. van de Wal. 2008. North American ice-sheet dynamics and the onset
555
of 100,000-year glacial cycles. Nature 454:869–72.
556
Bouckaert, R., J. Heled, D. Kühnert, T. Vaughan, C.-H. Wu, D. Xie, M. A. Suchard, A.
557
Rambaut, and A. J. Drummond. 2014. BEAST 2: A software platform for Bayesian
558
evolutionary analysis. PLoS Comput. Biol. 10:e1003537.
559
Bradburd, G. S., G. M. Coop, and P. L. Ralph. 2018. Inferring continuous and discrete
560
population genetic structure across space. Genetics 210:33–52.
561
Brandley, M. C., T. J. Guiher, R. A. Pyron, C. T. T. Winne, and F. T. Burbrink. 2010. Does
562
dispersal across an aquatic geographic barrier obscure phylogeographic structure in the
563
diamond-backed watersnake (Nerodia rhombifer)? Mol. Phylogenet. Evol. 57:552–560.
564
Branstetter, M. G., and J. T. Longino. 2019. Ultra-conserved element phylogenomics of New
565
World Ponera (Hymenoptera: Formicidae) illuminates the origin and phylogeographic
566
history of the endemic exotic ant Ponera exotica. Insect Syst. Divers. 3.
567
Brown, J. L., D. J. Hill, A. M. Dolan, A. C. Carnaval, and A. M. Haywood. 2018. PaleoClim,
568
high spatial resolution paleoclimate surfaces for global land areas. Sci. Data 5:180254.
569
Bryant, D., R. Bouckaert, J. Felsenstein, N. A. Rosenberg, and A. RoyChoudhury. 2012.
570
Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full
571
coalescent analysis. Mol. Biol. Evol. 29:1917–1932.
572
Burbrink, F. T. 2002. Phylogeographic analysis of the cornsnake (Elaphe guttata) complex as
573
inferred from maximum likelihood and Bayesian analyses. Mol. Phylogenet. Evol. 25:465–
574
476.
575
Burbrink, F. T. 2001. Systematics of the eastern ratsnake complex (Elaphe obsoleta). Herpetol.
576
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
26
Monogr. 1–53.
577
Burbrink, F. T., Y. L. Chan, E. A. Myers, S. Ruane, B. T. Smith, and M. J. Hickerson. 2016.
578
Asynchronous demographic responses to Pleistocene climate change in Eastern Nearctic
579
vertebrates. Ecol. Lett. 19:1457–1467.
580
Burbrink, F. T., F. Fontanella, R. A. Pyron, T. J. Guiher, C. Jimenez, and R. Alexander Pyron.
581
2008. Phylogeography across a continent: The evolutionary and demographic history of the
582
North American racer (Serpentes: Colubridae: Coluber constrictor). Mol. Phylogenet.
583
Evol. 47:274–288.
584
Burbrink, F. T., F. G. Grazziotin, R. A. Pyron, D. Cundall, S. Donnellan, F. Irish, J. S. Keogh, F.
585
Kraus, R. W. Murphy, B. Noonan, C. J. Raxworthy, S. Ruane, A. R. Lemmon, E. M.
586
Lemmon, and H. Zaher. 2020. Interrogating genomic-scale data for Squamata (lizards,
587
snakes, and amphisbaenians) shows no support for key traditional morphological
588
relationships. Syst. Biol. 69:502–520.
589
Burbrink, F. T., R. Lawson, and J. B. Slowinski. 2000. Mitochondrial DNA phylogeography of
590
the polytypic North American rat snake (Elaphe obsoleta): A critique of the subspecies
591
concept. Evolution (N. Y). 54:2107–2118.
592
Case, T. J., R. D. Holt, M. A. McPeek, and T. H. Keitt. 2005. The community context of species’
593
borders: ecological and evolutionary perspectives. Oikos 108:28–46.
594
Case, T. J., and M. L. Taper. 2000. Interspecific competition, environmental gradients, gene
595
flow, and the coevolution of species’ borders. Am. Nat. 155:583–605.
596
Cox, R. T., D. N. Lumsden, and R. B. Van Arsdale. 2014. Possible relict meanders of the
597
Pliocene Mississippi River and their implications. J. Geol. 122:609–622.
598
De Queiroz, K. 2007. Species concepts and species delimitation. Syst. Biol. 56:879–886.
599
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
27
Derryberry, E. P., G. E. Derryberry, J. M. Maley, and R. T. Brumfield. 2014. hzar: hybrid zone
600
analysis using an R software package. Mol. Ecol. Resour. 14:652–663.
601
Edwards, S., R. Hopkins, and J. Mallet. 2020. Speciation. Pp. 296–318 in S. Scheiner and D.
602
Mindell, eds. The Theory of Evolution. University of Chicago Press, Chicago.
603
Faircloth, B. C., J. E. McCormack, N. G. Crawford, M. G. Harvey, R. T. Brumfield, and T. C.
604
Glenn. 2012. Ultraconserved Elements Anchor Thousands of Genetic Markers Spanning
605
Multiple Evolutionary Timescales. Syst. Biol. 61:717–726.
606
Feder, J. L., S. P. Egan, and P. Nosil. 2012. The genomics of speciation-with-gene-flow. Trends
607
Genet. 28:342–350.
608
Flouris, T., X. Jiao, B. Rannala, and Z. Yang. 2019. A Bayesian implementation of the
609
multispecies coalescent model with introgression for phylogenomic analysis. Mol. Biol.
610
Evol., doi: 10.1093/molbev/msz296.
611
Frankham, R., J. D. (Jonathan D. . Ballou, and D. A. (David A. Briscoe. 2010. Introduction to
612
conservation genetics. Cambridge University Press.
613
Frantz, A. C., L. C. Pope, T. R. Etherington, G. J. Wilson, and T. Burke. 2010. Using isolation-
614
by-distance-based approaches to assess the barrier effect of linear landscape elements on
615
badger (Meles meles) dispersal. Mol. Ecol. 19:1663–1674.
616
Garrick, R. C., J. D. Banusiewicz, S. Burgess, C. Hyseni, and R. E. Symula. 2019. Extending
617
phylogeography to account for lineage fusion. J. Biogeogr. 46:268–278.
618
Gavrilets, S. 2004. Fitness landscapes and the origin of species. Princeton University Press.
619
Gay, L., P.-A. Crochet, D. A. Bell, and T. Lenormand. 2008. Comparing clines on molecular and
620
phenotypic traits in hybrid zones: a window on tension zone models. Evolution (N. Y).
621
62:2789–2806.
622
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
28
Gehara, M., A. A. Garda, F. P. Werneck, E. F. Oliveira, E. M. da Fonseca, F. Camurugi, F. d. M.
623
Magalhães, F. M. F. M. Lanna, J. W. Sites, R. Marques, R. Silveira-Filho, V. A. São Pedro,
624
G. R. Colli, G. C. Costa, and F. T. Burbrink. 2017. Estimating synchronous demographic
625
changes across populations using hABC and its application for a herpetological community
626
from northeastern Brazil. Mol. Ecol. 26:4756–4771.
627
Gibbs, H. L., S. J. Corey, G. Blouin-Demers, K. A. Prior, and P. J. Weatherhead. 2006.
628
Hybridization between mtDNA-defined phylogeographic lineages of black ratsnakes
629
(Pantherophis sp.). Mol. Ecol. 15:3755–3767.
630
Gourbière, S., and J. Mallet. 2010. Are species real? The shape of the species boundry with
631
exponential failure, reinforcement, and the “missing snowball.” Evolution (N. Y). 64:1–24.
632
Graham, R. W., and J. I. Mead. 1987. Environmental fluctuations and evolution of mammalian
633
faunas during the last deglaciation in North America. Pp. 371–402 in North America and
634
Adjacent Oceans During the Last Deglaciation. Geological Society of America, Boulder,
635
Colorado 80301.
636
Harrison, R. G. 1993. Hybrid zones and the evolutionary process. Oxford University Press.
637
Hewitt, G. 2000. The genetic legacy of the Quaternary ice ages. Nature 405:907–913.
638
Hewitt, G. M. 2008. Speciation, hybrid zones and phylogeography - or seeing genes in space and
639
time. Mol. Ecol. 10:537–549.
640
Hey, J. 2010. Isolation with migration models for more than two populations. Mol. Biol. Evol.
641
27:905–920.
642
Hey, J., and R. Nielsen. 2004. Multilocus methods for estimating population sizes, migration
643
rates and divergence time, with applications to the divergence of Drosophila pseudoobscura
644
and D-persimilis. Genetics 167:747–760.
645
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
29
Hijmans, R. J., J. van Etten, and J. Checg. 2014. Geographic Data Analysis and Modeling [R
646
package raster version 2.5-8]. Comprehensive R Archive Network (CRAN).
647
Jackson, N. D., B. C. Carstens, A. E. Morales, and B. C. O’Meara. 2016. Species delimitation
648
with gene flow. Syst. Biol. 66:799–812.
649
Jiggins, C. D., and J. Mallet. 2000. Bimodal hybrid zones and speciation. Trends Ecol. Evol.
650
15:250–255.
651
Jombart, T. 2008. adegenet: a R package for the multivariate analysis of genetic markers.
652
Bioinformatics 24:1403–1405.
653
Jombart, T., S. Devillard, and F. Balloux. 2010. Discriminant analysis of principal components: a
654
new method for the analysis of genetically structured populations. BMC Genet. 11:94.
655
Joseph, L., and T. Wilke. 2007. Lack of phylogeographic structure in three widespread
656
Australian birds reinforces emerging challenges in Australian historical biogeography. J.
657
Biogeogr. 34:612–624.
658
Karrenberg, S., X. Liu, E. Hallander, A. Favre, J. Herforth0Rahmé, and A. Widmer. 2019.
659
Ecological divergence plays an important role in strong but complex reproductive isolation
660
in campions (Silene ). Evolution (N. Y). 73:245–261.
661
Kuhn, M. 2008. Caret: Classification and Regression Training Package. R package version 6.0-
662
77.
663
Lawson, R., and C. S. Lieb. 1990. Variation and hybridization in Elaphe bairdi (Serpentes:
664
Colubridae). J. Herpetol. 24:280.
665
Leaché, A. D., T. Zhu, B. Rannala, and Z. Yang. 2019. The spectre of too many species. Syst.
666
Biol. 68:168–181.
667
Legendre, P., and M. J. Fortin. 2010. Comparison of the Mantel test and alternative approaches
668
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
30
for detecting complex multivariate relationships in the spatial analysis of genetic data. Mol.
669
Ecol. Resour. 10:831–844.
670
Legendre, P., J. Oksanen, and C. J. F. ter Braak. 2011. Testing the significance of canonical axes
671
in redundancy analysis. Methods Ecol. Evol. 2:269–277.
672
Lek, S., M. Delacoste, P. Baran, I. Dimopoulos, J. Lauga, and S. Aulagnier. 1996. Application of
673
neural networks to modelling nonlinear relationships in ecology. Ecol. Modell. 90:39–52.
674
Lek, S., and J. F. Gue. 1999. Artificial neural networks as a tool in ecological modelling , an
675
introduction. 120:65–73.
676
Libbrecht, M. W., and W. S. Noble. 2015. Machine learning applications in genetics and
677
genomics. Nat. Rev. Genet. 16:321–332.
678
Lourenço, C. R., K. R. Nicastro, C. D. McQuaid, R. M. Chefaoui, J. Assis, M. Z. Taleb, and G. I.
679
Zardi. 2017. Evidence for rangewide panmixia despite multiple barriers to dispersal in a
680
marine mussel. Sci. Rep. 7:10279. Nature Publishing Group.
681
Makowsky, R., J. Chesser, and L. J. Rissler. 2009. A striking lack of genetic diversity across the
682
wide-ranging amphibian Gastrophryne carolinensis (Anura: Microhylidae). Genetica
683
135:169–183. Springer Netherlands.
684
Mallet, J. 2007. Hybrid speciation. Nature 446:279–283. Nature Publishing Group.
685
Mayr, E. 1963. Animal species and evolution. Belknap Press of Harvard University Press,
686
Cambridge,.
687
McEntee, J. P., J. G. Burleigh, and S. Singhal. 2018. Dispersal predicts hybrid zone widths
688
across animal diversity: Implications for species borders under incomplete reproductive
689
isolation. bioRxiv 472506. Cold Spring Harbor Laboratory.
690
Metcalfe, S. E. 2006. Late Quaternary environments of the northern deserts and central
691
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
31
transvolcanic belt of Mexico. Ann. Missouri Bot. Gard. 93:258–273. Missouri Botanical
692
Garden Press.
693
Metcalfe, S., A. Say, S. Black, R. McCulloch, and S. O’Hara. 2002. Wet Conditions during the
694
Last Glaciation in the Chihuahuan Desert, Alta Babicora Basin, Mexico. Quat. Res. 57:91–
695
101.
696
Myers, E. A., A. D. McKelvy, and F. T. Burbrink. 2020. Biogeographic barriers, Pleistocene
697
refugia, and climatic gradients in the southeastern Nearctic drive diversification in
698
cornsnakes ( Pantherophis guttatus complex). Mol. Ecol. 29:797–811. John Wiley & Sons,
699
Ltd.
700
Myers, E. A., A. T. Xue, M. Gehara, C. L. Cox, A. R. Davis Rabosky, J. Lemos0Espinal, J. E.
701
Martínez0Gómez, and F. T. Burbrink. 2019. Environmental heterogeneity and not vicariant
702
biogeographic barriers generate community0wide population structure in desert0adapted
703
snakes. Mol. Ecol. 28:4535–4548.
704
Nei, M. 1973. Analysis of gene diversity in subdivided populations. Proc Natl Acad Sci USA
705
70:3321–3323.
706
Nosil, P. 2012. Ecological Speciation. Oxford University Press, Oxford.
707
Nosil, P., and B. J. Crespi. 2006. Ecological divergence promotes the evolution of cryptic
708
reproductive isolation. Proc. R. Soc. B-Biological Sci. 273:991–997.
709
Nosil, P., J. L. Feder, S. M. Flaxman, and Z. Gompert. 2017. Tipping points in the dynamics of
710
speciation. Nat. Ecol. Evol. 1:1. Nat Ecol Evol.
711
Noss, R. F., W. J. Platt, B. A. Sorrie, A. S. Weakley, D. B. Means, J. Costanza, and R. K. Peet.
712
2015. How global biodiversity hotspots may go unrecognized: lessons from the North
713
American Coastal Plain. Divers. Distrib. 21:236–244.
714
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
32
O’Meara, B. C. 2010. New heuristic methods for joint species delimitation and species tree
715
inference. Syst. Biol. 59:59–73.
716
Orr, H. A. 1995. The population genetics of speciation: the evolution of hybrid incompatibilities.
717
Genetics 139:1805–13. Genetics Society of America.
718
Paradis, E., and K. Schliep. 2019. ape 5.0: an environment for modern phylogenetics and
719
evolutionary analyses in R. Bioinformatics 35:526–528. .
720
Petkova, D., J. Novembre, and M. Stephens. 2016. Visualizing spatial population structure with
721
estimated effective migration surfaces. Nat. Genet. 48:94–100. Nature Publishing Group.
722
Plummer, M., N. Best, K. Cowles, and K. Vines. 2006. CODA: Convergence diagnosis and
723
output analysis for MCMC. R News 6:7–11.
724
Pyron, R. A., and F. T. Burbrink. 2009. Lineage diversification in a widespread species: roles for
725
niche divergence and conservatism in the common kingsnake, Lampropeltis getula. Mol.
726
Ecol. 18:3443–3457.
727
Rannala, B. 2015. The art and science of species delimitation. Curr. Zool. 61:846–853.
728
Remington, C. L. 1968. Suture-zones of hybrid interaction between recently joined biotas. Pp.
729
321–428 in Evolutionary Biology. Springer US, Boston, MA.
730
Riddle, B. R. 2016. Comparative phylogeography clarifies the complexity and problems of
731
continental distribution that drove A. R. Wallace to favor islands. Proc. Natl. Acad. Sci. U.
732
S. A. 113:7970.
733
Robison, H. W. 1986. Zoogeographic implications of the Mississippi River Basin. Pp. 267–285
734
in The Zoogeography of North American Freshwater Fishes. Wiley, New York.
735
Roux, C., C. Fraïsse, J. Romiguier, Y. Anciaux, N. Galtier, and N. Bierne. 2016. Shedding light
736
on the grey zone of speciation along a continuum of genomic divergence. PLoS Biol.
737
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
33
14:e2000234.
738
Ruane, S., O. Torres-Carvajal, and F. T. Burbrink. 2015. Independent demographic responses to
739
climate change among temperate and tropical milksnakes (Colubridae: Genus
740
Lampropeltis). PLoS One 10:e0128543.
741
Satler, J. D., and B. C. Carstens. 2017. Do ecological communities disperse across biogeographic
742
barriers as a unit? Mol. Ecol. 26:3533–3545.
743
Schultz, K. D. 1996. A monograph of the colubrid snakes of the genus Elaphe Fitzinger. Koelz
744
Scientific Books, Havlicuv.
745
Seehausen, O., R. K. Butlin, I. Keller, C. E. Wagner, J. W. Boughman, P. A. Hohenlohe, C. L.
746
Peichel, G.-P. Saetre, C. Bank, A. Brannstrom, A. Brelsford, C. S. Clarkson, F.
747
Eroukhmanoff, J. L. Feder, M. C. Fischer, A. D. Foote, P. Franchini, C. D. Jiggins, F. C.
748
Jones, A. K. Lindholm, K. Lucek, M. E. Maan, D. A. Marques, S. H. Martin, B. Matthews,
749
J. I. Meier, M. Most, M. W. Nachman, E. Nonaka, D. J. Rennison, J. Schwarzer, E. T.
750
Watson, A. M. Westram, and A. Widmer. 2014. Genomics and the origin of species. Nat
751
Rev Genet 15:176–192.
752
Sheehan, S., Y. S. Song, E. Buzbas, D. Petrov, A. Boyko, and A. Auton. 2016. Deep learning for
753
population genetic inference. PLOS Comput. Biol. 12:e1004845.
754
Singhal, S., and C. Moritz. 2013. Reproductive isolation between phylogeographic lineages
755
scales with divergence. Proceedings. Biol. Sci. 280:20132246.
756
Soltis, D. E., A. B. Morris, J. S. McLachlan, P. S. Manos, and P. S. Soltis. 2006. Comparative
757
phylogeography of unglaciated eastern North America. Mol. Ecol. 15:4261–4293.
758
Stankowski, S., J. M. Sobel, and M. A. Streisfeld. 2017. Geographic cline analysis as a tool for
759
studying genome-wide variation: a case study of pollinator-mediated divergence in a
760
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
34
monkeyflower. Mol. Ecol. 26:107–122.
761
Sukumaran, J., and L. L. Knowles. 2017. Multispecies coalescent delimits structure, not species.
762
Proc. Natl. Acad. Sci. U. S. A. 114:1607–1612.
763
Sun, K., K. A. Meiklejohn, B. C. Faircloth, T. C. Glenn, E. L. Braun, and R. T. Kimball. 2014.
764
The evolution of peafowl and other taxa with ocelli (eyespots): a phylogenomic approach.
765
Proc. R. Soc. B Biol. Sci. 281:20140823.
766
Swenson, N. G., and D. J. Howard. 2005. Clustering of contact zones, hybrid zones, and
767
phylogeographic breaks in North America. Am. Nat. 166:581–591.
768
Vandewege, M. W., D. Rodriguez, J. P. Weaver, T. D. Hibbitts, M. R. J. Forstner, and L. D.
769
Densmore. 2012. Evidence of hybridization between Elaphe bairdi and Elaphe obsoleta
770
lindheimeri including comparative population genetics inferred from microsatellites and
771
mitochondrial DNA. J. Herpetol. 46:56–63. Society for the Study of Amphibians and
772
Reptiles.
773
Vavrek, M. J. 2020. Fossil. https://cran.r-project.org/web/packages/fossil/index.html.
774
Walker, D., and J. C. Avise. 1998. Principles of phylogeography as illustrated by freshwater and
775
terrestrial turtles in the southeastern United States. Annu. Rev. Ecol. Syst. 29:23–58.
776
Waltari, E., R. J. Hijmans, A. T. Peterson, Á. S. Nyári, S. L. Perkins, and R. P. Guralnick. 2007.
777
Locating Pleistocene refugia: comparing phylogeographic and ecological niche model
778
predictions. PLoS One 2:e563.
779
Werler, J. E., and J. R. Dixon. 2000. Texas snakes: identification, distribution, and natural
780
history. 1st ed. University of Texas Press, Austin.
781
Wright, S. 1978. Evolution and the Genetics of Populations. Vol. 4. Variability within and
782
among Natural Populations. University Chicago Press.
783
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
35
Wright, S. 1931. Evolution in Mendelian populations. Genetics 16:97–159.
784
Wright, S. 1943. Isolation by Distance. Genetics 28:114–38. Genetics Society of America.
785
Younger, J. L., P. Dempster, Á. S. Nyári, T. O. Helms, M. J. Raherilalao, S. M. Goodman, and S.
786
Reddy. 2019. Phylogeography of the Rufous vanga and the role of bioclimatic transition
787
zones in promoting speciation within Madagascar. Mol. Phylogenet. Evol. 139:106535.
788
Zellmer, A. J., M. M. Hanes, S. M. Hird, and B. C. Carstens. 2012. Deep phylogeographic
789
structure and environmental differentiation in the carnivorous plant Sarracenia alata. Syst.
790
Biol. 61:763–777.
791
Zhang, C., D.-X. Zhang, T. Zhu, and Z. Yang. 2011. Evaluation of a bayesian coalescent method
792
of species delimitation. Syst. Biol. 60:747–761.
793
Zhang, W. 2010. Computational Ecology: Artificial Neural Networks and Their Applications.
794
World Scientific.
795
796
797
798
799
800
801
802
803
804
805
806
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
36
Table 1. Measures of divergence dates in millions of years (My), migration rates (first number is
807
first taxon into second taxa, e.g., Pantherophis alleganiensis into P.spiloides, and second vice
808
versa), and migration rates (individuals/generation) for each contacting group in the
809
Pantherophis obsoletus complex.
810
811
Group
Migration Rate (ind/gen)
Cline Width (m)
Eastern (P. alleghaniensis x P.spiloides)
0.65 (0.367-0.875)
2.79/3.79
125 (114-138)
Central (P.spiloides x P.obsoletus)
1.08 (0.55-1.67)
1.96/1.28
97 (88-107)
Western (P.obsoletus x P. bairdi)
0.367 (0.37-0.88)
0.65/0.55
8 (4-15)
812
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
37
Figures and Legends
813
814
815
816
Fig.1. Distribution and population delimitation for the ratsnakes Pantherophis obsoletus
817
complex. (A) Map showing range and geographic features: Chihuahuan Desert Province, the
818
Mississippi River, and the Appalachian Mountains and Apalachicola/Chattahoochee River
819
System (AARS). (B) Estimated effective migration surfaces (EEMS) showing areas of both high
820
(blue) and low migration (brown). (C) Population structure using Discriminant Analysis of
821
Principal Components (DAPC) shows the location of four geographically distinct lineages. (D)
822
Bi-plots of PC space shows relative distances among the four taxa using DAPC.
823
824
825
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
38
826
827
Fig.2. The results of SNAPP species-tree estimation for the four taxa over all loci (A) and
828
models showing Isolation Migration (IM) and Isolation Migration with Demographic Change
829
(IMD; best-fit model) and Isolation with Recent Migration and Demographic Change (IMRD)
830
estimated in PipeMaster. Colored arrows represent migration among spatially adjacent pairs. (B).
831
Photographs representing some of the typical color patterns seen in ratsnakes: 1) Pantherophis
832
alleghaniensis (photo F. Burbrink), 2) P. alleghaniensis (photo N. Claunch), 3) P. spiloides
833
(photo A. Meier), 4) P. obsoletus (photo D. Shepard), and 5) P. bairdi (E. Myers).
834
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
39
835
836
837
838
Fig.3. Results from PipeMaster showing estimates and directionality of gene flow (A),
839
divergence times between groups and taxon pairs (B), and changes in population sizes over time
840
(C).
841
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
40
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
Fig.4. Artificial neural network results showing accuracy predicting genetic structure among all
862
lineages and species pairs (left) and ranked variable importance (filtered for multicollinearity in
863
inset box, right).
864
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
41
865
Fig.5. Results from estimating hybrid types between taxon pairs using snapclust showing the
866
spatial distribution of parentals and hybrids along with their frequencies.
867
Longtiude
−110 −100 −90 −80 −70
25 30 35 40 45
Latitude
−110 −100 −90 −80 −70
25 30 35 40 45
Longtiude
Latitude
−110 −100 −90 −80 −70
25 30 35 40 45
Latitude
P. spiloides
P. alleghaniensis
Backcross (P. alleghaniensis)
Backcross (P. spiloides)
F1
P. spiloides
P. obsoletus
Backcross (P. spiloides)
Backcross (P. obsoletus)
P. obsoletus
P. bairdi
Backcross
(P.alleghaniensis)
F1 P.allegheninsis
P.spiloides
0 10 20 30 40 50
Backcross
(P.spiloides)
Backcross
(P.obsoletus) P.obsoletus
P.spiloides
0 20 40 60 80 100
Backcross
(P.spiloides)
P.bairdi
P.obsoletus
0510 15 20
Longtiude
Eastern Lineages
Central Lineages
Western Lineages
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
42
868
Fig.6. Cline model results for all three species pairs using Gaussian Cline (HZAR) model given
869
locations and admixture estimates using DAPC.
870
Eastern Lineages (Pantherophis alleghaniensis x Pantherophis spiloides)
-80.57 Longtitude
25.40 Latitude
~Homestead, FL
Center = 797 km
Width =98 km
-97.42 Longitude
35.18 Latitude
~Oklahoma City, OK
-104.02 Longitude
30.68 Latitude
~McDonald Observatory, TX
Center =617 km
Width = 8 km
Frequency
Distance (km)
Distance (km)
0 500 1000 1500 2000
0.0 0.2 0.4 0.6 0.8 1.0
++
+++
++++
+++
++
+
+
+ + +
+
++
+
++
+
+
+
++
+
++
+
+
+
+++
+
+
+
+
+
+
+
+
+
+
+
++ +
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
++
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+++
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
++
+++
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
++
+
+
++
+
+
++
+
+
+
+
+
+
++
+
+
+
+
+
+
+++
+
+++
++
+
+
+
+
+
+
+
+
+
Center = 753 km
Width = 126 km
Right Tail = 53 km
0 500 1000 1500 2000
0.0 0.2 0.4 0.6 0.8 1.0
++ ++ +
++ ++ ++++ + ++ ++ + +++
++
+
+
+
+ +++ ++++++ + +++++ + ++++++ +++ ++ ++ +++++ ++++++ ++++ + +++ +++ +++ ++ ++ + +++++++++++ + + +++ ++++ + + ++++++++++ +++ + +++ ++++++++ ++
Frequency
Distance (km)
Central Lineages (Pantherophis spiloides x Pantherophis obsoletus)
0 200 400 600 800 1000 1200 1400
0.0 0.2 0.4 0.6 0.8 1.0
+++ ++ + ++
++ +++ +++ + ++ +++ + + +++ +++ +
Western Lineages (Pantherophis bairdi x Pantherophis obsoletus)
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
43
Supplemental Material
871
DRYAD XXX
872
Genomic Dataset
873
Geographic Localities
874
R scripts
875
876
Supporting Information Material 1
877
Additional Methods
878
879
Table S1 – Priors and summary stats for PipeMaster; estimating migration, timing of divergence,
880
and historical demography
881
Table S2 – PHRAPL Results (open with R read.table() or Excel)
882
Table S3 – RDA and ANN Results
883
Table S4 – HZAR Results for snapclust assignments
884
Table S5 – HZAR results for DAPC assignments
885
886
Fig.S1. Delimiting populations using sparse nonnegative matrix factorization (SNMF) and
887
spatial Principal Component analysis (sPCA) showing the location of four geographically
888
distinct lineages.
889
890
Fig. S2 – PipeMaster model fit of observed data to simulations under IM, IMD, and IMD-LGM
891
models for PC1-PC10.
892
893
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
44
Fig. S3 – Symmetric background niche similarity tests for spatially adjacent pairs (eastern,
894
central, and western lineage comparisons) for the D and I statistics.
895
896
Fig. S4 – Prediction of cline width under neutrality given origin times for each species pair: A)
897
Pantherophis alleghaniensis/P. spiloides, B) P. spiloides/P.obsoletus, and C) P.
898
obsoletus/P.bairdi.
899
900
901
Supporting Information Material 1
902
Supplemental Methods and Results
903
904
Dataset
905
906
We sampled 288 individuals liberally covering the range of all species within the
907
Pantherophis obsoletus complex (Fig.1; Dryad XXX). DNA was extracted from all samples
908
using Qiagen DNeasy Blood & Tissue Kits and samples were screened for quality using broad-
909
range Qubit Assays (https://www.thermofisher.com/us/en/home/industrial/spectroscopy-
910
elemental-isotope-analysis/molecular-spectroscopy/fluorometers/qubit/qubit-assays.html). We
911
used services from RAPiD Genomics (https://www.rapid-genomics.com/services/) to generate
912
5472 baits and to sequence 5060 conserved elements (UCEs) loci following the protocols from
913
(Faircloth et al., 2012) and (Sun et al., 2014).
914
Raw sequence reads were trimmed of adapter contamination using illumiprocessor
915
(Faircloth, 2013), a wrapper around the trimmomatic package (Bolger et al., 2014). These
916
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
45
sequence capture data were then assembled by mapping reads to all UCEs found within a
917
Chromium 10x assembled Pantherophis spiloides genome (Burbrink and Myers, in prep) after
918
2,500 base pairs of flanking DNA around UCE baits were pulled from this genome assembly
919
using the Phyluce commands phyluce_probe_run_multiple_lastzs_sqlite and
920
phyluce_probe_slice_sequence_from_genomes (Faircloth, 2016). Sequence reads for each
921
sample were then mapped to the ‘reference’ UCE dataset using bwa v0.7.17 (Li and Durbin,
922
2009). Samtools was used to convert sam format to bam files and alignments were soft clipped
923
using the CleanSam tool in Picard (http://broadinstitute.github.io/picard/). To phase these data,
924
we mapped reads back to each individual sample’s alignments. To do this, bam files were
925
converted to fastq using samtools mpileup (Li et al., 2009), fastq files were converted to fasta
926
format via seqtk (https://github.com/lh3/seqtk) and all missing sites were removed using seqkit
927
(Shen et al 2016). These newly created fasta files were then used as a reference sequence for
928
phasing the data following the seqcap_pop pipeline (Harvey et al., 2016). Briefly we followed
929
this pipeline and used GATK (McKenna et al., 2010) to mark duplicates, call, realign, and
930
annotate/mask indels, call and annotate SNPs via GATK, restrict SNP calling to high quality
931
SNPs, and then conducted read backed phasing. Finally, we created a sample specific fasta file of
932
all phased loci for every sample using ‘add_phased_snps_to_seqs_filter.py’ available via the
933
seqcap_pop pipeline (Harvey et al., 2016). All samples were combined into locus specific fasta
934
files using a perl script developed in (Myers et al., 2019). Locus specific fasta files were aligned
935
using muscle as implemented in the R package ape (Paradis and Schliep, 2019) after removing
936
individuals from alignments that fell below the 25% quantile of the distribution of sequence
937
lengths for each locus. All fasta files were conservatively trimmed of missing data resulting in
938
alignments with no missing sites (i.e., trimming to the shortest sequence in the alignment). We
939
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
46
filtered these alignments by removing all loci that were missing >50% sequenced samples and
940
removed all individuals that were missing from >30% of all alignments. Finally, we created a vcf
941
file from these alignments and using vcftools (Danecek et al., 2011) and filtered this for minor
942
allele frequency >10% retaining only 1 SNP/locus for population assignment analyses (Linck
943
and Battey, 2019).
944
945
946
Geographic groupings
947
We estimated population structure by comparing Discriminant Analysis of Principal Components
948
(DAPC; Jombart et al., 2010) and spatial Principal Component analysis (sPCA; Jombart et al.,
949
2008) in adegenet v2.1.2(Jombart 2008), sparse nonnegative matrix factorization (SNMF;
950
Frichot et al., 2014) in LEA v1.4.0 (Frichot and François 2015), and estimated effective
951
migration surfaces (EEMS v; Petkova et al., 2016), which accounts for IBD . Each of these
952
methods use different assumptions for grouping individuals and we compare congruence among
953
them.The model-free method DAPC sequentially estimates K-means for clustering and selects
954
the best fit to infer genetic clusters in the absence of prior group identification. Using the
955
package adegenet (Jombart 2008) in R (R Core Team 2015) we first transformed data using PCA
956
by choosing a large number of PC axes (n=200), then picking the number of discriminant
957
functions yielding large F-statistics (>2000) and optimal number of groups using BIC (Bayesian
958
inference criterion). Because using a large number of the PCs may yield arbitrary solutions, we
959
estimated the optimal a-score, which measures this bias by calculating the difference between
960
the actual cluster assignment probabilities and randomly assigned probabilities (predicting 15-23
961
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
47
PCs). We also used cross-validation with a 90% training and 10% test dataset to estimate the
962
average predicted success for each group.
963
Because DAPC does not account for spatial structure, such as IBD, we also used sPCA.
964
Similar to DAPC, this method does not require data to be in Hardy-Weinberg or linkage
965
equilibrium. sPCA also adds an element of space by isolating global structure, representing
966
disconnected groups or clines, from local structure, representing repulsion (selection against
967
similar genetic types co-occurring), and random noise. Using adegenet, we estimated sPCA
968
given SNPs and locations of each sample using up to 15 positive eigenvalues (global structures)
969
and 15 negative eigenvalues (local structures). To connect individual locations to build a
970
connection network, we used Delaunay triangulation. We also implemented Monte-Carlo tests
971
with 100 permutations using the function spca_randtests to determine if significant local or
972
global spatial structures exist.
973
We also assessed ancestral coefficients using SNMF in the package LEA (Frichot and
974
François 2015). This methodology is comparable to ADMIXTURE (Alexander et al. 2009) and
975
Structure (Pritchard et al. 2000) methods, but runs 10-30x faster using genomic-scale data. This
976
method produces a least-squares estimate of ancestry populations given K ancestral populations
977
We estimated ancestry coefficients over 1-6 K using 100 repetitions per K, 100 iterations per
978
algorithm, and a tolerance of 0.05. To determine the appropriate number of groups, we used a
979
cross-validation technique with the entropy criterion, which masks genotypes to fit a particular
980
model to each K.
981
Finally, to compare these methods of grouping individuals into populations, we estimated
982
migration rates over the landscape using EEMS (Petkova et al. 2016). This method models
983
effective migration rates over geography to represent regions where migration is low in cases
984
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
48
when genetic dissimilarity increases rapidly, thus providing a uniquely distinct view of the
985
location of population clusters relative to biogeographic barriers as compared to DAPC, sPCA,
986
and SNMF. We ran EEMS 3 times for 3x106 generations with burnin set at 1x106 generations,
987
thinned by 9,999 generations, with 1,000 demes covering the known range of the P. obsoletus
988
complex (Burbrink 2001).
989
Both EEMS and DAPC find population structure, generally matched the geographic
990
ranges of P. alleghaniensis, P. spiloides, P. obsoletus, and P. bairdi (Fig. 1 and Fig. S1;
991
Burbrink, 2001). Total DAPC assignment probabilities were 0.975 (P. bairdi = 1.0, P. obsoletus
992
= 0.96; P. spiloides = 1.0, and P. alleghaniensis = 0.97). Comparing these assignments to those
993
from SMNF, we found that only P. alleghaniensis was misclassified as P. spiloides 8.5% of the
994
time. Similarly, SNMF cross validation predicted four groups (cv = 0.599) as compared to two
995
(cv=0.627), three (cv=0.607), and five (cv=0.601). These four groups generally match the same
996
ranges as those found in DAPC. sPCA showed the presence of four groups, with PC1 showing
997
the separation of groups at the MR and the two lineages east of this river, whereas PC2-5 shows
998
divergence in all taxa. Importantly, global spatial structures were significant (observed spatial
999
structure test statistic = 75.47, random expectation = 28.14; P < 0.009). Similarly, all three
1000
EEMS runs suggest low migration at the MR and the area separating the P. bairdi and P.
1001
obsoletus in west Texas. East of the MR, estimated low migration occurred near the Appalachian
1002
Mountains, though this area is a complex mixture of isolation and gene flow (Fig. 1). All three
1003
EEMS generated similar acceptance proportions for all proposal types (12-51%).
1004
1005
Running SNAPP
1006
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
49
We set forward and reverse mutation rates to 1. We applied a gamma distribution (α = 2, β =200)
1007
for the speciation rate (λ) prior and the snapprior was set to α = 1, β =250, and κ = 1. To test the
1008
four alternative species delimitation models including four taxa, three taxa (collapsing P.
1009
bairdi/P. obsoletus or P. alleghaniensis/P. spiloides), and two taxa (collapsing P. bairdi/P.
1010
obsoletus and P.alleghaniensis/P. spiloides we used a stepping-stone analysis with the
1011
PathSampleAnalyser with 72 steps and a chain length of 200000, burnin percentage at 50%, and
1012
a preBurnin of 10,000. We checked for stationarity using Tracer v1.7.1 (Drummond and
1013
Rambaut 2007) determining that estimated sample size (ESS) were > 200 for all parameters.
1014
Running PHRAPL
1015
PHRAPL uses gene-tree distances between simulated and empirical data to rank models using
1016
approximate likelihoods and AICs. To generate the empirical observations we estimated
1017
unrooted gene trees for each loci using IQ-Tree 1.6.12 (Nguyen et al. 2015) and testing all
1018
substitution models per locus prior to estimation. We rooted the trees using midpoint rooting and
1019
performed 100 subsamples of two individuals per population. To reduce model space, we
1020
allowed only one population size parameter (all populations have equal size) and one migration
1021
parameter (all migration rates are equal). We then ran 1000 simulations per model, searching
1022
over a grid of four CollapseStarts values and five MigrationStarts values.
1023
1024
Niche models
1025
We determined if the four taxa show uniquely distinct niches. We quantified niche
1026
similarity using methods introduced in Warren et al. (2008) using Schoener’s D and Hellinger
1027
distance I (Schoener 1968). We used the georeferenced genetic samples here as inputs,
1028
predictions of species identity from DAPC, and 19 bioclimatic variables cropped to the eastern
1029
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
50
half of the US (Fick and Hijmans 2017). Estimates of D and I between adjacent species pairs
1030
were calculated and significance of niche identity was derived from 500 random permutations of
1031
lineage identity and re-estimation of the ecological niche model (ENM) using maxent (Phillips
1032
and Dudik 2008) in the R program ENMTools (Warren et al. 2010). Because underlying
1033
differences between lineage-pair regions may not be different from the exact locations of
1034
individuals estimating niche identity we used the background test to produce a null distribution
1035
of ENM symmetrically between a lineage and randomized occurrence points within the adjacent
1036
range of the other adjacent lineage (using a radius of 20km and 1000 points). Therefore, if D and
1037
I from the identity tests were significantly lower than the background test null distribution then
1038
similarity between species, provided that accessible habitat exists in the region, is rejected
1039
(Warren et al. 2008).
1040
Identity tests for niche overlap for both the D and I statistics were significant (P <
1041
1.6x10-12 - 4.02 x10-214) for all adjacent pairs. Dissimilarity for the background tests was
1042
significantly lower than the null distributions for all pairs of taxa (P=0.003 to 0.019; Fig. S3).
1043
These results suggests niches overlap between pairs of taxa less than expected by chance.
1044
1045
1046
Spatial genetics using RDA
1047
We predicted genetic distances from environmental, elevation, river, and distance variables using
1048
RDA in the R package Vegan (Dixon 2003), which avoided problems with spatial data using
1049
Mantel tests (Legendre et al. 2015). This method was used to compare results from the ANN
1050
analyses. RDA is an asymmetric canonical analysis here used to determine the significance of
1051
predictor axes (Legendre et al. 2011) at generating predict genetic structure over a landscape
1052
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
51
(McGaughran et al. 2014; Noguerales et al. 2016). We ran this using the capscale function where
1053
dissimilarity data were ordinated with metric scaling and then passed to RDA. All variable
1054
distances were used to predict genetic distance and significance of each variable was tested using
1055
anova-like permutation tests to examine the joint effect of predictor variables given partial
1056
effects from spatial distance to account for IBD. Also, given that some of these variables might
1057
be correlated, we assessed multicollinearity and reran RDA using the reduced number of
1058
predictor variables. To examine how multicollinearity among variables affects model inference
1059
given linear dependence among these independent variables (De Veaux and Ungar 1994; Hastie
1060
et al. 2001; Shan et al. 2006; Dormann et al. 2013), we used stepwise variable inflation factors in
1061
the package Rnalytica (https://rdrr.io/github/software-analytics/Rnalytica/). The function
1062
stepwise.vif selects non-correlated variables by excluding those with the highest variable
1063
inflation factor above five.
1064
Estimates of accuracy using ANN were high (>90%) for all comparisons and yielded
1065
similar conclusions to RDA. Here, RDA demonstrated that most of the structure for all taxa can
1066
be predicted by the Mississippi River, MIS19 (climate at 787Kya), and elevation (P=0.001-
1067
0.002) and these predictors similarly showed 100% variable importance using ANN (Table S3).
1068
Distance also played a role in structuring these data, though they were partialed out in RDA. For
1069
those lineages east of the MR, population structure was predicted by LGM and elevation (RDA:
1070
P = 0.001, ANN: variable importance=100). For those lineages west of the MR, we
1071
demonstrated that MIS19 and elevation structure these populations (RDA: P= 0.001, ANN:
1072
variable importance = 100).
1073
1074
Hyrbid zone modeling in HZAR using snaplust and DAPC assignments:
1075
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
52
We also fit clines for the same species pairs using individual genetic assignments: pure parental
1076
(0,1), backcross (0.25, 0.75), and F1 (0.5,0.5) and, alternatively, using adegenet prediction
1077
probabilities in the program HZAR v.2-9 (Derryberry et al. 2014). Using the Gaussian cline
1078
model, we estimated the center and width of the cline and determined if these sigmoidal
1079
distributions have significant tails by fitting the following models: 1) no tails, 2) right tail only,
1080
3) left tail only, 4) mirrored tails, and 5) both tails estimated independently (see Derryberry et al.
1081
2014). We fit these models to our data using AICc and ran the MCMC chains for 5x106
1082
generations, thinned by 5x103 generations, and estimated stationarity using ESS >200 in the R
1083
package CODA (Plummer et al. 2006).
1084
We found agreement for best-fit model between spaclust and DAPC assignments and
1085
parameter estimates. For the eastern lineages, HZAR showed strong support for a right-tailed
1086
model (ΔAICsnapclust =-1.91 - -6.08, ΔAICDAPC =-31.96 - -107.42), whereas central lineages at
1087
the MR (ΔAICsnapclust =-4.28 - -8.70, ΔAICDAPC = -4.26 - -8.869) and western lineages at the
1088
Desert and Edward’s Plateau (ΔAICsnapclust =-5.51 - -221.35, ΔAICDAPC = -5.51 - -221.35) both
1089
supported a no-tails model (Fig. 6; Table S4 & S5).
1090
1091
1092
Genomics and the ICZN
1093
Another problem likely to be encountered by other researchers using genomic data is that
1094
previously named type specimens at particular type localities may be from regions that show
1095
some admixture. For instance, the type locality of the eastern-most species, P. alleghaniensis is
1096
“the summit of the Blue Ridge in Virginia and the Highlands of the Hudson” (Holbrook 1836;
1097
Schultz 1996). Both areas are likely composed of admixed individuals and the International Code
1098
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
53
of Zoological Nomenclature (ICZN) forbids naming species based on hybrids (Article
1099
1.3.3;ICZN, 1999) though the intention of this ICZN article did not consider proportions of
1100
admixture. One possible solution would be to use the next available name from Florida, where
1101
no admixture is present (eliminating P. quadrivittata), which would be P.deckerti (type locality:
1102
Lower Matecumber Key, FL; Brady 1932). Another solution would be to designate a Neotype
1103
from a locality showing no admixture and retain P. alleghaniensis. If P. alleghaniensis/P.
1104
spiloides were not considered unique, then P. alleghaniensis has priority by 18 years (Holbrook
1105
1836; Duméril et al. 1854).
1106
The type locality for the remaining taxa, P. spiloides (New Orleans, Louisiana; Duméril
1107
et al. 1854), P. obsoletus (On the Missouri River from the Vicinity Isle au Vache to Council
1108
Bluff; Say, 1823), and P. bairdi (Fort Davis, Apache Mountains, Jeff Davis Co.: Texas; Yarrow
1109
1880) are all from regions that likely represent individuals not admixed (Fig.1 & 5, Burbrink,
1110
2001). Species delimitation methods, estimates of fixation for loci, and observation that these
1111
four organisms have remained geographically distinct in the face of gene flow throughout the
1112
Pleistocene suggests that these taxa should be considered unique species though hybrid zones
1113
exist. We acknowledge difficulties recognizing the eastern lineages as distinct and could argue
1114
for recognizing them as a single taxon, P. alleghaniensis.
1115
1116
Table S1. Priors and summary stats for PipeMaster for the IM, IMD, IMD-LGM models;
1117
estimating migration, timing of divergence, and historical demography.
1118
1119
IM
1120
PARAMETER
PRIOR.1
PRIOR.2
DISTRIBUTION
1
Ne0.pop1
20000
500000
runif
2
Ne0.pop2
20000
5e+05
runif
3
Ne0.pop3
20000
5e+05
runif
4
Ne0.pop4
20000
5e+05
runif
5
mig0.1_2
2
5
runif
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
54
6
mig0.1_3
0
0
runif
7
mig0.1_4
0
0
runif
8
mig0.2_1
2
5
runif
9
mig0.2_3
2
5
runif
10
mig0.2_4
0
0
runif
11
mig0.3_1
0
0
runif
12
mig0.3_2
2
5
runif
13
mig0.3_4
2
5
runif
14
mig0.4_1
0
0
runif
15
mig0.4_2
0
0
runif
16
mig0.4_3
2
5
runif
17
join1_2
30000
1000000
runif
18
join3_4
30000
1000000
runif
19
join2_4
500000
1700000
runif
1121
IMD
1122
PARAMETER
PRIOR.1
PRIOR.2
DISTRIBUTION
1
Ne0.pop1
10000
500000
runif
2
Ne0.pop2
10000
500000
runif
3
Ne0.pop3
10000
500000
runif
4
Ne0.pop4
10000
500000
runif
5
mig0.1_2
2
5
rtnorm
6
mig0.1_3
0
0
rtnorm
7
mig0.1_4
0
0
rtnorm
8
mig0.2_1
2
5
rtnorm
9
mig0.2_3
2
5
rtnorm
10
mig0.2_4
0
0
rtnorm
11
mig0.3_1
0
0
rtnorm
12
mig0.3_2
2
5
rtnorm
13
mig0.3_4
2
5
rtnorm
14
mig0.4_1
0
0
rtnorm
15
mig0.4_2
0
0
rtnorm
16
mig0.4_3
2
5
rtnorm
17
Ne1.pop1
1000
10000
runif
18
Ne1.pop2
1000
10000
runif
19
Ne1.pop3
1000
10000
runif
20
Ne1.pop4
1000
10000
runif
21
t.Ne1.pop1
3000
170000
runif
22
t.Ne1.pop2
3000
170000
runif
23
t.Ne1.pop3
3000
170000
runif
24
t.Ne1.pop4
3000
170000
runif
25
join1_2
30000
1000000
runif
26
join3_4
30000
1000000
runif
27
join2_4
500000
1700000
runif
1123
IMD-LGM
1124
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
55
PARAMETER
PRIOR.1
PRIOR.2
DISTRIBUTION
1
Ne0.pop1
10000
500000
runif
2
Ne0.pop2
10000
500000
runif
3
Ne0.pop3
10000
500000
runif
4
Ne0.pop4
10000
500000
runif
5
mig0.1_2
2
5
rtnorm
6
mig0.1_3
0
0
rtnorm
7
mig0.1_4
0
0
rtnorm
8
mig0.2_1
2
5
rtnorm
9
mig0.2_3
2
5
rtnorm
10
mig0.2_4
0
0
rtnorm
11
mig0.3_1
0
0
rtnorm
12
mig0.3_2
2
5
rtnorm
13
mig0.3_4
2
5
rtnorm
14
mig0.4_1
0
0
rtnorm
15
mig0.4_2
0
0
rtnorm
16
mig0.4_3
2
5
rtnorm
17
Ne1.pop1
1000
10000
runif
18
Ne1.pop2
1000
10000
runif
19
Ne1.pop3
1000
10000
runif
20
Ne1.pop4
1000
10000
runif
21
t.Ne1.pop1
3000
170000
runif
22
t.Ne1.pop2
3000
170000
runif
23
t.Ne1.pop3
3000
170000
runif
24
t.Ne1.pop4
3000
170000
runif
25
mig1.1_2
0
0
runif
26
mig1.1_3
0
0
runif
27
mig1.1_4
0
0
runif
28
mig1.2_1
0
0
runif
29
mig1.2_3
0
0
runif
30
mig1.2_4
0
0
runif
31
mig1.3_1
0
0
runif
32
mig1.3_2
0
0
runif
33
mig1.3_4
0
0
runif
34
mig1.4_1
0
0
runif
35
mig1.4_2
0
0
runif
36
mig1.4_3
0
0
runif
37
t.mig1.1_2
0
8700
runif
38
t.mig1.1_3
0
8700
runif
39
t.mig1.1_4
0
8700
runif
40
t.mig1.2_1
0
8700
runif
41
t.mig1.2_3
0
8700
runif
42
t.mig1.2_4
0
8700
runif
43
t.mig1.3_1
0
8700
runif
44
t.mig1.3_2
0
8700
runif
45
t.mig1.3_4
0
8700
runif
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
56
46
t.mig1.4_1
0
8700
runif
47
t.mig1.4_2
0
8700
runif
48
t.mig1.4_3
0
8700
runif
49
join1_2
30000
1000000
runif
50
join3_4
30000
1000000
runif
51
join2_4
500000
1700000
runif
1125
Table S2 – PHRAPL Results. Available upon request (open with R read.table() or Excel).
1126
1127
Table S3. Results from RDA (F and P values) and ANN (variable importance [VAR_IMP]) for
1128
climate, distance, Mississippi River, and elevation variables . Significance of variables after
1129
variable inflation factors (VIF) are considered.
1130
1131
VARIABLES
RDA F VALUE
RDA P VALUE
P_VALUE_VIF
NN_VAR_IMP
NN_VAR_IMP_VIF
ALL TAXA
MISSISSIPPI
14.1683
0.0001
0.001
100
100
MPW
1.3642
0.06
NA
31
NA
LGM
1.6455
0.015
NA
91
NA
LIG
1.4935
0.024
NA
6
NA
MIS
1.7603
0.005
0.001
81
100
CURRENT
1.8522
0.009
NA
0
NA
ELEVATION
0.9627
0.458
0.002
91
100
DISTANCE
Factored
Factored
Factored
100
100
EAST_OF THE_MS
LGM
1.7148
0.001
0.001
70
100
LIG
2.0434
0.001
NA
100
NA
MIS
3.1272
0.001
NA
41
NA
CURRENT
1.8641
0.001
NA
98
NA
ELEVATION
1.0464
0.275
0.001
0
100
DISTANCE
Factored
Factored
Factored
100
100
WEST_OF_THE_MS
LGM
1.4705
0.013
NA
72
NA
LIG
1.1802
0.192
NA
100
NA
MIS
0.9094
0.632
0.001
79
100
CURRENT
0.9583
0.533
NA
91
NA
ELEVATION
0.626
0.984
0.31
90
100
DISTANCE
Factored
Factored
Factored
100
100
1132
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
57
1133
Table S4. Results from cline modeling using HZAR for each species pair grouping using
1134
snapclust ancestry estimates
1135
1136
PANTHEROPHIS_ALLEGHANIENSIS_X_P.SPILOIDES
MODEL
Center
Width
Tail_Right
Tail_Left
Tail_Mirror
AICc
ESS
NONE
729 (723-735)
141 (127-157)
-
-
-
-65.158
717-5200
RIGHT
729 (723-736)
141 (126-157)
859 (375-1599)
-
-
-67.067
530-37423
LEFT
729 (723-735)
141 (126-157)
-
1143 (593-1679)
-
-60.988
829-53146
MIRROR
729 (723-735)
141 (126-157)
880 (251-1636)
-61.575
321-53591
BOTH
730 (724-736)
141 (126-157)
769 (147-1562)
1226 (719-1723)
-
-62.885
358-23637
PANTHEROPHIS_SPILOIDES_X_P.OBSOLETUS
MODEL
Center
Width
Tail_Right
Tail_Left
Tail_Mirror
AICc
ESS
NONE
801 (795-807)
100 (90-110)
-
-
-
-535.0162
14399-95534
RIGHT
801 (795-808)
100 (90-110)
1290 (721-1816)
-
-
-530.734
676-79146
LEFT
801 (795-808)
100 (90-110)
-
1276 (709-1835)
-
-530.735
20873-100521
MIRROR
801 (795-807)
100 (90-110)
-
-
1307 (762-1848)
-530.732
10695-120834
BOTH
802 (795-808)
99 (90-110)
1299 (770-1830)
885 (323-1638)
-
-526.3181
612-32572
PANTHEROPHIS_BAIRDI_P.OBSOLE TUS
MODEL
Center
Width
Tail_Right
Tail_Left
Tail_Mirror
AICc
ESS
NONE
617 (575-670)
7 (3-13)
-
-
-
-150.277
5-1e5
RIGHT
617 (575-670)
10 (4-17)
1204 (613-1803)
-
-
-144.766
14-109080
LEFT
188 (635-2094)
70 (6-123)
-
647 (440-970)
-
71.072
3-523
MIRROR
642 (590-876)
12 (5-182)
-
-
757(37-1578)
58.219
1-915
BOTH
616 (574-660)
9 (4-16)
1218 (617-1810)
1198 (609-1810)
-
-138.23
9-98456
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
58
1152
Table S5. Results from cline modeling using HZAR for each species pair grouping using
1153
adegenet ancestry estimates.
1154
1155
PANTHEROPHIS_ALLEGHANIENSIS_X_P.0SPILOIDES0
0
0
0
0
0
0
0
MODEL0
Center&
Width&
Tail_Ri ght&
Tail_Left&
Tail_Mi rror&
AICc&
ESS&
NONE0
938&(918-954)&
552&(507-599)&
-&
-&
-&
62.806&
1393-2686&
RIGHT0
753(747-760)&
141&(126-157)&
53&(45-61)&
-&
-&
-44.61&
970-3782&
LEFT0
912&(883-939)&
610&(544-681)&
-&
188&(59-1433)&
-&
-60.988&
129-514&
MIRROR0
760&(746-911)&
155&(116-502)&
&
&
91&(66-177)&
64.53&
6-350&
BOTH0
750&(740-757)&
120&(64-136)&
53&(47-60)&
819&(278-1559)&
-&
-12.65&
33-1717&
0
&
&
&
&
&
&
&
PANTHEROPHIS_SPILOIDES_X_P.OBSOLETUS0
&
&
&
&
&
&
&
MODEL0
Center&
Width&
Tail_Ri ght&
Tail_Left&
Tail_Mi rror&
AICc&
ESS&
NONE0
797&(790-803)&
97&(89-107)&
-&
-&
-&
-528.48&
12066-16809&
RIGHT0
797&(790-803)&
98&(88-107)&
11335&(809-1857)&
-&
-&
-524.22&
363-29907&
LEFT0
797&(790-803)&
98&(89-107)&
-&
1281&(718-1841)&
-&
-524.22&
4278-36224&
MIRROR0
797&(790-803)&
98&(89-107)&
-&
-&
1302&(763-1851 )&
-524.22&
3565-41252&
BOTH0
797&(791-804)&
97&(88-106)&
1327&(791-1864)&
890&(235-1646)&
-&
-519.79&
184-7761&
0
&
&
&
&
&
&
&
PANTHEROPHIS_BAIRDI_P.OBSO LETUS0
&
&
&
&
&
&
&
MODEL0
Center&
Width&
Tail_Ri ght&
Tail_Left&
Tail_Mi rror&
AICc&
ESS&
NONE0
617&(575-660)&
7&(4-13)&
-&
-&
-&
-150.277&
10-34416&
RIGHT0
617&(575-661)&
10&(5-20)&
1204&(595-1793)&
-&
-&
-144.766&
6-24620&
LEFT0
652&(595-1874)&
7&(3-81)&
-&
647&(465-1515)&
-&
71.072&
7-400&
MIRROR0
617&(574-661)&
11&(5-19)&
-&
-&
1219(622-1807)&
-144.76&
9-38425&
BOTH0
617&(574-660)&
8&(3-13)&
1211&(619-1806 )&
1235&(635-1824 )&
-&
-138.29&
2-367770&
1156
1157
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
59
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
Fig.S1. Delimiting populations using sparse nonnegative matrix factorization (SNMF) and
1186
spatial Principal Component analysis (sPCA) showing the location of four geographically
1187
distinct lineages.
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
60
1199
1200
Fig. S2 – PipeMaster model fit of observed data to simulations under IM, IMD, and IMD-LGM
1201
models for PC1-PC10.
1202
−4
0
4
0 10 20
PC1
PC2
IM
IMD
IMD-LGM observed
−4
0
4
8
0 10 20
PC1
PC3
−5.0
−2.5
0.0
2.5
0 10 20
PC1
PC4
−3
0
3
0 10 20
PC1
PC5
−4
−2
0
2
0 10 20
PC1
PC6
−4
−2
0
2
4
0 10 20
PC1
PC7
−2.5
0.0
2.5
0 10 20
PC1
PC8
−2
0
2
0 10 20
PC1
PC9
−2
0
2
0 10 20
PC1
PC10
Observed
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
61
Fig. S3 – Symmetric background niche similarity tests for spatially adjacent pairs (eastern,
1203
central, and western lineage comparisons) for the D and I statistics.
1204
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
62
1205
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
63
Fig. S4 – Prediction of cline width under neutrality given origin times for each species pair: A)
1206
Pantherophis alleghaniensis/P. spiloides, B) P. spiloides/P.obsoletus, and C) P.
1207
obsoletus/P.bairdi.
1208
0 1 2 3 4 5 6
0 1000 2000 3000 4000 5000
Cline Width (km)
0 2 4 6 8 10
0 1000 2000 3000 4000 5000 6000
0 1 2 3
0 1000 2000 3000
Time (Years x 100,000)
median dispersal
max dispersal
median dispersal
max dispersal
median dispersal
max dispersal
Pleistocene
Actual cline width
Pleistocene
Actual cline width
Pleistocene
Actual cline width
P.alleghaniensis
P.spiloides
P.spiloides
P.obsoletus
P.obsoletus
P.bairdi
A)
B)
C)
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
64
1209
.CC-BY-NC-ND 4.0 International licensewas not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (whichthis version posted May 7, 2020. . https://doi.org/10.1101/2020.05.05.079467doi: bioRxiv preprint
Article
Full-text available
Mud snakes (Serpentes: Homalopsidae) are a family of 54 described, mainly aquatic, species primarily distributed throughout mainland Southeast Asia and the Indo-Australian Archipelago. Although they have been the focus of prior research, the basic relationships amongst genera and species remain poorly known. We used a combined mitochondrial and nuclear gene dataset to infer their phylogenetic relationships, using the highest levels of taxon and geographic sampling for any homalopsid phylogeny to date (64% generic and 63% species coverage; 140 individuals). Our results recover two reciprocally monophyletic groups: the fangless Brachyorrhos and its sister clade comprised of all rear-fanged homalopsids. Most genera and interspecific relationships were monophyletic and strongly supported, but intergeneric relationships and intraspecific population structure lack support. We find evidence of both undescribed diversity as well as cases of taxonomic inflation within several species. Tree-based species delimitation approaches (mPTP) support potential new candidate species as distinct from their conspecifics and also suggest that many named taxa may not be distinct species. Divergence date estimation and lineage-through-time analyses indicate lower levels of speciation in the Eocene, with a subsequent burst in diversification in the Miocene. Homalopsids may have diversified most rapidly during the Pliocene and Pleistocene, possibly in relation to tectonic shifts and sea-level fluctuations that took place in Sundaland and the Sahul Shelf. Our analyses provide new insights on homalopsid taxonomy, a baseline phylogeny for the family, and further biogeographic implications demonstrating how dynamic tectonics and Quaternary sea level changes may have shaped a widespread, diverse family of snakes.
Article
Full-text available
Oceanographic features shape the distributional and genetic patterns of marine species by interrupting or promoting connections among populations. Although general patterns commonly arise, distributional ranges and genetic structure are species-specific and do not always comply with the expected trends. By applying a multimarker genetic approach combined with Lagrangian particle simulations (LPS) we tested the hypothesis that oceanographic features along northeastern Atlantic and Mediterranean shores influence dispersal potential and genetic structure of the intertidal mussel Perna perna. Additionally, by performing environmental niche modelling we assessed the potential and realized niche of P. perna along its entire native distributional range and the environmental factors that best explain its realized distribution. Perna perna showed evidence of panmixia across >4,000 km despite several oceanographic breaking points detected by LPS. This is probably the result of a combination of life history traits, continuous habitat availability and stepping-stone dynamics. Moreover, the niche modelling framework depicted minimum sea surface temperatures (SST) as the major factor shaping P. perna distributional range limits along its native areas. Forthcoming warming SST is expected to further change these limits and allow the species to expand its range polewards though this may be accompanied by retreat from warmer areas.
Article
Full-text available
Deciphering the geographic context of diversification and distributional dynamics in continental biotas has long been an interest of biogeographers, ecologists, and evolutionary biologists. Thirty years ago, the approach now known as comparative phylogeography was introduced in a landmark study of a continental biota. Here, I use a set of 455 studies to explore the current scope of continental comparative phylogeography, including geographic, conceptual, temporal, ecological, and genomic attributes. Geographically, studies are more frequent in the northern hemisphere, but the south is catching up. Most studies focus on a Quaternary timeframe, but the Neogene is well represented. As such, explanations for geographic structure and history include geological and climatic events in Earth history, and responses include vicariance, dispersal, and range contraction-expansion into and out of refugia. Focal taxa are biased toward terrestrial or semiterrestrial vertebrates, although plants and invertebrates are well represented in some regions. The use of various kinds of nuclear DNA markers is increasing, as are multiple locus studies, but use of organelle DNA is not decreasing. Species distribution models are not yet widely incorporated into studies. In the future, continental comparative phylogeographers will continue to contribute to erosion of the simple vicariance vs. dispersal paradigm, including exposure of the widespread nature of temporal pseudocongruence and its implications for models of diversification; provide new templates for addressing a variety of ecological and evolutionary traits; and develop closer working relationships with earth scientists and biologists in a variety of disciplines.
Article
Full-text available
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Article
Hybrid zones occur as range boundaries for many animal taxa. One model for how hybrid zones form and stabilize is the tension zone model, a version of which predicts that hybrid zone widths are determined by a balance between random dispersal into hybrid zones and selection against hybrids. Here, we examine whether random dispersal and proxies for selection against hybrids (genetic distances between hybridizing pairs) can explain variation in hybrid zone widths across 131 hybridizing pairs of animals. We show that these factors alone can explain ∼40% of the variation in zone width among animal hybrid zones, with dispersal explaining far more of the variation than genetic distances. Patterns within clades were idiosyncratic. Genetic distances predicted hybrid zone widths particularly well for reptiles, while this relationship was opposite tension zone predictions in birds. Last, the data suggest that dispersal and molecular divergence set lower bounds on hybrid zone widths in animals, indicating that there are geographic restrictions on hybrid zone formation. Overall, our analyses reinforce the fundamental importance of dispersal in hybrid zone formation and more generally in the ecology of range boundaries.
Article
Madagascar is known as a biodiversity hotspot, providing an ideal natural laboratory for investigating the processes of avian diversification. Yet, the phylogeography of Madagascar's avifauna is still largely unexamined. In this study, we evaluated phylogeographic patterns and species limits within the Rufous Vanga, Schetba rufa, a monotypic genus of forest-dwelling birds endemic to the island. Using an integrative taxonomic approach, we synthesized data from over 4000 ultra-conserved element (UCE) loci, mitochondrial DNA, multivariate morphometrics, and ecological niche modeling to uncover two reciprocally monophyletic, geographically circumscribed, and morphologically distinct clades of Schetba. The two lineages are restricted to eastern and western Madagascar, respectively, with distributions broadly consistent with previously described subspecies. Based on their genetic and morphological distinctiveness, the two subspecies merit recognition as separate species. The bioclimatic transition between the humid east and dry west of Madagascar likely promoted population subdivision and drove speciation in Schetba during the Pleistocene. Our study is the first evidence that an East-West bioclimatic transition zone played a role in the speciation of birds within Madagascar.
Article
After more than fifteen years of existence, the R package ape has continuously grown its contents, and has been used by a growing community of users. The release of version 5.0 has marked a leap towards a modern software for evolutionary analyses. Efforts have been put to improve efficiency, flexibility, support for 'big data' (R's long vectors), ease of use, and quality check before a new release. These changes will hopefully make ape a useful software for the study of biodiversity and evolution in a context of increasing data quantity. Availability: ape is distributed through the Comprehensive R Archive Network: http://cran.r-project.org/package=apeFurther information may be found athttp://ape-package.ird.fr/.
Article
Biogeographic barriers have long been implicated as drivers of biological diversification, but how these barriers influence co-occurring taxa can vary depending on factors intrinsic to the organism and in their relationships with other species. Due to the interdependence among taxa, ecological communities present a compelling opportunity to explore how interactions among species may lead to a shared response to historical events. Here we collect single nucleotide polymorphism (SNP) data from five commensal arthropods associated with the Sarracenia alata carnivorous pitcher plant, and test for co-diversification across the Mississippi River, a major biogeographic barrier in the southeastern United States. Population genetic structure in three of the ecologically dependent arthropods mirrors that of the host pitcher plant, with divergence time estimates suggesting two of the species (the pitcher plant moth Exyra semicrocea and a flesh fly Sarcophaga sarraceniae) dispersed synchronously across this barrier along with the pitcher plant. Patterns in population size and genetic diversity suggest the plant and ecologically dependent arthropods dispersed from east to west across the Mississippi River. In contrast, species less dependent on the plant ecologically show discordant phylogeographic patterns. This study demonstrates that ecological relationships may be an important predictor of co-diversification, and supports recent suggestions that organismal trait data should be prominently featured in comparative phylogeographic investigations. This article is protected by copyright. All rights reserved.
Article
Significance Despite its widespread application to the species delimitation problem, our study demonstrates that what the multispecies coalescent actually delimits is structure. The current implementations of species delimitation under the multispecies coalescent do not provide any way for distinguishing between structure due to population-level processes and that due to species boundaries. The overinflation of species due to the misidentification of general genetic structure for species boundaries has profound implications for our understanding of the generation and dynamics of biodiversity, because any ecological or evolutionary studies that rely on species as their fundamental units will be impacted, as well as the very existence of this biodiversity, because conservation planning is undermined due to isolated populations incorrectly being treated as distinct species.