ArticlePDF Available

White TA, Perkins SE, Heckel G, Searle JB.. Adaptive evolution during an ongoing range expansion: the invasive bank vole (Myodes glareolus in Ireland. Mol Ecol 22: 2971-2985

Authors:

Abstract and Figures

Range expansions are extremely common, but have only recently begun to attract attention in terms of their genetic consequences. As populations expand, demes at the wave front experience strong genetic drift, which is expected to reduce genetic diversity and potentially cause 'allele surfing', where alleles may become fixed over a wide geographical area even if their effects are deleterious. Previous simulation models show that range expansions can generate very strong selective gradients on dispersal, reproduction, competition and immunity. To investigate the effects of range expansion on genetic diversity and adaptation, we studied the population genomics of the bank vole (Myodes glareolus) in Ireland. The bank vole was likely introduced in the late 1920s and is expanding its range at a rate of ~2.5 km/year. Using genotyping-by-sequencing, we genotyped 281 bank voles at 5979 SNP loci. Fourteen sample sites were arranged in three transects running from the introduction site to the wave front of the expansion. We found significant declines in genetic diversity along all three transects. However, there was no evidence that sites at the wave front had accumulated more deleterious mutations. We looked for outlier loci with strong correlations between allele frequency and distance from the introduction site, where the direction of correlation was the same in all three transects. Amongst these outliers, we found significant enrichment for genic SNPs, suggesting the action of selection. Candidates for selection included several genes with immunological functions and several genes that could influence behaviour.
Content may be subject to copyright.
Adaptive evolution during an ongoing range expansion:
the invasive bank vole (Myodes glareolus) in Ireland
THOMAS A. WHITE,*SARAH E. PERKINS,GERALD HECKEL§and JEREMY B. SEARLE*
*Department of Ecology and Evolutionary Biology, Cornell University, Corson Hall, Ithaca, NY 14853-2701, USA,
Computational and Molecular Population Genetics (CMPG), Institute of Ecology and Evolution, University of Bern,
Baltzerstrasse 6, CH-3012, Bern, Switzerland, School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum
Avenue, Cardiff, CF10 3AX, UK, §Swiss Institute of Bioinformatics, Genopode, CH 1015 Lausanne, Switzerland
Abstract
Range expansions are extremely common, but have only recently begun to attract
attention in terms of their genetic consequences. As populations expand, demes at the
wave front experience strong genetic drift, which is expected to reduce genetic diver-
sity and potentially cause ‘allele surfing’, where alleles may become fixed over a wide
geographical area even if their effects are deleterious. Previous simulation models
show that range expansions can generate very strong selective gradients on dispersal,
reproduction, competition and immunity. To investigate the effects of range expansion
on genetic diversity and adaptation, we studied the population genomics of the bank
vole (Myodes glareolus) in Ireland. The bank vole was likely introduced in the late
1920s and is expanding its range at a rate of ~2.5 km/year. Using genotyping-by-
sequencing, we genotyped 281 bank voles at 5979 SNP loci. Fourteen sample sites were
arranged in three transects running from the introduction site to the wave front of the
expansion. We found significant declines in genetic diversity along all three transects.
However, there was no evidence that sites at the wave front had accumulated more
deleterious mutations. We looked for outlier loci with strong correlations between
allele frequency and distance from the introduction site, where the direction of correla-
tion was the same in all three transects. Amongst these outliers, we found significant
enrichment for genic SNPs, suggesting the action of selection. Candidates for selection
included several genes with immunological functions and several genes that could
influence behaviour.
Keywords: allele frequency cline, genotyping-by-sequencing, nonmodel, outlier, population
genomics, RAD
Received 21 November 2012; accepted 3 April 2013
Introduction
Many empirical studies of the genetic consequences of
species introductions have tended to focus on the intro-
duction event itself (e.g. Tsutsui et al. 2000; Kolbe et al.
2004; Bossdorf et al. 2005) and fail to consider the
genetic consequences of the subsequent range expan-
sionan integral part of successful establishment of
any invasive speciesin an explicitly spatial context.
The genetic consequences of range expansion are not
only important for invasive species. Many, if not most,
species have recently experienced range expansions
(Excoffier et al. 2009); examples include the expansion
of species from refugia following glacial retreat or
advance (Hewitt 2000), recovery of species after perse-
cution or overexploitation (Lubina & Levin 1988), the
current movement of species due to climate change
(Parmesan & Yohe 2003), expansions associated with
geological events (Marshall et al. 1982), the spread of
species with novel adaptations, such as the expansion
of anatomically modern humans out of Africa (Fagun-
des et al. 2007), and the spread of pathogens during
disease epidemics (Biek et al. 2007; Velo-Ant
on et al.
Correspondence: Thomas A. White,
E-mail: tawhite201@gmail.com
©2013 John Wiley & Sons Ltd
Molecular Ecology (2013) 22, 2971–2985 doi: 10.1111/mec.12343
2012). Despite their frequency, researchers have only
recently begun to appreciate the importance of range
expansions in shaping the current distribution of
genetic diversity at both neutral and functional loci
(Prugnolle et al. 2005; Handley et al. 2007; Besold et al.
2008; Buckley et al. 2012; Velo-Ant
on et al. 2012; Waters
et al. 2012).
Theoretical studies have shown that range expansions
are fundamentally different from purely demographic
expansions. As a population expands its range, it
undergoes a series of founder events, which can lead to
fluctuations in allele frequency and stochastic loss of
alleles (Slatkin & Excoffier 2012). Range expansions are
generally associated with decreasing allelic richness and
heterozygosity with increasing distance along the axis
of expansion (Estoup et al. 2004; Heckel et al. 2005;
Prugnolle et al. 2005; Handley et al. 2007; Besold et al.
2008; Parisod & Bonvin 2008; Velo-Ant
on et al. 2012).
This reduced genetic diversity and associated inbreed-
ing may negatively impact the fitness of individuals at
the expanding range margin. Edmonds et al. (2004) and
Klopfstein et al. (2006) demonstrated that neutral muta-
tions arising on the edge of a range expansion can
sometimes ‘surf’ on the wave of the advance and reach
higher frequencies than would be expected in a popula-
tion at equilibrium. Klopfstein et al. (2006) suggested
that this phenomenon could lead to increased rates of
evolution at range margins. However, Travis et al.
(2007) have shown with simulation models that deleteri-
ous mutations can also surf to high frequencies at
expanding range margins, including mutations having a
negative effect on reproductive rate and juvenile com-
petitive ability. Despite these theoretical insights, there
remain very few empirical studies that have tested their
predictions, and the distribution of genetic diversity in
expanding populations, and its significance, remains
poorly understood.
In addition to strong drift and allele surfing, range
expansions may generate very strong selection pres-
sures. Simulation modelling predicts that individuals at
the expanding wave front should experience selection
for increased dispersal and reproduction (Travis &
Dytham 2002). This is due to a combination of spatial
sorting (Shine et al. 2011) and natural selection (acting
over multiple generations; Travis et al. 2009) favouring
individuals at the edge of an expansion. Evolution of
dispersal and reproduction during range expansions
has now been documented in a number of taxa, includ-
ing plants (Cwynar & MacDonald 1987; Monty & Mahy
2010), amphibians (Phillips et al. 2006), humans (Mo-
reau et al. 2011) and insects (Simmons & Thomas 2004;
Hughes et al. 2007). The process of range expansion
can also influence hostparasite interactions. During a
range expansion, parasites and pathogens may lag
behind their hosts, due to both stochastic loss and low
host density at the wave front of the expansion (Phil-
lips et al. 2010). Where trade-offs exist, individuals at
the wave front should therefore invest less in intraspe-
cific competition (Burton et al. 2010) and immune
defence (Phillips et al. 2010). If such a lag does occur,
these traits may also experience relaxed selection at the
genic level, for example if specific antigen receptor
alleles are no longer required for parasite or pathogen
recognition. In longer established populations behind
the wave front of the expansion, host densities and
parasite burdens are expected gradually to return to
baseline levels, so here selection should favour invest-
ment in intraspecific competition and immunity over
dispersal. To the extent that dispersal, reproduction,
competitive ability and immunity are genetically deter-
mined, spatial sorting and natural selection should be
reflected by allele frequency clines along the axis of
expansion at loci influencing these traits (Hancock et al.
2010a).
However, detecting such adaptations at the genetic
level is expected to present a number of challenges.
Many of the traits in which we might expect to see
adaptation are polygenic, and much of the adaptation
is predicted to come from standing genetic variation
rather than new mutations (Barret & Schluter 2008).
Therefore, adaptation is expected to occur via subtle
shifts in allele frequencies (Hancock et al. 2010a) rather
than hard sweeps (Novembre & Han 2012). Outlier
approaches based on F
ST
values are unlikely to be
useful in this case (Hancock et al. 2010a), as selection
is unlikely to create large differences in allele frequen-
cies between populations. In addition, F
ST
-based
approaches are unable to distinguish allele frequency
variation that is related to an underlying environmen-
tal variable or gradient (such as distance along the
axis of a range expansion) vs. variation that follows a
spatially incoherent pattern (Yang et al. 2012). The pre-
vious rationale underlying genome scans for selection
has been that drift and demographic processes affect
the entire genome, and therefore, unusual patterns at
particular loci should reflect the action of selection
(Zayed & Whitfield 2008). It is now known that ‘allele
surfing’ can generate clines in allele frequencies, but
this affects loci at random (Excoffier et al. 2009). There-
fore, a na
ıve genome scan may reveal many loci that
are putatively under selection but which are actually
false positives (Hofer et al. 2009). It is unlikely that
any method will be able to overcome this problem
completely, but it may be possible to minimize the
problem using replication. An allele frequency cline
at a locus in one region may be due to drift or selec-
tion caused by some underlying environmental
variable. However, the direction of drift or surfing is
©2013 John Wiley & Sons Ltd
2972 T. A. WHITE ET AL.
independent between different ‘sectors’ of the expan-
sion (Hallatschek et al. 2007; Excoffier & Ray 2008).
Clines in the same direction in multiple regions are
therefore less likely to be due to drift.
Here, we report the results of one of the first popula-
tion genomic studies of an ongoing range expansion.
Our study system is the bank vole, Myodes glareolus,in
Ireland. The bank vole is a small rodent distributed
throughout much of Eurasia from Iberia to central Sibe-
ria and from the Mediterranean to Scandinavia, but not
recorded in Ireland until 1964 (Claassens & O’Gorman
1965). Previous studies of mtDNA variation and para-
site distribution support a single introduction event
involving a small number of founders arriving in the
late 1920s on the southern shore of the Shannon Estuary
(Fairley 1971; Ryan et al. 1996; Stuart et al. 2007). Stuart
et al. (2007) place the arrival in 1926 at the deep-water
port of Foynes, as this coincides with the importation of
heavy earth moving equipment from Germany prior to
the construction of the Shannon hydroelectricity
scheme. Since its introduction, the vole has occupied
approximately one-third of the island of Ireland and is
continuing to expand its range at a constant rate of
c. 2.5 km/year (White et al. 2012).
Using a genotyping-by-sequencing (GBS) approach,
we simultaneously identify and genotype a large panel
of SNPs for the bank vole in Ireland. We report changes
in genic and nongenic diversity over the course of the
range expansion and develop a new approach to iden-
tify loci under selection. Importantly, we identify
genetic signatures of adaptation to the process of range
expansion itself.
Methods
Sampling and DNA extraction
In autumn 2010 and summer 2011, 281 bank voles
were sampled from 14 sample sites in Ireland
(Table 1). These sites were arranged in three transects
running from the site of introduction at Foynes out to
the expansion front, to the north, the northeast and
the east (Fig. 1). Voles were euthanized by isoflurane
overdose followed by cervical dislocation. For each
vole, a piece of liver tissue was placed in an Eppen-
dorf tube with 95% ethanol. Genomic DNA was
extracted using the DNeasy kit from Qiagen.
genotyping-by-sequencing (GBS)
Extracted DNA was sent to the Cornell Institute for
Genomic Diversity to conduct GBS. GBS (Elshire et al.
2011) is a simple technique for constructing reduced
representation libraries for the Illumina sequencing plat-
form and is conceptually similar to RAD sequencing
(Hohenlohe et al. 2010). Briefly, DNA from each indi-
vidual was separately digested using the restriction
enzyme PstI (CTGCAG). The fragmented DNA was
then ligated to a barcoded adaptor and a common
adaptor with appropriate sticky ends. The digestion
and ligation were carried out in a 96-well plate. The
wells each contained DNA from a different individual
and a barcoded adaptor unique to that well. One con-
trol well did not contain any DNA. After ligation, the
wells were pooled into one Eppendorf tube and cleaned
Table 1 Sampling information, detailing the names of sample sites, their locations, the transects on which they fall, sample sizes (n),
start and end of trapping periods, distance from the introduction site at Foynes and three measures of genetic diversity for all 5979
SNPs: mean expected heterozygosity per locus (H
e
), mean alleles per locus (A) and mean allelic richness per locus (A
rich
)
Sample site Latitude Longitude Transect nTrapping period
Distance from
Foynes (km)*
H
e
All SNPs
A
All SNPs
A
rich
All SNPs
Foynes 52.574 9.140 All 20 0912/07/2011 0 0.357 1.959 1.759
Tulla 52.795 8.731 N 20 27/0702/08/2011 45 0.288 1.845 1.622
Gort 53.138 8.770 N 20 1617/10/2010 84 0.265 1.782 1.571
Tuam 53.497 8.737 N 21 1215/10/2010 124 0.256 1.757 1.553
Cloonfad 53.708 8.730 N 20 1421/08/2011 148 0.253 1.750 1.546
Limerick 52.660 8.451 NE 20 1314/07/2011 48 0.307 1.893 1.661
Nenagh 52.861 8.266 NE 20 01/11/2010 67 0.277 1.834 1.602
Birr 53.131 7.906 NE 20 2830/10/2010 106 0.254 1.755 1.547
Ballynahown 53.360 7.871 NE 20 2223/08/2011 129 0.237 1.715 1.513
Adare 52.471 8.765 E 20 21/11/2010 28 0.342 1.956 1.734
Kilteely 52.495 8.408 E 20 2426/07/2011 50 0.312 1.906 1.674
Cashel 52.479 7.907 E 20 02/11/2010 87 0.301 1.878 1.650
Windgap 52.438 7.404 E 20 0708/08/2011 119 0.309 1.896 1.665
New Ross 52.418 7.046 E 20 1015/11/2010 146 0.300 1.874 1.648
*Shortest straight-line distance, except that the shortest land route around the Shannon Estuary was incorporated for transects
incorporating this feature.
©2013 John Wiley & Sons Ltd
ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2973
using a Qiagen QIAquick PCR purification kit to form a
library. The library was then subjected to a PCR, using
long primers that matched the barcoded and common
adaptors. The PCR has two functions. One is to
perform a size-selection step, as the PCR preferentially
amplifies fragments of an ideal length for Illumina
sequencing. The second is that the long primers add a
length of sequence to the fragments in the library.
These sequences bind to the Illumina flow cell and are
also used to prime subsequent DNA sequencing reac-
tions. After PCR, the library was cleaned again using
a Qiagen QIAquick PCR purification kit. Each library
was then diluted and sent for sequencing using
single-end 100-bp reads on the Illumina HiSeq 2000 at
the Cornell Core Laboratories Center. To assess repeat-
ability of our approach, a second library was made
for 24 of the individuals, which was sequenced as
before.
SNP genotyping and annotation pipeline
Raw sequence files from Illumina were converted into
individual genotypes using the UNEAK pipeline, avail-
able as part of the TASSEL 3.0 software (Bradbury et al.
2007). Briefly, the UNEAK pipeline keeps good reads
with a barcode, cut site and no ‘N’s in the first 64 bp of
the sequence after the barcode. Reads are then trimmed
to 64 bp after the barcode. Identical reads are clustered
into tags, and counts of these tags present in each bar-
coded individual are stored. Following this, all unique
tags are merged, and their counts in the whole sample
of individuals are stored. Pairwise alignment of tags is
then performed, and tag pairs with 1-bp mismatch are
considered as candidate SNPs. With a certain error
tolerance rate (here set to 0.03), only reciprocal pairs of
tags are retained as SNPs, according to standard proto-
cols of the Cornell Institute for Genomic Diversity. Fol-
lowing SNP identification, counts of each tag (or allele)
are output for each locus and each individual. After
running UNEAK, individual genotypes were recalled
following the approach of Lynch (2009), using a global
sequencing error rate of 0.03. The likelihood of each
genotype was calculated using a multinomial sampling
distribution, and a genotype was called if it had an AIC
value at least four lower than the next best genotype.
Otherwise, the genotype was coded as ‘missing’. To
filter out potential paralogs, we discarded loci with a
mean observed heterozygosity >0.75. This cut-off is
obviously somewhat arbitrary, but choosing different
cut-off values between 0.5 and 1 made little difference
to our results (data not shown). After filtering, we had
5979 loci that could be confidently called in at least 80%
of individuals. Although allele dropout can affect esti-
mates of genetic variation within and between popula-
tions (Gautier et al. 2012), the number of individuals
with missing data (which could reflect different levels
of allele dropout) had no effect on the patterns of diver-
sity reported here (data not shown).
Twenty-four individuals were analysed in two sepa-
rate GBS runs. Where individuals were assigned geno-
types from both runs, the genotype calls from the two
runs were compared. This analysis showed that repeat-
ability of genotyping was high (mean, 97.2%; SD, 1.4%).
Locus sequences were blasted against the RefSeq mam-
malian RNA database using BLASTN (Altschul et al. 1997)
with parameters: word_size =11; gapopen =5; gapex-
tend =2; penalty =3; and reward =2. Sequences were
also blasted against the SwissProt and NR databases using
BLASTX with default parameters. SwissProt was used pref-
erentially, to facilitate functional annotation using Uni-
Prot. Loci were identified as putatively genic if they had
an expectation value e<1910
5
in matches against the
RefSeq database or e<1910
3
in matches against Swiss-
50 km
Fig. 1 Location of sample sites in Ireland. CD =Cloonfad,
TM =Tuam, GT =Gort, TA =Tulla, BN =Ballynahown, BR =
Birr, NH =Nenagh, LK =Limerick, NS =New Ross,
WP =Windgap, CL =Cashel, KY =Kilteely, AE =Adare,
FS =Foynes. Sites on the northern transect are marked with
squares, those on the north-eastern transect are marked with
circles, and those on the eastern transect with triangles. Foynes,
which is the introduction site and is on all three transects is
marked with a cross. The dashed line shows the approximate
range limits of the bank vole in 2011.
©2013 John Wiley & Sons Ltd
2974 T. A. WHITE ET AL.
Prot/NR databases. BLASTX was used to determine
whether the genic SNPs were synonymous or nonsynony-
mous (NS).
Genetic diversity patterns
Mean expected heterozygosity (H
e
), mean alleles per
locus (A) and mean allelic richness (A
rich
) were calcu-
lated for each population and each locus class [NS
SNPs, genic (not NS) SNPs and nongenic SNPs] using
the software ARLEQUIN 3.5 (Excoffier & Lischer 2010) and
HP-RARE (Kalinowski 2005). Measures of genetic diver-
sity were regressed onto the geographical distance
between the sampling locality and the point of intro-
duction (Foynes) using the ‘lm’ package in R2.15
R Core Team (2012). As the Shannon Estuary represents
a significant barrier to dispersal, we calculated distances
as the shortest path by land. To test for differences in
slopes and intercepts between the different SNP locus
classes, an ANCOVA was performed taking mean diver-
sity (H
e
,Aor A
rich
) as the response variable and locus
class, distance and their interaction as the independent
variables.
Identifying SNP outliers
Two general approaches were used to identify loci
potentially under selection relating to range expansion.
The first was to calculate the Spearman rank correlation
between allele frequency and the geographical distance
between the sampling locality and Foynes as the point
of introduction. This was done for the three transects
separately. We then took the absolute value of the mean
correlation coefficients across the three transects. Loci
were ranked by mean correlation coefficient, and an
empirical P-value was calculated as the rank divided by
the number of loci. We then identified potential outlier
loci as those with empirical P-values <0.05 and <0.01.
Using this approach, the Foynes population appeared
as the starting site in all three transects, so correlation
coefficients may have been disproportionately influ-
enced by the allele frequency at Foynes. Therefore, we
repeated the correlations, excluding Foynes from all
three transects. Mean correlation coefficients and p-val-
ues were calculated as before. As the Foynes sample
does still contain relevant information, we considered
outliers to be those loci that appeared in the tails of the
distributions of the correlations both with and without
Foynes.
The second approach to identify outliers was to use
the method of Coop et al. (2010), implemented in the
software Bayenv. This approach estimates the covari-
ance in allele frequencies between populations from a
set of control loci. In our case, this was the set of 5713
nongenic SNPs. For each of the 5979 SNPs, a Bayes
factor was then calculated for a model where an envi-
ronmental variable has a linear effect on allele frequen-
cies compared with a model given by the covariance
matrix alone. The environmental variable of interest
was the geographical distance from the point of intro-
duction at Foynes. Each locus was binned according to
the frequency of allele 1 (arbitrarily defined) over all
populations into one of 10 bins with a frequency interval
of 0.1. Within each frequency bin, loci were ranked by Ba-
yes factors, and an empirical P-value was calculated as the
rank divided by the number of loci in that bin. We then
identified potential outlier loci as those with empirical P-
values <0.05 and <0.01. Variancecovariance matrices
were compared within and between independent runs of
the programme to ensure convergence.
Putative functions and Gene Ontology (GO) Biologi-
cal Process terms were assigned to outlier loci using the
UniProt Knowledgebase [‘The UniProt Consortium
(2012) Reorganizing the protein space at the Universal
Protein Resource (UniProt)’] and PANTHER v7.2 (Thomas
et al. 2008).
Neutral simulations
A modified version of SPLATCHE (Ray et al. 2010) was
used to simulate neutral genetic diversity after a range
expansion in the bank vole. Ireland was represented as
a lattice of 1 Km squares. Areas of land were defined as
potential bank vole habitat, whereas areas of sea or
lakes were defined as unsuitable. Simulated sample
sites were arranged according to the same coordinates
as our real sample sites and had the same sample sizes.
The range expansion began at Foynes and progressed
until all sample sites had been colonized. The forward
demographic part of the SPLATCHE simulation records for
each time step the population sizes in each deme and
migration events between demes. Samples of genes
were taken from each sample site, and SNP data were
simulated using a discrete time coalescent model. For
each demographic simulation, 5979 neutral SNP loci
were simulated. Allele frequencies were calculated for
each sample site, and these were correlated with dis-
tance from Foynes using Spearman rank correlation. As
with the real genetic data, we calculated these correla-
tions separately for each transect and took the absolute
mean correlation across all three transects. We also cal-
culated the correlations with and without the Foynes
sample site. For both these approaches, we recorded the
number of loci that had higher correlation coefficients
than our observed outliers at the 5% and 1% thresholds.
The strength of allele frequency correlations at neutral
loci will depend on the amount of genetic drift
experienced by the population as it expands, which in
©2013 John Wiley & Sons Ltd
ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2975
turn will depend on the demographic model used. As
this is unknown, we performed 1000 different demo-
graphic simulations, whose parameters (founding num-
ber of individuals, carrying capacity of each deme,
growth rate per generation, migration rate, Allee effect
severity and Allee effect scale (see Stephens & Suther-
land 1999)) were drawn from uniform distributions that
we had previously found to generate close matches to
the observed SNP data (T. A. White, unpublished data).
Deleterious SNPs
The program PolyPhen-2 (Ramensky & Sunyaev 2009)
was used to predict the functional impact of each NS
SNP on the translated protein. This approach is based
on multiple alignments and biochemical and physical
characteristics of the amino acid replacements. In cases
where one of the alleles at a locus matched to a human
or rodent reference genome, this allele was used as the
reference, and the effect of changing this to the other
allele was assessed using PolyPhen-2. The functional
impact of each NS SNP was designated as ‘Benign’,
‘Possibly damaging’ or ‘Probably damaging’, the latter
two classes we call potentially deleterious SNPs. If
neither allele matched to a reference sequence, the func-
tional impact of the substitution was left unclassified.
For SNPs classified as ‘Possibly damaging’ or ‘Probably
damaging’, the frequency of the potentially deleterious
allele was calculated for each population, and the rela-
tionship between these frequencies and geographical
distance from Foynes was determined using Spearman
rank correlation.
Results
Data quality and coverage
Illumina sequencing of 281 individuals on three lanes
resulted in 786 834 622 reads. Of these, 676 763 709
reads contained a unique barcode and cut site remnant
and contained no ‘N’s. These data were used in the
UNEAK pipeline. UNEAK identified 60 417 biallelic
SNP loci. However, many of these had low coverage or
were only present in a small number of individuals.
Over all of these loci, the mean coverage per locus per
individual was 3.39(max coverage per individual
252.79, min 0.0049). When loci with more than 20%
missing data were excluded, 6398 loci were retained,
with a mean coverage of 16.99(max coverage per indi-
vidual 252.79, min 7.29). Discarding loci with observed
heterozygosity >0.75 left 5979 loci, with a mean cover-
age of 16.569(max coverage per individual 168.09,
min 7.29). This last difference in coverage is consistent
with the idea that paralogs should have high coverage
and also high heterozygosity, as filtering them out
reduces the mean coverage but especially the maximum
coverage at a locus.
Locus classification
Our BLAST approach identified 266 (4.4%) loci as ‘genic’,
245 of which had a match in the RefSeq database and
124 in the SwissProt/NR databases. Results from BLASTX
also determined that 30 of the genic SNPs were NS.
Genetic diversity patterns
Genetic diversity declined significantly with distance
from Foynes (Table S1 and Fig. 2). This was regardless
of whether we used all SNPs, NS SNPs, genic (not NS)
SNPs or nongenic SNPs and whether measured as H
e
,
Aor A
rich
. The slope of the regressions of H
e
and A
rich
on distance was steeper for NS SNPs than for the other
SNP locus classes, and the intercept of the regression
was higher (i.e. there was greater diversity for NS SNPs
at Foynes). However, ANCOVA revealed no significant
effect of SNP locus class on the relationship between
distance and diversity or on the levels of diversity at
Foynes, regardless of which measure of diversity was
used (results not shown). When the three transects are
compared, it can be seen that the loss of diversity in the
eastern transect appears to be less severe than in the
northern and northeastern transects (Fig. 2).
The mean number of alleles is 1.959 at Foynes and in
the wave front populations is 1.874 at New Ross, 1.750
at Cloonfad and 1.715 at Ballynahown. However, when
the wave front populations are pooled, the mean num-
ber of alleles is 1.951. So, it appears that the loss of
diversity has been somewhat independent in the three
transects, as different subsets of alleles have been lost
in each.
Outlier loci
Using the Spearman rank correlation approach (includ-
ing Foynes in each transect), 21 of the 266 genic SNPs,
and 278 of the 5713 nongenic SNPs, had an empirical
P-value <0.05. This represented a 1.6-fold enrichment of
genic SNPs in the outliers (Fisher’s exact test, one-tailed
P=0.0245). Nine of the 266 genic SNPs, and 51 of the
5713 nongenic SNPs, had an empirical P-value <0.01.
This represented a 3.8-fold enrichment of genic SNPs
(Fisher’s exact test, one-tailed P=0.0012).
When Foynes was excluded from this analysis, 21 of
the 266 genic SNPs, and 278 of the 5713 nongenic SNPs,
had an empirical P-value <0.05. This represented a
1.6-fold enrichment of genic SNPs in the outliers
(Fisher’s exact test, one-tailed P=0.0245). Seven of the
©2013 John Wiley & Sons Ltd
2976 T. A. WHITE ET AL.
266 genic SNPs, and 53 of the 5713 nongenic SNPs, had
an empirical P-value <0.01, a 2.8-fold enrichment of
genic SNPs (Fisher’s exact test, one-tailed P=0.0164).
One hundred and sixty-two SNPs lay in the top 5%
of the distribution of correlation coefficients in both
correlations with and without Foynes. Of these, 12 were
genic SNPs, representing a not quite significant 1.7-fold
enrichment of genic SNPs (Fisher’s exact test, one-tailed
P=0.0564). Thirty-five SNPs appeared in the top 1% in
both analyses. Of these, seven were genic, representing
a highly significant 6-fold enrichment of genic SNPs
(Fisher’s exact test, one-tailed P=0.0004).
Thus, using our Spearman rank correlation approach,
we find a significant enrichment of genic SNPs amongst
those SNPs with the strongest correlations between
allele frequency and distance from Foynes, the point of
introduction. This is consistent with adaptation during
the range expansion, as we expect the targets of selec-
tion to be either genes or regulatory regions in close
linkage with genes. The 12 genic loci that are common
to both Spearman rank correlation approaches are listed
in Table 2.
Using Bayenv, 293 SNPs were identified as outliers
with P<0.05. Of these, 13 were genic SNPs. Fifty-four
SNPs were outliers with P<0.01, only one of which
was a genic SNP. There was no enrichment of genic
SNPs in either outlier set identified by Bayenv.
Forty-two SNPs were identified as outliers using both
our correlation approach and Bayenv, of which five
were genic SNPs. This represented a 2.9-fold
enrichment of genic SNPs in the outliers (Fisher’s exact
test, one-tailed P=0.0372).
A total of 20 genic outliers were identified using
either our correlation-based method or Bayenv. These
are listed in Table 2. Of these, 16 genes were assigned
GO terms under ‘biological processes’, of which four
genes had the GO term ‘immune system process’. In
the mouse genome, there are 24 935 genes that are
assigned biological process GO terms, of which 1421
have the GO term ‘immune system process’ (Eppig
et al. 2012). Therefore, assuming that a similar propor-
tion holds true for the bank vole, in our outliers there is
significant enrichment for genes involved in immunity
(Fisher’s exact test, one-tailed P=0.0205).
Neutral simulations
For a range of reasonable demographic models for the
bank vole expansion, we found that, on average, the pro-
portion of simulated loci with more extreme correlation
coefficients than our observed 0.05 and 0.01 thresholds
was 0.041 and 0.008 for correlations including Foynes,
and 0.04 and 0.009 for correlations excluding Foynes. In
our real data, the proportion of loci falling in the 5% tail
of both distributions was 0.027, whilst the proportion fall-
ing in the 1% tail of both distributions was 0.006. In the
simulated neutral data, these proportions were 0.021 and
0.004, respectively. These results suggest that our
0 50 100 150
0.24 0.28 0.32 0.36
Distance from introduction site (km)
He
(a)
0 50 100 150
1.70 1.80 1.90 2.00
Distance from introduction site (km)
A
(b)
0 50 100 150
1.50 1.60 1.70 1.80
Distance from introduction site (km)
Arich
(c)
Fig. 2 Decline of genetic diversity with dis-
tance from the introduction site of the bank
vole in Ireland. Measured as (a) mean
expected heterozygosity (H
e
), R
2
=0.5345;
P=0.003, (b) mean alleles per locus (A),
R
2
=0.5005; P=0.005, and (c) mean allelic
richness (A
rich
), R
2
=0.4911; P=0.003. Sites
on the northern transect are marked with
squares, those on the north-eastern transect
are marked with circles, and those on the
eastern transect with triangles. Foynes,
which is the introduction site and is on all
three transects is marked with a cross.
©2013 John Wiley & Sons Ltd
ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2977
Table 2 Outlier SNPs identified using our Spearman rank correlation approach and Bayenv. The column ‘outlier’ gives the method used to identify that SNP as an outlier. The
next four columns give the accession nos of the best matches and associated gene descriptions, in the mammalian RNA RefSeq database and the SwissProt/NR databases. ‘Type’
shows whether the SNP is synonymous (S) or nonsynonymous (NS) or whether it is located in a noncoding region of the gene (). The final two columns give the functions or
processes in which the genes are involved, according to the UniProt Knowledgebase, and Panther GO (Gene Ontology) classifications, respectively
SNP Outlier*
Match in
mammalian
RNA RefSeq
database Description
Significant match
in SwissProt/NR
protein database Description Type
UniProt knowledgebase
function/process
PANTHER GO biological
process
mg8017 1,3 XM_002753731.1 Hypothetical protein
LOC100361418
P06323.1|
TVA3_MOUSE
T-cell receptor
alpha chain V
NS Receptor activity Immune system process
mg10984 1,3,5 NM_133638.3 ADAM metallopeptidase
with thrombospondin
type 1 motif, 19
(ADAMTS19)
NA NA Proteolysis Signal transduction;
Cellcell adhesion;
Proteolysis
mg39858 1,4 NM_006946.2 Spectrin, beta,
nonerythrocytic 2
(SPTBN2)
NA NA Actin cytoskeleton
organization; axon
guidance
Cellular component
morphogenesis
mg68377 1,3 NM_181652.2 Peroxiredoxin 5 (PRDX5) NA NA Intracellular redox
signalling
Immune system process;
Oxygen and reactive
oxygen species metabolic
process
mg71009 1,3,5 NM_014899.3 Rho-related BTB domain
containing 3 (RHOBTB3)
NA NA Retrograde transport,
endosome to Golgi;
Small GTPase-mediated
signal transduction
G-protein coupled
receptor protein
signalling pathway
mg72604 1,3,5 NM_001118890.1 Glutaredoxin
(thioltransferase) (GLRX)
NA NA Cell redox homoeostasis Sulphur metabolic process
mg81865 1,4,5 NM_017415.2 Kelch-like 3 (KLHL3) Q5REP9.1|
KLHL3_PONAB
Kelch-like protein 3 S Protein ubiquitination Neurological system
process; Cellular
component morphogenesis
mg96770 1,3 NM_001160392.1 tRNA phosphotransferase
1 (TRPT1)
NA NA tRNA processing Nucleobase, nucleoside,
nucleotide and nucleic
acid metabolic process
mg123985 1,3 NM_005956.3 Methylenetetrahydrofolate
dehydrogenase (NADP+
dependent) 1 (MTHFD1)
P11586.3|
C1TC_HUMAN
C-1-tetrahydrofolate
synthase
NS Folic acid metabolism;
neural tube development
Purine base metabolic
process; Cellular amino
acid biosynthetic process
mg8197 2,4 NM_025258.2 Von Willebrand factor A
domain containing 7
(VWA7)
NA NA Glycoprotein
mg17560 2,4 NM_001004736.2 Olfactory receptor,
family 5, subfamily K,
member 1 (OR5K1)
Q8NHB7.2|
OR5K1_HUMAN
Olfactory
receptor 5K1
S Olfaction; sensory
transduction
©2013 John Wiley & Sons Ltd
2978 T. A. WHITE ET AL.
Table 2 Continued
SNP Outlier*
Match in
mammalian
RNA RefSeq
database Description
Significant match
in SwissProt/NR
protein database Description Type
UniProt knowledgebase
function/process
PANTHER GO biological
process
mg83555 2,4,5 NM_015125.3 Capicua homolog
(Drosophila) (CIC)
NA NA Central nervous
system development
Regulation of transcription
from RNA polymerase II
promoter
mg13786 5 NM_001031749.2 LY6/PLAUR domain
containing 5 (LYPD5)
NA NA —— —
mg24029 5 XM_002752883.1 Leukotriene A4 hydrolase,
transcript variant 2
(LTA4H)
Q6S9C8.3|
LKHA4_CHILA
Leukotriene
A-4 hydrolase
S Leukotriene biosynthesis;
inflammatory response
Immune system process;
Fatty acid biosynthetic
process; Proteolysis
mg26799 5 NM_001005217.1 FSHD region gene 2
(FRG2)
ABB88900.1 Oocyte-specific
eukaryotic
translation
initiation
factor 4E-like
(Eif4e1b)
S Protein biosynthesis Translation
mg49438 5 XM_001101962.2 WNT5A wingless-type
MMTV integration site
family, member 5A
P22726.2|
WNT5B_MOUSE
Protein Wnt-5b S Wnt signalling pathway G-protein coupled receptor
protein signalling pathway;
Cellcell signalling
mg57185 5 NM_021226.2 Rho GTPase activating
protein 22 (ARHGAP22)
NA NA Positive regulation of
GTPase activity;
signal transduction
mg59899 5 NM_001012426.1 Forkhead box P4 (FOXP4) NA NA Embryonic foregut
morphogenesis; heart
development;
transcription, DNA
dependent
Visual perception; Sensory
perception; Cell cycle;
Cell surface receptor linked
signal transduction;
Carbohydrate metabolic
process; Regulation of
transcription from RNA
polymerase II promoter;
Cellular component
morphogenesis; Segment
specification; Anterior/
posterior axis specification;
Ectoderm development;
Mesoderm development;
Embryonic development;
Nervous system
development
©2013 John Wiley & Sons Ltd
ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2979
observed data may contain more loci with extreme allele
frequency clines than expected under neutrality.
Deleterious mutations
Ten NS SNPs were classed by PolyPhen-2 as ‘Possibly
damaging’ or ‘Probably damaging’ (Table S2, Support-
ing information). Of these loci, two showed significant
negative correlations between the frequency of the dele-
terious allele and distance from the introduction site
(Fig. 3). These SNPs, mg8017 and mg134581, were
located in the T-cell receptor alpha V (TVA3) gene,
involved in antigen recognition, and the laminin
subunit alpha 2 (LAMA2) gene, respectively. Defects in
Lama2 are a cause of murine muscular dystrophy (Xu
et al. 1994). One SNP, mg123985, located in the C-1-
tetrahydrofolate synthase (C1TC) gene showed a signifi-
cant positive correlation (Fig. 3). Mutations in this gene
may impair foetal growth in mice (Beaudin et al. 2012).
Discussion
Using the bank vole invasion of Ireland as our study
system, we found evidence for adaptation during the
range expansion, despite an overall loss of genetic
diversity due to strong genetic drift at the wave front.
This suggests that selection pressures during range
expansions may be very strong. This is one of the first
studies to provide empirical genomic evidence for the
adaptation to the process of range expansion in a wild
population.
We found that the eastern transect shows the least
reduction in genetic diversity, whilst the northern and
northeastern transects show similar patterns of greater
loss (Fig. 2). In the east of the country, there are few
barriers to dispersal, whilst in the north and north-
Table 2 Continued
SNP Outlier*
Match in
mammalian
RNA RefSeq
database Description
Significant match
in SwissProt/NR
protein database Description Type
UniProt knowledgebase
function/process
PANTHER GO biological
process
mg87917 5 XM_002817845.1 CAP-GLY domain-
containing linker protein
2 (CLIP2)
O55156.1|
CLIP2_RAT
CAP-Gly
domain-containing
linker protein 2
S Control of
brain-specific
organelle translocations
Intracellular protein
transport; Vesicle-mediated
transport; Mitosis; Cellular
component morphogenesis
mg122511 5 NM_006162.3 Nuclear factor of activated
T cells (NFATC1)
NA NA Transcription regulation Immune system process;
Regulation of transcription
from RNA polymerase II
promoter; Mesoderm
development; Cellular
defence response
*1 =significant correlation of allele frequency with distance with P<0.01 (Foynes included); 2 =significant correlation of allele frequency with distance with P<0.05 (Foynes
included); 3 =significant correlation of allele frequency with distance with P<0.01 (Foynes excluded); 4 =significant correlation of allele frequency with distance with P<0.05
(Foynes excluded); 5 =outlier with P<0.05 in Bayenv analysis.
0 50 100 150
0.0 0.2 0.4 0.6 0.8 1.0
Distance from introduction site (km)
Frequency of deleterious allele
Fig. 3 Change in frequencies of deleterious alleles (identified
using PolyPhen-2) with distance from the introduction site.
Only loci with significant correlations are shown. The SNP
mg8017 is shown with open circles, mg123985 with open
squares, and mg134851 with crosses.
©2013 John Wiley & Sons Ltd
2980 T. A. WHITE ET AL.
east, the expanding population would have encoun-
tered substantial barriers to dispersal, including the
River Shannon to the north and unsuitable bog habi-
tat in the northeast. However, diversity in the north-
ern and northeastern transects shows a monotonic
decline, suggesting that the difference between
transects is due to some continuously acting process
and not a one-off founder event, such as might be
caused by crossing a semi-permeable barrier to dis-
persal. These two transects may additionally experi-
ence reduced lateral dispersal along most of their
length, due to the close proximity to the River Shan-
non (Fig. 1). We might expect lateral dispersal to
influence the amount of genetic diversity lost or
retained in a particular transect, if different alleles are
found in different transects. This appears to be the
case, as when we pooled the three populations at the
wave front of the expansion, the mean number of
alleles was almost as high as at the point of introduc-
tion (Foynes, Fig. 1), showing that different alleles
had been lost (or preserved) in different transects.
We can consider at least three different types of selec-
tion acting on individuals in a range expansion that
could produce consistent allele frequency clines
between transects. One is spatial sorting. Individuals
that are more likely to disperse, or that disperse long
distances, are more likely to be found towards the wave
front of the range expansion. If dispersal strategy has a
genetic component, breeding between highly dispersive
individuals at the wave front may lead to an increase in
dispersal over time in wave front populations, the so-
called Olympic Village Effect(Shine et al. 2011). If this
is the case, one would expect alleles contributing to a
highly dispersive phenotype to show a frequency cline,
increasing from the core to the wave front. Traditional
natural selection could generate such a pattern in two
ways. With positive selection at the expansion front,
individuals would disperse to a new habitat without
respect to genotype. In this new habitat, differential sur-
vival and/or fecundity would lead to changes in allele
frequency in the next generation. An alternative to this
model is relaxed selection at the expansion front, which
may be due to reduced density of conspecifics and
reduced parasite burdens in a deme. As time pro-
gresses, conspecific density and parasites in the deme
will increase, potentially leading to purifying selection
behind the expansion front. Both traditional selection
models will generate differences in allele frequency
between demes along the expansion axis. These types
of selection may be difficult to separate empirically.
Indeed, they are not mutually exclusive and may act to
reinforce or oppose one another. Whilst we also expect
expanding populations to be under selection due to
external environmental variables, such as climate
(Hancock et al. 2011), we predict that selection due to
range expansion processes will be much stronger, par-
ticularly over such a small scale in Ireland where envi-
ronmental variation is limited.
The major challenge to date in identifying genes
involved in adaptation during range expansion has
been in separating the signals of selection from drift
and allele surfing (Hofer et al. 2009). Here, we make use
of replicated transects to identify loci showing signifi-
cant allele frequency clines in the same direction in sev-
eral transects. Of course, this approach may fail to
detect some loci that are under selection, and some out-
liers may continue to reflect drift or allele surfing rather
than selection. However, our simulation modelling
showed that our data contained more extreme allele fre-
quency clines than expected under a neutral model,
suggesting the action of selection. In addition, the fact
that we observe an enrichment of genic vs. nongenic
SNPs in the outliers, and that this enrichment is stron-
ger when we consider the tail of the distribution with
P<0.01 vs. P<0.05, suggests that a majority of these
loci are good candidates for being under selection (Han-
cock et al. 2011).
Our simple outlier detection approach may work
better than Bayenv in this case. This is partly because
Bayenv computes only a Bayes factor for each SNP (com-
paring an environmental selection model to a null
model), meaning that only overall relationships can be
assessed, and information on the direction of relation-
ships in different transects is lost. A second reason is that
in a range expansion much of the population genetic var-
iation lies in the direction of the expansion. By removing
the average effect of the variancecovariance of allele fre-
quencies, Bayenv may also be removing signals of adap-
tation to expansion. Other studies that have used Bayenv
successfully have not considered environmental vari-
ables in parallel with the direction of range expansion
(Eckert et al. 2010; Hancock et al. 2010b, 2011; Chen et al.
2012), and none have considered such a recent range
expansion as the one studied here.
It is predicted that during a range expansion, individ-
uals should experience selection for increased dispersal
(most likely due to spatial sorting; Burton et al. 2010;
Shine et al. 2011) and positive selection for reproduction
early and often at the expansion front (including rapid
growth and maturation; Moreau et al. 2011). They
should also experience relaxed selection on intraspecific
competition. It is likely that very many genes influence
these traits (but see Haag et al. (2005) and Matthews &
Butler (2011)), and it is difficult to make predictions
about the classes of genes that should appear as
outliers. For mammals, we might predict that changes
in dispersal, reproduction and competition might be
mediated via behavioural changes, particular with
©2013 John Wiley & Sons Ltd
ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2981
regard to how individuals interact with conspecifics. As
the bank vole in Ireland has experienced strong bottle-
necking and genetic drift during its introduction and
subsequent expansion, we expect strong linkage dis-
equilibrium between different regions of the genome.
Therefore, the outliers we identify may not be the tar-
gets of selection themselves, but merely linked to
regions under selection. Nevertheless, it is interesting
that many of the outliers we identified may be involved
in sensory perception and neural development (mg
39858, mg81865, mg123985, mg17560, mg83555, mg
59899, mg87917; see Table 2), although no GO terms
relating to these functions had more than one outlier
assigned to them. Other interesting outliers include
mg10984 [encoding ADAMTS19, a gene involved in sex-
ual differentiation and expressed predominantly in the
foetal ovary (Menke & Page 2002)] and mg26799
(encoding EIFE1B, an oocyte-specific translation initia-
tion factor). These genes are potential candidates for
adaptation relating to differential investment in repro-
duction at the expansion front.
It should be easier to assign a mechanistic basis to
outlier genes involved in the immune response. At the
expansion front, we might expect both an increased
need to invest in other traits and a reduced need
to invest in immunity if parasites lag behind their hosts.
Here, hosts should divert fewer resources to
maintaining their immune systems (White & Perkins
2012). Reduced investments may be targeted at particu-
lar aspects of the immune response, due to different
development and use costs (Lee 2006). For example, if
selection at the expansion front favours rapid growth,
trade-offs may lead to reduced investment in immune
components with particularly high development costs
(van der Most et al. 2011), such as induced cell-medi-
ated and antibody responses (Tschirren et al. 2003). The
distribution of parasites (helminths and ectoparasites)
changes markedly along the axis of the bank vole range
expansion (S. E. Perkins, unpublished data), and so it is
interesting to see differential selection pressure on
immune response reflected at the genetic level. As a
class, we found that immune system genes were signifi-
cantly enriched amongst the outliers. Genic outliers
involved in immunity are mg8107 (a T-cell receptor),
mg68377 (PRDX5, involved in peroxisome signalling),
mg24029 (LKHA4, involved in the inflammatory
response) and mg122511 (NFATC1, which plays a role
in the inducible expression of cytokine genes in T cells,
especially in the induction of IL-2 and IL-4 gene tran-
scription; Table 2). Of course, there is a need to validate
outliers and candidate loci through functional assays,
association studies and quantitative genetic dissection.
This will not only filter out potential false positive out-
liers, but will also lead to a greater understanding of
the mechanistic underpinnings of adaptation to range
expansion.
In this study, ten NS SNPs were identified where one
of the alleles was predicted to have a damaging effect
on the final protein product. Two of these damaging
alleles had a significant decline in frequency with dis-
tance from the point of introduction, whilst one had a
significant increase in frequency. There is therefore no
evidence for extensive fixation or frequency increase in
deleterious alleles during the range expansion, although
this may be due to the small sample of NS SNPs with
predicted functional effects. Lohmueller et al. (2008)
found that non-African human populations had signifi-
cantly more deleterious mutations than Africans, a pat-
tern they interpreted as being due to founder events,
genetic drift and allele surfing as humans moved out of
Africa. In a simulation study, Travis et al. (2007) found
that deleterious alleles arising at the edge of an expan-
sion were much more likely to persist than if they had
arisen in a stationary population. However, this result
depended strongly on the growth (r), carrying capacity
and dispersal (m) parameters used in the simulations. A
high rvalue increases the chances that a deleterious
mutation will surf, whilst at higher mvalues, deleteri-
ous alleles were less likely, and beneficial mutations
more likely, to surf. For small mammals, the costs of
dispersal are such that they should only disperse as far
as the nearest suitable unoccupied space. Emigration
may largely be driven by positive density-dependent
dispersal and agonistic behaviour from conspecifics
(Matthysen 2005; Hahne et al. 2011; Le Galliard et al.
2012), which is supported by our previous analysis of
the bank vole range expansion in Ireland (White et al.
2012). Positive density-dependent dispersal should tend
to reduce the rate of range expansion and minimize the
effect of genetic drift in demes at the expansion front.
Moreover, the simulation results of Travis et al. (2007)
were based on novel mutations arising near the expan-
sion front and did not consider standing variation. In
an introduced population such as the one considered
here, standing deleterious alleles may have increased in
frequency at the introduction site due to drift during
the initial bottleneck. Thereafter, selection by spatial
sorting may result in a kind of ‘spatial purging’. One
might expect that mutations having negative effects on
reproduction or dispersal might tend to be left behind
during a range expansion. Indeed, Travis et al. (2007)
found that mutations with a negative impact on fertility
were much less likely to surf than those with negative
effects on survival. The difference in deleterious allele
frequency between the expansion front and older estab-
lished populations may in general be less pronounced
than suggested by the findings of Lohmueller et al.
(2008).
©2013 John Wiley & Sons Ltd
2982 T. A. WHITE ET AL.
This study used a genome-wide approach to track
changes in genetic diversity across a well-characterized
range expansion. Using both functional and neutral loci,
we found that the introduced bank vole population in
Ireland has lost a substantial proportion of its diversity
during the expansion. Due to changes in diversity along
the axis of expansion and the potential for allele surf-
ing, traditional outlier approaches to detect loci under
selection are likely to return many false positives. Here,
we introduced a new test to detect loci under direc-
tional selection during the expansion. Using a correla-
tion-based approach, we identified a number of genes
under selection during the range expansion. It appears
that the bank vole has been able to respond adaptively
to the range expansion in spite of the general loss of
genetic diversity. However, there is no evidence that
populations at the expansion front carry more deleteri-
ous mutations than those at the range core, and this
may be because spatial purging is also important in
removing deleterious alleles as the population expands
its range. This is of relevance to many other species
expanding their ranges, for example due to climate
change, as it suggests that fitness does not necessarily
decline towards the wave front of the expansion.
The bank vole in Ireland represents an excellent sys-
tem with which to test hypotheses associated with
range expansions. The range is continuing to expand
without any human interference, and the history of the
expansion has been well characterized demographically
(White et al. 2012). In Ireland, the bank vole is expand-
ing into a landscape with relatively minor environmen-
tal perturbations, as shown by the consistent and
similar declines in genetic diversity across all three
transects. As the bank vole is amenable to laboratory
breeding and manipulation, the system also offers the
possibility to study the mechanics of an invasion/range
expansion of a small mammal experimentally.
To date, much work in population genetics and
genomics has used analytical models developed for
populations at approximate equilibrium. As many, if
not most, species have undergone recent range expan-
sions, we believe that it is of general relevance to con-
sider whether range expansions could have influenced
the genetic variation seen in any particular study sys-
tem, and use statistical models and simulations appro-
priate to such cases.
Acknowledgements
This research was supported by a Marie Curie FP7-PEOPLE-
2009-IOF and a Marie Curie FP7-PEOPLE-2009-IEF within the
7th European Community Framework Programme. TW was
also supported by a Heredity Fieldwork Grant from The Genet-
ics Society and a Percy Sladen Memorial Fund Grant from the
Linnean Society. GH acknowledges support from Swiss
National Science Foundation grant 31003A_127377/1. Colin
Lawton, Michael Field-May, Sam Grathoff, Libby Nixon, Nia
Thomas and Sophie Watson assisted in the collection of speci-
mens. The authors would like to thank Rob Elshire, Sharon
Mitchell and Charlotte Acharya in the Buckler lab at Cornell for
help with genotyping-by-sequencing, Rodrigo Vega for help
with laboratory work, Robert Bukowski at the Cornell Compu-
tational Biology Service Unit for bioinformatics advice and
Laurent Excoffier for access to computing facilities. The editor
and reviewers provided helpful comments and suggestions.
References
Altschul SF, Madden TL, Sch
affer AA et al. (1997) Gapped
BLAST and PSI-BLAST: a new generation of protein data-
base search programs. Nucleic Acids Research,25, 33893402.
Barret RD, Schluter D (2008) Adaptation from standing varia-
tion. Trends in Ecology and Evolution,23,3844.
Beaudin AE, Perry CA, Stabler SP, Allen RH, Stover PJ (2012)
Maternal Mthfd1 disruption impairs fetal growth but does
not cause neural tube defects in mice. American Journal of
Clinical Nutrition,95, 882891.
Besold J, Schmitt T, Tammaru T, Cassel-Lundhagen A (2008)
Strong genetic impoverishment from the centre of distribu-
tion in southern Europe to peripheral Baltic and isolated
Scandinavian populations of the pearly heath butterfly. Jour-
nal of Biogeography,35, 20902101.
Biek R, Henderson JC, Waller LA, Rupprecht CE, Real LA
(2007) A high-resolution genetic signature of demo-
graphic and spatial expansion in epizootic rabies virus.
Proceedings of the National Academy of Sciences USA,104,
79937998.
Bossdorf O, Auge H, Lafuma L et al. (2005) Phenotypic and
genetic differentiation between native and introduced plant
populations. Oecologia,144,111.
Bradbury PJ, Zhang Z, Kroon DE et al. (2007) TASSEL: soft-
ware for association mapping of complex traits in diverse
samples. Bioinformatics,23, 26332635.
Buckley J, Butlin RK, Bridle JR (2012) Evidence for evolution-
ary change associated with the recent range expansion of the
British butterfly, Aricia agestis, in response to climate change.
Molecular Ecology,21, 267280.
Burton OJ, Phillips BL, Travis JMJ (2010) Trade-offs and the
evolution of life-histories during range expansion. Ecology
Letters,13, 12101220.
Chen J, K
allman T, Ma X et al. (2012) Disentangling the roles
of history and local selection in shaping clinal variation of
allele frequencies and gene expression in Norway Spruce
(Picea abies). Genetics,191, 865881.
Claassens AJM, O’Gorman F (1965) The bank vole Clethrionomys glare-
olus Schreber a mammal new to Ireland. Nature,205,923924.
Coop G, Witonsky D, Di Rienzo A, Pritchard JK (2010) Using
environmental correlations to identify loci underlying local
adaptation. Genetics,185, 14111423.
Cwynar LC, MacDonald GM (1987) Geographical variation of
lodgepole pine in relation to population history. American
Naturalist,129, 463469.
Eckert AJ, Bower AD, Gonz
alez-Mart
ınez SC et al. (2010) Back
to nature: ecological genomics of loblolly pine (Pinus taeda,
Pinaceae). Molecular Ecology,19, 37893805.
©2013 John Wiley & Sons Ltd
ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2983
Edmonds CA, Lillie AS, Cavalli-Sforza LL (2004) Mutations
arising in the wave front of an expanding population. Pro-
ceedings of the National Academy of Sciences USA,101, 975979.
Elshire RJ, Glaubitz JC, Sun Q et al. (2011) A robust, simple
genotyping-by-sequencing (GBS) approach for high diversity
species. PLoS ONE,6, e19379.
Eppig JT, Blake JA, Bult CJ, Kadin JA, Richardson JE; the
Mouse Genome Database Group (2012) The Mouse Genome
Database (MGD): comprehensive resource for genetics and
genomics of the laboratory mouse. Nucleic Acids Research,40,
D881D886.
Estoup A, Beaumont M, Sennedot F, Moritz C, Cornuet JM
(2004) Genetic analysis of complex demographic scenarios:
spatially expanding populations of the cane toad, Bufo mari-
nus.Evolution,58, 20212036.
Excoffier L, Lischer HEL (2010) Arlequin suite ver. 3.5: a new ser-
ies of programs to perform population genetics analyses under
Linux and Windows. Molecular Ecology Resources,10, 564567.
Excoffier L, Ray N (2008) Surfing during population expan-
sions promotes genetic revolutions and structuration. Trends
in Ecology and Evolution,23, 347351.
Excoffier L, Foll M, Petit RJ (2009) Genetic consequences of
range expansions. Annual Review of Ecology, Evolution, and
Systematics,40, 481501.
Fagundes NJR, Ray N, Beaumont M et al. (2007) Statistical eval-
uation of alternative models of human evolution. Proceedings
of the National Academy of Sciences USA,104, 1761417619.
Fairley JS (1971) Malareus penicilliger mustelae: a flea new to
Ireland. Entomologist’s Monthly Magazine,107, 44.
Gautier M, Gharbi K, Cezard T et al. (2013) The effect of
RAD allele dropout on the estimation of genetic variation
within and between populations. Molecular Ecology,22,
31653178.
Haag CR, Saastamoinen M, Marden JH, Hanski I (2005) A can-
didate locus for variation in dispersal rate in a butterfly
metapopulation. Proceedings of the Royal Society of London.
Series B, Biological Sciences,272, 24492456.
Hahne J, Jenkins T, Halle S, Heckel G (2011) Establishment suc-
cess and resulting fitness consequences for vole dispersers.
Oikos,120,95105.
Hallatschek O, Hersen P, Ramanathan S, Nelson DR (2007)
Genetic drift at expanding frontiers promotes gene segrega-
tion. Proceedings of the National Academy of Sciences USA,104,
1992619930.
Hancock AM, Alkorta-Aranburu G, Witonsky DB, Di Rienzo A
(2010a) Adaptations to new environments in humans: the
role of subtle allele frequency shifts. Philosophical Transactions
of the Royal Society of London. Series B, Biological Sciences,365,
24592468.
Hancock AM, Witonsky DB, Ehler E et al. (2010b) Human
adaptations to diet, subsistence, and ecoregion are due to
subtle shifts in allele frequency. Proceedings of the National
Academy of Sciences USA,107, 89248930.
Hancock AM, Witonsky DB, Alkorta-Aranburu G et al. (2011)
Adaptations to climate-mediated selective pressures in
humans. PLoS Genetics,7, e1001375.
Handley LJL, Manica A, Goudet J, Balloux F (2007) Going the
distance: human population genetics in a clinal world. Trends
in Genetics,23, 432439.
Heckel G, Burri R, Fink S, Desmet J-F, Excoffier L (2005)
Genetic structure and colonization processes in European
populations of the common vole Microtus arvalis.Evolution,
59, 22312242.
Hewitt G (2000) The genetic legacy of the Quaternary ice ages.
Nature,405, 907913.
Hofer T, Ray N, Wegmann D, Excoffier L (2009) Large allele
frequency differences between human continental groups
are more likely to have occurred by drift during range
expansions than by selection. Annals of Human Genetics,73,
95108.
Hohenlohe PA, Bassham S, Etter PD, et al. (2010) Population
genomics of parallel adaptation in threespine stickleback
using sequenced RAD tags. PloS Genetics,6, e1000862.
Hughes CL, Dytham C, Hill JK (2007) Modelling and analysing
evolution of dispersal in populations at expanding range
boundaries. Ecological Entomology,32, 437445.
Kalinowski ST (2005) HP-RARE 1.0: a computer program for
performing rarefaction on measures of allelic richness. Molec-
ular Ecology Notes,5, 187189.
Klopfstein S, Currat M, Excoffier L (2006) The fate of mutations
surfing on the wave of a range expansion. Molecular Biology
and Evolution,23, 482490.
Kolbe JJ, Glor RE, Rodr
ıguez Schettino L et al. (2004) Genetic
variation increases during biological invasion by a Cuban
lizard. Nature,431, 177181.
Le Galliard J-F, R
emy A, Ims RA, Lambin X (2012) Patterns
and processes of dispersal behaviour in arvicoline rodents.
Molecular Ecology,21, 505523.
Lee KA (2006) Linking immune defenses and life history at the
levels of the individual and the species. Integrative and Com-
parative Biology,46, 10001015.
Lohmueller KE, Indap AR, Schmidt S et al. (2008) Proportion-
ally more deleterious genetic variation in European than in
African populations. Nature,451, 994997.
Lubina JA, Levin SA (1988) The spread of a reinvading species:
range expansion in the California sea otter. American Natural-
ist,131, 526543.
Lynch M (2009) Estimation of allele frequencies from high-
coverage genome-sequencing projects. Genetics,182, 295301.
Marshall LG, Webb SD, Sepkoski JJ, Raup DM (1982) Mamma-
lian evolution and the great American interchange. Science,
215, 13511357.
Matthews LJ, Butler PM (2011) Novelty-seeking DRD4 poly-
morphisms are associated with human migration distance
out-of-Africa after controlling for neutral population gene
structure. American Journal of Physical Anthropology,145, 382
389.
Matthysen E (2005) Density-dependent dispersal in birds and
mammals. Ecography,28, 403416.
Menke DB, Page DC (2002) Sexually dimorphic gene expres-
sion in the developing mouse gonad. Gene Expression Pat-
terns,2, 359367.
Monty A, Mahy G (2010) Evolution of dispersal traits along an
invasion route in the wind-dispersed Senecio inaequidens
(Asteraceae). Oikos,119, 15631570.
Moreau C, Bherer C, Vezina H et al. (2011) Deep human gene-
alogies reveal a selective advantage to be on an expanding
wave front. Science,334, 11481150.
van der Most PJ, de Jong B, Parmentier HK, Verhulst S (2011)
Trade-off between growth and immune function: a meta-
analysis of selection experiments. Functional Ecology,25,
7480.
©2013 John Wiley & Sons Ltd
2984 T. A. WHITE ET AL.
Novembre J, Han E (2012) Human population structure and
the adaptive response to pathogen-induced selection pres-
sures. Philosophical Transactions of the Royal Society of London.
Series B, Biological Sciences,367, 878886.
Parisod C, Bonvin G (2008) Fine-scale genetic structure and
marginal processes in an expanding population of Biscutella
laevigata L. (Brassicaceae). Heredity,101, 536542.
Parmesan C, Yohe G (2003) A globally coherent fingerprint of
climate change impacts across natural systems. Nature,421,
3742.
Phillips BL, Brown GP, Webb JK, Shine R (2006) Invasion and
the evolution of speed in toads. Nature,439, 803.
Phillips BL, Kelehear C, Pizzatto L et al. (2010) Parasites and
pathogens lag behind their host during periods of host range
advance. Ecology,91, 872881.
Prugnolle F, Manica A, Charpentier M et al. (2005) Pathogen-
driven selection and worldwide HLA class I diversity. Cur-
rent Biology,15, 10221027.
R Core Team (2012) R: A language and environment for statistical
computing. R Foundation for Statistical Computing, Vienna,
Austria, ISBN 3-900051-07-0, http://www.R-project.org/.
Ramensky VE, Sunyaev SR (2009) Computational analysis of
human genome polymorphism. Molecular Biology,43, 260268.
Ray N, Currat M, Foll M, Excoffier L (2010) SPLATCHE2: a
spatially-explicit simulation framework for complex demog-
raphy, genetic admixture and recombination. Bioinformatics,
26, 29932994.
Ryan A, Duke E, Fairley JS (1996) Mitochondrial DNA in bank
voles Clethrionomys glareolus in Ireland: evidence for a small
founder population and localized founder effects. Acta Theri-
ologica,41,4550.
Shine R, Brown GP, Phillips BL (2011) An evolutionary process
that assembles phenotypes through space rather than
through time. Proceedings of the National Academy of Sciences
USA,108, 57085711.
Simmons AD, Thomas CD (2004) Changes in dispersal during
species’ range expansions. American Naturalist,164, 378395.
Slatkin M, Excoffier L (2012) Serial founder effects during
range expansion: a spatial analog of genetic drift. Genetics,
191, 171181.
Stephens PA, Sutherland WJ (1999) Consequences of the Allee
effect for behaviour, ecology and conservation. Trends in
Ecology and Evolution,14, 401405.
Stuart P, Mirmin L, Cross TF et al. (2007) The origin of Irish
bank voles Clethrionomys glareolus assessed by mitochondrial
DNA analysis. Irish Naturalists’ Journal,28, 440446.
Thomas PD, Campbell MJ, Kejariwal A et al. (2008) PANTHER:
a library of protein families and subfamilies indexed by
function. Genome Research,13, 21292141.
Travis JMJ, Dytham C (2002) Dispersal evolution during inva-
sions. Evolutionary Ecology Research,4, 11191129.
Travis JMJ, Munkemuller T, Burton OJ et al. (2007) Deleterious
mutations can surf to high densities on the wave front of an
expanding population. Molecular Biology and Evolution,24,
23342343.
Travis JMJ, Mustin K, Benton TG, Dytham C (2009) Accelerating
invasion rates result from the evolution of density-dependent
dispersal. Journal of Theoretical Biology,259, 151158.
Tschirren B, Fitze PS, Richner H (2003) Sexual dimorphism in
susceptibility to parasites and cell-mediated immunity in
great tit nestlings. Journal of Animal Ecology,72, 839845.
Tsutsui ND, Suarez AV, Holway DA, Case TJ (2000) Reduced
genetic variation and the success of an invasive species.
Proceedings of the National Academy of Sciences USA,97, 5948
5953.
UniProt Consortium (2012) Reorganizing the protein space at
the Universal Protein Resource (UniProt). Nucleic Acids
Research,40, D71D75.
Velo-Ant
on G, Rodr
ıguez D, Savage AE et al. (2012) Amphib-
ian-killing fungus loses genetic diversity as it spreads across
the New World. Biological Conservation,146, 213218.
Waters JM, Fraser CI, Hewitt GM (2012) Founder takes all:
density-dependent processes structure biodiversity. Trends in
Ecology and Evolution,28,7885.
White TA, Perkins SE (2012) The ecoimmunology of invasive
species. Functional Ecology,26, 13131323.
White TA, Lundy MG, Montgomery WI et al. (2012) Range
expansion in an invasive small mammal: influence of life-
history and habitat quality. Biological Invasions,14, 2203
2215.
White TA, Perkins SE, Heckel G, Searle JB. (2013) Data from:
adaptive evolution during an ongoing range expansion: the
invasive bank vole (Myodes glareolus) in Ireland. Dryad Digital
Repository. doi:10.5061/dryad.fb782.
Xu H, Wu XR, Wewer UM, Engvall E (1994) Murine muscular
dystrophy caused by a mutation in the laminin alpha 2
(Lama2) gene. Nature Genetics,8, 297302.
Yang W-Y, Novembre J, Eskin E, Halperin E (2012) A model-
based approach for analysis of spatial structure in genetic
data. Nature Genetics,44, 725731.
Zayed A, Whitfield CW (2008) A genome-wide signature of
positive selection in ancient and recent invasive expansions
of the honey bee Apis mellifera.Proceedings of the National
Academy of Sciences USA,105, 34213426.
T.A.W., G.H. and J.B.S. designed and planned the
study. T.A.W. and S.E.P. carried out the fieldwork in
Ireland. T.A.W. carried out the analyses. T.A.W., S.E.P.,
G.H. and J.B.S. wrote the manuscript.
Data accessibility
Genotype data are available via Dryad doi:10.5061/
dryad.fb782 (White et al. 2013). Illumina reads are avail-
able from the Sequence Read Archive accession
SRP020629.
Supporting information
Additional supporting information may be found in the online ver-
sion of this article.
Table S1 Results of linear regression of genetic diversity on
distance from the introduction site at Foynes.
Table S2 Potentially deleterious alleles identified by PolyPhen-
2 and correlations with distance from the introduction site.
©2013 John Wiley & Sons Ltd
ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2985
... To understand how colonization occurs, and which candidate traits may favour expansion in a non-native environment, we need to monitor an expansion in real time. An ongoing rodent colonization of Ireland by a continental vole species [20], that started a century ago and is currently covering more than half of Ireland, offers a unique opportunity to study colonization processes in an ecologically homogeneous area with a single, poorer, competitor (the wood mouse, Apodemus sylvaticus), and a genetically homogeneous invader population from a single origin. The bank vole (Myodes glareolus) is believed to have been introduced to Ireland at Foynes port, in the west of Ireland, from Germany during the construction of a hydroelectric dam in 1919/1920 [21]. ...
... We investigated behavioural traits of both the non-native bank vole and the single native small rodent woodland species the wood mouse at 8 site replicates (figure 1a) with comparable woodland vegetation in the different colonization zones, testing 414 rodents in 533 behavioural tests. At the established zone (2 sites) voles have been present for 80-100 years [20] and coexist with relatively low numbers of mice; at the edge (4 sites) mice and voles coexist both in relatively even numbers; and at the pre-arrival zone (2 sites) mice were found in relatively high numbers (figure 1a, and electronic supplementary material, table S1) but no voles. ...
... urbanization [3,50]) and challenging environments by increasing their behavioural plasticity at both the genotypic and phenotypic level. A previous study of the bank vole colonization of Ireland found significant enrichment for genic SNPs (single nucleotide polymorphisms) to correlate with distance from the site of introduction, including genes 'that could influence behaviour and a gene involved in sexual differentiation' [20], which was confirmed at the phenotypic level by the present study. Furthermore, behavioural plasticity is related to risk taking behaviour in bank voles [51] and our results show that both are relevant traits in dispersal processes. ...
Article
Full-text available
Animal behaviour can moderate biological invasion processes, and the native fauna's ability to adapt. The importance and nature of behavioural traits favouring colonization success remain debated. We investigated behavioural responses associated with risk-taking and exploration, both in non-native bank voles (Myodes glareolus, N = 225) accidentally introduced to Ireland a century ago, and in native wood mice (Apodemus sylvaticus, N = 189), that decline in numbers with vole expansion. We repeatedly sampled behavioural responses in three colonization zones: established bank vole populations for greater than 80 years (2 sites), expansion edge vole populations present for 1-4 years (4) and pre-arrival (2). All zones were occupied by wood mice. Individuals of both species varied consistently in risk-taking and exploration. Mice had not adjusted their behaviour to the presence of non-native voles, as it did not differ between the zones. Male voles at the expansion edge were initially more risk-averse but habituated faster to repeated testing, compared to voles in the established population. Results thus indicate spatial sorting for risk-taking propensity along the expansion edge in the dispersing sex. In non-native prey species the ability to develop risk-averse phenotypes may thus represent a fundamental component for range expansions.
... They provide a cost-effective approach to generate high-throughput genome-wide sequencing data, allowing thousands of SNPs to be discovered, sequenced, and genotyped, with or without a genome (see review Davey et al., 2011). Thus, RRS approaches offer many advantages to conservation genetic studies of non-model species (Hess et al., 2013;White et al., 2013). ...
Thesis
Full-text available
Captive insurance populations act as a safeguard against species extinction and can provide a source for future reintroductions into the wild. As more species become threatened globally there is increased pressure on captive insurance populations to support the conservation of their wild counterparts. Consequently, the genetic composition of captive populations becomes increasing important. Captive insurance populations should be both representative of the wild population throughout its range and genetically and demographically viable for long-term management. However, many captive populations have been established with a small number of founders, with unknown source location or relationships. The risks associated with small populations, such as genetic drift, loss of genetic diversity, and increased inbreeding are then exacerbated in captivity if founders are sourced from existing small wild population, where the likelihood that founders are unrelated is low. Therefore, existing captive populations may not be suitable as a future insurance for species. Kea (Nestor notabilis) are a large species of parrot native to Te Waipounamu o Aotearoa (the South Island of New Zealand), listed as ‘Nationally Endangered’ under the Aotearoa New Zealand Threat Classification System. Despite current conservation strategies, the wild population is thought to still be declining, which has raised interest in the role the captive kea flock might play as a future source for wild reintroductions (i.e., an insurance population). Kea have been held in captivity since at least the early 1960s, with uncontrolled, sporadic breeding and a historical lack of regulations prior to the full protection of the species in 1986. Currently, the 59 kea held in captivity in New Zealand have an incomplete pedigree, skewed sex ratio, skewed founder representation, and unknown genetic structure, which undermines their suitability as a potential insurance population for future management. In this thesis, genome-wide single nucleotide polymorphism (SNP) data generated through genotyping-by-sequencing (GBS) is used to examine the population genetic structure, genetic diversity, relatedness, and levels of inbreeding in the captive and wild kea populations. Overall, the genetic data presented here will help achieve the long-term goals and sustainable management of kea by i) determining if the current captive population is a viable insurance population for the species, ii) informing genetic management of captive kea to optimise genetic diversity retention, and minimise inbreeding, and iii) investigating the genetic structure and diversity of wild kea at a higher resolution. Our analyses found greatest support for two genetic clusters in the wild kea population (north and south of the South Island), with a steady gradient of admixture between the two. These data provide support at a higher resolution to previous genetic studies on the wild population. Wild kea to the north of the South Island show lower levels of genetic diversity and higher levels of inbreeding relative to the rest of the wild kea population, and the captive population. Long-term population monitoring and genetic analyses will be essential to accurately examine trends in genetic diversity and inbreeding of wild kea populations, particularly if populations continue to decline. The pedigree reconstruction using relatedness estimates derived from the GBS data was largely congruent with the studbook pedigree. Notably, however, three putatively unrelated individuals appear to be first-order relatives (parent-offspring or full-sibling). This finding is indicative of a rejection to the typical assumption that founders are unrelated, even for an endangered species with a sizeable wild population. Additionally, an understanding of the relationships among kea in the current captive population has implications for the future genetic management and breeding of the population. Despite no signs of inbreeding or reduced genetic diversity, the captive kea population is not genetically representative of the wild kea population throughout its range, nor is it viable for long-term management. It is recommended that the captive population be supplemented with additional wild founders sourced from the more genetically diverse, and currently underrepresented, southern end of the South Island. This study highlights the importance of proactive genetic assessments and the integration of genetic information into captive and wild species management, particularly when establishing or supplementing captive insurance populations.
... While range expansion could be considered beneficial for a species, there can be genetic consequences such as low genetic diversity at the leading edge. In the case of a species following a stepping-stone dispersal pattern during range expansion, genetic diversity can diminish with distance (Excoffier et al., 2009;White et al., 2013). Surprisingly, the gentoo penguin colonies sampled exhibited very little evidence of founder effects. ...
Article
Many species are shifting their ranges in response to climate‐driven environmental changes, particularly in high‐latitude regions. However, the patterns of dispersal and colonization during range shifting events are not always clear. Understanding how populations are connected through space and time can reveal how species navigate a changing environment. Here, we present a fine‐scale population genomics study of gentoo penguins ( Pygoscelis papua ), a presumed site‐faithful colonial nesting species that has increased in population size and expanded its range south along the Western Antarctic Peninsula. Using whole genome sequencing, we analysed 129 gentoo penguin individuals across 12 colonies located at or near the southern range edge. Through a detailed examination of fine‐scale population structure, admixture, and population divergence, we inferred that gentoo penguins historically dispersed rapidly in a stepping‐stone pattern from the South Shetland Islands leading to the colonization of Anvers Island, and then the adjacent mainland Western Antarctica Peninsula. Recent southward expansion along the Western Antarctic Peninsula also followed a stepping‐stone dispersal pattern coupled with limited post‐divergence gene flow from colonies on Anvers Island. Genetic diversity appeared to be maintained across colonies during the historical dispersal process, and range‐edge populations are still growing. This suggests large numbers of migrants may provide a buffer against founder effects at the beginning of colonization events to maintain genetic diversity similar to that of the source populations before migration ceases post‐divergence. These results coupled with a continued increase in effective population size since approximately 500–800 years ago distinguish gentoo penguins as a robust species that is highly adaptable and resilient to changing climate.
... We only retained bi-allelic SNPs and called genotypes if individuals had a read depth of at least five at the locus. After SNP calling, we removed all loci with complex indels, a minor allele frequency of less than 5 per cent, more than 20 per cent missing data or observed heterozygosity greater than 50 per cent, which may indicate loci that contain paralogues merged together (White et al. 2013). Individuals with more than 50 per cent missing data were also removed (seven individuals, all from Saxenhofer et al. (2022)). ...
Article
Full-text available
Evolutionary divergence of viruses is most commonly driven by co-divergence with their hosts or through isolation of transmission after host-shifts. It remains mostly unknown, however, whether divergent phylogenetic clades within named virus species represent functionally equivalent byproducts of high evolutionary rates or rather incipient virus species. Here, we test these alternatives with genomic data from two widespread phylogenetic clades in Tula orthohantavirus (TULV) within a single evolutionary lineage of their natural rodent host, the common vole Microtus arvalis. We examined voles from 42 locations in the contact region between clades for TULV infection by RT-PCR. Sequencing yielded 23 TULV Central North and 21 TULV Central South genomes which differed by 14.9-18.5% at the nucleotide and 2.2-3.7% at the amino acid level without evidence of recombination or reassortment between clades. Geographic cline analyses demonstrated an abrupt (<1 km wide) transition between the parapatric TULV clades in continuous landscape. This transition was located within the Central mitochondrial lineage of M. arvalis and genomic SNPs showed gradual mixing of host populations across it. Genomic differentiation of hosts was much weaker across the TULV Central North to South transition than across the nearby hybrid zone between two evolutionary lineages in the host. We suggest that these parapatric TULV clades represent functionally distinct, incipient species which are likely differently affected by genetic polymorphisms in the host. This highlights the potential of natural viral contact zones as systems for investigating of the genetic and evolutionary factors enabling or restricting the transmission of RNA viruses.
... The DNA samples were submitted to a genomic library preparation protocol adapted from Elshire et al. [29] at the Ecomol Genomic Service of the ESALQ-USP (Piracicaba, Brazil). For the digestion of genomic DNA, the PstI enzyme was chosen because it presented satisfactory in vitro test results and in library constructions with many animal species [35,36], including birds [37]. The digestion reaction was performed separately for each sample, using 10 µL of DNA at a concentration of 10 ng/µL, 3 µL of buffer, 1 µL of PstI enzyme (20 U/µL) (New England Biolabs, Ipswich, MA, USA), and 16 µL of water. ...
Article
Full-text available
Simple Summary The Brazilian merganser, a critically endangered duck species in South America, was studied using a population genomics approach. This research focused on the genetic diversity of the mergansers in the four remaining wild populations located in Central Brazil. The results showed that there is a low genetic diversity and high levels of inbreeding in individuals across all locations, with a moderate level of genetic differentiation between them. These findings highlight the need for immediate conservation actions to prevent the decline of the Brazilian merganser population and genetic erosion. Genetic monitoring can help implement appropriate in situ and ex situ management strategies to increase the species’ long-term survival in its natural environment. Abstract The Brazilian merganser (Mergus octosetaceus) is one of the most endangered bird species in South America and comprises less than 250 mature individuals in wild environments. This is a species extremely sensitive to environmental disturbances and restricted to a few “pristine” freshwater habitats in Brazil, and it has been classified as Critically Endangered on the IUCN Red List since 1994. Thus, biological conservation studies are vital to promote adequate management strategies and to avoid the decline of merganser populations. In this context, to understand the evolutionary dynamics and the current genetic diversity of remaining Brazilian merganser populations, we used the “Genotyping by Sequencing” approach to genotype 923 SNPs in 30 individuals from all known areas of occurrence. These populations revealed a low genetic diversity and high inbreeding levels, likely due to the recent population decline associated with habitat loss. Furthermore, it showed a moderate level of genetic differentiation between all populations located in four separated areas of the highly threatened Cerrado biome. The results indicate that urgent actions for the conservation of the species should be accompanied by careful genetic monitoring to allow appropriate in situ and ex situ management to increase the long-term species’ survival in its natural environment.
... В своем исследовании расселения людей из Африки Deshpande et al. (2009) обнаружили, что уменьшение числа основателей усиливает совокупный эффект бутылочного горлышка, что истощает генетическое разнообразие, которое резко падает по оси расширения ареала. Экспансия рыжих полевок (Myodes glareolus) в Северной Ирландии сопровождалась снижением аллельного разнообразия и гетерозиготности вдоль оси расширения ареала в результате последовательных эффектов основателя и стохастических процессов (White et al., 2013). Таким образом, уменьшение генетического разнообразия вдоль пространственной оси -один из признаков расширения ареала и индикатор его направления. ...
Article
Full-text available
Деятельность человека порождает новые глобальные процессы, в том числе изменения ареалов, вызванные трансформацией ландшафтов, биологическими инвазиями и изменениями климата. В ходе расширения ареала происходит освоение видом или популяцией новых пространств – колонизация. Исследование причин и процессов, сопровождающих колонизацию, а также ее последствий бурно развивается в последние 20 лет на стыке между такими областями биологии, как: пространственная экология, экология перемещений, экология инвазий, теория метапопуляций, поведенческая экология, эволюционная экология, популяционная генетика, теория персональности. В своем обзоре мы суммируем теоретические представления и эмпирические исследования, нацеленные на поиск ответов на два главных вопроса: что отличает колонистов от их сородичей и в чем специфика демографических и генетических процессов, протекающих на волне экспансии популяции?
... A source population was then established at the West coast, and has been estimated to expand its range at about 2.5 km a year (White et al. 2012), 5 times faster compared to dispersal within an occupied area with normal density (e.g., Smal and Fairley 1984;Tegelström and Jaarola 1998;Gliwicz and Ims 2000). Genetic comparison among source and edge populations indicated spatial sorting, although the phenotypic differences remained unclear (White et al. 2013). ...
Article
Full-text available
Whether introduced into a completely novel habitat or slowly expanding their current range, the degree to which animals can efficiently explore and navigate new environments can be key to survival, ultimately determining population establishment and colonization success. We tested whether spatial orientation and exploratory behavior are associated with non-native spread in free-living bank voles (Myodes glareolus, N = 43) from a population accidentally introduced to Ireland a century ago. We measured spatial orientation and navigation in a radial arm maze, and behaviors associated to exploratory tendencies and risk-taking in repeated open-field tests, at the expansion edge and in the source population. Bank voles at the expansion edge re-visited unrewarded arms of the maze more, waited longer before leaving it, took longer to start exploring both the radial arm maze and the open field, and were more risk-averse compared to conspecifics in the source population. Taken together, results suggest that for this small mammal under heavy predation pressure, a careful and thorough exploration strategy might be favored when expanding into novel environments.
... The DNA samples were submitted to a genomic library preparation protocol adapted from Elshire et al. 29 at the Ecomol Genomic Service of the ESALQ-USP (Piracicaba, Brazil). For the digestion of genomic DNA, the PstI enzyme was chosen because it presented satisfactory in vitro test results and in library constructions with many animal species 41,42 , including birds 43 . The digestion reaction was performed separately for each sample, using 10 µl of DNA at a concentration of 10 ng/ul, 3 µl of Buffer, 1 µl of PstI enzyme (20 U/µl) (New England Biolabs) and 16 µl of water. ...
Preprint
Full-text available
The Brazilian merganser ( Mergus octosetaceus ) is one of the most endangered bird species in South America that comprises less than 250 mature individuals in the wild environments. This is a species extremely sensitive to environmental disturbances and restricted to few “pristine” freshwater habitats in Brazil, and it has been classified as Critically Endangered on the IUCN Red List since 1994. Understanding its current genetic diversity to promote in situ and ex situ management strategies was considered urgent for conservation of the remaining populations. To understand the evolutionary dynamics of remaining Brazilian merganser populations we have used "Genotyping by Sequencing" approach to characterize 923 SNPs in 31 individuals from all known areas of occurrence. The remaining populations of the Brazilian merganser present a low genetic diversity and high inbreeding levels likely due to recent population decline associated to habitat loss. Furthermore, it revealed a moderate level of genetic differentiation between all populations located in four separated areas of the highly threatened Cerrado biome. The results indicate that urgent actions for conservation of the species should be accompanied by a careful genetic monitoring to allow appropriate in situ and ex situ management to increase the long-term species survival in its natural environment.
Article
Full-text available
Evidence for divergent selection and adaptive variation across the landscape can provide insight into a species' ability to adapt to different environments. However, despite recent advances in genomics, it remains difficult to detect the footprints of climate‐mediated selection in natural populations. Here, we analysed ddRAD sequencing data (21,892 SNPs) in conjunction with geographic climate variation to search for signatures of adaptive differentiation in twelve populations of the bank vole ( Clethrionomys glareolus ) distributed across Europe. To identify the loci subject to selection associated with climate variation, we applied multiple genotype‐environment association methods, two univariate and one multivariate, and controlled for the effect of population structure. In total, we identified 213 candidate loci for adaptation, 74 of which were located within genes. In particular, we identified signatures of selection in candidate genes with functions related to lipid metabolism and the immune system. Using the results of redundancy analysis, we demonstrated that population history and climate have joint effects on the genetic variation in the pan‐European metapopulation. Furthermore, by examining only candidate loci, we found that annual mean temperature is an important factor shaping adaptive genetic variation in the bank vole. By combining landscape genomic approaches, our study sheds light on genome‐wide adaptive differentiation and the spatial distribution of variants underlying adaptive variation influenced by local climate in bank voles.
Article
A species' success during the invasion of new areas hinges on an interplay between the demographic processes common to invasions and the specific ecological context of the novel environment. Evolutionary genetic studies of invasive species can investigate how genetic bottlenecks and ecological conditions shape genetic variation in invasions, and our study pairs two invasive populations that are hypothesized to be from the same source population to compare how each population evolved during and after introduction. Invasive European starlings ( Sturnus vulgaris ) established populations in both Australia and North America in the 19th century. Here, we compare whole‐genome sequences among native and independently introduced European starling populations to determine how demographic processes interact with rapid evolution to generate similar genetic patterns in these recent and replicated invasions. Demographic models indicate that both invasive populations experienced genetic bottlenecks as expected based on invasion history, and we find that specific genomic regions have differentiated even on this short evolutionary timescale. Despite genetic bottlenecks, we suggest that genetic drift alone cannot explain differentiation in at least two of these regions. The demographic boom intrinsic to many invasions as well as potential inversions may have led to high population‐specific differentiation, although the patterns of genetic variation are also consistent with the hypothesis that this infamous and highly mobile invader adapted to novel selection (e.g., extrinsic factors). We use targeted sampling of replicated invasions to identify and evaluate support for multiple, interacting evolutionary mechanisms that lead to differentiation during the invasion process.
Data
Full-text available
The mission of UniProt is to support biological research by providing a freely accessible, stable, comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and querying interfaces. UniProt is comprised of four major components, each optimized for different uses: the UniProt Archive, the UniProt Knowledgebase, the UniProt Reference Clusters and the UniProt Metagenomic and Environmental Sequence Database. A key development at UniProt is the provision of complete, reference and representative proteomes. UniProt is updated and distributed every 4 weeks and can be accessed online for searches or download at http://www.uniprot.org.
Article
The mission of UniProt is to support biological research by providing a freely accessible, stable, comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and querying interfaces. UniProt is comprised of four major components, each optimized for different uses: the UniProt Archive, the UniProt Knowledgebase, the UniProt Reference Clusters and the UniProt Metagenomic and Environmental Sequence Database. A key development at UniProt is the provision of complete, reference and representative proteomes. UniProt is updated and distributed every 4 weeks and can be accessed online for searches or download at http://www.uniprot.org.
Article
... The PANTHER database (http:// panther .celera.com) was designed as a resource to comprehensively and consistently treat both family and subfamily classification of proteins, focused on metazoans but also covering other organisms. Rationale. ...
Article
Whereas the genome-era technologies have produced the sequence of complete human genome, the modern post-genome technologies aim at the understanding of mechanisms of processing of genetic information and elucidation of within-species variation. Single nucleotide polymorphisms (SNPs) comprise the majority of polymorphism in the human population. Non-synonymous coding SNPs together with SNPs in regulatory regions are believed to have the highest impact on complex disease etiology, quantitative traits and response to drug treatment. PolyPhen is a computational tool for prediction of putatively functional nsSNPs with application areas such as genetics of complex disease, birth defects, identification of functional mutations in model organisms and evolutionary genetics.
Article
Inferring the spatial expansion dynamics of invading species from molecular data is notoriously difficult due to the complexity of the processes involved. For these demographic scenarios, genetic data obtained from highly variable markers may be profitably combined with specific sampling schemes and information from other sources using a Bayesian approach. The geographic range of the introduced toad Bufo marinus is still expanding in eastern and northern Australia, in each case from isolates established around 1960. A large amount of demographic and historical information is available on both expansion areas. In each area, samples were collected along a transect representing populations of different ages and genotyped at 10 microsatellite loci. Five demographic models of expansion, differing in the dispersal pattern for migrants and founders and in the number of founders, were considered. Because the demographic history is complex, we used an approximate Bayesian method, based on a rejection-regression algorithm, to formally test the relative likelihoods of the five models of expansion and to infer demographic parameters. A stepwise migration-foundation model with founder events was statistically better supported than other four models in both expansion areas. Posterior distributions supported different dynamics of expansion in the studied areas. Populations in the eastern expansion area have a lower stable effective population size and have been founded by a smaller number of individuals than those in the northern expansion area. Once demographically stabilized, populations exchange a substantial number of effective migrants per generation in both expansion areas, and such exchanges are larger in northern than in eastern Australia. The effective number of migrants appears to be considerably lower than that of founders in both expansion areas. We found our inferences to be relatively robust to various assumptions on marker, demographic, and historical features. The method presented here is the only robust, model-based method available so far, which allows inferring complex population dynamics over a short time scale. It also provides the basis for investigating the interplay between population dynamics, drift, and selection in invasive species.
Article
Advances in next generation technologies have driven the costs of DNA sequencing down to the point that genotyping-by-sequencing (GBS) is now feasible for high diversity, large genome species. Here, we report a procedure for constructing GBS libraries based on reducing ...
Data
The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information that is essential for modern biological research. UniProt is produced by the UniProt Consortium which consists of groups from the European Bioinformatics Institute, the Protein Information Resource and the Swiss Institute of Bioinformatics. The core activities include manual curation of protein sequences assisted by computa-tional analysis, sequence archiving, a user-friendly UniProt website and the provision of additional value-added information through cross-references to other databases. UniProt is comprised of four major components, each optimized for different uses: the UniProt Archive, the UniProt Knowledge-base, the UniProt Reference Clusters and the Uni-Prot Metagenomic and Environmental Sequence Database. One of the key achievements of the UniProt consortium in 2008 is the completion of the first draft of the complete human proteome in UniProtKB/Swiss-Prot. This manually annotated representation of all currently known human protein-coding genes was made available in UniProt release 14.0 with 20 325 entries. UniProt is updated and distributed every three weeks and can be accessed online for searches or downloaded at www.uniprot.org. INTRODUCTION