ArticlePDF Available

White TA, Perkins SE, Heckel G, Searle JB.. Adaptive evolution during an ongoing range expansion: the invasive bank vole (Myodes glareolus in Ireland. Mol Ecol 22: 2971-2985

May 2013
Molecular Ecology 22(11)

May 2013
22(11)

DOI:10.1111/mec.12343

Source
PubMed

Authors:

Sarah E Perkins

Cardiff University

Jeremy Searle

Cornell University

Range expansions are extremely common, but have only recently begun to attract attention in terms of their genetic consequences. As populations expand, demes at the wave front experience strong genetic drift, which is expected to reduce genetic diversity and potentially cause 'allele surfing', where alleles may become fixed over a wide geographical area even if their effects are deleterious. Previous simulation models show that range expansions can generate very strong selective gradients on dispersal, reproduction, competition and immunity. To investigate the effects of range expansion on genetic diversity and adaptation, we studied the population genomics of the bank vole (Myodes glareolus) in Ireland. The bank vole was likely introduced in the late 1920s and is expanding its range at a rate of ~2.5 km/year. Using genotyping-by-sequencing, we genotyped 281 bank voles at 5979 SNP loci. Fourteen sample sites were arranged in three transects running from the introduction site to the wave front of the expansion. We found significant declines in genetic diversity along all three transects. However, there was no evidence that sites at the wave front had accumulated more deleterious mutations. We looked for outlier loci with strong correlations between allele frequency and distance from the introduction site, where the direction of correlation was the same in all three transects. Amongst these outliers, we found significant enrichment for genic SNPs, suggesting the action of selection. Candidates for selection included several genes with immunological functions and several genes that could influence behaviour.

Location of sample sites in Ireland. CD = Cloonfad, TM = Tuam, GT = Gort, TA = Tulla, BN = Ballynahown, BR =Birr, NH = Nenagh, LK = Limerick, NS = New Ross, WP = Windgap, CL = Cashel, KY = Kilteely, AE = Adare, FS = Foynes. Sites on the northern transect are marked with squares, those on the north-eastern transect are marked with circles, and those on the eastern transect with triangles. Foynes, which is the introduction site and is on all three transects is marked with a cross. The dashed line shows the approximate range limits of the bank vole in 2011.

…

Change in frequencies of deleterious alleles (identified using PolyPhen-2) with distance from the introduction site. Only loci with significant correlations are shown. The SNP mg8017 is shown with open circles, mg123985 with open squares, and mg134851 with crosses.

…

Figures - uploaded by Sarah E Perkins

Content may be subject to copyright.

Content uploaded by Sarah E Perkins

Content may be subject to copyright.

Adaptive evolution during an ongoing range expansion:

the invasive bank vole (Myodes glareolus) in Ireland

THOMAS A. WHITE,*†SARAH E. PERKINS,‡GERALD HECKEL†§and JEREMY B. SEARLE*

*Department of Ecology and Evolutionary Biology, Cornell University, Corson Hall, Ithaca, NY 14853-2701, USA,

†Computational and Molecular Population Genetics (CMPG), Institute of Ecology and Evolution, University of Bern,

Baltzerstrasse 6, CH-3012, Bern, Switzerland, ‡School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum

Avenue, Cardiff, CF10 3AX, UK, §Swiss Institute of Bioinformatics, Genopode, CH 1015 Lausanne, Switzerland

Abstract

Range expansions are extremely common, but have only recently begun to attract

attention in terms of their genetic consequences. As populations expand, demes at the

wave front experience strong genetic drift, which is expected to reduce genetic diver-

sity and potentially cause ‘allele surﬁng’, where alleles may become ﬁxed over a wide

geographical area even if their effects are deleterious. Previous simulation models

show that range expansions can generate very strong selective gradients on dispersal,

reproduction, competition and immunity. To investigate the effects of range expansion

on genetic diversity and adaptation, we studied the population genomics of the bank

vole (Myodes glareolus) in Ireland. The bank vole was likely introduced in the late

1920s and is expanding its range at a rate of ~2.5 km/year. Using genotyping-by-

sequencing, we genotyped 281 bank voles at 5979 SNP loci. Fourteen sample sites were

arranged in three transects running from the introduction site to the wave front of the

expansion. We found signiﬁcant declines in genetic diversity along all three transects.

However, there was no evidence that sites at the wave front had accumulated more

deleterious mutations. We looked for outlier loci with strong correlations between

allele frequency and distance from the introduction site, where the direction of correla-

tion was the same in all three transects. Amongst these outliers, we found signiﬁcant

enrichment for genic SNPs, suggesting the action of selection. Candidates for selection

included several genes with immunological functions and several genes that could

inﬂuence behaviour.

Keywords: allele frequency cline, genotyping-by-sequencing, nonmodel, outlier, population

genomics, RAD

Received 21 November 2012; accepted 3 April 2013

Introduction

Many empirical studies of the genetic consequences of

species introductions have tended to focus on the intro-

duction event itself (e.g. Tsutsui et al. 2000; Kolbe et al.

2004; Bossdorf et al. 2005) and fail to consider the

genetic consequences of the subsequent range expan-

sion—an integral part of successful establishment of

any invasive species—in an explicitly spatial context.

The genetic consequences of range expansion are not

only important for invasive species. Many, if not most,

species have recently experienced range expansions

(Excofﬁer et al. 2009); examples include the expansion

of species from refugia following glacial retreat or

advance (Hewitt 2000), recovery of species after perse-

cution or overexploitation (Lubina & Levin 1988), the

current movement of species due to climate change

(Parmesan & Yohe 2003), expansions associated with

geological events (Marshall et al. 1982), the spread of

species with novel adaptations, such as the expansion

of anatomically modern humans out of Africa (Fagun-

des et al. 2007), and the spread of pathogens during

disease epidemics (Biek et al. 2007; Velo-Ant

on et al.

Correspondence: Thomas A. White,

E-mail: tawhite201@gmail.com

Molecular Ecology (2013) 22, 2971–2985 doi: 10.1111/mec.12343

2012). Despite their frequency, researchers have only

recently begun to appreciate the importance of range

expansions in shaping the current distribution of

genetic diversity at both neutral and functional loci

(Prugnolle et al. 2005; Handley et al. 2007; Besold et al.

2008; Buckley et al. 2012; Velo-Ant

on et al. 2012; Waters

et al. 2012).

Theoretical studies have shown that range expansions

are fundamentally different from purely demographic

expansions. As a population expands its range, it

undergoes a series of founder events, which can lead to

ﬂuctuations in allele frequency and stochastic loss of

alleles (Slatkin & Excofﬁer 2012). Range expansions are

generally associated with decreasing allelic richness and

heterozygosity with increasing distance along the axis

of expansion (Estoup et al. 2004; Heckel et al. 2005;

Prugnolle et al. 2005; Handley et al. 2007; Besold et al.

2008; Parisod & Bonvin 2008; Velo-Ant

on et al. 2012).

This reduced genetic diversity and associated inbreed-

ing may negatively impact the ﬁtness of individuals at

the expanding range margin. Edmonds et al. (2004) and

Klopfstein et al. (2006) demonstrated that neutral muta-

tions arising on the edge of a range expansion can

sometimes ‘surf’ on the wave of the advance and reach

higher frequencies than would be expected in a popula-

tion at equilibrium. Klopfstein et al. (2006) suggested

that this phenomenon could lead to increased rates of

evolution at range margins. However, Travis et al.

(2007) have shown with simulation models that deleteri-

ous mutations can also surf to high frequencies at

expanding range margins, including mutations having a

negative effect on reproductive rate and juvenile com-

petitive ability. Despite these theoretical insights, there

remain very few empirical studies that have tested their

predictions, and the distribution of genetic diversity in

expanding populations, and its signiﬁcance, remains

poorly understood.

In addition to strong drift and allele surﬁng, range

expansions may generate very strong selection pres-

sures. Simulation modelling predicts that individuals at

the expanding wave front should experience selection

for increased dispersal and reproduction (Travis &

Dytham 2002). This is due to a combination of spatial

sorting (Shine et al. 2011) and natural selection (acting

over multiple generations; Travis et al. 2009) favouring

individuals at the edge of an expansion. Evolution of

dispersal and reproduction during range expansions

has now been documented in a number of taxa, includ-

ing plants (Cwynar & MacDonald 1987; Monty & Mahy

2010), amphibians (Phillips et al. 2006), humans (Mo-

reau et al. 2011) and insects (Simmons & Thomas 2004;

Hughes et al. 2007). The process of range expansion

can also inﬂuence host–parasite interactions. During a

range expansion, parasites and pathogens may lag

behind their hosts, due to both stochastic loss and low

host density at the wave front of the expansion (Phil-

lips et al. 2010). Where trade-offs exist, individuals at

the wave front should therefore invest less in intraspe-

ciﬁc competition (Burton et al. 2010) and immune

defence (Phillips et al. 2010). If such a lag does occur,

these traits may also experience relaxed selection at the

genic level, for example if speciﬁc antigen receptor

alleles are no longer required for parasite or pathogen

recognition. In longer established populations behind

the wave front of the expansion, host densities and

parasite burdens are expected gradually to return to

baseline levels, so here selection should favour invest-

ment in intraspeciﬁc competition and immunity over

dispersal. To the extent that dispersal, reproduction,

competitive ability and immunity are genetically deter-

mined, spatial sorting and natural selection should be

reﬂected by allele frequency clines along the axis of

expansion at loci inﬂuencing these traits (Hancock et al.

2010a).

However, detecting such adaptations at the genetic

level is expected to present a number of challenges.

Many of the traits in which we might expect to see

adaptation are polygenic, and much of the adaptation

is predicted to come from standing genetic variation

rather than new mutations (Barret & Schluter 2008).

Therefore, adaptation is expected to occur via subtle

shifts in allele frequencies (Hancock et al. 2010a) rather

than hard sweeps (Novembre & Han 2012). Outlier

approaches based on F

values are unlikely to be

useful in this case (Hancock et al. 2010a), as selection

is unlikely to create large differences in allele frequen-

cies between populations. In addition, F

-based

approaches are unable to distinguish allele frequency

variation that is related to an underlying environmen-

tal variable or gradient (such as distance along the

axis of a range expansion) vs. variation that follows a

spatially incoherent pattern (Yang et al. 2012). The pre-

vious rationale underlying genome scans for selection

has been that drift and demographic processes affect

the entire genome, and therefore, unusual patterns at

particular loci should reﬂect the action of selection

(Zayed & Whitﬁeld 2008). It is now known that ‘allele

surﬁng’ can generate clines in allele frequencies, but

this affects loci at random (Excofﬁer et al. 2009). There-

fore, a na€

ıve genome scan may reveal many loci that

are putatively under selection but which are actually

false positives (Hofer et al. 2009). It is unlikely that

any method will be able to overcome this problem

completely, but it may be possible to minimize the

problem using replication. An allele frequency cline

at a locus in one region may be due to drift or selec-

tion caused by some underlying environmental

variable. However, the direction of drift or surﬁng is

2972 T. A. WHITE ET AL.

independent between different ‘sectors’ of the expan-

sion (Hallatschek et al. 2007; Excofﬁer & Ray 2008).

Clines in the same direction in multiple regions are

therefore less likely to be due to drift.

Here, we report the results of one of the ﬁrst popula-

tion genomic studies of an ongoing range expansion.

Our study system is the bank vole, Myodes glareolus,in

Ireland. The bank vole is a small rodent distributed

throughout much of Eurasia from Iberia to central Sibe-

ria and from the Mediterranean to Scandinavia, but not

recorded in Ireland until 1964 (Claassens & O’Gorman

1965). Previous studies of mtDNA variation and para-

site distribution support a single introduction event

involving a small number of founders arriving in the

late 1920s on the southern shore of the Shannon Estuary

(Fairley 1971; Ryan et al. 1996; Stuart et al. 2007). Stuart

et al. (2007) place the arrival in 1926 at the deep-water

port of Foynes, as this coincides with the importation of

heavy earth moving equipment from Germany prior to

the construction of the Shannon hydroelectricity

scheme. Since its introduction, the vole has occupied

approximately one-third of the island of Ireland and is

continuing to expand its range at a constant rate of

c. 2.5 km/year (White et al. 2012).

Using a genotyping-by-sequencing (GBS) approach,

we simultaneously identify and genotype a large panel

of SNPs for the bank vole in Ireland. We report changes

in genic and nongenic diversity over the course of the

range expansion and develop a new approach to iden-

tify loci under selection. Importantly, we identify

genetic signatures of adaptation to the process of range

expansion itself.

Methods

Sampling and DNA extraction

In autumn 2010 and summer 2011, 281 bank voles

were sampled from 14 sample sites in Ireland

(Table 1). These sites were arranged in three transects

running from the site of introduction at Foynes out to

the expansion front, to the north, the northeast and

the east (Fig. 1). Voles were euthanized by isoﬂurane

overdose followed by cervical dislocation. For each

vole, a piece of liver tissue was placed in an Eppen-

dorf tube with 95% ethanol. Genomic DNA was

extracted using the DNeasy kit from Qiagen.

genotyping-by-sequencing (GBS)

Extracted DNA was sent to the Cornell Institute for

Genomic Diversity to conduct GBS. GBS (Elshire et al.

2011) is a simple technique for constructing reduced

representation libraries for the Illumina sequencing plat-

form and is conceptually similar to RAD sequencing

(Hohenlohe et al. 2010). Brieﬂy, DNA from each indi-

vidual was separately digested using the restriction

enzyme PstI (CTGCAG). The fragmented DNA was

then ligated to a barcoded adaptor and a common

adaptor with appropriate sticky ends. The digestion

and ligation were carried out in a 96-well plate. The

wells each contained DNA from a different individual

and a barcoded adaptor unique to that well. One con-

trol well did not contain any DNA. After ligation, the

wells were pooled into one Eppendorf tube and cleaned

Table 1 Sampling information, detailing the names of sample sites, their locations, the transects on which they fall, sample sizes (n),

start and end of trapping periods, distance from the introduction site at Foynes and three measures of genetic diversity for all 5979

SNPs: mean expected heterozygosity per locus (H

), mean alleles per locus (A) and mean allelic richness per locus (A

rich

)

Sample site Latitude Longitude Transect nTrapping period

Distance from

Foynes (km)*

All SNPs

rich

All SNPs

Foynes 52.574 9.140 All 20 09–12/07/2011 0 0.357 1.959 1.759

Tulla 52.795 8.731 N 20 27/07–02/08/2011 45 0.288 1.845 1.622

Gort 53.138 8.770 N 20 16–17/10/2010 84 0.265 1.782 1.571

Tuam 53.497 8.737 N 21 12–15/10/2010 124 0.256 1.757 1.553

Cloonfad 53.708 8.730 N 20 14–21/08/2011 148 0.253 1.750 1.546

Limerick 52.660 8.451 NE 20 13–14/07/2011 48 0.307 1.893 1.661

Nenagh 52.861 8.266 NE 20 01/11/2010 67 0.277 1.834 1.602

Birr 53.131 7.906 NE 20 28–30/10/2010 106 0.254 1.755 1.547

Ballynahown 53.360 7.871 NE 20 22–23/08/2011 129 0.237 1.715 1.513

Adare 52.471 8.765 E 20 21/11/2010 28 0.342 1.956 1.734

Kilteely 52.495 8.408 E 20 24–26/07/2011 50 0.312 1.906 1.674

Cashel 52.479 7.907 E 20 02/11/2010 87 0.301 1.878 1.650

Windgap 52.438 7.404 E 20 07–08/08/2011 119 0.309 1.896 1.665

New Ross 52.418 7.046 E 20 10–15/11/2010 146 0.300 1.874 1.648

*Shortest straight-line distance, except that the shortest land route around the Shannon Estuary was incorporated for transects

incorporating this feature.

ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2973

using a Qiagen QIAquick PCR puriﬁcation kit to form a

library. The library was then subjected to a PCR, using

long primers that matched the barcoded and common

adaptors. The PCR has two functions. One is to

perform a size-selection step, as the PCR preferentially

ampliﬁes fragments of an ideal length for Illumina

sequencing. The second is that the long primers add a

length of sequence to the fragments in the library.

These sequences bind to the Illumina ﬂow cell and are

also used to prime subsequent DNA sequencing reac-

tions. After PCR, the library was cleaned again using

a Qiagen QIAquick PCR puriﬁcation kit. Each library

was then diluted and sent for sequencing using

single-end 100-bp reads on the Illumina HiSeq 2000 at

the Cornell Core Laboratories Center. To assess repeat-

ability of our approach, a second library was made

for 24 of the individuals, which was sequenced as

before.

SNP genotyping and annotation pipeline

Raw sequence ﬁles from Illumina were converted into

individual genotypes using the UNEAK pipeline, avail-

able as part of the TASSEL 3.0 software (Bradbury et al.

2007). Brieﬂy, the UNEAK pipeline keeps good reads

with a barcode, cut site and no ‘N’s in the ﬁrst 64 bp of

the sequence after the barcode. Reads are then trimmed

to 64 bp after the barcode. Identical reads are clustered

into tags, and counts of these tags present in each bar-

coded individual are stored. Following this, all unique

tags are merged, and their counts in the whole sample

of individuals are stored. Pairwise alignment of tags is

then performed, and tag pairs with 1-bp mismatch are

considered as candidate SNPs. With a certain error

tolerance rate (here set to 0.03), only reciprocal pairs of

tags are retained as SNPs, according to standard proto-

cols of the Cornell Institute for Genomic Diversity. Fol-

lowing SNP identiﬁcation, counts of each tag (or allele)

are output for each locus and each individual. After

running UNEAK, individual genotypes were recalled

following the approach of Lynch (2009), using a global

sequencing error rate of 0.03. The likelihood of each

genotype was calculated using a multinomial sampling

distribution, and a genotype was called if it had an AIC

value at least four lower than the next best genotype.

Otherwise, the genotype was coded as ‘missing’. To

ﬁlter out potential paralogs, we discarded loci with a

mean observed heterozygosity >0.75. This cut-off is

obviously somewhat arbitrary, but choosing different

cut-off values between 0.5 and 1 made little difference

to our results (data not shown). After ﬁltering, we had

5979 loci that could be conﬁdently called in at least 80%

of individuals. Although allele dropout can affect esti-

mates of genetic variation within and between popula-

tions (Gautier et al. 2012), the number of individuals

with missing data (which could reﬂect different levels

of allele dropout) had no effect on the patterns of diver-

sity reported here (data not shown).

Twenty-four individuals were analysed in two sepa-

rate GBS runs. Where individuals were assigned geno-

types from both runs, the genotype calls from the two

runs were compared. This analysis showed that repeat-

ability of genotyping was high (mean, 97.2%; SD, 1.4%).

Locus sequences were blasted against the RefSeq mam-

malian RNA database using BLASTN (Altschul et al. 1997)

with parameters: word_size =11; gapopen =5; gapex-

tend =2; penalty =3; and reward =2. Sequences were

also blasted against the SwissProt and NR databases using

BLASTX with default parameters. SwissProt was used pref-

erentially, to facilitate functional annotation using Uni-

Prot. Loci were identiﬁed as putatively genic if they had

an expectation value e<1910

5

in matches against the

RefSeq database or e<1910

3

in matches against Swiss-

50 km

Fig. 1 Location of sample sites in Ireland. CD =Cloonfad,

TM =Tuam, GT =Gort, TA =Tulla, BN =Ballynahown, BR =

Birr, NH =Nenagh, LK =Limerick, NS =New Ross,

WP =Windgap, CL =Cashel, KY =Kilteely, AE =Adare,

FS =Foynes. Sites on the northern transect are marked with

squares, those on the north-eastern transect are marked with

circles, and those on the eastern transect with triangles. Foynes,

which is the introduction site and is on all three transects is

marked with a cross. The dashed line shows the approximate

range limits of the bank vole in 2011.

2974 T. A. WHITE ET AL.

Prot/NR databases. BLASTX was used to determine

whether the genic SNPs were synonymous or nonsynony-

mous (NS).

Genetic diversity patterns

Mean expected heterozygosity (H

), mean alleles per

locus (A) and mean allelic richness (A

rich

) were calcu-

lated for each population and each locus class [NS

SNPs, genic (not NS) SNPs and nongenic SNPs] using

the software ARLEQUIN 3.5 (Excofﬁer & Lischer 2010) and

HP-RARE (Kalinowski 2005). Measures of genetic diver-

sity were regressed onto the geographical distance

between the sampling locality and the point of intro-

duction (Foynes) using the ‘lm’ package in R2.15

R Core Team (2012). As the Shannon Estuary represents

a signiﬁcant barrier to dispersal, we calculated distances

as the shortest path by land. To test for differences in

slopes and intercepts between the different SNP locus

classes, an ANCOVA was performed taking mean diver-

sity (H

,Aor A

rich

) as the response variable and locus

class, distance and their interaction as the independent

variables.

Identifying SNP outliers

Two general approaches were used to identify loci

potentially under selection relating to range expansion.

The ﬁrst was to calculate the Spearman rank correlation

between allele frequency and the geographical distance

between the sampling locality and Foynes as the point

of introduction. This was done for the three transects

separately. We then took the absolute value of the mean

correlation coefﬁcients across the three transects. Loci

were ranked by mean correlation coefﬁcient, and an

empirical P-value was calculated as the rank divided by

the number of loci. We then identiﬁed potential outlier

loci as those with empirical P-values <0.05 and <0.01.

Using this approach, the Foynes population appeared

as the starting site in all three transects, so correlation

coefﬁcients may have been disproportionately inﬂu-

enced by the allele frequency at Foynes. Therefore, we

repeated the correlations, excluding Foynes from all

three transects. Mean correlation coefﬁcients and p-val-

ues were calculated as before. As the Foynes sample

does still contain relevant information, we considered

outliers to be those loci that appeared in the tails of the

distributions of the correlations both with and without

Foynes.

The second approach to identify outliers was to use

the method of Coop et al. (2010), implemented in the

software Bayenv. This approach estimates the covari-

ance in allele frequencies between populations from a

set of control loci. In our case, this was the set of 5713

nongenic SNPs. For each of the 5979 SNPs, a Bayes

factor was then calculated for a model where an envi-

ronmental variable has a linear effect on allele frequen-

cies compared with a model given by the covariance

matrix alone. The environmental variable of interest

was the geographical distance from the point of intro-

duction at Foynes. Each locus was binned according to

the frequency of allele 1 (arbitrarily deﬁned) over all

populations into one of 10 bins with a frequency interval

of 0.1. Within each frequency bin, loci were ranked by Ba-

yes factors, and an empirical P-value was calculated as the

rank divided by the number of loci in that bin. We then

identiﬁed potential outlier loci as those with empirical P-

values <0.05 and <0.01. Variance–covariance matrices

were compared within and between independent runs of

the programme to ensure convergence.

Putative functions and Gene Ontology (GO) Biologi-

cal Process terms were assigned to outlier loci using the

UniProt Knowledgebase [‘The UniProt Consortium

(2012) Reorganizing the protein space at the Universal

Protein Resource (UniProt)’] and PANTHER v7.2 (Thomas

et al. 2008).

Neutral simulations

A modiﬁed version of SPLATCHE (Ray et al. 2010) was

used to simulate neutral genetic diversity after a range

expansion in the bank vole. Ireland was represented as

a lattice of 1 Km squares. Areas of land were deﬁned as

potential bank vole habitat, whereas areas of sea or

lakes were deﬁned as unsuitable. Simulated sample

sites were arranged according to the same coordinates

as our real sample sites and had the same sample sizes.

The range expansion began at Foynes and progressed

until all sample sites had been colonized. The forward

demographic part of the SPLATCHE simulation records for

each time step the population sizes in each deme and

migration events between demes. Samples of genes

were taken from each sample site, and SNP data were

simulated using a discrete time coalescent model. For

each demographic simulation, 5979 neutral SNP loci

were simulated. Allele frequencies were calculated for

each sample site, and these were correlated with dis-

tance from Foynes using Spearman rank correlation. As

with the real genetic data, we calculated these correla-

tions separately for each transect and took the absolute

mean correlation across all three transects. We also cal-

culated the correlations with and without the Foynes

sample site. For both these approaches, we recorded the

number of loci that had higher correlation coefﬁcients

than our observed outliers at the 5% and 1% thresholds.

The strength of allele frequency correlations at neutral

loci will depend on the amount of genetic drift

experienced by the population as it expands, which in

ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2975

turn will depend on the demographic model used. As

this is unknown, we performed 1000 different demo-

graphic simulations, whose parameters (founding num-

ber of individuals, carrying capacity of each deme,

growth rate per generation, migration rate, Allee effect

severity and Allee effect scale (see Stephens & Suther-

land 1999)) were drawn from uniform distributions that

we had previously found to generate close matches to

the observed SNP data (T. A. White, unpublished data).

Deleterious SNPs

The program PolyPhen-2 (Ramensky & Sunyaev 2009)

was used to predict the functional impact of each NS

SNP on the translated protein. This approach is based

on multiple alignments and biochemical and physical

characteristics of the amino acid replacements. In cases

where one of the alleles at a locus matched to a human

or rodent reference genome, this allele was used as the

reference, and the effect of changing this to the other

allele was assessed using PolyPhen-2. The functional

impact of each NS SNP was designated as ‘Benign’,

‘Possibly damaging’ or ‘Probably damaging’, the latter

two classes we call potentially deleterious SNPs. If

neither allele matched to a reference sequence, the func-

tional impact of the substitution was left unclassiﬁed.

For SNPs classiﬁed as ‘Possibly damaging’ or ‘Probably

damaging’, the frequency of the potentially deleterious

allele was calculated for each population, and the rela-

tionship between these frequencies and geographical

distance from Foynes was determined using Spearman

rank correlation.

Results

Data quality and coverage

Illumina sequencing of 281 individuals on three lanes

resulted in 786 834 622 reads. Of these, 676 763 709

reads contained a unique barcode and cut site remnant

and contained no ‘N’s. These data were used in the

UNEAK pipeline. UNEAK identiﬁed 60 417 biallelic

SNP loci. However, many of these had low coverage or

were only present in a small number of individuals.

Over all of these loci, the mean coverage per locus per

individual was 3.39(max coverage per individual

252.79, min 0.0049). When loci with more than 20%

missing data were excluded, 6398 loci were retained,

with a mean coverage of 16.99(max coverage per indi-

vidual 252.79, min 7.29). Discarding loci with observed

heterozygosity >0.75 left 5979 loci, with a mean cover-

age of 16.569(max coverage per individual 168.09,

min 7.29). This last difference in coverage is consistent

with the idea that paralogs should have high coverage

and also high heterozygosity, as ﬁltering them out

reduces the mean coverage but especially the maximum

coverage at a locus.

Locus classiﬁcation

Our BLAST approach identiﬁed 266 (4.4%) loci as ‘genic’,

245 of which had a match in the RefSeq database and

124 in the SwissProt/NR databases. Results from BLASTX

also determined that 30 of the genic SNPs were NS.

Genetic diversity patterns

Genetic diversity declined signiﬁcantly with distance

from Foynes (Table S1 and Fig. 2). This was regardless

of whether we used all SNPs, NS SNPs, genic (not NS)

SNPs or nongenic SNPs and whether measured as H

Aor A

rich

. The slope of the regressions of H

and A

rich

on distance was steeper for NS SNPs than for the other

SNP locus classes, and the intercept of the regression

was higher (i.e. there was greater diversity for NS SNPs

at Foynes). However, ANCOVA revealed no signiﬁcant

effect of SNP locus class on the relationship between

distance and diversity or on the levels of diversity at

Foynes, regardless of which measure of diversity was

used (results not shown). When the three transects are

compared, it can be seen that the loss of diversity in the

eastern transect appears to be less severe than in the

northern and northeastern transects (Fig. 2).

The mean number of alleles is 1.959 at Foynes and in

the wave front populations is 1.874 at New Ross, 1.750

at Cloonfad and 1.715 at Ballynahown. However, when

the wave front populations are pooled, the mean num-

ber of alleles is 1.951. So, it appears that the loss of

diversity has been somewhat independent in the three

transects, as different subsets of alleles have been lost

in each.

Outlier loci

Using the Spearman rank correlation approach (includ-

ing Foynes in each transect), 21 of the 266 genic SNPs,

and 278 of the 5713 nongenic SNPs, had an empirical

P-value <0.05. This represented a 1.6-fold enrichment of

genic SNPs in the outliers (Fisher’s exact test, one-tailed

P=0.0245). Nine of the 266 genic SNPs, and 51 of the

5713 nongenic SNPs, had an empirical P-value <0.01.

This represented a 3.8-fold enrichment of genic SNPs

(Fisher’s exact test, one-tailed P=0.0012).

When Foynes was excluded from this analysis, 21 of

the 266 genic SNPs, and 278 of the 5713 nongenic SNPs,

had an empirical P-value <0.05. This represented a

1.6-fold enrichment of genic SNPs in the outliers

(Fisher’s exact test, one-tailed P=0.0245). Seven of the

2976 T. A. WHITE ET AL.

266 genic SNPs, and 53 of the 5713 nongenic SNPs, had

an empirical P-value <0.01, a 2.8-fold enrichment of

genic SNPs (Fisher’s exact test, one-tailed P=0.0164).

One hundred and sixty-two SNPs lay in the top 5%

of the distribution of correlation coefﬁcients in both

correlations with and without Foynes. Of these, 12 were

genic SNPs, representing a not quite signiﬁcant 1.7-fold

enrichment of genic SNPs (Fisher’s exact test, one-tailed

P=0.0564). Thirty-ﬁve SNPs appeared in the top 1% in

both analyses. Of these, seven were genic, representing

a highly signiﬁcant 6-fold enrichment of genic SNPs

(Fisher’s exact test, one-tailed P=0.0004).

Thus, using our Spearman rank correlation approach,

we ﬁnd a signiﬁcant enrichment of genic SNPs amongst

those SNPs with the strongest correlations between

allele frequency and distance from Foynes, the point of

introduction. This is consistent with adaptation during

the range expansion, as we expect the targets of selec-

tion to be either genes or regulatory regions in close

linkage with genes. The 12 genic loci that are common

to both Spearman rank correlation approaches are listed

in Table 2.

Using Bayenv, 293 SNPs were identiﬁed as outliers

with P<0.05. Of these, 13 were genic SNPs. Fifty-four

SNPs were outliers with P<0.01, only one of which

was a genic SNP. There was no enrichment of genic

SNPs in either outlier set identiﬁed by Bayenv.

Forty-two SNPs were identiﬁed as outliers using both

our correlation approach and Bayenv, of which ﬁve

were genic SNPs. This represented a 2.9-fold

enrichment of genic SNPs in the outliers (Fisher’s exact

test, one-tailed P=0.0372).

A total of 20 genic outliers were identiﬁed using

either our correlation-based method or Bayenv. These

are listed in Table 2. Of these, 16 genes were assigned

GO terms under ‘biological processes’, of which four

genes had the GO term ‘immune system process’. In

the mouse genome, there are 24 935 genes that are

assigned biological process GO terms, of which 1421

have the GO term ‘immune system process’ (Eppig

et al. 2012). Therefore, assuming that a similar propor-

tion holds true for the bank vole, in our outliers there is

signiﬁcant enrichment for genes involved in immunity

(Fisher’s exact test, one-tailed P=0.0205).

Neutral simulations

For a range of reasonable demographic models for the

bank vole expansion, we found that, on average, the pro-

portion of simulated loci with more extreme correlation

coefﬁcients than our observed 0.05 and 0.01 thresholds

was 0.041 and 0.008 for correlations including Foynes,

and 0.04 and 0.009 for correlations excluding Foynes. In

our real data, the proportion of loci falling in the 5% tail

of both distributions was 0.027, whilst the proportion fall-

ing in the 1% tail of both distributions was 0.006. In the

simulated neutral data, these proportions were 0.021 and

0.004, respectively. These results suggest that our

0 50 100 150

0.24 0.28 0.32 0.36

Distance from introduction site (km)

(a)

0 50 100 150

1.70 1.80 1.90 2.00

Distance from introduction site (km)

(b)

0 50 100 150

1.50 1.60 1.70 1.80

Distance from introduction site (km)

Arich

(c)

Fig. 2 Decline of genetic diversity with dis-

tance from the introduction site of the bank

vole in Ireland. Measured as (a) mean

expected heterozygosity (H

), R

=0.5345;

P=0.003, (b) mean alleles per locus (A),

=0.5005; P=0.005, and (c) mean allelic

richness (A

rich

), R

=0.4911; P=0.003. Sites

on the northern transect are marked with

squares, those on the north-eastern transect

are marked with circles, and those on the

eastern transect with triangles. Foynes,

which is the introduction site and is on all

three transects is marked with a cross.

ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2977

Table 2 Outlier SNPs identiﬁed using our Spearman rank correlation approach and Bayenv. The column ‘outlier’ gives the method used to identify that SNP as an outlier. The

next four columns give the accession nos of the best matches and associated gene descriptions, in the mammalian RNA RefSeq database and the SwissProt/NR databases. ‘Type’

shows whether the SNP is synonymous (S) or nonsynonymous (NS) or whether it is located in a noncoding region of the gene (–). The ﬁnal two columns give the functions or

processes in which the genes are involved, according to the UniProt Knowledgebase, and Panther GO (Gene Ontology) classiﬁcations, respectively

SNP Outlier*

Match in

mammalian

RNA RefSeq

database Description

Signiﬁcant match

in SwissProt/NR

protein database Description Type

UniProt knowledgebase

function/process

PANTHER GO biological

process

mg8017 1,3 XM_002753731.1 Hypothetical protein

LOC100361418

P06323.1|

TVA3_MOUSE

T-cell receptor

alpha chain V

NS Receptor activity Immune system process

mg10984 1,3,5 NM_133638.3 ADAM metallopeptidase

with thrombospondin

type 1 motif, 19

(ADAMTS19)

NA NA —Proteolysis Signal transduction;

Cell–cell adhesion;

Proteolysis

mg39858 1,4 NM_006946.2 Spectrin, beta,

nonerythrocytic 2

(SPTBN2)

NA NA —Actin cytoskeleton

organization; axon

guidance

Cellular component

morphogenesis

mg68377 1,3 NM_181652.2 Peroxiredoxin 5 (PRDX5) NA NA —Intracellular redox

signalling

Immune system process;

Oxygen and reactive

oxygen species metabolic

process

mg71009 1,3,5 NM_014899.3 Rho-related BTB domain

containing 3 (RHOBTB3)

NA NA —Retrograde transport,

endosome to Golgi;

Small GTPase-mediated

signal transduction

G-protein coupled

receptor protein

signalling pathway

mg72604 1,3,5 NM_001118890.1 Glutaredoxin

(thioltransferase) (GLRX)

NA NA —Cell redox homoeostasis Sulphur metabolic process

mg81865 1,4,5 NM_017415.2 Kelch-like 3 (KLHL3) Q5REP9.1|

KLHL3_PONAB

Kelch-like protein 3 S Protein ubiquitination Neurological system

process; Cellular

component morphogenesis

mg96770 1,3 NM_001160392.1 tRNA phosphotransferase

1 (TRPT1)

NA NA —tRNA processing Nucleobase, nucleoside,

nucleotide and nucleic

acid metabolic process

mg123985 1,3 NM_005956.3 Methylenetetrahydrofolate

dehydrogenase (NADP+

dependent) 1 (MTHFD1)

P11586.3|

C1TC_HUMAN

C-1-tetrahydrofolate

synthase

NS Folic acid metabolism;

neural tube development

Purine base metabolic

process; Cellular amino

acid biosynthetic process

mg8197 2,4 NM_025258.2 Von Willebrand factor A

domain containing 7

(VWA7)

NA NA —Glycoprotein —

mg17560 2,4 NM_001004736.2 Olfactory receptor,

family 5, subfamily K,

member 1 (OR5K1)

Q8NHB7.2|

OR5K1_HUMAN

Olfactory

receptor 5K1

S Olfaction; sensory

transduction

—

2978 T. A. WHITE ET AL.

Table 2 Continued

SNP Outlier*

Match in

mammalian

RNA RefSeq

database Description

Signiﬁcant match

in SwissProt/NR

protein database Description Type

UniProt knowledgebase

function/process

PANTHER GO biological

process

mg83555 2,4,5 NM_015125.3 Capicua homolog

(Drosophila) (CIC)

NA NA —Central nervous

system development

Regulation of transcription

from RNA polymerase II

promoter

mg13786 5 NM_001031749.2 LY6/PLAUR domain

containing 5 (LYPD5)

NA NA —— —

mg24029 5 XM_002752883.1 Leukotriene A4 hydrolase,

transcript variant 2

(LTA4H)

Q6S9C8.3|

LKHA4_CHILA

Leukotriene

A-4 hydrolase

S Leukotriene biosynthesis;

inﬂammatory response

Immune system process;

Fatty acid biosynthetic

process; Proteolysis

mg26799 5 NM_001005217.1 FSHD region gene 2

(FRG2)

ABB88900.1 Oocyte-speciﬁc

eukaryotic

translation

initiation

factor 4E-like

(Eif4e1b)

S Protein biosynthesis Translation

mg49438 5 XM_001101962.2 WNT5A wingless-type

MMTV integration site

family, member 5A

P22726.2|

WNT5B_MOUSE

Protein Wnt-5b S Wnt signalling pathway G-protein coupled receptor

protein signalling pathway;

Cell–cell signalling

mg57185 5 NM_021226.2 Rho GTPase activating

protein 22 (ARHGAP22)

NA NA —Positive regulation of

GTPase activity;

signal transduction

—

mg59899 5 NM_001012426.1 Forkhead box P4 (FOXP4) NA NA —Embryonic foregut

morphogenesis; heart

development;

transcription, DNA

dependent

Visual perception; Sensory

perception; Cell cycle;

Cell surface receptor linked

signal transduction;

Carbohydrate metabolic

process; Regulation of

transcription from RNA

polymerase II promoter;

Cellular component

morphogenesis; Segment

speciﬁcation; Anterior/

posterior axis speciﬁcation;

Ectoderm development;

Mesoderm development;

Embryonic development;

Nervous system

development

ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2979

observed data may contain more loci with extreme allele

frequency clines than expected under neutrality.

Deleterious mutations

Ten NS SNPs were classed by PolyPhen-2 as ‘Possibly

damaging’ or ‘Probably damaging’ (Table S2, Support-

ing information). Of these loci, two showed signiﬁcant

negative correlations between the frequency of the dele-

terious allele and distance from the introduction site

(Fig. 3). These SNPs, mg8017 and mg134581, were

located in the T-cell receptor alpha V (TVA3) gene,

involved in antigen recognition, and the laminin

subunit alpha 2 (LAMA2) gene, respectively. Defects in

Lama2 are a cause of murine muscular dystrophy (Xu

et al. 1994). One SNP, mg123985, located in the C-1-

tetrahydrofolate synthase (C1TC) gene showed a signiﬁ-

cant positive correlation (Fig. 3). Mutations in this gene

may impair foetal growth in mice (Beaudin et al. 2012).

Discussion

Using the bank vole invasion of Ireland as our study

system, we found evidence for adaptation during the

range expansion, despite an overall loss of genetic

diversity due to strong genetic drift at the wave front.

This suggests that selection pressures during range

expansions may be very strong. This is one of the ﬁrst

studies to provide empirical genomic evidence for the

adaptation to the process of range expansion in a wild

population.

We found that the eastern transect shows the least

reduction in genetic diversity, whilst the northern and

northeastern transects show similar patterns of greater

loss (Fig. 2). In the east of the country, there are few

barriers to dispersal, whilst in the north and north-

Table 2 Continued

SNP Outlier*

Match in

mammalian

RNA RefSeq

database Description

Signiﬁcant match

in SwissProt/NR

protein database Description Type

UniProt knowledgebase

function/process

PANTHER GO biological

process

mg87917 5 XM_002817845.1 CAP-GLY domain-

containing linker protein

2 (CLIP2)

O55156.1|

CLIP2_RAT

CAP-Gly

domain-containing

linker protein 2

S Control of

brain-speciﬁc

organelle translocations

Intracellular protein

transport; Vesicle-mediated

transport; Mitosis; Cellular

component morphogenesis

mg122511 5 NM_006162.3 Nuclear factor of activated

T cells (NFATC1)

NA NA —Transcription regulation Immune system process;

Regulation of transcription

from RNA polymerase II

promoter; Mesoderm

development; Cellular

defence response

*1 =signiﬁcant correlation of allele frequency with distance with P<0.01 (Foynes included); 2 =signiﬁcant correlation of allele frequency with distance with P<0.05 (Foynes

included); 3 =signiﬁcant correlation of allele frequency with distance with P<0.01 (Foynes excluded); 4 =signiﬁcant correlation of allele frequency with distance with P<0.05

(Foynes excluded); 5 =outlier with P<0.05 in Bayenv analysis.

0 50 100 150

0.0 0.2 0.4 0.6 0.8 1.0

Distance from introduction site (km)

Frequency of deleterious allele

Fig. 3 Change in frequencies of deleterious alleles (identiﬁed

using PolyPhen-2) with distance from the introduction site.

Only loci with signiﬁcant correlations are shown. The SNP

mg8017 is shown with open circles, mg123985 with open

squares, and mg134851 with crosses.

2980 T. A. WHITE ET AL.

east, the expanding population would have encoun-

tered substantial barriers to dispersal, including the

River Shannon to the north and unsuitable bog habi-

tat in the northeast. However, diversity in the north-

ern and northeastern transects shows a monotonic

decline, suggesting that the difference between

transects is due to some continuously acting process

and not a one-off founder event, such as might be

caused by crossing a semi-permeable barrier to dis-

persal. These two transects may additionally experi-

ence reduced lateral dispersal along most of their

length, due to the close proximity to the River Shan-

non (Fig. 1). We might expect lateral dispersal to

inﬂuence the amount of genetic diversity lost or

retained in a particular transect, if different alleles are

found in different transects. This appears to be the

case, as when we pooled the three populations at the

wave front of the expansion, the mean number of

alleles was almost as high as at the point of introduc-

tion (Foynes, Fig. 1), showing that different alleles

had been lost (or preserved) in different transects.

We can consider at least three different types of selec-

tion acting on individuals in a range expansion that

could produce consistent allele frequency clines

between transects. One is spatial sorting. Individuals

that are more likely to disperse, or that disperse long

distances, are more likely to be found towards the wave

front of the range expansion. If dispersal strategy has a

genetic component, breeding between highly dispersive

individuals at the wave front may lead to an increase in

dispersal over time in wave front populations, the so-

called “Olympic Village Effect”(Shine et al. 2011). If this

is the case, one would expect alleles contributing to a

highly dispersive phenotype to show a frequency cline,

increasing from the core to the wave front. Traditional

natural selection could generate such a pattern in two

ways. With positive selection at the expansion front,

individuals would disperse to a new habitat without

respect to genotype. In this new habitat, differential sur-

vival and/or fecundity would lead to changes in allele

frequency in the next generation. An alternative to this

model is relaxed selection at the expansion front, which

may be due to reduced density of conspeciﬁcs and

reduced parasite burdens in a deme. As time pro-

gresses, conspeciﬁc density and parasites in the deme

will increase, potentially leading to purifying selection

behind the expansion front. Both traditional selection

models will generate differences in allele frequency

between demes along the expansion axis. These types

of selection may be difﬁcult to separate empirically.

Indeed, they are not mutually exclusive and may act to

reinforce or oppose one another. Whilst we also expect

expanding populations to be under selection due to

external environmental variables, such as climate

(Hancock et al. 2011), we predict that selection due to

range expansion processes will be much stronger, par-

ticularly over such a small scale in Ireland where envi-

ronmental variation is limited.

The major challenge to date in identifying genes

involved in adaptation during range expansion has

been in separating the signals of selection from drift

and allele surﬁng (Hofer et al. 2009). Here, we make use

of replicated transects to identify loci showing signiﬁ-

cant allele frequency clines in the same direction in sev-

eral transects. Of course, this approach may fail to

detect some loci that are under selection, and some out-

liers may continue to reﬂect drift or allele surﬁng rather

than selection. However, our simulation modelling

showed that our data contained more extreme allele fre-

quency clines than expected under a neutral model,

suggesting the action of selection. In addition, the fact

that we observe an enrichment of genic vs. nongenic

SNPs in the outliers, and that this enrichment is stron-

ger when we consider the tail of the distribution with

P<0.01 vs. P<0.05, suggests that a majority of these

loci are good candidates for being under selection (Han-

cock et al. 2011).

Our simple outlier detection approach may work

better than Bayenv in this case. This is partly because

Bayenv computes only a Bayes factor for each SNP (com-

paring an environmental selection model to a null

model), meaning that only overall relationships can be

assessed, and information on the direction of relation-

ships in different transects is lost. A second reason is that

in a range expansion much of the population genetic var-

iation lies in the direction of the expansion. By removing

the average effect of the variance–covariance of allele fre-

quencies, Bayenv may also be removing signals of adap-

tation to expansion. Other studies that have used Bayenv

successfully have not considered environmental vari-

ables in parallel with the direction of range expansion

(Eckert et al. 2010; Hancock et al. 2010b, 2011; Chen et al.

2012), and none have considered such a recent range

expansion as the one studied here.

It is predicted that during a range expansion, individ-

uals should experience selection for increased dispersal

(most likely due to spatial sorting; Burton et al. 2010;

Shine et al. 2011) and positive selection for reproduction

early and often at the expansion front (including rapid

growth and maturation; Moreau et al. 2011). They

should also experience relaxed selection on intraspeciﬁc

competition. It is likely that very many genes inﬂuence

these traits (but see Haag et al. (2005) and Matthews &

Butler (2011)), and it is difﬁcult to make predictions

about the classes of genes that should appear as

outliers. For mammals, we might predict that changes

in dispersal, reproduction and competition might be

mediated via behavioural changes, particular with

ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2981

regard to how individuals interact with conspeciﬁcs. As

the bank vole in Ireland has experienced strong bottle-

necking and genetic drift during its introduction and

subsequent expansion, we expect strong linkage dis-

equilibrium between different regions of the genome.

Therefore, the outliers we identify may not be the tar-

gets of selection themselves, but merely linked to

regions under selection. Nevertheless, it is interesting

that many of the outliers we identiﬁed may be involved

in sensory perception and neural development (mg

39858, mg81865, mg123985, mg17560, mg83555, mg

59899, mg87917; see Table 2), although no GO terms

relating to these functions had more than one outlier

assigned to them. Other interesting outliers include

mg10984 [encoding ADAMTS19, a gene involved in sex-

ual differentiation and expressed predominantly in the

foetal ovary (Menke & Page 2002)] and mg26799

(encoding EIFE1B, an oocyte-speciﬁc translation initia-

tion factor). These genes are potential candidates for

adaptation relating to differential investment in repro-

duction at the expansion front.

It should be easier to assign a mechanistic basis to

outlier genes involved in the immune response. At the

expansion front, we might expect both an increased

need to invest in other traits and a reduced need

to invest in immunity if parasites lag behind their hosts.

Here, hosts should divert fewer resources to

maintaining their immune systems (White & Perkins

2012). Reduced investments may be targeted at particu-

lar aspects of the immune response, due to different

development and use costs (Lee 2006). For example, if

selection at the expansion front favours rapid growth,

trade-offs may lead to reduced investment in immune

components with particularly high development costs

(van der Most et al. 2011), such as induced cell-medi-

ated and antibody responses (Tschirren et al. 2003). The

distribution of parasites (helminths and ectoparasites)

changes markedly along the axis of the bank vole range

expansion (S. E. Perkins, unpublished data), and so it is

interesting to see differential selection pressure on

immune response reﬂected at the genetic level. As a

class, we found that immune system genes were signiﬁ-

cantly enriched amongst the outliers. Genic outliers

involved in immunity are mg8107 (a T-cell receptor),

mg68377 (PRDX5, involved in peroxisome signalling),

mg24029 (LKHA4, involved in the inﬂammatory

response) and mg122511 (NFATC1, which plays a role

in the inducible expression of cytokine genes in T cells,

especially in the induction of IL-2 and IL-4 gene tran-

scription; Table 2). Of course, there is a need to validate

outliers and candidate loci through functional assays,

association studies and quantitative genetic dissection.

This will not only ﬁlter out potential false positive out-

liers, but will also lead to a greater understanding of

the mechanistic underpinnings of adaptation to range

expansion.

In this study, ten NS SNPs were identiﬁed where one

of the alleles was predicted to have a damaging effect

on the ﬁnal protein product. Two of these damaging

alleles had a signiﬁcant decline in frequency with dis-

tance from the point of introduction, whilst one had a

signiﬁcant increase in frequency. There is therefore no

evidence for extensive ﬁxation or frequency increase in

deleterious alleles during the range expansion, although

this may be due to the small sample of NS SNPs with

predicted functional effects. Lohmueller et al. (2008)

found that non-African human populations had signiﬁ-

cantly more deleterious mutations than Africans, a pat-

tern they interpreted as being due to founder events,

genetic drift and allele surﬁng as humans moved out of

Africa. In a simulation study, Travis et al. (2007) found

that deleterious alleles arising at the edge of an expan-

sion were much more likely to persist than if they had

arisen in a stationary population. However, this result

depended strongly on the growth (r), carrying capacity

and dispersal (m) parameters used in the simulations. A

high rvalue increases the chances that a deleterious

mutation will surf, whilst at higher mvalues, deleteri-

ous alleles were less likely, and beneﬁcial mutations

more likely, to surf. For small mammals, the costs of

dispersal are such that they should only disperse as far

as the nearest suitable unoccupied space. Emigration

may largely be driven by positive density-dependent

dispersal and agonistic behaviour from conspeciﬁcs

(Matthysen 2005; Hahne et al. 2011; Le Galliard et al.

2012), which is supported by our previous analysis of

the bank vole range expansion in Ireland (White et al.

2012). Positive density-dependent dispersal should tend

to reduce the rate of range expansion and minimize the

effect of genetic drift in demes at the expansion front.

Moreover, the simulation results of Travis et al. (2007)

were based on novel mutations arising near the expan-

sion front and did not consider standing variation. In

an introduced population such as the one considered

here, standing deleterious alleles may have increased in

frequency at the introduction site due to drift during

the initial bottleneck. Thereafter, selection by spatial

sorting may result in a kind of ‘spatial purging’. One

might expect that mutations having negative effects on

reproduction or dispersal might tend to be left behind

during a range expansion. Indeed, Travis et al. (2007)

found that mutations with a negative impact on fertility

were much less likely to surf than those with negative

effects on survival. The difference in deleterious allele

frequency between the expansion front and older estab-

lished populations may in general be less pronounced

than suggested by the ﬁndings of Lohmueller et al.

(2008).

2982 T. A. WHITE ET AL.

This study used a genome-wide approach to track

changes in genetic diversity across a well-characterized

range expansion. Using both functional and neutral loci,

we found that the introduced bank vole population in

Ireland has lost a substantial proportion of its diversity

during the expansion. Due to changes in diversity along

the axis of expansion and the potential for allele surf-

ing, traditional outlier approaches to detect loci under

selection are likely to return many false positives. Here,

we introduced a new test to detect loci under direc-

tional selection during the expansion. Using a correla-

tion-based approach, we identiﬁed a number of genes

under selection during the range expansion. It appears

that the bank vole has been able to respond adaptively

to the range expansion in spite of the general loss of

genetic diversity. However, there is no evidence that

populations at the expansion front carry more deleteri-

ous mutations than those at the range core, and this

may be because spatial purging is also important in

removing deleterious alleles as the population expands

its range. This is of relevance to many other species

expanding their ranges, for example due to climate

change, as it suggests that ﬁtness does not necessarily

decline towards the wave front of the expansion.

The bank vole in Ireland represents an excellent sys-

tem with which to test hypotheses associated with

range expansions. The range is continuing to expand

without any human interference, and the history of the

expansion has been well characterized demographically

(White et al. 2012). In Ireland, the bank vole is expand-

ing into a landscape with relatively minor environmen-

tal perturbations, as shown by the consistent and

similar declines in genetic diversity across all three

transects. As the bank vole is amenable to laboratory

breeding and manipulation, the system also offers the

possibility to study the mechanics of an invasion/range

expansion of a small mammal experimentally.

To date, much work in population genetics and

genomics has used analytical models developed for

populations at approximate equilibrium. As many, if

not most, species have undergone recent range expan-

sions, we believe that it is of general relevance to con-

sider whether range expansions could have inﬂuenced

the genetic variation seen in any particular study sys-

tem, and use statistical models and simulations appro-

priate to such cases.

Acknowledgements

This research was supported by a Marie Curie FP7-PEOPLE-

2009-IOF and a Marie Curie FP7-PEOPLE-2009-IEF within the

7th European Community Framework Programme. TW was

also supported by a Heredity Fieldwork Grant from The Genet-

ics Society and a Percy Sladen Memorial Fund Grant from the

Linnean Society. GH acknowledges support from Swiss

National Science Foundation grant 31003A_127377/1. Colin

Lawton, Michael Field-May, Sam Grathoff, Libby Nixon, Nia

Thomas and Sophie Watson assisted in the collection of speci-

mens. The authors would like to thank Rob Elshire, Sharon

Mitchell and Charlotte Acharya in the Buckler lab at Cornell for

help with genotyping-by-sequencing, Rodrigo Vega for help

with laboratory work, Robert Bukowski at the Cornell Compu-

tational Biology Service Unit for bioinformatics advice and

Laurent Excofﬁer for access to computing facilities. The editor

and reviewers provided helpful comments and suggestions.

References

Altschul SF, Madden TL, Sch€

affer AA et al. (1997) Gapped

BLAST and PSI-BLAST: a new generation of protein data-

base search programs. Nucleic Acids Research,25, 3389–3402.

Barret RD, Schluter D (2008) Adaptation from standing varia-

tion. Trends in Ecology and Evolution,23,38–44.

Beaudin AE, Perry CA, Stabler SP, Allen RH, Stover PJ (2012)

Maternal Mthfd1 disruption impairs fetal growth but does

not cause neural tube defects in mice. American Journal of

Clinical Nutrition,95, 882–891.

Besold J, Schmitt T, Tammaru T, Cassel-Lundhagen A (2008)

Strong genetic impoverishment from the centre of distribu-

tion in southern Europe to peripheral Baltic and isolated

Scandinavian populations of the pearly heath butterﬂy. Jour-

nal of Biogeography,35, 2090–2101.

Biek R, Henderson JC, Waller LA, Rupprecht CE, Real LA

(2007) A high-resolution genetic signature of demo-

graphic and spatial expansion in epizootic rabies virus.

Proceedings of the National Academy of Sciences USA,104,

7993–7998.

Bossdorf O, Auge H, Lafuma L et al. (2005) Phenotypic and

genetic differentiation between native and introduced plant

populations. Oecologia,144,1–11.

Bradbury PJ, Zhang Z, Kroon DE et al. (2007) TASSEL: soft-

ware for association mapping of complex traits in diverse

samples. Bioinformatics,23, 2633–2635.

Buckley J, Butlin RK, Bridle JR (2012) Evidence for evolution-

ary change associated with the recent range expansion of the

British butterﬂy, Aricia agestis, in response to climate change.

Molecular Ecology,21, 267–280.

Burton OJ, Phillips BL, Travis JMJ (2010) Trade-offs and the

evolution of life-histories during range expansion. Ecology

Letters,13, 1210–1220.

Chen J, K€

allman T, Ma X et al. (2012) Disentangling the roles

of history and local selection in shaping clinal variation of

allele frequencies and gene expression in Norway Spruce

(Picea abies). Genetics,191, 865–881.

Claassens AJM, O’Gorman F (1965) The bank vole Clethrionomys glare-

olus Schreber –a mammal new to Ireland. Nature,205,923–924.

Coop G, Witonsky D, Di Rienzo A, Pritchard JK (2010) Using

environmental correlations to identify loci underlying local

adaptation. Genetics,185, 1411–1423.

Cwynar LC, MacDonald GM (1987) Geographical variation of

lodgepole pine in relation to population history. American

Naturalist,129, 463–469.

Eckert AJ, Bower AD, Gonz

alez-Mart

ınez SC et al. (2010) Back

to nature: ecological genomics of loblolly pine (Pinus taeda,

Pinaceae). Molecular Ecology,19, 3789–3805.

ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2983

Edmonds CA, Lillie AS, Cavalli-Sforza LL (2004) Mutations

arising in the wave front of an expanding population. Pro-

ceedings of the National Academy of Sciences USA,101, 975–979.

Elshire RJ, Glaubitz JC, Sun Q et al. (2011) A robust, simple

genotyping-by-sequencing (GBS) approach for high diversity

species. PLoS ONE,6, e19379.

Eppig JT, Blake JA, Bult CJ, Kadin JA, Richardson JE; the

Mouse Genome Database Group (2012) The Mouse Genome

Database (MGD): comprehensive resource for genetics and

genomics of the laboratory mouse. Nucleic Acids Research,40,

D881–D886.

Estoup A, Beaumont M, Sennedot F, Moritz C, Cornuet JM

(2004) Genetic analysis of complex demographic scenarios:

spatially expanding populations of the cane toad, Bufo mari-

nus.Evolution,58, 2021–2036.

Excofﬁer L, Lischer HEL (2010) Arlequin suite ver. 3.5: a new ser-

ies of programs to perform population genetics analyses under

Linux and Windows. Molecular Ecology Resources,10, 564–567.

Excofﬁer L, Ray N (2008) Surﬁng during population expan-

sions promotes genetic revolutions and structuration. Trends

in Ecology and Evolution,23, 347–351.

Excofﬁer L, Foll M, Petit RJ (2009) Genetic consequences of

range expansions. Annual Review of Ecology, Evolution, and

Systematics,40, 481–501.

Fagundes NJR, Ray N, Beaumont M et al. (2007) Statistical eval-

uation of alternative models of human evolution. Proceedings

of the National Academy of Sciences USA,104, 17614–17619.

Fairley JS (1971) Malareus penicilliger mustelae: a ﬂea new to

Ireland. Entomologist’s Monthly Magazine,107, 44.

Gautier M, Gharbi K, Cezard T et al. (2013) The effect of

RAD allele dropout on the estimation of genetic variation

within and between populations. Molecular Ecology,22,

3165–3178.

Haag CR, Saastamoinen M, Marden JH, Hanski I (2005) A can-

didate locus for variation in dispersal rate in a butterﬂy

metapopulation. Proceedings of the Royal Society of London.

Series B, Biological Sciences,272, 2449–2456.

Hahne J, Jenkins T, Halle S, Heckel G (2011) Establishment suc-

cess and resulting ﬁtness consequences for vole dispersers.

Oikos,120,95–105.

Hallatschek O, Hersen P, Ramanathan S, Nelson DR (2007)

Genetic drift at expanding frontiers promotes gene segrega-

tion. Proceedings of the National Academy of Sciences USA,104,

19926–19930.

Hancock AM, Alkorta-Aranburu G, Witonsky DB, Di Rienzo A

(2010a) Adaptations to new environments in humans: the

role of subtle allele frequency shifts. Philosophical Transactions

of the Royal Society of London. Series B, Biological Sciences,365,

2459–2468.

Hancock AM, Witonsky DB, Ehler E et al. (2010b) Human

adaptations to diet, subsistence, and ecoregion are due to

subtle shifts in allele frequency. Proceedings of the National

Academy of Sciences USA,107, 8924–8930.

Hancock AM, Witonsky DB, Alkorta-Aranburu G et al. (2011)

Adaptations to climate-mediated selective pressures in

humans. PLoS Genetics,7, e1001375.

Handley LJL, Manica A, Goudet J, Balloux F (2007) Going the

distance: human population genetics in a clinal world. Trends

in Genetics,23, 432–439.

Heckel G, Burri R, Fink S, Desmet J-F, Excofﬁer L (2005)

Genetic structure and colonization processes in European

populations of the common vole Microtus arvalis.Evolution,

59, 2231–2242.

Hewitt G (2000) The genetic legacy of the Quaternary ice ages.

Nature,405, 907–913.

Hofer T, Ray N, Wegmann D, Excofﬁer L (2009) Large allele

frequency differences between human continental groups

are more likely to have occurred by drift during range

expansions than by selection. Annals of Human Genetics,73,

95–108.

Hohenlohe PA, Bassham S, Etter PD, et al. (2010) Population

genomics of parallel adaptation in threespine stickleback

using sequenced RAD tags. PloS Genetics,6, e1000862.

Hughes CL, Dytham C, Hill JK (2007) Modelling and analysing

evolution of dispersal in populations at expanding range

boundaries. Ecological Entomology,32, 437–445.

Kalinowski ST (2005) HP-RARE 1.0: a computer program for

performing rarefaction on measures of allelic richness. Molec-

ular Ecology Notes,5, 187–189.

Klopfstein S, Currat M, Excofﬁer L (2006) The fate of mutations

surﬁng on the wave of a range expansion. Molecular Biology

and Evolution,23, 482–490.

Kolbe JJ, Glor RE, Rodr

ıguez Schettino L et al. (2004) Genetic

variation increases during biological invasion by a Cuban

lizard. Nature,431, 177–181.

Le Galliard J-F, R

emy A, Ims RA, Lambin X (2012) Patterns

and processes of dispersal behaviour in arvicoline rodents.

Molecular Ecology,21, 505–523.

Lee KA (2006) Linking immune defenses and life history at the

levels of the individual and the species. Integrative and Com-

parative Biology,46, 1000–1015.

Lohmueller KE, Indap AR, Schmidt S et al. (2008) Proportion-

ally more deleterious genetic variation in European than in

African populations. Nature,451, 994–997.

Lubina JA, Levin SA (1988) The spread of a reinvading species:

range expansion in the California sea otter. American Natural-

ist,131, 526–543.

Lynch M (2009) Estimation of allele frequencies from high-

coverage genome-sequencing projects. Genetics,182, 295–301.

Marshall LG, Webb SD, Sepkoski JJ, Raup DM (1982) Mamma-

lian evolution and the great American interchange. Science,

215, 1351–1357.

Matthews LJ, Butler PM (2011) Novelty-seeking DRD4 poly-

morphisms are associated with human migration distance

out-of-Africa after controlling for neutral population gene

structure. American Journal of Physical Anthropology,145, 382–

389.

Matthysen E (2005) Density-dependent dispersal in birds and

mammals. Ecography,28, 403–416.

Menke DB, Page DC (2002) Sexually dimorphic gene expres-

sion in the developing mouse gonad. Gene Expression Pat-

terns,2, 359–367.

Monty A, Mahy G (2010) Evolution of dispersal traits along an

invasion route in the wind-dispersed Senecio inaequidens

(Asteraceae). Oikos,119, 1563–1570.

Moreau C, Bherer C, Vezina H et al. (2011) Deep human gene-

alogies reveal a selective advantage to be on an expanding

wave front. Science,334, 1148–1150.

van der Most PJ, de Jong B, Parmentier HK, Verhulst S (2011)

Trade-off between growth and immune function: a meta-

analysis of selection experiments. Functional Ecology,25,

74–80.

2984 T. A. WHITE ET AL.

Novembre J, Han E (2012) Human population structure and

the adaptive response to pathogen-induced selection pres-

sures. Philosophical Transactions of the Royal Society of London.

Series B, Biological Sciences,367, 878–886.

Parisod C, Bonvin G (2008) Fine-scale genetic structure and

marginal processes in an expanding population of Biscutella

laevigata L. (Brassicaceae). Heredity,101, 536–542.

Parmesan C, Yohe G (2003) A globally coherent ﬁngerprint of

climate change impacts across natural systems. Nature,421,

37–42.

Phillips BL, Brown GP, Webb JK, Shine R (2006) Invasion and

the evolution of speed in toads. Nature,439, 803.

Phillips BL, Kelehear C, Pizzatto L et al. (2010) Parasites and

pathogens lag behind their host during periods of host range

advance. Ecology,91, 872–881.

Prugnolle F, Manica A, Charpentier M et al. (2005) Pathogen-

driven selection and worldwide HLA class I diversity. Cur-

rent Biology,15, 1022–1027.

R Core Team (2012) R: A language and environment for statistical

computing. R Foundation for Statistical Computing, Vienna,

Austria, ISBN 3-900051-07-0, http://www.R-project.org/.

Ramensky VE, Sunyaev SR (2009) Computational analysis of

human genome polymorphism. Molecular Biology,43, 260–268.

Ray N, Currat M, Foll M, Excofﬁer L (2010) SPLATCHE2: a

spatially-explicit simulation framework for complex demog-

raphy, genetic admixture and recombination. Bioinformatics,

26, 2993–2994.

Ryan A, Duke E, Fairley JS (1996) Mitochondrial DNA in bank

voles Clethrionomys glareolus in Ireland: evidence for a small

founder population and localized founder effects. Acta Theri-

ologica,41,45–50.

Shine R, Brown GP, Phillips BL (2011) An evolutionary process

that assembles phenotypes through space rather than

through time. Proceedings of the National Academy of Sciences

USA,108, 5708–5711.

Simmons AD, Thomas CD (2004) Changes in dispersal during

species’ range expansions. American Naturalist,164, 378–395.

Slatkin M, Excofﬁer L (2012) Serial founder effects during

range expansion: a spatial analog of genetic drift. Genetics,

191, 171–181.

Stephens PA, Sutherland WJ (1999) Consequences of the Allee

effect for behaviour, ecology and conservation. Trends in

Ecology and Evolution,14, 401–405.

Stuart P, Mirmin L, Cross TF et al. (2007) The origin of Irish

bank voles Clethrionomys glareolus assessed by mitochondrial

DNA analysis. Irish Naturalists’ Journal,28, 440–446.

Thomas PD, Campbell MJ, Kejariwal A et al. (2008) PANTHER:

a library of protein families and subfamilies indexed by

function. Genome Research,13, 2129–2141.

Travis JMJ, Dytham C (2002) Dispersal evolution during inva-

sions. Evolutionary Ecology Research,4, 1119–1129.

Travis JMJ, Munkemuller T, Burton OJ et al. (2007) Deleterious

mutations can surf to high densities on the wave front of an

expanding population. Molecular Biology and Evolution,24,

2334–2343.

Travis JMJ, Mustin K, Benton TG, Dytham C (2009) Accelerating

invasion rates result from the evolution of density-dependent

dispersal. Journal of Theoretical Biology,259, 151–158.

Tschirren B, Fitze PS, Richner H (2003) Sexual dimorphism in

susceptibility to parasites and cell-mediated immunity in

great tit nestlings. Journal of Animal Ecology,72, 839–845.

Tsutsui ND, Suarez AV, Holway DA, Case TJ (2000) Reduced

genetic variation and the success of an invasive species.

Proceedings of the National Academy of Sciences USA,97, 5948–

5953.

UniProt Consortium (2012) Reorganizing the protein space at

the Universal Protein Resource (UniProt). Nucleic Acids

Research,40, D71–D75.

Velo-Ant

on G, Rodr

ıguez D, Savage AE et al. (2012) Amphib-

ian-killing fungus loses genetic diversity as it spreads across

the New World. Biological Conservation,146, 213–218.

Waters JM, Fraser CI, Hewitt GM (2012) Founder takes all:

density-dependent processes structure biodiversity. Trends in

Ecology and Evolution,28,78–85.

White TA, Perkins SE (2012) The ecoimmunology of invasive

species. Functional Ecology,26, 1313–1323.

White TA, Lundy MG, Montgomery WI et al. (2012) Range

expansion in an invasive small mammal: inﬂuence of life-

history and habitat quality. Biological Invasions,14, 2203–

2215.

White TA, Perkins SE, Heckel G, Searle JB. (2013) Data from:

adaptive evolution during an ongoing range expansion: the

invasive bank vole (Myodes glareolus) in Ireland. Dryad Digital

Repository. doi:10.5061/dryad.fb782.

Xu H, Wu XR, Wewer UM, Engvall E (1994) Murine muscular

dystrophy caused by a mutation in the laminin alpha 2

(Lama2) gene. Nature Genetics,8, 297–302.

Yang W-Y, Novembre J, Eskin E, Halperin E (2012) A model-

based approach for analysis of spatial structure in genetic

data. Nature Genetics,44, 725–731.

Zayed A, Whitﬁeld CW (2008) A genome-wide signature of

positive selection in ancient and recent invasive expansions

of the honey bee Apis mellifera.Proceedings of the National

Academy of Sciences USA,105, 3421–3426.

T.A.W., G.H. and J.B.S. designed and planned the

study. T.A.W. and S.E.P. carried out the ﬁeldwork in

Ireland. T.A.W. carried out the analyses. T.A.W., S.E.P.,

G.H. and J.B.S. wrote the manuscript.

Data accessibility

Genotype data are available via Dryad doi:10.5061/

dryad.fb782 (White et al. 2013). Illumina reads are avail-

able from the Sequence Read Archive accession

SRP020629.

Supporting information

Additional supporting information may be found in the online ver-

sion of this article.

Table S1 Results of linear regression of genetic diversity on

distance from the introduction site at Foynes.

Table S2 Potentially deleterious alleles identiﬁed by PolyPhen-

2 and correlations with distance from the introduction site.

ADAPTIVE EVOLUTION DURING A RANGE EXPANSION 2985

The timid invasion: behavioural adjustments and range expansion in a non-native rodent

Article

Full-text available

Jul 2023

Animal behaviour can moderate biological invasion processes, and the native fauna's ability to adapt. The importance and nature of behavioural traits favouring colonization success remain debated. We investigated behavioural responses associated with risk-taking and exploration, both in non-native bank voles (Myodes glareolus, N = 225) accidentally introduced to Ireland a century ago, and in native wood mice (Apodemus sylvaticus, N = 189), that decline in numbers with vole expansion. We repeatedly sampled behavioural responses in three colonization zones: established bank vole populations for greater than 80 years (2 sites), expansion edge vole populations present for 1-4 years (4) and pre-arrival (2). All zones were occupied by wood mice. Individuals of both species varied consistently in risk-taking and exploration. Mice had not adjusted their behaviour to the presence of non-native voles, as it did not differ between the zones. Male voles at the expansion edge were initially more risk-averse but habituated faster to repeated testing, compared to voles in the established population. Results thus indicate spatial sorting for risk-taking propensity along the expansion edge in the dispersing sex. In non-native prey species the ability to develop risk-averse phenotypes may thus represent a fundamental component for range expansions.

Genomics for the conservation management of kea (Nestor notabilis)

Thesis

Full-text available

Mar 2022

Aimee Stubbs

Captive insurance populations act as a safeguard against species extinction and can provide a source for future reintroductions into the wild. As more species become threatened globally there is increased pressure on captive insurance populations to support the conservation of their wild counterparts. Consequently, the genetic composition of captive populations becomes increasing important. Captive insurance populations should be both representative of the wild population throughout its range and genetically and demographically viable for long-term management. However, many captive populations have been established with a small number of founders, with unknown source location or relationships. The risks associated with small populations, such as genetic drift, loss of genetic diversity, and increased inbreeding are then exacerbated in captivity if founders are sourced from existing small wild population, where the likelihood that founders are unrelated is low. Therefore, existing captive populations may not be suitable as a future insurance for species. Kea (Nestor notabilis) are a large species of parrot native to Te Waipounamu o Aotearoa (the South Island of New Zealand), listed as ‘Nationally Endangered’ under the Aotearoa New Zealand Threat Classification System. Despite current conservation strategies, the wild population is thought to still be declining, which has raised interest in the role the captive kea flock might play as a future source for wild reintroductions (i.e., an insurance population). Kea have been held in captivity since at least the early 1960s, with uncontrolled, sporadic breeding and a historical lack of regulations prior to the full protection of the species in 1986. Currently, the 59 kea held in captivity in New Zealand have an incomplete pedigree, skewed sex ratio, skewed founder representation, and unknown genetic structure, which undermines their suitability as a potential insurance population for future management. In this thesis, genome-wide single nucleotide polymorphism (SNP) data generated through genotyping-by-sequencing (GBS) is used to examine the population genetic structure, genetic diversity, relatedness, and levels of inbreeding in the captive and wild kea populations. Overall, the genetic data presented here will help achieve the long-term goals and sustainable management of kea by i) determining if the current captive population is a viable insurance population for the species, ii) informing genetic management of captive kea to optimise genetic diversity retention, and minimise inbreeding, and iii) investigating the genetic structure and diversity of wild kea at a higher resolution. Our analyses found greatest support for two genetic clusters in the wild kea population (north and south of the South Island), with a steady gradient of admixture between the two. These data provide support at a higher resolution to previous genetic studies on the wild population. Wild kea to the north of the South Island show lower levels of genetic diversity and higher levels of inbreeding relative to the rest of the wild kea population, and the captive population. Long-term population monitoring and genetic analyses will be essential to accurately examine trends in genetic diversity and inbreeding of wild kea populations, particularly if populations continue to decline. The pedigree reconstruction using relatedness estimates derived from the GBS data was largely congruent with the studbook pedigree. Notably, however, three putatively unrelated individuals appear to be first-order relatives (parent-offspring or full-sibling). This finding is indicative of a rejection to the typical assumption that founders are unrelated, even for an endangered species with a sizeable wild population. Additionally, an understanding of the relationships among kea in the current captive population has implications for the future genetic management and breeding of the population. Despite no signs of inbreeding or reduced genetic diversity, the captive kea population is not genetically representative of the wild kea population throughout its range, nor is it viable for long-term management. It is recommended that the captive population be supplemented with additional wild founders sourced from the more genetically diverse, and currently underrepresented, southern end of the South Island. This study highlights the importance of proactive genetic assessments and the integration of genetic information into captive and wild species management, particularly when establishing or supplementing captive insurance populations.

Whole genome sequencing reveals stepping-stone dispersal buffered against founder effects in a range expanding seabird

Article

Feb 2024
MOL ECOL

Many species are shifting their ranges in response to climate‐driven environmental changes, particularly in high‐latitude regions. However, the patterns of dispersal and colonization during range shifting events are not always clear. Understanding how populations are connected through space and time can reveal how species navigate a changing environment. Here, we present a fine‐scale population genomics study of gentoo penguins ( Pygoscelis papua ), a presumed site‐faithful colonial nesting species that has increased in population size and expanded its range south along the Western Antarctic Peninsula. Using whole genome sequencing, we analysed 129 gentoo penguin individuals across 12 colonies located at or near the southern range edge. Through a detailed examination of fine‐scale population structure, admixture, and population divergence, we inferred that gentoo penguins historically dispersed rapidly in a stepping‐stone pattern from the South Shetland Islands leading to the colonization of Anvers Island, and then the adjacent mainland Western Antarctica Peninsula. Recent southward expansion along the Western Antarctic Peninsula also followed a stepping‐stone dispersal pattern coupled with limited post‐divergence gene flow from colonies on Anvers Island. Genetic diversity appeared to be maintained across colonies during the historical dispersal process, and range‐edge populations are still growing. This suggests large numbers of migrants may provide a buffer against founder effects at the beginning of colonization events to maintain genetic diversity similar to that of the source populations before migration ceases post‐divergence. These results coupled with a continued increase in effective population size since approximately 500–800 years ago distinguish gentoo penguins as a robust species that is highly adaptable and resilient to changing climate.

Genome-wide support for incipient Tula hantavirus species within a single rodent host lineage

Article

Full-text available

Jan 2024

Evolutionary divergence of viruses is most commonly driven by co-divergence with their hosts or through isolation of transmission after host-shifts. It remains mostly unknown, however, whether divergent phylogenetic clades within named virus species represent functionally equivalent byproducts of high evolutionary rates or rather incipient virus species. Here, we test these alternatives with genomic data from two widespread phylogenetic clades in Tula orthohantavirus (TULV) within a single evolutionary lineage of their natural rodent host, the common vole Microtus arvalis. We examined voles from 42 locations in the contact region between clades for TULV infection by RT-PCR. Sequencing yielded 23 TULV Central North and 21 TULV Central South genomes which differed by 14.9-18.5% at the nucleotide and 2.2-3.7% at the amino acid level without evidence of recombination or reassortment between clades. Geographic cline analyses demonstrated an abrupt (<1 km wide) transition between the parapatric TULV clades in continuous landscape. This transition was located within the Central mitochondrial lineage of M. arvalis and genomic SNPs showed gradual mixing of host populations across it. Genomic differentiation of hosts was much weaker across the TULV Central North to South transition than across the nearby hybrid zone between two evolutionary lineages in the host. We suggest that these parapatric TULV clades represent functionally distinct, incipient species which are likely differently affected by genetic polymorphisms in the host. This highlights the potential of natural viral contact zones as systems for investigating of the genetic and evolutionary factors enabling or restricting the transmission of RNA viruses.

Population Genomics of the Critically Endangered Brazilian Merganser

Article

Full-text available

Dec 2023

Simple Summary The Brazilian merganser, a critically endangered duck species in South America, was studied using a population genomics approach. This research focused on the genetic diversity of the mergansers in the four remaining wild populations located in Central Brazil. The results showed that there is a low genetic diversity and high levels of inbreeding in individuals across all locations, with a moderate level of genetic differentiation between them. These findings highlight the need for immediate conservation actions to prevent the decline of the Brazilian merganser population and genetic erosion. Genetic monitoring can help implement appropriate in situ and ex situ management strategies to increase the species’ long-term survival in its natural environment. Abstract The Brazilian merganser (Mergus octosetaceus) is one of the most endangered bird species in South America and comprises less than 250 mature individuals in wild environments. This is a species extremely sensitive to environmental disturbances and restricted to a few “pristine” freshwater habitats in Brazil, and it has been classified as Critically Endangered on the IUCN Red List since 1994. Thus, biological conservation studies are vital to promote adequate management strategies and to avoid the decline of merganser populations. In this context, to understand the evolutionary dynamics and the current genetic diversity of remaining Brazilian merganser populations, we used the “Genotyping by Sequencing” approach to genotype 923 SNPs in 30 individuals from all known areas of occurrence. These populations revealed a low genetic diversity and high inbreeding levels, likely due to the recent population decline associated with habitat loss. Furthermore, it showed a moderate level of genetic differentiation between all populations located in four separated areas of the highly threatened Cerrado biome. The results indicate that urgent actions for the conservation of the species should be accompanied by careful genetic monitoring to allow appropriate in situ and ex situ management to increase the long-term species’ survival in its natural environment.

КОЛОНИЗАЦИЯ: ИНДИВИДУАЛЬНЫЕ ОСОБЕННОСТИ КОЛОНИСТОВ И ПОПУЛЯЦИОННЫЕ ПРОЦЕССЫ

Article

Full-text available

Oct 2023

Деятельность человека порождает новые глобальные процессы, в том числе изменения ареалов, вызванные трансформацией ландшафтов, биологическими инвазиями и изменениями климата. В ходе расширения ареала происходит освоение видом или популяцией новых пространств – колонизация. Исследование причин и процессов, сопровождающих колонизацию, а также ее последствий бурно развивается в последние 20 лет на стыке между такими областями биологии, как: пространственная экология, экология перемещений, экология инвазий, теория метапопуляций, поведенческая экология, эволюционная экология, популяционная генетика, теория персональности. В своем обзоре мы суммируем теоретические представления и эмпирические исследования, нацеленные на поиск ответов на два главных вопроса: что отличает колонистов от их сородичей и в чем специфика демографических и генетических процессов, протекающих на волне экспансии популяции?

Expanding through the Emerald Isle: exploration and spatial orientation of non-native bank voles in Ireland

Article

Full-text available

Aug 2023

Whether introduced into a completely novel habitat or slowly expanding their current range, the degree to which animals can efficiently explore and navigate new environments can be key to survival, ultimately determining population establishment and colonization success. We tested whether spatial orientation and exploratory behavior are associated with non-native spread in free-living bank voles (Myodes glareolus, N = 43) from a population accidentally introduced to Ireland a century ago. We measured spatial orientation and navigation in a radial arm maze, and behaviors associated to exploratory tendencies and risk-taking in repeated open-field tests, at the expansion edge and in the source population. Bank voles at the expansion edge re-visited unrewarded arms of the maze more, waited longer before leaving it, took longer to start exploring both the radial arm maze and the open field, and were more risk-averse compared to conspecifics in the source population. Taken together, results suggest that for this small mammal under heavy predation pressure, a careful and thorough exploration strategy might be favored when expanding into novel environments.

Population genomics of the critically endangered Brazilian merganser

Preprint

Full-text available

Mar 2023

The Brazilian merganser ( Mergus octosetaceus ) is one of the most endangered bird species in South America that comprises less than 250 mature individuals in the wild environments. This is a species extremely sensitive to environmental disturbances and restricted to few “pristine” freshwater habitats in Brazil, and it has been classified as Critically Endangered on the IUCN Red List since 1994. Understanding its current genetic diversity to promote in situ and ex situ management strategies was considered urgent for conservation of the remaining populations. To understand the evolutionary dynamics of remaining Brazilian merganser populations we have used "Genotyping by Sequencing" approach to characterize 923 SNPs in 31 individuals from all known areas of occurrence. The remaining populations of the Brazilian merganser present a low genetic diversity and high inbreeding levels likely due to recent population decline associated to habitat loss. Furthermore, it revealed a moderate level of genetic differentiation between all populations located in four separated areas of the highly threatened Cerrado biome. The results indicate that urgent actions for conservation of the species should be accompanied by a careful genetic monitoring to allow appropriate in situ and ex situ management to increase the long-term species survival in its natural environment.

Genomic signatures of climate adaptation in bank voles

Article

Full-text available

Mar 2024

Evidence for divergent selection and adaptive variation across the landscape can provide insight into a species' ability to adapt to different environments. However, despite recent advances in genomics, it remains difficult to detect the footprints of climate‐mediated selection in natural populations. Here, we analysed ddRAD sequencing data (21,892 SNPs) in conjunction with geographic climate variation to search for signatures of adaptive differentiation in twelve populations of the bank vole ( Clethrionomys glareolus ) distributed across Europe. To identify the loci subject to selection associated with climate variation, we applied multiple genotype‐environment association methods, two univariate and one multivariate, and controlled for the effect of population structure. In total, we identified 213 candidate loci for adaptation, 74 of which were located within genes. In particular, we identified signatures of selection in candidate genes with functions related to lipid metabolism and the immune system. Using the results of redundancy analysis, we demonstrated that population history and climate have joint effects on the genetic variation in the pan‐European metapopulation. Furthermore, by examining only candidate loci, we found that annual mean temperature is an important factor shaping adaptive genetic variation in the bank vole. By combining landscape genomic approaches, our study sheds light on genome‐wide adaptive differentiation and the spatial distribution of variants underlying adaptive variation influenced by local climate in bank voles.

Concurrent invasions of European starlings in Australia and North America reveal population-specific differentiation in shared genomic regions

Article

Nov 2023
MOL ECOL

A species' success during the invasion of new areas hinges on an interplay between the demographic processes common to invasions and the specific ecological context of the novel environment. Evolutionary genetic studies of invasive species can investigate how genetic bottlenecks and ecological conditions shape genetic variation in invasions, and our study pairs two invasive populations that are hypothesized to be from the same source population to compare how each population evolved during and after introduction. Invasive European starlings ( Sturnus vulgaris ) established populations in both Australia and North America in the 19th century. Here, we compare whole‐genome sequences among native and independently introduced European starling populations to determine how demographic processes interact with rapid evolution to generate similar genetic patterns in these recent and replicated invasions. Demographic models indicate that both invasive populations experienced genetic bottlenecks as expected based on invasion history, and we find that specific genomic regions have differentiated even on this short evolutionary timescale. Despite genetic bottlenecks, we suggest that genetic drift alone cannot explain differentiation in at least two of these regions. The demographic boom intrinsic to many invasions as well as potential inversions may have led to high population‐specific differentiation, although the patterns of genetic variation are also consistent with the hypothesis that this infamous and highly mobile invader adapted to novel selection (e.g., extrinsic factors). We use targeted sampling of replicated invasions to identify and evaluate support for multiple, interacting evolutionary mechanisms that lead to differentiation during the invasion process.

Reduced genetic variation and the success of an invasive species

Article

Full-text available

May 2000

Reorganizing the protein space at the Universal Protein Resource (UniProt)The UniProt ConsortiumNucleic Acids Res201140D1D71D75324512022102590

Data

Full-text available

Nov 2012

Philippe Le Mercier

The mission of UniProt is to support biological research by providing a freely accessible, stable, comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and querying interfaces. UniProt is comprised of four major components, each optimized for different uses: the UniProt Archive, the UniProt Knowledgebase, the UniProt Reference Clusters and the UniProt Metagenomic and Environmental Sequence Database. A key development at UniProt is the provision of complete, reference and representative proteomes. UniProt is updated and distributed every 4 weeks and can be accessed online for searches or download at http://www.uniprot.org.

R: A Language and Environment for Statistical Computing

Book

Jan 2017

Core R Team

Reorganizing the protein space at the Universal Protein Resource (UniProt)

Article

Jan 2012

Philippe Le Mercier

PANTHER: a library of protein families and subfamilies indexed by function

Article

Jan 2003

... The PANTHER database (http:// panther .celera.com) was designed as a resource to comprehensively and consistently treat both family and subfamily classification of proteins, focused on metazoans but also covering other organisms. Rationale. ...

Computational analysis of human genome polymorphism

Article

Mar 2009
Mol Biol

Whereas the genome-era technologies have produced the sequence of complete human genome, the modern post-genome technologies aim at the understanding of mechanisms of processing of genetic information and elucidation of within-species variation. Single nucleotide polymorphisms (SNPs) comprise the majority of polymorphism in the human population. Non-synonymous coding SNPs together with SNPs in regulatory regions are believed to have the highest impact on complex disease etiology, quantitative traits and response to drug treatment. PolyPhen is a computational tool for prediction of putatively functional nsSNPs with application areas such as genetics of complex disease, birth defects, identification of functional mutations in model organisms and evolutionary genetics.

Genetic analysis of complex demographic scenarios: Spatially expanding populations of the cane toad, Bufo marinus

Article

Sep 2004
EVOLUTION

Inferring the spatial expansion dynamics of invading species from molecular data is notoriously difficult due to the complexity of the processes involved. For these demographic scenarios, genetic data obtained from highly variable markers may be profitably combined with specific sampling schemes and information from other sources using a Bayesian approach. The geographic range of the introduced toad Bufo marinus is still expanding in eastern and northern Australia, in each case from isolates established around 1960. A large amount of demographic and historical information is available on both expansion areas. In each area, samples were collected along a transect representing populations of different ages and genotyped at 10 microsatellite loci. Five demographic models of expansion, differing in the dispersal pattern for migrants and founders and in the number of founders, were considered. Because the demographic history is complex, we used an approximate Bayesian method, based on a rejection-regression algorithm, to formally test the relative likelihoods of the five models of expansion and to infer demographic parameters. A stepwise migration-foundation model with founder events was statistically better supported than other four models in both expansion areas. Posterior distributions supported different dynamics of expansion in the studied areas. Populations in the eastern expansion area have a lower stable effective population size and have been founded by a smaller number of individuals than those in the northern expansion area. Once demographically stabilized, populations exchange a substantial number of effective migrants per generation in both expansion areas, and such exchanges are larger in northern than in eastern Australia. The effective number of migrants appears to be considerably lower than that of founders in both expansion areas. We found our inferences to be relatively robust to various assumptions on marker, demographic, and historical features. The method presented here is the only robust, model-based method available so far, which allows inferring complex population dynamics over a short time scale. It also provides the basis for investigating the interplay between population dynamics, drift, and selection in invasive species.

Team RDC.R: A Language And Environment For Statistical Computing. R Foundation for Statistical Computing: Vienna, Austria

Technical Report

Jan 2012

Core R Team

A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species

Article

Jan 2011
PLOS ONE

Advances in next generation technologies have driven the costs of DNA sequencing down to the point that genotyping-by-sequencing (GBS) is now feasible for high diversity, large genome species. Here, we report a procedure for constructing GBS libraries based on reducing ...

The Universal Protein Resource (UniProt) 2009

Data

Nov 2009
NUCLEIC ACIDS RES

Philippe Le Mercier

The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information that is essential for modern biological research. UniProt is produced by the UniProt Consortium which consists of groups from the European Bioinformatics Institute, the Protein Information Resource and the Swiss Institute of Bioinformatics. The core activities include manual curation of protein sequences assisted by computa-tional analysis, sequence archiving, a user-friendly UniProt website and the provision of additional value-added information through cross-references to other databases. UniProt is comprised of four major components, each optimized for different uses: the UniProt Archive, the UniProt Knowledge-base, the UniProt Reference Clusters and the Uni-Prot Metagenomic and Environmental Sequence Database. One of the key achievements of the UniProt consortium in 2008 is the completion of the first draft of the complete human proteome in UniProtKB/Swiss-Prot. This manually annotated representation of all currently known human protein-coding genes was made available in UniProt release 14.0 with 20 325 entries. UniProt is updated and distributed every three weeks and can be accessed online for searches or downloaded at www.uniprot.org. INTRODUCTION

White TA, Perkins SE, Heckel G, Searle JB.. Adaptive evolution during an ongoing range expansion: the invasive bank vole (Myodes glareolus in Ireland. Mol Ecol 22: 2971-2985

Abstract and Figures

Recommended publications

Non-invasive multi-species monitoring: real-time PCR detection of small mammal and squirrel prey DNA...

Spatial dynamics of Microtus vole populations in continuous and fragmented agricultural landscapes

A Regional Study of Diversity and Abundance of Small Mammals in Ohio

Rodents, Lagomorphs and Insectivores from Azokh Cave