ArticlePDF Available

An Ancestral Balanced Inversion Polymorphism Confers Global Adaptation

Authors:

Abstract

Since the pioneering work of Dobzhansky in the 1930s and 1940s, many chromosomal inversions have been identified but how they contribute to adaptation remains poorly understood. In Drosophila melanogaster, the widespread inversion polymorphism In(3R)Payne underpins latitudinal clines in fitness traits on multiple continents. Here, we use single-individual whole-genome sequencing, transcriptomics and published sequencing data to study the population genomics of this inversion on four continents: in its ancestral African range and in derived populations in Europe, North America, and Australia. Our results confirm that this inversion originated in sub-Saharan Africa and subsequently became cosmopolitan; we observe marked monophyletic divergence of------- inverted and non-inverted karyotypes, with some substructure among inverted chromosomes between continents. Despite divergent evolution of this inversion since its out-of-Africa migration, derived non-African populations exhibit similar patterns of long-range linkage disequilibrium between the inversion breakpoints and major peaks of divergence in its center, consistent with balancing selection and suggesting that the inversion harbors alleles that are maintained by selection on several continents. Using RNA-seq we identify overlap between inversion-linked SNPs and loci that are differentially expressed between inverted and non-inverted chromosomes. Expression levels are higher for inverted chromosomes at low temperature, suggesting loss of buffering or compensatory plasticity and consistent with higher inversion frequency in warm climates. Our results suggest that this ancestrally tropical balanced polymorphism spread around the world and became latitudinally assorted along similar but independent climatic gradients, always being frequent in subtropical/tropical areas but rare or absent in temperate climates.
© The Author(s) 2023. Published by Oxford University Press on behalf of Society for Molecular Biology and
Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution License
(https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in
any medium, provided the original work is properly cited. 1
An Ancestral Balanced Inversion Polymorphism Confers Global Adaptation
1
2
Martin Kapun1,2,3,4,*, Esra Durmaz Mitchell1,2,5, Tadeusz J. Kawecki1, Paul Schmidt6, and
3
Thomas Flatt1,2,*
4
5
1 Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
6
7
2 Department of Biology, University of Fribourg, Fribourg, Switzerland
8
9
3 Division of Cell and Developmental Biology, Medical University of Vienna, Vienna, Austria
10
11
4 Natural History Museum Vienna, A-1010 Vienna, Austria
12
13
5 Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense,
14
Denmark
15
16
6 Department of Biology, University of Pennsylvania, Philadelphia, USA
17
18
ORCID IDs:
19
MK: 0000-0002-3810-0504
20
ED: 0000-0002-4345-2264
21
TJK: 0000-0002-9244-1991
22
PS: 0000-0002-8076-6705
23
TF: 0000-0002-5990-1503
24
25
*Co-corresponding authors: Martin.Kapun@nhm-wien.ac.at; thomas.flatt@unifr.ch
26
27
Running title: Genomics of a Balanced Inversion Polymorphism
28
29
30
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
2
Abstract
1
Since the pioneering work of Dobzhansky in the 1930s and 1940s, many chromosomal
2
inversions have been identified but how they contribute to adaptation remains poorly
3
understood. In Drosophila melanogaster, the widespread inversion polymorphism In(3R)Payne
4
underpins latitudinal clines in fitness traits on multiple continents. Here, we use single-individual
5
whole-genome sequencing, transcriptomics and published sequencing data to study the
6
population genomics of this inversion on four continents: in its ancestral African range and in
7
derived populations in Europe, North America, and Australia. Our results confirm that this
8
inversion originated in sub-Saharan Africa and subsequently became cosmopolitan; we observe
9
marked monophyletic divergence of inverted and non-inverted karyotypes, with some
10
substructure among inverted chromosomes between continents. Despite divergent evolution of
11
this inversion since its out-of-Africa migration, derived non-African populations exhibit similar
12
patterns of long-range linkage disequilibrium between the inversion breakpoints and major
13
peaks of divergence in its center, consistent with balancing selection and suggesting that the
14
inversion harbors alleles that are maintained by selection on several continents. Using RNA-seq
15
we identify overlap between inversion-linked SNPs and loci that are differentially expressed
16
between inverted and non-inverted chromosomes. Expression levels are higher for inverted
17
chromosomes at low temperature, suggesting loss of buffering or compensatory plasticity and
18
consistent with higher inversion frequency in warm climates. Our results suggest that this
19
ancestrally tropical balanced polymorphism spread around the world and became latitudinally
20
assorted along similar but independent climatic gradients, always being frequent in
21
subtropical/tropical areas but rare or absent in temperate climates.
22
23
Keywords: inversion, balanced polymorphism, balancing selection, clines, adaptation
24
25
Introduction
26
Chromosomal inversions are structural mutations that cause the gene order of a chromosomal
27
segment to be reversed (Sturtevant 1917, 1919, 1921). Because inversions suppress crossing-
28
over (but not gene conversion events) in heterozygous state, they can cause an effective barrier
29
to genetic exchange (‘gene flux’) between inverted and non-inverted (‘standard’) chromosomes
30
(Rozas and Aguadé 1994; Navarro et al. 1997; Griffiths et al. 2000; Schaeffer and Anderson
31
2005; Kirkpatrick 2010; Charlesworth 2016; Crown et al. 2018; Korunes and Noor 2019; Kapun
32
and Flatt 2019; Durmaz et al. 2020). This pervasive effect of inversions on patterns of
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
3
recombination can have major evolutionary consequences. For example, inversions can
1
contribute to (i) speciation by allowing mutations involved in reproductive isolation to
2
accumulate; (ii) genetic divergence between the sexes by accumulating on sex chromosomes;
3
and (iii) adaptation by capturing beneficial alleles at multiple loci and binding them together
4
(Dobzhansky 1948, 1949, 1950; Charlesworth and Charlesworth 1973; Rieseberg 2001; Noor et
5
al. 2001; Navarro and Barton 2003; Kirkpatrick and Barton 2006; Hoffmann and Rieseberg
6
2008; Kirkpatrick 2010; Charlesworth 2016; Fuller et al. 2016, 2017; Charlesworth and Barton
7
2018; Wellenreuther and Bernatchez 2018; Faria et al. 2019; Fuller et al. 2019; Kapun and Flatt
8
2019; Durmaz et al. 2020; Charlesworth and Flatt 2021; Mackintosh et al. 2022).
9
Since the discovery of inversions in the early 20th century by Sturtevant (1917, 1919, 1921),
10
their role in adaptation has attracted great interest among evolutionary geneticists (Dobzhansky
11
1948, 1950; Krimbas and Powell 1992; Hoffmann et al. 2004; Kirkpatrick and Barton 2006;
12
Hoffmann and Rieseberg 2008; Kirkpatrick 2010; Guerrero et al. 2012; Wellenreuther and
13
Bernatchez 2018; Faria et al. 2019; Kapun and Flatt 2019). For instance, theory suggests that
14
linked selection can cause the spread of an initially rare inversion when it captures a locally
15
adaptive haplotype, protects it from recombination load and/or maladaptive gene flow from
16
neighboring populations, and then ‘hitchhikes’ with it to high frequency; alternatively, a new
17
inversion might be favored by direct positive selection when the breakpoints of the inversion
18
fortuitously induce a beneficial mutation (Charlesworth and Charlesworth 1973; Charlesworth
19
1974; Kirkpatrick and Barton 2006; Kirkpatrick 2010; Guerrero et al. 2012; Charlesworth and
20
Barton 2018; Kapun and Flatt 2019; Durmaz et al. 2020; Mackintosh et al. 2022). Indeed,
21
beginning with Dobzhansky’s seminal observations in Drosophila pseudoobscura (Dobzhansky
22
1943, 1947, 1948, 1950; Wright and Dobzhansky 1946), many inversion polymorphisms subject
23
to spatially and/or temporally varying selection have been identified, from plants to humans
24
(Krimbas and Powell 1992; Hoffmann et al. 2004; Stefansson et al. 2005; Hoffmann and
25
Rieseberg 2008; Lowry and Willis 2010; Kapun et al. 2016a; Wellenreuther and Bernatchez
26
2018; Faria et al. 2019; Kapun and Flatt 2019; Machado et al. 2021; Lange et al. 2022).
27
Despite over 100 years of research, however, many fundamental questions about the
28
adaptive role of inversions remain poorly understood (Kirkpatrick and Barton 2006; Kirkpatrick
29
and Kern 2012; Kapun and Flatt 2019). Does adaptive divergence between inverted and
30
standard chromosomes accumulate after an initially rare inversion has become established in a
31
population? For instance, if an inversion has direct fitness consequences because it causes a
32
deletion or gene expression changes near the breakpoints, we might expect that adaptive
33
divergence between the chromosomal arrangements postdates the initial establishment of the
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
4
inversion. Alternatively, do adaptive haplotypes predate the mutational origin of an inversion and
1
then get captured by it (Kirkpatrick and Barton 2006; Kirkpatrick and Kern 2012; Guerrero et al.
2
2012; Charlesworth and Barton 2018; Schaal et al. 2022; Mackintosh et al. 2022)? What forms
3
of balancing selection maintain inversion polymorphisms (Faria et al. 2019; Kapun and Flatt
4
2019)? And what are the genic targets of selection carried by adaptive inversions?
5
A promising, tractable model for tackling some of these major questions is the vinegar fly
6
Drosophila melanogaster: it harbors several apparently balanced inversion polymorphisms that
7
form parallel latitudinal clines on multiple continents (Mettler et al. 1977; Knibb et al. 1981;
8
Knibb 1982, 1983; Lemeunier and Aulard 1992; Fabian et al. 2012; Kapun et al. 2014, 2016a,
9
2020; Kapun and Flatt 2019). The best studied inversion in this species is In(3R)Payne, a 8-Mb
10
large paracentric inversion that spans roughly one third of the right arm of the third chromosome
11
(3R; encompassing ~1200 genes) and whose frequency varies latitudinally on several
12
continents, most prominently along the North American and Australian east coasts (Ashburner
13
and Lemeunier 1976; Mettler et al. 1977; Knibb et al. 1981; Knibb 1982, 1986; Lemeunier and
14
Aulard 1992; Sezgin et al. 2004; Fabian et al. 2012; Rane et al. 2015; Kapun et al. 2014, 2016a,
15
2020; Kapun and Flatt 2019). The 3R Payne inversion originated in sub-Saharan Africa >120
16
kya (Corbett-Detig and Hartl 2012); it thus predates the out-of-Africa expansion of D.
17
melanogaster ~4-19 kya and its subsequent colonization of other continents (Lachaise et al.
18
1988; David and Capy 1988; Li and Stephan 2006; Keller 2007; Kapopoulou et al. 2018, 2020;
19
Arguello et al. 2019; Sprengelmeyer et al. 2020). Several lines of genetic and phenotypic
20
evidence including patterns of latitudinal clinality suggest that this chromosomal
21
polymorphism is adaptive (Rako et al. 2006; Kennington et al. 2006, 2007; Fabian et al. 2012;
22
Rane et al. 2015; Kapun et al. 2014, 2016a, 2016b, Durmaz et al. 2018; Kapun and Flatt 2019;
23
Kapun et al. 2020).
24
The evolutionary history of this adaptive inversion raises several interesting questions. Given
25
its parallel clinal distribution on multiple continents, being frequent (~40-50% or higher) in
26
subtropical and tropical climates but rare or absent in high-latitude, temperate areas around the
27
world (Kapun and Flatt 2019), did this inversion adapt independently and hence convergently
28
to similar climatic gradients on several continents? Under such a scenario of local adaptation,
29
the allelic content of the inversion might vary among different geographical areas (Dobzhansky
30
1949; Schaeffer et al. 2003; Kirkpatrick and Barton 2006). Alternatively, selection might act
31
uniformly across a broad geographic range: if so, did the inversion capture a pre-existing
32
adaptive haplotype in its ancestral range and then invade the rest of the world, with climatic
33
selection favoring parallel but non-convergent spatial assortment of this polymorphism on
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
5
multiple continents? With appropriate data we might be able to distinguish between these
1
scenarios. And, given its effects on multiple fitness traits, what are likely genic targets of
2
selection harbored by the 3R Payne inversion?
3
Here we address these fundamental questions by investigating the evolutionary genomics of
4
the 3R Payne inversion polymorphism on four continents: in its ancestral range in Africa and in
5
derived populations in Europe, North America, and Australia. First, we seek to elucidate the
6
adaptive genetic basis of this inversion by combining new phased sequencing data for3R Payne
7
inverted and standard karyotypes isolated from North American populations in Florida and
8
Maine with published sequencing data from the African ancestral range as well as from Europe
9
and Australia. We use these data to investigate patterns of phylogeography, nucleotide
10
variability, linkage disequilibrium, karyotypic divergence, and allele sharing across populations.
11
Second, to identify potential targets of selection spanned by the inversion, we combine FST
12
outlier analyses with transcriptomic analysis of karyotypes from a derived Florida population;
13
because 3R Payne has been implicated in climate adaptation, we performed RNA-sequencing
14
across two developmental temperatures.
15
We discuss our results in the light of theoretical predictions about expected patterns of
16
variation and divergence of inversions (Navarro et al. 2000; Guerrero et al. 2012) and balancing
17
selection (Zeng et al. 2021) and with regard to two commonly invoked models for adaptive
18
inversions, Dobzhansky’s epistatic coadaptation model (Dobzhansky 1948, 1949, 1950, 1951;
19
Charlesworth and Charlesworth 1973; Charlesworth 1974; Charlesworth and Flatt 2021) and
20
Kirkpatrick’s and Barton’s model of ‘local adaptation’ (i.e., local selection in the face of
21
maladaptive gene flow; Kirkpatrick and Barton 2006; Charlesworth and Barton 2018;
22
Mackintosh et al. 2022). Under both models, a possible consequence is that the same inversion
23
is highly locally adapted and thus contains distinct sets of adaptive alleles in different
24
populations (Dobzhansky 1948, 1950; Prakash and Lewontin 1968, 1971; Kirkpatrick and
25
Barton 2006),
26
Consistent with either model, our results suggest that In(3R)Payne captured adaptive alleles
27
in the ancestral African range that predate the origin of the inversion. Yet, contrary to the above-
28
mentioned corollary, we find relatively weak differentiation among inverted chromosomes across
29
continents. These results indicate that the adaptive allelic content of the inversion might be
30
ancestral and shared among populations: selection appears to have favored the spatial
31
assortment of this ancestral polymorphism on multiple continents in a parallel fashion, resulting
32
in qualitatively identical latitudinal clines and mediating ‘global’ (species-wide) adaptation.
33
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
6
Results and Discussion
1
Table S1 (Supplementary Material online) gives an overview of the genomic data analyzed and
2
indicates which data subsets were used in the different analyses presented below. Table S2
3
(Supplementary Material online) provides information
4
5
3R Payne is of monophyletic African origin and shows weak out-of-Africa divergence
6
Given that 3R Payne is a cosmopolitan adaptive polymorphism (cf. Kapun and Flatt 2019) of
7
sub-Saharan African evolutionary origin (Corbett-Detig and Hartl 2012), we first sought to study
8
its phylogeography. For example, major divergence of inverted chromosomes among continents
9
could indicate that the inversion adapted independently (i.e., convergently) to similar conditions
10
on different continents.
11
Divergence-based age estimation suggests that 3R Payne has originated ~146,000 years
12
ago; polymorphism-based estimation indicates a median estimate of ~129,000 years (5%
13
confidence limit [CL]: ~80 kya, 95% CL: ~196 kya; Corbett-Detig and Hartl 2012). Taking the
14
latter estimate and assuming a generation time of ~15 generations per year (Pool 2015), this
15
inversion is thus at least ~1.95 x 106 generations old, i.e., roughly twice the ancestral effective
16
population size Ne (~1.0 x 106 - 1.5 x 106; Kreitman 1983, Matzkin et al. 2005; Shapiro et al.
17
2007; Campos et al. 2017; Kapopoulou et al. 2018). The polymorphism is therefore probably
18
sufficiently old for homogenizing flux between inverted and standard karyotypes to have
19
occurred via gene conversion or double crossovers: flux rates Φ have been estimated to be
20
~10-4-10-5 for the central regions of D. melanogaster inversions (Payne 1924; Chovnick 1973;
21
Navarro et al. 1997, 2000).
22
The age of 3R Payne is relevant because for sufficiently old inversions (age >> Ne
23
generations) that have captured an adaptive haplotype we might expect major peaks of
24
divergence between inverted and standard chromosomes in the center of the inversion, which
25
are due to the interplay of homogenizing flux and divergent selection opposing recombination
26
(Guerrero et al. 2012; also see below). Consistent with this expectation, we have previously
27
found major peaks of divergence in the center of In(3R)Payne in North American samples
28
(Kapun et al. 2016a). In further support of a selective role, latitudinal frequency clines of 3R
29
Payne in Europe and North America deviate from neutral expectations (Kapun et al. 2016a,
30
2020), and inverted and standard karyotypes differ in their effects on several major fitness traits
31
including body size, cold shock survival and lifespan (Rako et al. 2006; Kapun et al. 2016b;
32
Durmaz et al. 2018).
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
7
To study the phylogeography of 3R Payne, we investigated phylogenetic relationships among
1
karyotypes using sequencing data from 485 strains across four continents, including data from
2
the ancestral African range (Siavonga, Zambia; Pool et al. 2012; Lack et al. 2015, 2016;
3
Sprengelmeyer et al. 2020) and from several derived populations in Europe (n=3), North
4
America (n=3) and Australia (n=2) (fig. 1A; supplementary table S1, Supplementary Material
5
online). Our analyses complement those of Corbett-Detig and Hartl (2012) who had examined
6
the phylogenetic history of In(3R)Payne and other inversions using several African populations
7
and single populations from Europe (France) and North America (North Carolina, USA). Based
8
on the average number of pairwise nucleotide differences per site (nucleotide diversity π; Nei
9
and Li 1979) in 100 kb non-overlapping windows, we constructed a neighbor-joining haplotype
10
network of inverted and standard chromosomes using the Neighbor-Net method (Bryant and
11
Moulton 2004; fig. 1B). In contrast to neighbor-joining trees, Neighbor-Net allows one to
12
represent conflicting signals in the data, for example due to recombination.
13
Inverted karyotypes clustered monophyletically within Africa, irrespective of their worldwide
14
sampling location (fig. 1B; table 1), confirming the finding that 3R Payne arose in sub-Saharan
15
Africa (Corbett-Detig and Hartl 2012). This differs markedly from the pattern observed when
16
analyzing a network based on a random set of third-chromosome SNPs at a distance > 200 kb
17
from In(3R)Payne (and from the second major inversion on chromosome 3, In(3L)P; see
18
Materials and Methods): here, the network structure mainly reflects geography, not 3R Payne
19
karyotype (fig. 1C; table 1). However, there is nonetheless a weak signal of clustering of
20
inverted chromosome in this analysis, suggesting that the effect of In(3R)Payne on genetic
21
variation might go well beyond its breakpoints (cf. Corbett-Detig and Hartl 2012).
22
In addition to the dominant signal of monophyletic divergence between inverted and standard
23
karyotypes, we also found a weaker signal of geographical substructure within the inverted and
24
standard clusters of chromosomes, indicating some divergence within karyotypes among
25
continents (fig. 1B; table 1). The observation of substructure within the inverted karyotype bears
26
on the question of whether 3R Payne inverted chromosomes might be locally adapted. Under
27
both the ‘local adaptation’ model and the epistatic coadaptation model mentioned above, loci
28
within the inverted karyotype may be differentiated among populations if gene flow among
29
populations disrupts locally adapted haplotypes and generates maladaptive genotypes (Prakash
30
and Lewontin 1968, 1971; Schaeffer et al. 2003). The fact that inverted 3R Payne chromosomes
31
exhibit some divergence among continents is consistent with this expectation (but see results
32
and discussion below).
33
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
8
Patterns of variation are consistent with a balanced inversion polymorphism
1
According to coalescent models by Navarro et al. (2000) (also cf. Zeng et al. 2021), a newly
2
arisen inversion subject to balancing selection eliminates substantial amounts of variation
3
across a large chromosomal segment via a partial selective sweep as it increases in frequency;
4
during the subsequent slow process of convergence to mutation-drift-flux equilibrium, variation
5
at the breakpoints is greatly reduced as compared to the central region of the inversion where
6
variation is restored. This is because the rate of gene flux in the form of crossing over is very
7
low in regions close to the breakpoints and hence the effect of the partial sweep is greater.
8
[Generally, pairing in heterokaryotypes is strongly reduced at the breakpoints, with
9
recombination rates being very low (<<10-4; Navarro et al. 1997, 2000); for an inversion
10
heterokaryotype in D. subobscura, Rozas and Aguadé (1994) estimated a value of 10-7 near the
11
breakpoints.] By contrast, old inversions that have reached mutation-drift-flux equilibrium can
12
exhibit greater variation at the breakpoints as compared to the inversion body (Navarro et al.
13
2000; cf. Wallace et al. 2013; Charlesworth 2023). This is because, over time, genetic
14
differences between inverted and standard karyotypes become homogenized by gene flux, but
15
this effect is much stronger in the central regions of the inversion than at the breakpoints where
16
flux is effectively suppressed. At least 107 generations are required to reach mutation-drift-flux
17
equilibrium (Navarro et al. 2000). We thus sought to examine π inside and outside of the
18
inverted region among inverted and standard 3R chromosomes and compare our data to the
19
expectations of Navarro et al. (2000) and Zeng et al. (2021).
20
Nucleotide variability on 3R was markedly higher in the African population sample from
21
Zambia as compared to the samples from derived population, consistent with the out-of-Africa
22
bottleneck (Li and Stephan 2006; Lack et al. 2016; Arguello et al. 2019; Kapopoulou et al. 2020;
23
Kapun et al. 2020, 2021) (fig. 2; supplementary fig. S1, Supplementary Material online). Inside
24
the inverted region of African chromosomes, π was higher in standard relative to inverted
25
chromosomes, but not different between arrangement types in derived populations (fig. 2;
26
supplementary fig. S1, Supplementary Material online; see below).
27
In both Africa and derived populations, π was markedly reduced in the breakpoint regions as
28
compared to the inversion body, resulting in a dome-shaped pattern (supplementary fig. S1,
29
Supplementary Material online). This dome-shaped pattern agrees qualitatively well with the
30
predictions of Navarro et al. (2000) for an inversion subject to balancing selection and which
31
might not have reached equilibrium (i.e., inversion age < 107 generations). On the other hand,
32
consistent with equilibrium (long-term) balancing selection, π was higher for African standard
33
chromosomes inside as compared to outside the inverted region (fig. 2; also see supplementary
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
9
fig. S1 B, Supplementary Material online). Assuming that the frequency of the inversion is
1
substantially lower than that of the standard arrangement, such a pattern might be expected
2
under an equilibrium model of balancing selection with recombination (Zeng et al. 2021). Under
3
such a scenario, the presence of the inversion would increase diversity due to the accumulation
4
of new mutations that distinguish inverted and standard chromosomes; the coalescent time
5
would be somewhat increased for standard alleles, due to the partial population subdivision
6
created by the inversion, while the coalescent time would be reduced for inversion alleles (cf.
7
Zeng et al. 2021). Recent calculations by Charlesworth (2023), which are based on our data in
8
Fig. S1B, are consistent with 3R Payne representing a balanced polymorphism which has
9
reached mutation-drift-recombination equilibrium with respect to neutral or nearly neutral
10
variants (see Navarro et al. 2000); this also suggests that 3R Payne might be older than
11
previously estimated (see above).
12
13
The absence of clear differences in π for non-African 3R chromosomes may be due to the
14
interplay of sufficient time for gene flux having homogenized variation between karyotypes
15
(already in Africa), selection, and the out-of-Africa bottleneck. The fact that levels of variation in
16
derived populations are very similar between standard and inverted chromosomes could imply
17
that a substantial number of individuals carrying 3R Payne has migrated out of Africa during the
18
range expansion.
19
We next examined patterns of Tajima’s D to test for departures of the site frequency
20
spectrum from an equilibrium standard neutral model (Tajima 1989). Relative to standard
21
neutral expectation (D=0), positive values of D indicate an excess of intermediate frequency
22
variants and might be consistent with a bottleneck or balancing selection; by contrast, negative
23
values of D indicate an excess of rare variants which may result from a recent population
24
expansion, from purifying selection, or from the affected genomic region having recovered
25
variation after a selective sweep (Innan and Stephan 2000; Wallace et al. 2013; Fijarczyk and
26
Babik 2015). Figure 2 shows average D, estimated separately for the two karyotypes;
27
supplementary fig. S2 (Supplementary Material online) displays D along the chromosome,
28
separately for each karyotype as well as for the pooled sample of inverted and standard
29
chromosomes.
30
Average levels of D were positive and significantly higher for inverted as compared to
31
standard chromosomes within populations on all continents, both inside and outside the inverted
32
region (fig. 2; also see supplementary fig. S2, Supplementary Material online). Inverted
33
chromosomes therefore harbor a greater frequency of intermediate variants than standard
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
10
chromosomes. Positive D values for inverted chromosomes could arise from a bottleneck
1
affecting the inversion within the ancestral range prior to the range expansion. But this seems
2
unlikely as such a bottleneck should have genome-wide effects beyond the inversion; yet the
3
positive values of D in the inverted region deviate markedly from the average value of D 0 on
4
chromosome 3R and the genome-wide average of D for populations in Africa, Europe and North
5
America (Kapun et al. 2020). Also, given that a new inversion is initially genetically invariant,
6
one would expect more low-frequency variants as the inversion accumulates new mutations.
7
Other possibilities might involve an incomplete sweep, or even a balanced polymorphism,
8
among inverted chromosomes; the latter could account for the relatively high diversity of
9
inverted chromosomes. Finally, associative overdominance (AOD), reflecting reduced
10
recombination experienced by the inversion overall, could be involved; AOD might generate a
11
pattern of pseudo-overdominance (Frydenberg 1963; Sved 1968; Ohta 1971; Zhao and
12
Charlesworth 2016; Charlesworth and Charlesworth 2018; Becher et al. 2020; Gilbert et al.
13
2020; Berdan et al. 2021; Waller 2021; Charlesworth and Jensen 2021; Charlesworth 2022).
14
However, under AOD, low-recombination regions still exhibit a skew towards low-frequency
15
variants (Becher et al. 2020), so that this scenario seems improbable.
16
In the pooled sample of inverted and standard chromosomes (supplementary fig. S2,
17
Supplementary Material online), we did not find evidence for positive D values consistent with
18
balancing selection. Thus, our analyses of D are somewhat difficult to interpret; a complication
19
with interpreting Tajima’s D is that it can be strongly influenced by sample size, the number of
20
segregating sites, and by demography.
21
Nevertheless, several lines of evidence strongly support the notion that In(3R)Payne
22
represents a balanced polymorphism, including our analyses of nucleotide variability above. For
23
example, consistent with some form of balancing selection, In(3R)Payne segregates at
24
intermediate frequencies in subtropical/tropical populations around the world: for example, the
25
inversion attains an average frequency of ~45% in subtropical southeastern North America and
26
~60% in tropical Australian populations (Lemeunier and Aulard 1992; Rako et al. 2006; Rane et
27
al. 2015; Kapun et al. 2016; see meta-analysis in Kapun and Flatt 2019). In Afrotropical
28
populations, the average inversion frequency is ~10-13%, with the highest value (~64%) in
29
tropical Ivory Coast (Aulard et al. 2002; Kapun and Flatt 2019). Temperate, high-latitude
30
populations, by contrast, are fixed for the standard arrangement (Lemeunier and Aulard 1992;
31
Kapun et al. 2016; Kapun and Flatt 2019; Kapun et al. 2020). These frequency clines,
32
presumably in the face of sufficient gene flow to homogenize arrangement frequencies, suggest
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
11
that 3R Payne represents a balanced polymorphism driven by selection in / across
1
heterogeneous environments (Levene 1953).
2
The fact that different low-latitude populations exhibit different intermediate inversion
3
frequencies is consistent with epistatic coadaptation: under such a model, there exist multiple
4
equilibria and quasi-equilibria for the frequency of the inversion, and the frequency which it will
5
ultimately attain will depend on the history, the initial conditions of the population, and/or the
6
local environment (Charlesworth 1974; also see Dobzhansky and Pavlovsky 1957; Lewontin
7
1974). Although the model of Charlesworth (1974) assumes constant fitness values, it leads to
8
apparent frequency-dependent selection. Interestingly, Nassar et al. (1973) found that negative
9
frequency-dependent viability selection operates on In(3R)Payne under crowded larval
10
conditions, giving further credence to a scenario of balancing selection.
11
Some studies have reported that In(3R)Payne can locally reach near-fixation or fixation in
12
some Australian populations (Knibb et al. 1981; Anderson et al. 2005; Umina et al. 2005), an
13
observation that seems to be at odds with a balanced polymorphism. However, the sample size
14
in the study of Knibb et al. (1981) was extremely low. Moreover, drift can cause the fixation of
15
one variant and loss of the alternative variant despite balancing selection (Robertson 1962;
16
Ewens and Thomson 1970). Also, the selective factors favoring the polymorphism might be
17
environmentally sensitive, so that balancing selection could break down in some locations.
18
Overall, the data available to date indicate that In(3R)Payne segregates at intermediate
19
frequencies in the majority of low-latitude populations around the world and that fixation of the
20
inversion is rare (Kapun and Flatt 2019) this pattern and our above results are thus broadly
21
consistent with balancing selection and/or spatially varying selection (Levene 1953) maintaining
22
this polymorphism.
23
24
Patterns of LD are compatible with linked selection maintaining the inversion
25
Next, we examined patterns of linkage disequilibrium (LD). Three aspects of LD can be
26
distinguished: (i) LD among markers without reference to karyotype; (ii) LD between a marker
27
and inverted vs. standard arrangements; and (iii) LD between markers within inverted or within
28
standard chromosomes. Because inversions strongly reduce the products of recombination in
29
heterozygous state, heterokaryotypes (or pools of inverted and standard chromosomes) should
30
exhibit increased LD as compared to homokaryotypes (aspect ii); for sufficiently old inversions
31
that evolve neutrally we might expect that LD decreases towards the center of the inversion due
32
to gene flux between the karyotypes, even though such a pattern is difficult to distinguish from
33
direct positive selection at the breakpoints (Navarro et al. 1997; Guerrero et al. 2012). Within the
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
12
class of inverted chromosomes (aspect iii), LD can be higher than within standard
1
chromosomes because of a smaller Ne of the inversion.
2
LD between the 3R Payne inversion and marker loci has been previously studied by Kojima
3
et al. (1970), Langley et al. (1974), Voelker et al. (1978), Sezgin et al. (2004) and Kennington et
4
al. (2006); such LD between the inversion and marker loci might be due to hitchhiking of a
5
neutral variant initially associated with the inversion by chance (Ishii and Charlesworth 1977) or
6
due to subsequent new mutations that differentiate the karyotypes. More recently, Rane et al.
7
(2015) examined LD associated with In(3R)Payne in Australian samples using RAD-sequencing
8
data. Here, we sought to use phased genomic data to compare patterns of LD in the region
9
spanned by 3R Payne in African, European, North American and Australian samples with single
10
nucleotide resolution (fig. 3; supplementary fig. S3, Supplementary Material online).
11
As compared to standard chromosomes, inverted arrangements showed significantly higher
12
levels of short-range LD (r2) within the region spanned by In(3R)Payne and a slower decay of
13
LD with physical distance (fig. 3A). A plausible explanation is that this pattern is due to drift
14
within the two semi-isolated subpopulations of inverted vs. standard chromosomes, with
15
inverted chromosomes exhibiting both lower recombination and lower Ne. The pattern of decay
16
was similar for inverted chromosomes across continents, except for the Portuguese sample,
17
perhaps due to the rather small number of sampled chromosomes and the overall lower
18
frequency of 3R Payne in Europe (Kapun et al. 2020). By contrast, the pattern of decay for
19
standard chromosomes differed markedly among continents: while in the African sample LD
20
leveled off to r2 < 0.1 within a few hundred base pairs (bp), the decay of LD in standard
21
arrangements from North America (Florida) and Australia (Queensland) closely resembled that
22
of inverted chromosomes (fig. 3A), probably reflecting bottlenecks in the derived samples. The
23
patterns for the derived populations were generally less clear than for the Zambian population,
24
presumably due to the out-of-Africa bottleneck.
25
Next, we examined long-range LD (fig. 3B; supplementary fig. S3, Supplementary Material
26
online). We first analyzed LD within each karyotype. For both standard and inverted
27
arrangements, LD levels did not exceed r2 > 0.1 within distances of a few kbp, revealing long-
28
range linkage equilibrium within karyotypes. In marked contrast, when jointly analyzing the pool
29
of inverted and standard karyotypes from Florida (Fig. 3B), we observed strong long-range LD
30
within the inverted region, involving SNPs several million bp away from each other and
31
suggesting that major associations among loci are driven by hetero- not homokaryotypes.
32
These patterns were similar for the other continents, with major LD between but not within
33
karyotypes (supplementary fig. S3, Supplementary Material online). Likewise, no strong LD was
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
13
seen within Australian karyotypes; this is contrary to Rane et al. (2015) and likely due to a
1
misclassification of karyotypes in that study (see below; Materials and Methods; supplementary
2
fig. S4, Supplementary Material online).
3
Notably, in European, North American and Australian samples we found large clusters of
4
SNPs in the center of the inversion that are in strong LD with each other and the proximal and
5
distal breakpoints, interspersed by regions of low or no LD (fig. 3B; supplementary fig. S3,
6
Supplementary Material online). For Australia, our data agree well with those of Kennington et
7
al. (1997) who found LD among marker loci within and near In(3R)Payne and between these
8
loci and the inversion itself, including marked associations in the center of the inversion. In the
9
African sample, we also observed LD between the breakpoints and center regions of elevated
10
LD, yet these central clusters of high LD were much less prominent than in the derived
11
populations (supplementary fig. S3, Supplementary Material online). These patterns of long-
12
range LD almost certainly reflect the strong divergence between inverted and standard
13
arrangements (cf. Zeng et al. 2021); the clearer patterns seen for non-African populations might
14
be due to lower diversity which tends to sharpen up divergence patterns (Nordborg et al. 1996).
15
Associations between an inversion and loci within the inverted region can have several
16
causes that are difficult to distinguish (Strobeck 1983; Navarro et al. 1996): the inversion might
17
have become associated with neutral alleles when it formed (Ishii and Charlesworth 1977; Nei
18
and Li 1980), or it might be linked to neutral loci subject to drift (Nei and Li 1975; Strobeck
19
1983); or selection might maintain associations between selected loci spanned by the inversion
20
and the inversion itself despite flux between arrangements (see above; also see Prakash and
21
Lewontin 1968; Charlesworth and Charlesworth 1973; Charlesworth 1974; Ishii and
22
Charlesworth 1977; Schaeffer et al. 2003; Guerrero et al. 2012; Fuller et al. 2017). The extent of
23
such associations depends on the flux rate, the effective number of inverted and standard
24
chromosomes, and the inversion age (Ishii and Charlesworth 1977; Nei and Li 1980). Theory
25
suggests that the half-life of decay of an association between a neutral locus and an inversion is
26
on the order of the reciprocal of the flux rate in heterokaryotypes (Ishii and Charlesworth 1977;
27
Nei and Li 1980). Selection can retard this decay considerably, but only when the neutral locus
28
is very closely linked to one of the selected loci involved in maintaining the polymorphism (Ishii
29
and Charlesworth 1977).
30
How do our data compare to these predictions? Assuming a gene flux rate Φ of ~10-5 in the
31
center of the inversion (Chovnick 1973), the timescale for the decay of the association would be
32
on the order of ~105 generations (~7000-10,000 years, assuming 10-15 generations per year).
33
Given that In(3R)Payne is at least ~129k years old and globally quite frequent (Kapun and Flatt
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
14
2019: average global frequency ~15%, based on 530 samples from 34 independent studies
1
spanning >50 years of data), and given that Ne is large (~106), gene flux should have had ample
2
opportunity to break down strong LD associated with this inversion. Our data are thus consistent
3
with the selective maintenance of the center peaks inside the inversion. On the other hand, D.
4
melanogaster has undergone an expansion from southern-central Africa and a major out-of-
5
Africa bottleneck; began to spread from the Middle East into Europe ~1800 years ago; and
6
colonized North America and Australia ~100-150 years ago (Hoffmann and Weeks 2007; Keller
7
2007; Sprengelmeyer et al. 2020): demographic events such as bottlenecks, drift and/or founder
8
effects can therefore not be ruled out as having influenced LD associated with In(3R)Payne.
9
10
Major peaks of divergence inside the inversion are shared across continents
11
To study chromosome-wide patterns of differentiation as a function of 3R Payne karyotype we
12
used FST, a normalized measure of pairwise allele frequency differentiation (Weir and
13
Cockerham 1984). In the context of karyotypic differentiation, it would be more accurate to call
14
this quantity FAT, a measure of variation between allelic classes at a polymorphic locus;
15
Charlesworth et al. 1997. We were particularly interested in determining whether there might be
16
major peaks of divergence between standard and inverted chromosomes in the center of the
17
inverted region, away from the breakpoints. For sufficiently old inversions, and assuming the
18
existence of targets of selection within the inversion, coalescent theory predicts that selection
19
might lead to peaks inside the inversion body, which are centered on the adaptive loci and
20
selectively maintained in the face of homogenizing flux between standard and inverted
21
chromosomes (Guerrero et al. 2012). This pattern is not unique to inversions: any form of
22
balancing selection will lead to a peak of divergence and LD around the target of selection at
23
equilibrium (Hudson and Kaplan 1988; Kaplan et al. 1988; Guerrero et al. 2012; Zeng et al.
24
2021). For old inversions, this leads to a characteristic pattern of divergence between the
25
karyotypes (Guerrero et al. 2012; Kirkpatrick 2017): the pattern of divergence resembles the
26
cables of a suspension bridge with peaks of divergence both at the breakpoints (where
27
recombination is greatly reduced) and in the center of the inversion (where selection opposes
28
recombination). Such center peaks of divergence could arise from either the Kirkpatrick-Barton
29
model or the epistatic coadaptation mechanism (Guerrero et al. 2012; Charlesworth and Barton
30
2018; Kapun and Flatt 2019; Durmaz et al. 2020; Charlesworth and Flatt 2021); sweeps within
31
inverted or standard chromosomes could also generate such peaks. We have previously found
32
such peaks in pool-sequencing data for North American samples (Kapun et al. 2016a), and
33
Rane et al. (2015) had examined such peaks in Australian data using RAD-sequencing.
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
15
Here, we sought to revisit these results and to extend them to African and European
1
samples. Secondly, we aimed to assess the contribution of 3R Payne to divergence across
2
latitudinal clines in Europe, North America and Australia (Kolaczkowski et al. 2011; Fabian et al.
3
2012; Rane et al. 2015; Kapun et al. 2016a, 2020). To this end, we studied the effects on
4
divergence of ‘karyotype’ (‘K’, comparing inverted vs. standard arrangements within the same
5
low-latitude populations), ‘geography’ (‘G’, comparing standard chromosomes between low- and
6
high-latitude populations) and ‘geography plus karyotype’ (‘G+K’, comparing low-latitude
7
inverted chromosomes with high-latitude standard chromosomes) (see Materials and Methods).
8
Figure 4 shows patterns of FST for these effects as a function of position on 3R, including
9
estimates of LD between SNPs in the region spanned by the inversion and the inversion itself.
10
Inspection of these patterns revealed several interesting findings.
11
First, we observed marked divergence in the region spanned by the inversion between
12
inverted and standard chromosomes on all four continents (effect of ‘K’), with pronounced peaks
13
in the breakpoint regions (fig. 4, black lines). For derived populations, where 3R Payne exhibits
14
latitudinal clines on different continents, this divergence is similar when contrasting inverted and
15
standard chromosomes from within the same low-latitude populations (effect of ‘K’) and when
16
comparing low-latitude inverted with high-latitude standard arrangements (effect of ‘G+K’,
17
comparing karyotypes between the cline ‘ends’; fig. 4, light grey lines). By contrast, divergence
18
is low when comparing standard chromosomes between low- and high-latitude populations in
19
Europe, North America and Australia (effect of ‘G’; fig. 4, dark grey lines). This is also quantified
20
for derived populations in fig. 5 and table 2, for both the region inside and outside of
21
In(3R)Payne.
22
These results indicate that 3R Payne karyotype is the major determinant of divergence on
23
chromosome arm 3R in all populations examined, and that clinal divergence in non-African
24
populations is predominantly caused by the divergence between inversion karyotypes, not by
25
geography; geographic differentiation inside the inverted region is much weaker than karyotypic
26
differentiation, despite very large geographical distances (~2600-3900 km) between the
27
‘endpoints’ of the clines on different continents (Kapun et al. 2016a). By contrast, outside the
28
inverted region, patterns of divergence are consistent with isolation by distance. These results
29
agree well with previous pool-sequencing analyses of In(3R)Payne in North America and
30
Europe (Kapun et al. 2016a, 2020). However, for Australia our findings differ from those of Rane
31
et al. (2015) who found no major effect of karyotype on divergence in the Queensland low-
32
latitude population sample. Our reclassification of karyotypes in this dataset suggests that this
33
previously reported pattern was due to a partial misassignment of karyotypes. Using our new
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
16
classification, we found major karyotypic divergence in the Queensland sample (fig. 4; fig. 5;
1
supplementary fig. S4, Supplementary Material online; table 2), which is fully consistent with our
2
analyses of European and North American karyotypes and our analyses in fig. 1.
3
Second, coarse-grained patterns of karyotypic divergence and LD, especially for derived
4
populations, are highly congruent across continents, including Australia (fig. 4; fig. 5;
5
supplementary fig. S4, Supplementary Material online). The parallel divergence due to
6
In(3R)Payne is underscored by strong correlations between levels of FST with respect to
7
karyotype across continents, including Africa (supplementary fig. S5, Supplementary Material
8
online). This might reflect that most SNPs are neutral and in LD with the inversion; on the other
9
hand, it is also consistent with parallel clinal adaptation to similar environmental gradients
10
around the world. Together with our phylogenetic analysis, this speaks against a scenario of
11
‘strict’ local adaptation whereby the same inversion is genetically differently adapted to distinct
12
local conditions under such a scenario one might expect larger geographical differentiation
13
among inverted chromosomes (Dobzhansky 1951; Prakash and Lewontin 1968, 1971;
14
Schaeffer et al. 2003; Kirkpatrick and Barton 2006; see below and supplementary fig. S6,
15
Supplementary Material online).
16
Third, on a fine-grained scale, we observed major peaks of divergence in the inversion center
17
that are shared among all non-African populations (fig. 4; supplementary fig. S4, Supplementary
18
Material online). Most prominently, there is a massive central peak of divergence of ~200-300
19
kb length (position on 3R: ~20.9 - 21.2 Mbp) that is common to derived populations in Europe,
20
North America and Australia, with alleles in this peak being in strong LD with the breakpoints
21
(fig. 4; also see fig. 3). These shared peaks, as well as the consistency of LD structure among
22
populations on several continents, are consistent with the idea that linked selection maintains
23
non-random associations between the center peaks and the breakpoints despite homogenizing
24
flux between arrangements (see above; Charlesworth 1974; Guerrero et al. 2012; cf. Prakash
25
and Lewontin 1968, 1971; Lewontin 1974). However, the history of these derived populations is
26
not independent, and bottleneck events (or a strong selective sweep within inverted
27
chromosomes) cannot be ruled out as alternative explanations.
28
Fourth, although these major center peaks seem to be absent in the African sample (top
29
panel in fig. 4), preliminary results by Brian Charlesworth (personal communication) suggest
30
that the observed FST between karyotypes of ~0.1 in the African sample for sites away from the
31
breakpoints agrees qualitatively well with expected neutral divergence between karyotypes (FST
32
= 0.13), assuming an equilibrium balanced polymorphism under an island model of population
33
structure (subdivision with neutral FST = 0.05; inversion frequency = 0.1; rate of exchange = 10-6
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
17
per site per generation). The pattern in the Zambian sample might thus be compatible with 3R
1
Payne representing a long-term balanced polymorphism (also see discussion of π above; see
2
discussion in Charlesworth 2023). This prompted us to take a closer look at inversion-
3
associated alleles within the ancestral African sample.
4
5
The inversion appears to have captured adaptive alleles in its ancestral range
6
Several models of adaptive inversion evolution posit that a new inversion might capture a pre-
7
existing adaptive haplotype, i.e., a set of selected loci that are in loose LD (Dobzhansky 1949,
8
1950, 1951; Charlesworth and Charlesworth 1973; Charlesworth 1974; Kirkpatrick and Barton
9
2006; Charlesworth and Barton 2018; Charlesworth and Flatt 2021; Schaal et al. 2022; also cf.
10
Kimura 1956); alternatively, adaptive divergence between inverted and standard arrangements
11
might have accumulated after the inversion was established. In the former case, we might
12
expect that standard chromosomes in the ancestral African range still carry some of the
13
presumably adaptive, pre-existing alleles that were captured by the inversion when it first arose
14
(Kirkpatrick 2017).
15
Consistent with differentiation among karyotypes not being the result of (continent-specific)
16
local adaptation but having arisen prior to the out-of-Africa migration, we failed to observe
17
elevated divergence within the inversion body among inverted chromosomes from different
18
continents (supplementary fig. S6, Supplementary Material online).
19
To further explore this idea, we quantified the frequency of inversion-specific alleles, defined
20
as SNPs with FST ≥ 0.9 between inverted and standard chromosomes in the North American
21
sample from Florida, among African (and for comparison also among European) standard and
22
inverted chromosomes (fig. 6).
23
This analysis revealed that alleles that are highly ‘inversion-specific’ outside of Africa are
24
polymorphic among both African standard and African inverted chromosomes (fig. 6A), possibly
25
indicating that they represent ancestral alleles that have been captured by the inversion. The
26
major enrichment of ‘inversion-specific’ alleles among African inverted relative to standard
27
chromosomes (frequency difference of ~45% between inverted and standard karyotypes; fig.
28
6A) might also be consistent with the inversion having captured these alleles before the out-of-
29
Africa expansion. If so, this would speak against a scenario whereby the inversion spread to
30
some appreciable frequency by drift and then gained adaptive variants via influx from the
31
subpopulation of standard chromosomes by recombination or through new mutations, with the
32
inversion driven to high frequency by hitchhiking.
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
18
Repeating the analysis in fig. 6A by using highly inversion-specific alleles (FST ≥ 0.9) as
1
defined based on the Zambian population (instead of defining them, as above, based on the
2
Florida sample) also revealed major frequency differentiation between African inverted and
3
standard alleles in derived populations, consistent with the notion that African alleles underpin
4
the divergence of In(3R)Payne karyotypes in derived populations (supplementary fig. S7,
5
Supplementary Material online).
6
Figure 6B shows the distribution of ‘inversion-specific’ SNPs (as defined using the Florida
7
sample) in the African sample with a frequency difference ≥ 50% between standard and inverted
8
chromosomes: the resulting pattern delineates the breakpoints clearly, indicating that
9
divergence in the African sample is driven by suppressed recombination at the breakpoints.
10
Also note the two ‘mini-peak’ regions away from the breakpoints (a larger one at ~19 Mbp; and
11
a smaller range of peaks at ~21 Mbp) where flux is expected to be much higher than at the
12
breakpoints: the locations of these mini-peaks correspond well with those of the major central
13
peaks seen in European, North American and (for the second peak region) Australian samples
14
(fig. 4). Because levels of diversity are very similar between standard and inverted
15
chromosomes in the derived populations, it seems improbable that these peaks are due to very
16
low Ne of inverted chromosomes leaving Africa. Nonetheless, we cannot rule out that these
17
peaks might have become more pronounced during the range expansion, potentially due to the
18
out-of-Africa bottleneck and/or drift, perhaps in addition to selection.
19
20
Genetic divergence between inversion karyotypes is shared among continents
21
Because patterns of karyotypic divergence and LD looked very similar across continents (fig. 4),
22
especially for derived populations, we were interested in quantifying the geographical overlap in
23
the number of inversion-associated candidate genes and SNPs (fig. 7; candidates defined by
24
SNPs with FST ≥ 0.9 between inverted and standard karyotypes; see Materials and Methods).
25
Overall, we observed significant sharing of candidate genes and SNPs across continents (fig.
26
7). However, the inclusion of the Australian data resulted in very low levels of sharing, perhaps
27
because this dataset is based on reduced-representation RAD-sequencing with low resolution;
28
we therefore excluded the Australian data from the analysis (fig. 7). Independent of whether the
29
Australian data were excluded or not (not shown), we identified major overlap of candidates
30
between Europe and North America (fig. 7), probably because of the demographic and genetic
31
similarity of populations on these continents. Importantly, when excluding the Australian data,
32
we found a highly significant overlap of 174 candidate genes and 34 SNPs that are shared
33
between Africa, Europe and North America (fig. 7A, fig. 7B) these loci might thus underlie the
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
19
shared pattern of karyotypic divergence across continents (supplementary table S3;
1
supplementary table S4; Supplementary Material online; see below).
2
When examining the putative functional effects of these candidate SNPs, we found a
3
significant deficiency of SNPs in the intergenic region of African, European and North American
4
samples and a significant enrichment in the 2 kb upstream region of genes in European and
5
North American samples (see supplementary fig. S8; supplementary table S4; Supplementary
6
Material online). These findings suggest that several candidate SNPs might influence gene
7
regulation and gene expression patterns.
8
9
The inversion affects gene expression in a temperature-dependent manner
10
A handful of studies has examined clinal differences in gene expression in Australian and North
11
American D. melanogaster (Levine et al. 2011; Chen et al. 2012; Zhao et al. 2015; Juneja et al.
12
2016; Clemson et al. 2018), but whether 3R Payne contributes to these patterns remains
13
unknown. To investigate the effects of this inversion polymorphism on differential expression
14
(DE), and to complement our genomic analyses, we analyzed RNA-Seq data from adult female
15
whole-body transcriptomes of different In(3R)Payne karyotypes. Our dataset consisted of 9
16
biological replicates each for Florida inverted (FI), Florida standard (FS) and Maine standard
17
(MS) homokaryotypes (isochromosomal lines), with each group reared at two developmental
18
temperatures (18°C, 25°C) prior to RNA extraction (3 karyotypes x 2 temperatures x 9 replicates
19
= 54 samples in total). Because the 3R Payne inversion is involved in climate adaptation (Kapun
20
et al. 2016a; Kapun and Flatt 2019), this design allowed us to examine whether developmental
21
temperature interacts with karyotype and/or geographic origin in affecting expression
22
(supplementary table S5, Supplementary Material online; see Materials and Methods).
23
Approximately 60% of all analyzed genes genome-wide (n=9724) showed significant DE in
24
response to temperature (n=5841; Benjamini-Hochberg [BH]-corrected p < 0.05; supplementary
25
table S5, Supplementary Material online), in agreement with previous work reporting high levels
26
of expression plasticity across different rearing temperatures (Chen et al. 2015). Inversion
27
karyotype had a much weaker effect: only 0.49% of all genes were differentially expressed
28
between karyotypes (Florida inverted vs. standard; n=46; BH-corrected p < 0.05) and 0.45% in
29
response to the effect of karyotype plus geography (Florida inverted vs. Maine standard; n=44;
30
BH-corrected p < 0.05) (supplementary table S5, Supplementary Material online).
31
Interestingly, we failed to identify any DE in response to the effects of geography alone
32
(Florida standard vs. Maine standard; BH-corrected p > 0.05; see supplementary table S5,
33
Supplementary Material online); the effects of karyotype plus geography thus seem to be driven
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
20
by karyotypic differences, not geography. We were thus interested in comparing our
1
transcriptomic candidates to the RNA-seq data of Zhao et al. (2015) who had examined DE
2
between populations from Panama (low latitude) and Maine (high latitude) at two growth
3
temperatures (21°C, 29°C). This analysis revealed significant overlaps between the effects of
4
karyotype plus geography in our data and differentially expressed genes (DEG) identified by
5
Zhao et al. (2015) (SuperExactTest; p < 0.05; supplementary table S6; also see supplementary
6
table S3, Supplementary Material online); overlaps between DEG found by Zhao et al. (2015)
7
and the effects of karyotype in our data were marginally non-significant. These results, together
8
with the analyses of Zhao et al. (2015), suggest that In(3R)Payne makes a major contribution to
9
latitudinal differentiation in gene expression patterns.
10
We also found pervasive interactions between inversion karyotype and growth temperature:
11
temperature had a major influence on both the magnitude of DE and the number of DEG
12
between karyotypes (supplementary table S5, Supplementary Material online). While 648 genes
13
were differentially expressed between inverted and standard arrangement females that had
14
developed at 18°C, we did not find any gene exhibiting significant DE between karyotypes at
15
25°C (fig. 8A; supplementary table S5, Supplementary Material online). This suggests that
16
variants associated with the inverted arrangement might be more sensitive to lower
17
temperatures, maybe due to a loss of buffering or because of increased compensatory plasticity
18
at 18°C (cf. Huang et al. 2022), lending further support to the role of 3R Payne in climate
19
adaptation (also see Pool et al. 2016).
20
Similar to other D. melanogaster inversions (In(2L)t, In(3R)Mo, In(3R)K; see Lavington and
21
Kern 2017; Said et al. 2018), In(3R)Payne karyotype affected DE across the entire genome (at
22
18°C; see fig. 8B; supplementary table S5, Supplementary Material online). These ‘non-local’
23
effects on DE suggest that the 3R Payne inversion exerts major trans-acting regulatory effects
24
(cf. Said et al. 2018), which is also consistent with the significant enrichment of DEG for gene
25
ontology (GO) terms related to regulation of gene expression (fig. 8C). Despite these genome-
26
wide transcriptional effects, DEG were enriched within the region spanned by the inversion (108
27
and 540 genes in- and outside 3R Payne, respectively; Fisher’s exact test (FET), p < 0.001). By
28
contrast, we failed to find enrichment for effects of temperature (459 and 5382 genes in- and
29
outside 3R Payne, respectively; FET, p = 0.75). Beyond DEG involved in regulating expression,
30
GO analysis revealed that the inversion polymorphism also affects the expression of genes
31
involved in growth, development, and reproduction (fig. 8C), as might be expected given the
32
multifarious effects of 3R Payne on fitness traits such as body size, survival upon starvation,
33
cold shock survival and lifespan (Rako et al. 2006; Kapun et al. 2016b; Durmaz et al. 2018).
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
21
Since inversions can have a large impact on the expression of genes in the breakpoint
1
regions (Lavington and Kern 2017; Said et al. 2018), we also asked whether the 108 DEG within
2
the inverted region might be enriched in the breakpoint regions (breakpoints plus a region of up
3
to ± 2 Mb proximal and distal to each breakpoint): there was no evidence for an uneven
4
distribution of DEG as compared to expectations based on non-candidate genes (FET; p =
5
0.83). Given that 3R Payne affects DE inside the inversion body as well as across the entire
6
genome, variants in the breakpoints cannot fully account for the transcriptional effects of the
7
inversion. These results agree well with the conjecture that inversions such as In(3R)Payne
8
affect gene expression as a consequence of linked allelic variation maintained by selection for
9
suppressed recombination (see Said et al. 2018).
10
To identify links between allelic variation and DEG with respect to karyotype, we compared
11
genomic and transcriptomic candidates (supplementary table S3, supplementary table S7,
12
Supplementary Material online). We first quantified the amount of overlap between karyotypic
13
DE at 18°C (FIFS18 = Florida inverted vs. Florida standard reared at 18°C) and gene-wise FST
14
without applying significance thresholds since arbitrary thresholds might constrain the ability to
15
identify overlaps. Using rank-rank hypergeometric overlap tests (RRHO; Cahill et al. 2018)
16
applied to all genes ranked either by FST or by DE, we found that only genes with high FST
17
values exhibited highly significant overlap with strongly differentially expressed genes (fig. 9A).
18
This analysis identified a core set of 86 overlapping genes (see top right corner of fig. 9A) which
19
are all located within In(3R)Payne or in close proximity to it (fig. 8B; supplementary table S3,
20
supplementary table S7; Supplementary Material online). Similar results were obtained when
21
repeating the analysis with the data based on DE between karyotypes irrespective of rearing
22
temperature (FIFS = Florida inverted vs. Florida standard; see supplementary fig. S9,
23
Supplementary Material online). This provides further evidence that allelic variation inside the
24
genomic region spanned by the inversion has a major functional impact on patterns of gene
25
expression (cf. Said et al. 2018).
26
We subsequently focused on 470 candidate SNPs on 3R with FST ≥ 0.9 between inverted
27
and standard arrangements in Florida and 317 transcriptomic candidates with significant DE
28
between karyotypes at 18°C (FIFS18). Comparison of these two groups of candidates revealed
29
a significant overlap, comprising 74 genes (SuperExactTest; expected overlap: 37 genes; p <
30
0.001; fig. 9B; supplementary table S3, supplementary table S7, Supplementary Material
31
online). Similarly, when considering 55 candidates exhibiting DE with respect to karyotype
32
irrespective of developmental temperature (FIFS), we found a significant overlap of 19 genes
33
(SuperExactTest; expected overlap: 4 genes; p < 0.001; supplementary fig. S9; supplementary
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
22
table S3, supplementary table S7, Supplementary Material online). Although neither the 74 nor
1
the 19 overlapping candidate loci were enriched for GO terms, several of them have well-known
2
biological functions (supplementary table S3, Supplementary Material online; also see gene
3
information on FlyBase at http://flybase.org/).
4
A comprehensive database of In(3R)Payne-associated candidate loci, based on our
5
genomic, transcriptomic and overlap analyses, is provided in supplementary table S3
6
(Supplementary Material online). In addition to listing many novel candidates, this dataset
7
contains and corroborates numerous genes previously associated with either latitudinal clinality
8
and/or with In(3R)Payne (Hoffmann and Weeks 2007; Kolaczkowski et al. 2011; Fabian et al.
9
2012; Chen et al. 2012; Zhao et al. 2015; Kapun et al. 2016a, 2016b, 2020). These candidates
10
include several loci with established mutant effects on fitness-related traits (e.g., size,
11
reproduction, lifespan, stress resistance; cf. Kapun et al. 2016b; Durmaz et al. 2018; Kapun and
12
Flatt 2019). Our database of candidates associated with 3R Payne thus provides rich grounds
13
for future work aiming to dissect the genetic architecture of this balanced inversion
14
polymorphism.
15
16
Conclusions
17
Here we have sought to refine our understanding of the adaptive nature of a common
18
cosmopolitan chromosomal inversion polymorphism in D. melanogaster, In(3R)Payne, on four
19
continents: in its ancestral African range and in derived populations in Europe, North America
20
and Australia. Based on our population genomic and transcriptomic analyses, we offer the
21
following conclusions and conjectures:
22
23
(1) Our data confirm that the 3R Payne polymorphism is monophyletic, consistent with a single
24
mutational origin in Africa at least ~129 kya (see Corbett-Detig and Hartl 2012). Despite
25
some genetic (geographical) divergence both within inverted and standard chromosomes
26
among continents, inverted arrangements always cluster together, independently of their
27
geographical provenance, and the same is true for standard arrangements.
28
29
(2) Phylogenetic analysis and patterns of divergence and LD support a scenario whereby
30
differentiation between inverted and standard chromosomes worldwide is due to ancestral
31
variants that differentiate the two karyotypes. This interpretation is supported by (1)
32
significant sharing among continents of loci that are strongly differentiated between the
33
karyotypes, both in the breakpoint regions and the center of the inversion, and (2) an
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
23
absence of pronounced genetic divergence among inverted chromosomes from different
1
continents. Analyses of inversion-specific alleles that are nearly or completely fixed in non-
2
African populations within the African population sample suggest that the inversion has
3
captured adaptive alleles in its ancestral range prior to the inversion having migrated out of
4
Africa and become cosmopolitan.
5
6
(3) Patterns of nucleotide variability, genetic divergence and LD are consistent with (potentially
7
long-term) balancing selection maintaining the inversion polymorphism (cf. Charlesworth
8
2023), but the exact type of balancing selection remains to be elucidated. Given its
9
intermediate frequency in low-latitude populations and its absence in high-latitude locales
10
around the world, 3R Payne appears to have spread out of its ancestral tropical range and
11
then become assorted by spatially varying selection in a parallel manner, causing the
12
formation of similar clines on several continents (Kapun and Flatt 2019). This scenario is
13
consistent with theoretical expectations suggesting that inversion frequencies can be
14
maintained by balancing selection at local equilibria that change clinally (Faria et al. 2019);
15
this could promote inversion polymorphism across large geographical scales and lead to
16
parallel, stable large-scale clines (Westram et al. 2022).
17
18
(4) Our results and previous work (cf. Kapun and Flatt 2019) suggest that 3R Payne is involved
19
in parallel or ‘global’ (species-wide; cf. Booker et al. 2021) adaptation to similar latitudinal
20
gradients around the world. It is noteworthy in this context that in Ethiopia 3R Payne occurs
21
at much lower frequency in a cold high-altitude locale as compared to a warm low-altitude
22
habitat (Pool et al. 2017). Similarly, Aulard et al. (2002) found negative (albeit non-
23
significant) correlations between In(3R)Payne frequency and altitude in African populations.
24
Because 3R Payne does typically not fix under warm conditions but appears to be selected
25
against under cool conditions, it is an intriguing possibility that the loci captured by the
26
inversion provide some form of balancing (e.g., negative frequency-dependent) selection
27
independent of temperature yet happen to render it less tolerant to cool temperatures.
28
29
(5) RNA-seq analyses of inverted and standard chromosomes in a sample from North America
30
(Florida) reveal pronounced effects of inversion karyotype on gene expression that depend
31
on developmental temperature: expression levels are higher for inverted chromosomes at
32
low temperature, perhaps due to a loss of buffering or compensatory plasticity (cf. Huang et
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
24
al. 2022) and consistent with the notion that 3R Payne is susceptible to cool conditions (see
1
above; cf. Kapun et al. 2016a; Pool et al. 2017; Kapun and Flatt 2019).
2
3
(6) Although the inversion body is enriched for differentially expressed genes, the 3R Payne
4
inversion has pervasive genome-wide effects on gene expression, consistent with trans-
5
acting regulatory effects. Functional effects of this inversion are thus unlikely explained by
6
lesions at the breakpoints alone. Together with analyses of divergence and LD, our results
7
support the idea that 3R Payne maintains non-random associations among adaptive loci
8
(Said et al. 2018). Yet, whether the linked loci are subject to epistatic balancing selection or
9
to another selective mechanism is an open question. Likewise, the precise identity of the
10
adaptive loci associated with the inversion remains unknown our database of candidate
11
loci might serve as a fruitful starting point for addressing this important question in future
12
work.
13
14
Materials and Methods
15
Fly strains and their maintenance
16
To investigate karyotype-specific patterns of genetic variation and differentiation between
17
karyotypes we established isofemale lines from populations in North America (Homestead,
18
Florida, and Bowdoin, Maine, collected by Paul Schmidt) (see supplementary table S1,
19
Supplementary Materials online). Lines were maintained under standard laboratory conditions
20
(25°C; 60% relative humidity, 12h:12h light:dark cycle). We karyotyped these lines for six major
21
chromosomal inversion polymorphisms (In(2L)t, In(2R)NS, In(3L)P, In(3R)C, In(3R)Mo,
22
In(3R)Payne; see Lemeunier and Aulard 1992) using codominant PCR markers following the
23
approach of Corbett-Detig et al. (2012). By combining PCR-based karyotyping and experimental
24
crosses we generated lines that were isochromosomal for the third chromosome as described in
25
Kapun et al. (2016b). In brief, we crossed wild-type males to females carrying a compound
26
(second and third chromosome) balancer (SMB6; TM6B; Bloomington Drosophila Stock Center
27
[BDSC] stock #5687) in an ebony (e1) mutant background. F1 offspring, which were
28
heterozygous for the wild-type chromosome and the balancer, were visually selected based on
29
the dominant tubby (Tb1) marker mutant phenotype and backcrossed to the balancer strain to
30
amplify the wild-type chromosomes. Using PCR markers (Matzkin et al. 2005; Corbett-Detig et
31
al. 2012) we determined the karyotype status of successfully isolated lines with respect to
32
In(3R)Payne (see Kapun et al. 2016b for details). Whenever possible we selected against the
33
balancer to establish isochromosomal lines. However, for a subset of standard and inverted
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
25
chromosomes we failed to obtain homokaryons, likely due to recessive homozygous lethal
1
alleles segregating among these chromosomes this is not surprising given that typically ~30-
2
55% of wild third chromosomes are homozygous lethal (Simmons and Crow 1977; Mukai and
3
Nagano 1983; our unpublished observations). In these cases, isolated wild-type chromosomes
4
were propagated as heterozygotes over the balancer chromosome. During the propagation of
5
the compound balancer line used for isolating wild third chromosomes, we occasionally
6
observed that the visual marker of the second chromosome balancer (SM6B; DuoxCy)
7
segregated independently of the visual marker of the third chromosome balancer (TM6B; Tb1).
8
Since we had not consistently selected for both visual markers during the isolation process, we
9
could not rule out that the wild-type second chromosomes might occasionally have recombined
10
with those of the lab strain. We therefore excluded sequencing data from all chromosomal arms
11
other than 3L and 3R from downstream genomic analyses.
12
13
Sample preparation for DNA re-sequencing
14
We generated single-individual (phased) sequencing data using a subset of North American
15
lines from Florida and Maine (see supplementary table S1, Supplementary Material online) to
16
investigate patterns of linkage disequilibrium (LD) and haplotype structure with respect to
17
karyotype and geographic origin. To obtain phased haploid sequencing data, we employed a
18
‘hemiclone’ approach, as described in Kapun et al. (2014) (supplementary fig. S10;
19
Supplementary Material online). To this end, we crossed females from isochromosomal lines, or
20
from strains with wild third chromosomes maintained over the balancer, to males of a highly
21
inbred, inversion-free isofemale reference strain from Nigeria (line NG9 from the Drosophila
22
Population Genomics Project [DPGP]; Lack et al. 2015). For each cross, we sequenced a single
23
F1 hemiclonal male (supplementary fig. S10; Supplementary Material online). To
24
bioinformatically discriminate between wild-type alleles and alleles segregating in the paternal
25
NG9 reference (i.e., ‘bioinformatic phasing of alleles’) we pool-sequenced all reference strain
26
males used for the crosses as a single pool (also see below).
27
28
DNA extraction and library preparation
29
For each of the DNA libraries, we jointly homogenized whole tissue by bead beating (Zirconia
30
beads; 1.2 mm; 2 x 30 min at 6500 rpm) and incubated the homogenate in lysis buffer (100
31
mM Tris-Cl, 100 mM EDTA, 1% SDS, 1 mg/ml Proteinase K) for 30 min at 56°C and 30 min at
32
70°C. The lysate was treated with RNAse A (3 mg per 250 μl aliquot) at 37°C for 30 min prior to
33
adding 39 μl of 8 M potassium acetate, followed by another incubation step for 30 min on ice to
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
26
precipitate protein. After centrifugation at 14,000 rpm for 15 minutes, we mixed the supernatant
1
with one volume of phenol-chloroform-isoamyl alcohol (ratio 25:24:1). The aqueous phase was
2
further washed with 0.75 volume of chloroform prior to precipitation of DNA by adding 3 volumes
3
of ice cold 100% ethanol. After incubation at 4°C for 2h, followed by centrifugation at 14,000
4
rpm for 15 minutes, we washed the pellet with 70% ethanol, dried it at room temperature and
5
then resuspended the DNA in 50 μl of TE buffer. gDNA libraries of each sample were sheared
6
using a Covaris instrument (Duty cycle 10%, intensity 5, cycles/burst 200, time 50s) and
7
prepared for paired-end sequencing at the Lausanne Genomic Technologies Facility (GTF),
8
using the Illumina TruSeq Nano Library preparation kit (Illumina, San Diego, USA). Samples
9
were sequenced on a HiSeq 2000 Illumina Sequencer to 100 bp paired-end reads.
10
11
Differential gene expression assays with RNA-seq
12
Given that In(3R)Payne is involved in climate adaptation (e.g., Kapun et al. 2016a, 2020, Kapun
13
and Flatt 2019), and given that some the inversion’s phenotypic effects depend on growth
14
temperature (Durmaz et al. 2018), we sought to examine the effects of 3R Payne karyotype
15
and/or of developmental temperature on gene expression and to identify potential candidate
16
transcripts / genes associated with In(3R)Payne. To do so, we performed RNA-seq assays on
17
isochromosomal lines carrying either the inverted or standard arrangement from Florida, where
18
the inversion is polymorphic, and on lines carrying the standard arrangement from Maine, where
19
the inversion is absent (see Kapun et al. 2016b for details of sampling locations; see Durmaz et
20
al. 2018 for details of isochromosomal lines; also see supplementary table S1, Supplementary
21
Material online). Each of these three groups was replicated 9-fold and exposed to two
22
developmental temperatures (18°C, 25°C; see below) to account for interactions between
23
genotype (karyotype) and environment (temperature). Prior to sampling for transcriptomic
24
assays, flies were kept under common garden conditions (~21°C, ~50% relative humidity;
25
10h:14h L:D) for three generations. The experimental generation was reared at two growth
26
temperatures during development until 5-7 days of adulthood (18°C vs. 25°C, 12:12h LD, 60%
27
relative humidity, on a cornmeal/ sugar/ yeast/agar diet). Total RNA was extracted from 5-7
28
days-old snap-frozen adult females from each isochromosomal line, with each sample being
29
prepared from 5 individuals (3 karyotypes x 2 temperatures x 9 isochromosomal lines = 54
30
samples) using the MagMAX-96 Total RNA Isolation Kit (ThermoFisher Scientific, Waltham,
31
MA, USA) on a MagMAX Express Magnetic Particle Processor (ThermoFisher Scientific,
32
Waltham, MA, USA), following the manufacturer’s protocol. Prior to library preparation RNA
33
quality was measured using Fragment Analyzer (Advanced Analytical) analysis. Single-end 101
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
27
bp long reads were sequenced on a Illumina HiSeq 2000 sequencer, following library
1
preparation using the TrueSeq stranded library preparation kit. For details of bioinformatic
2
analyses of these RNA-seq data see below.
3
4
Bioinformatics pipeline for genomic analyses
5
The bioinformatics pipeline used for our population genomic analyses (see below for details),
6
including scripts, is available here: https://github.com/capoony/In3RPayne_PopGenomics.
7
8
Mapping pipeline
9
FASTQ reads from DNA and RNA sequencing data were examined for sequencing quality with
10
FASTQC (v.0.10.1; http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and then
11
trimmed and filtered with cutadapt (v.1.8.3; Martin 2011) to remove low-quality bases (base
12
quality ≥ 18; sequence length ≥ 75 bp) and sequencing adapters. For DNA sequencing data, we
13
only retained read pairs for which both reads fulfilled our quality criteria after trimming for
14
mapping with bbmap (v.0.7.15; Li 2013) with default parameters against a compound reference
15
genome consisting of the genomes of D. melanogaster (v.6.12) and common pro- and
16
eukaryotic symbionts of Drosophila, including Saccharomyces cerevisiae (GCF_000146045.2),
17
Wolbachia pipientis (NC_002978.6), Pseudomonas entomophila (NC_008027.1),
18
Commensalibacter intestine (NZ_AGFR00000000.1), Acetobacter pomorum
19
(NZ_AEUP00000000.1), Gluconobacter morbifer (NZ_AGQV00000000.1), Providencia
20
burhodogranariea (NZ_AKKL00000000.1), Providencia alcalifaciens (NZ_AKKM01000049.1),
21
Providencia rettgeri (NZ_AJSB00000000.1), Enterococcus faecalis (NC_004668.1),
22
Lactobacillus brevis (NC_008497.1), and Lactobacillus plantarum (NC_004567.2), to avoid
23
paralogous mapping. We filtered mapped reads for a mapping quality ≥ 20 and used Picard
24
(v.2.17.6; http://picard.sourceforge.net) to remove duplicate reads and re-aligned sequences
25
flanking insertions-deletions (indels) with the Genome Analysis Toolkit, GATK (v3.4-46;
26
McKenna et al. 2010).
27
28
Variant calling in DNA sequencing data
29
We combined mapped reads in BAM file format from each of the sequenced F1 hybrid
30
individuals and from the sequenced pool of sires into a single mpileup file using samtools
31
mpileup (v.1.3; Li et al. 2009) without base quality recalibration (parameter -B). Next, we
32
reconstructed the identity of the maternal wild-type allelic state (‘bioinformatic phasing of
33
alleles’) by contrasting polymorphisms present in the F1 larvae with the reference alleles from
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
28
the sires based on the bioinformatics pipeline described in Kapun et al. (2014). We only
1
considered positions that were homozygous in the reference pool (minimum minor allele
2
frequency < 10%) and retained wild-type alleles with a minimum count of 10 across all
3
sequenced F1 individuals. To avoid false positives, we excluded alleles whose counts fell
4
outside the limits of a 90% binomial confidence interval based on an expected frequency of 50%
5
at a heterozygous site in a given diploid F1 library. We further excluded positions with either (1)
6
minimum coverage < 15 to reduce false negatives due to large sampling errors or (2) maximum
7
coverage > the 95th coverage percentile for the corresponding sample and chromosome to
8
avoid false positives due to paralogous mapping. For positions with more than one wild-type
9
allele, we only considered the most frequent allele.
10
11
Sequencing data availability
12
All newly generated sequencing data are available under NCBI BioProject ID PRJNA928565
13
(http://www.ncbi.nlm.nih.gov/bioproject/928565).
14
15
Additional sequencing data from other continents
16
To complement the above-mentioned sequencing data, we used previously published
17
sequencing data from D. melanogaster lines from Africa, Europe and Australia, all with known
18
inversion karyotype (see supplementary table S1, Supplementary Material online):
19
(i) African data. The African strains were collected in Siavonga (Zambia) and sequenced as
20
haploid embryos to obtain fully phased sequences; they were bioinformatically karyotyped for
21
various inversions as part of the DPGP (Drosophila Population Genomics Project) resource (see
22
Lack et al. 2015, 2016). We focused on 21 lines that are known to segregate In(3R)Payne and
23
randomly selected an equal number of strains with standard arrangement third chromosomes.
24
Consensus sequence files were downloaded from the DGN (Drosophila Genome Nexus)
25
website (http://www.johnpool.net/genomes.html), filtered for polymorphic sites and merged into
26
a single VCF file using custom-made software. Genomic coordinates from D. melanogaster
27
reference v.5 were converted to v.6.
28
(ii) European data. We used phased sequencing data from wild-type strains collected in
29
Póvoa de Varzim in Portugal (Kapun et al. 2014; Franssen et al. 2015). In addition to 7 strains
30
carrying In(3R)Payne, we randomly picked an equal number of strains with standard
31
arrangement on the third chromosome and obtained the genomic data from Dryad
32
(http://doi.org/10.5061/dryad.403b2). In addition to this European low-latitude sample from
33
Portugal, we integrated phased sequencing data from 14 non-inverted strains from a high-
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
29
latitude population in Umeå (Sweden) into our analyses, which were sequenced as haploid
1
embryos (Kapopoulou et al. 2020), similar to the African samples mentioned above.
2
(iii) Australian data. Sequence data for the Australian continent were obtained from Rane et
3
al. (2015) who had investigated population samples that approximate the endpoints of the
4
latitudinal cline along the Australian east coast and sequenced these samples with reduced
5
library representation RAD-tag sequencing. All 55 strains from Innisfail (tropical Queensland)
6
(19 strains carrying 3R Payne plus 18 carrying the standard arrangement) and Yering Station
7
(temperate Victoria) (18 standard lines) had been screened by the authors for 3R Payne using
8
PCR markers (Rane et al. 2015). We obtained a VCF file containing high-confidence SNPs for
9
all lines from Dryad (https://doi.org/10.5061/dryad.5q0m8) and converted genomic coordinates
10
from D. melanogaster reference v.5 to v.6 prior to downstream analyses. Note that this dataset
11
does not include genomic information for the first 12 million bp on chromosome arm 3R.
12
13
Re-analysis of DNA sequences of Australian isochromosomal lines
14
Rane et al. (2015) reported patterns genetic differentiation and LD with respect to In(3R)Payne
15
in Australia that deviate from observations based on single-individual sequencing data of African
16
flies (Corbett-Detig and Hartl 2012) and pool-seq data from North American flies (Kapun et al.
17
2016a). In contrast to the studies of Corbett-Detig and Hartl (2012) and Kapun et al. (2016a),
18
Rane et al. (2015) found that Australian flies from tropical Queensland (where the polymorphism
19
is segregating) do not exhibit elevated genetic differentiation between inverted and standard
20
karyotypes within the genomic region spanned by In(3R)Payne. Yet, these authors found a
21
pattern of strong, highly localized divergence between flies from Queensland and temperate
22
flies from Victoria (where the inversion is very rare or absent), irrespective of In(3R)Payne
23
karyotype. To explore why the patterns in the Australia data might differ from those observed in
24
Africa and North America, we compared the 55 Australian libraries (Queensland: 19 inverted
25
karyotypes, 18 standard karyotypes; Victoria: 18 standard karyotypes) to high-confidence
26
sequencing data from 42 isogenic lines from Siavonga (Zambia, Africa; see above) which had
27
previously been characterized for the presence or absence of In(3R)Payne (Lack et al. 2015,
28
2016). Our goal was to use these data to determine whether Australian and/or African lines
29
cluster according to their In(3R)Payne karyotype and/or their geographic origin. Because 3R
30
Payne is of monophyletic African origin (Corbett-Detig and Hartl 2012, and analyses herein), we
31
expected to find marked clustering of inverted chromosomes of African and Australian origin.
32
Since gene flux due to double crossing overs is strongly suppressed or absent between the
33
karyotypes in the breakpoint regions, we focused on 240 and 262 SNPs that were polymorphic
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
30
both in Africa and Australia, respectively, and which were located within 200,000 bp around the
1
proximal and distal breakpoints. We used custom-made software to combine and convert the
2
allelic data from African and Australian lines to the NEXUS file format and calculated unrooted
3
phylogenetic networks based on the Neighbor-Net inference method (see below). Since the
4
karyotype-specific clustering of African strains and Australian lines from Queensland was
5
inconsistent when using the karyotype classification of Rane et al. (2015) (supplementary fig.
6
S4, Supplementary Material online), we used a panel of highly diagnostic, experimentally
7
validated marker SNPs for In(3R)Payne (see Kapun et al. 2014 for details; also cf. Kapun et al.
8
2016a, 2020) to bioinformatically determine the karyotype status of the sequenced lines. Four of
9
the 19 inversion-specific SNPs had sufficient coverage in the RAD-Seq data of most lines
10
reported by Rane et al. (2015), thus allowing us to re-classify the karyotypes of the Australian
11
lines. Notably, our new classification of karyotypes was highly consistent with the results of the
12
clustering analysis of African samples. We therefore decided to use this new karyotype
13
classification for all downstream analyses of the Australian data. Our analysis using inversion-
14
specific marker SNPs also indicated that several Australian lines were not fixed for either the
15
inverted or standard karyotype but appeared to be heterokaryotypic (supplementary table S8,
16
Supplementary Material online). We thus excluded all apparently heterokaryotypic and/or
17
ambiguous strains and only retained unambiguous homokaryotypes for downstream analyses.
18
19
Phylogenetic relationships among In(3R)Payne karyotypes
20
Phylogenetic relationships among a total of 450 D. melanogaster strains from Africa, Europe,
21
North America and Australia were analyzed with respect to In(3R)Payne based on a compilation
22
of sequencing data from the above-mentioned sources (see supplementary table S1,
23
Supplementary Material online). First, to investigate phylogenetic relationships among strains
24
within the region spanned by In(3R)Payne, we analyzed 3766 SNPs located within the
25
breakpoints of In(3R)Payne. Secondly, we reconstructed genome-wide patterns of phylogenetic
26
relationships among these samples independent of chromosomal inversions by focusing on
27
4849 SNPs that were randomly drawn from the left and right arms of the third chromosome in
28
200 kb distance from the breakpoints of In(3L)P and In(3R)Payne. Using custom-made
29
software, we combined and converted allelic data from all lines to the NEXUS file format and
30
calculated unrooted phylogenetic networks based on the Neighbor-Net inference method
31
(Bryant and Moulton 2004) with Splitstree (v.4.14.6; Huson 1998), using the Jukes-Cantor
32
model for computing genetic distances. Importantly, unlike the Neighbor-Joining (NJ) method,
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
31
Neighbor-Net can represent conflicting signals in the data, e.g., due to recombination (Bryant
1
and Moulton 2004).
2
3
Population genetic analyses
4
Analysis of nucleotide diversity and Tajima’s D
5
We quantified genetic variation using the software packages vcftools (v.0.1.16) to obtain SNP-
6
wise estimates of nucleotide diversity π and Tajima’s D in samples with phased sequencing
7
data. Because vcftools provides window-wise estimates based on total window size but does
8
not account for positions in a given window that do not fulfill the same quality criteria as the
9
polymorphic sites, we obtained average values in 100 kb non-overlapping windows using
10
custom-made software. We first generated mask files where positions for which more than 50%
11
of individuals did not fulfil heuristic quality criteria (based on minimum and maximum coverage,
12
as defined above) were flagged with a ‘0’, whereas all other positions that passed were flagged
13
with a ‘1’. We then calculated window-wise averages of π and Tajima’s D separately for inverted
14
vs. standard chromosomes using the information in the mask files for population samples with
15
phased sequencing data. To test for differences in genetic variation with respect to (1)
16
geography, (2) karyotype and (3) genomic region we analyzed samples from Siavonga (Zambia,
17
Africa), Póvoa de Varzim (Portugal, Europe), Homestead (Florida, USA) and Innisfail
18
(Queensland, Australia). We considered all window-wise averages of π and Tajima’s D between
19
positions 3R:16,432,209 and 3R:24,744,010 as being located ‘inside’ In(3R)Payne. To define a
20
representative ‘control’ region ‘outside’ of In(3R)Payne we choose a random sample of equal
21
size, composed of average estimates of π from 3L and 3R located outside the interval ranging
22
from 3R:14,232,209 to 3R:26,744,010. To account for potential long-range effects of
23
In(3R)Payne, we extended the actual length of the inversion by 2 Mb on both ends. Using R we
24
performed a three-way analysis of variance (ANOVA) of the form yi = O + K + G + O ´ K + O ´ G
25
+ G ´ K + O ´ G ´ K+ εi, where yi is the continuous dependent variable π or Tajima’s D in the ith
26
sample, O denotes the categorical factor ‘origin’ with four levels (Africa, Europe, North America,
27
Australia), K represents the factor ‘karyotype’ with two levels (inverted, non-inverted) and G
28
stands for the factor ‘Genomic region’ with two levels (in- vs. outside of the inversion), followed
29
by all possible interactions, and where ε represents the error term. Based on the coefficients
30
estimated from this model, we calculated planned contrasts using the R package emmeans to
31
test for significant differences in genetic variation between karyotypes inside the inverted region.
32
To search for a potential signature of balancing selection (as indicated by positive D values) we
33
also estimated Tajima’s D for pooled samples of inverted and standard chromosomes (see
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
32
supplementary fig. S2, Supplementary Material online). We note that estimating D for pools
1
consisting of equal numbers of inverted vs. standard chromosomes (50:50 ratio; see
2
supplementary fig. S2, Supplementary Material online) vs. estimating D using pools with the
3
numbers of inverted vs. standard chromosomes being proportional to their population
4
frequencies (not shown) did not make a qualitative difference.
5
6
Analysis of linkage disequilibrium (LD)
7
We estimated LD within and among karyotypes from low-latitude populations in Siavonga
8
(Zambia, Africa), Póvoa de Varzim (Portugal, Europe), Homestead (Florida, USA) and Innisfail
9
(Queensland, Australia) for which phased sequencing data were available. Squared allele
10
frequency correlations (r2) (Hill and Robertson 1968) were calculated among pairs of 5000
11
randomly drawn SNPs on 3R and between all polymorphic SNPs and In(3R)Payne using
12
custom-made software as described in Kapun et al. (2014). Since the r2 statistic can be affected
13
by large variance due to rare alleles, and because this might confound analyses of LD patterns
14
(Hedrick 1987), we restricted analyses to SNPs with minor allele frequencies ≥ 0.1. To compare
15
the decay of LD with physical distance we focused on 5000 SNPs located within the region
16
spanned by In(3R)Payne and restricted analyses to pairwise r2 among SNPs within 100 kb
17
distance. Following the approach in Remington et al. (2001) and Marroni et al. (2011), we used
18
our LD estimates to fit the following equation from Hill and Weir (1988):
19
20
which allows modeling the expected reduction of r2 with physical distance. Here, E(r2) is the
21
expected value of r2, n represents the sample size, and C is the product of the population-
22
scaled recombination rate (ρ = 4Ner) and the distance in base pairs, which we estimated using
23
nonlinear regression in R. For each population sample we employed the R function nls which
24
fits a model based on nonlinear least squares across both karyotypes. Subsequently, we fitted
25
the same model but additionally accounted for karyotype as a grouping factor using the function
26
nlsList in the R package nlme (Pinheiro et al. 2021). To infer significant variation in the decay of
27
LD as a function of In(3R)Payne karyotype we tested for differences in the goodness-of-fit of the
28
two nested models by analysis of deviance, using the function anova_nlslist in the R package
29
nlshelper (Duursma 2017).
30
31
32
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
33
Analysis of genetic differentiation
1
To quantify the amount of genetic differentiation between samples for single SNPs and for
2
averages in 100 kb non-overlapping windows, we estimated pairwise FST for every SNP based
3
on the method of Weir and Cockerham (1984), using vcftools (v.0.1.16). We first investigated
4
how In(3R)Payne affects genetic differentiation within and among populations. We focused on
5
population samples from the endpoints of latitudinal gradients in Europe, North America and
6
Australia for which the low-latitude populations harbored In(3R)Payne at appreciable
7
frequencies and for which the inversion was absent in high-latitude populations. Our LD
8
analyses mentioned above revealed elevated LD inside inverted karyotypes, indicating that
9
adjacent SNPs do not evolve independently. We thus compared average FST in 100 kb non-
10
overlapping windows within the inversion breakpoints (‘inside’) to a similarly sized ‘outside’ set
11
of average FST values that were randomly chosen from the third chromosome in 2 Mb distance
12
from the breakpoints of In(3L)P and In(3R)Payne, as defined above. For the regions defined as
13
‘inside’ and ‘outside’ the inversion, and for each continent separately, we tested for differences
14
in average pairwise FST values using the following comparisons as input data: (1) samples from
15
the same low-latitude population with different karyotypes (factor level: ‘Karyotype’; e.g., Florida
16
inverted vs. Florida standard [FIFS]), (2) samples with standard arrangement from different
17
populations at the endpoints of a given continental latitudinal gradient (factor level: ‘Geography’;
18
e.g., Florida standard vs. Maine standard [FSMS) and (3) samples with different karyotypes
19
from different geographical populations within the same continent (factor l. evel: ‘Geography +
20
Karyotype’; e.g., Florida inverted vs. Maine standard [FIMS]; i.e., pairwise FST estimates for
21
which the effects of karyotype and geography might be confounded). To analyze these data we
22
used one-way ANOVA of the form yi = C + εi, where yi represents pairwise FST in the ith genomic
23
window, C is the categorical factor ‘pairwise comparison’ with three levels (‘Karyotype’,
24
‘Geography’, ‘Karyotype + Geography’) and ε represents the error term. To determine which of
25
the three levels of C differ from each other we performed Tukey’s HSD post-hoc tests. The
26
above-mentioned between-karyotype FST estimates (i.e., estimates of FAT, Charlesworth et al.
27
1997) were obtained for pools consisting of equal numbers of inverted vs. standard
28
chromosomes (50:50 ratio); we note that estimating karyotypic divergence using pools with the
29
numbers of inverted vs. standard chromosomes being proportional to their population
30
frequencies did not make a qualitative difference (not shown). In addition to FST, we also
31
estimated genetic differentiation between karyotypes using KST (Hudson et al. 1992; also cf. Nei
32
1973 and Charlesworth 1998) these analyses yielded qualitatively identical patterns as those
33
for FST (results not shown). Finally, we also examined whether FST was elevated within the
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
34
breakpoints of 3R Payne when comparing inverted chromosomes between continents. We
1
focused on SNPs shared across populations and calculated FST in 100 kb non-overlapping
2
windows in all pairwise combinations across inverted chromosomes from Zambia, Portugal,
3
Florida (USA) and Queensland (Australia) using vcftools (v.0.1.16). FST values were plotted
4
against their genomic positions for all comparisons in R using the ggplot2 package.
5
6
Identification of candidate genes and SNPs associated with In(3R)Payne
7
To identify candidate genes and SNPs in the region spanned by by In(3R)Payne, we focused
8
again on the samples from Siavonga (Zambia, Africa), Póvoa de Varzim (Portugal, Europe),
9
Homestead (Florida, USA) and Innisfail (Queensland, Australia) and isolated SNPs positions
10
that exhibited FST ≥ 0.9 between In(3R)Payne inverted and standard individuals within a given
11
population; we considered all genes as candidate loci if at least one candidate SNP with FST
12
0.9 was located inside or within 2 kb proximity of the 3’ and 5’ ends of a given gene, since these
13
regions harbour regulatory elements (Down et al. 2007; Nègre et al 2011). Long genes have a
14
higher probability to harbor candidate SNPs by chance, and this might result in a bias towards
15
gene ontology (GO) classes that are enriched for long genes. To account for this potential bias,
16
we used Gowinda (Kofler and Schlötterer 2012) in order to test for overrepresentation of GO
17
terms associated with karyotype-specific SNP candidates. Gowinda first generates an empirical
18
null distribution of gene abundance in a given GO category based on a set of randomly chosen
19
SNPs of equal size as the candidate set; Gowinda then estimates the significance of
20
overrepresentation for each GO category and accounts for multiple testing by using Benjamini-
21
Hochberg (BH) correction of p-values (Benjamini and Hochberg 1995). Next, we examined the
22
extent to which candidate SNPs and genes are shared among continents. Using the R package
23
SuperExactTest (Wang et al. 2015), which allows assessing the significance of intersections
24
among multiple sets of similar data, we tested for overlaps between the sets of candidate SNPs
25
and genes. Since SuperExactTest estimates predicted intersections based on the size of
26
statistical background populations from which all sets are sampled, we only included SNPs that
27
were polymorphic in all four datasets. Because the Australian samples were sequenced with
28
RAD-Seq, only a limited number of SNPs could be recovered when comparing across all four
29
populations (fig. 7). We thus excluded the Australian samples and performed genome-wide
30
comparisons among the three remaining populations / continents. Based on the annotations
31
assigned to each SNP with SNPeff (Cingolani et al. 2012), we tested for enrichment of
32
candidates with Fisher’s exact test (FET) using a custom-made python script. We focused on
33
the 8 most common SNP categories (i.e., intergenic_region, upstream_gene_variant,
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
35
5_prime_UTR_variant, intron_variant, synonymous_variant, missense_variant,
1
3_prime_UTR_variant and downstream_gene_variant). To build contingency tables for
2
category-specific FETs, SNPs were classified as candidate vs. non-candidate and as belonging
3
to a given category or not. To account for multiple testing p-values were Bonferroni-corrected.
4
5
RNA-seq data analysis
6
Prior to mapping, we trimmed and filtered raw reads for a base quality ≥ 18 and read lengths ≥
7
75 bp using cutadapt, as explained above. Next, we used kallisto (v.0.44.0; Bray et al. 2016) for
8
pseudo-alignments of each library against the D. melanogaster transcriptome (v. 6.17, obtained
9
from http://flybase.org/), using the following parameters: -l 101 (average fragment length = 101
10
bp); -s 10 (average standard deviation of fragment length = 10); -b 100 (number of bootstrapped
11
samples = 100); --rf-strand (reads are strand-specific, with the first read being reversed); --
12
single (reads are single-ended). We focused on gene-specific expression patterns and summed
13
up all transcript-specific read counts for each gene using custom-made software following the
14
approach in Soneson et al. (2015). We first transformed the raw absolute read counts to relative
15
counts per million (cpm) and normalized data using the ‘trimmed means of the M-value’ (TMM)
16
approach implemented in the R package edgeR (3.20.9) (Robinson et al. 2010, 2011). Lowly
17
expressed or non-expressed genes were excluded from downstream analyses by removing
18
genes with less than or equal to 2 CPM in more than or equal to 9 samples from the dataset. To
19
identify differentially expressed genes affected by karyotype, developmental temperature or
20
both, we fitted linear models to the expression data using the R package limma (Smyth 2005;
21
Ritchie et al. 2015). By employing the model.matrix function of the limma package we set up a
22
design matrix of the form: ~0 + G, where 0 indicates that the model is fitted without intercept and
23
where G is a grouping factor with six levels (FI-18, FI-25, FS-18, FS-25, MS-18, MS-25) based
24
on the geographic origin and karyotype of the samples (i.e., Florida inverted [FI], Florida
25
standard [FS], Maine standard [MS]) and the two rearing temperatures (18°C, 25°C). To
26
account for potential line effects, we included replicate line identity as a random effect nested in
27
karyotype. Using the voom function of limma, precision weights were calculated for log2-CPM-
28
transformed read counts to account for the relationship between mean and variance in RNA-
29
Seq data when fitting linear models to expression data (Law et al. 2014). We used the eBayes
30
function to improve the accuracy of gene-wise variance estimates by empirical Bayes
31
moderation which integrates information on variation across all genes in the dataset. Based on
32
the parameter estimates for each of the six levels of the grouping factor G, we calculated
33
contrasts to test for the effects of karyotype, geography, and karyotype + geography, averaging
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
36
across both temperatures. In addition, we calculated contrasts for the two developmental
1
temperatures separately. We also employed contrasts to examine interactions between
2
temperature and karyotype, geography and karyotype + geography. To account for multiple
3
testing, we used BH p-value correction and only considered genes with a p < 0.05 to be
4
differentially expressed. For each of the candidate gene lists obtained from differential
5
expression analysis with limma, we tested for enrichment of specific GO categories using the R
6
package topgo (Alexa and Rahnenführer 2009). After correcting significant sets of GO terms for
7
hierarchical clustering using GO-Module (Yang et al. 2011), we visualized the remaining set of
8
GO terms with REVIGO (Supek et al. 2011) and Cytoscape (Shannon et al. 2003). Enrichment
9
of candidates according to their position relative to In(3R)Payne was tested by creating
10
contingency tables based on candidate and non-candidate genes located either in- or outside
11
the region spanned by the breakpoints of In(3R)Payne and using Fisher’s exact tests in R. We
12
further compared our candidates to the candidates reported by Zhao et al. (2015), who had
13
identified genes that exhibit differential expression between a low-latitude population in Panama
14
and a high-latitude population in Maine at two rearing temperatures (21°C, 29°C), and tested for
15
significant overlaps using SuperExactTests (Wang et al. 2015).
16
17
Overlap between genomic and transcriptomic candidates
18
To refine our set of candidate loci associated with 3R Payne we compared genomic and
19
transcriptomic candidates. Genomic plus transcriptomic data were only available from North
20
American populations (i.e., inverted and standard karyotypes from Florida; standard karyotypes
21
from Maine). We tested for significant overlaps between FST-based and differentially expressed
22
candidate genes using SuperExactTests (Wang et al. 2015); as the background set for these
23
analyses we only considered third chromosome genes identified in both datasets. Since the
24
significance of overlaps across sets can be confounded by the choice of significance thresholds
25
in the individual datasets, we also employed a comparison based on rank-rank hypergeometric
26
overlaps of ranked gene lists using the R package RRHO (Plaisier et al. 2010). RNA-seq
27
candidates were ranked based on adjusted p-values, whereas genomic candidates were ranked
28
based on average FST of the 10% top most highly differentiated SNPs located within 2 kb
29
proximity of a given candidate gene. RRHO tests for significant overlaps between gene lists are
30
based on hypergeometric tests calculated while sliding across all possible thresholds in the two
31
ranked lists. Besides visual representation of changes of significance with decreasing rank,
32
RRHO allows to define an ‘optimally’ overlapping gene set. We used this ‘optimal’ set to test for
33
enrichment of GO categories using topGO in the R package limma.
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
37
1
Acknowledgements
2
We are indebted to Brian Charlesworth for helpful comments on previous versions of the
3
manuscript and for sharing unpublished results; to Mark Kirkpatrick for suggesting the analysis
4
in figure 6; and to two anonymous reviewers for their valuable feedback. We also acknowledge
5
the Lausanne Genomic Technologies Facility (GTF) and the Vital-IT Bioinformatics Facility at
6
the University of Lausanne for sequencing and bioinformatics support. Our research was funded
7
by the Swiss National Science Foundation (SNSF grants 31003A-182262, PP00P3_165836,
8
PP00P3_133641/1 to TF), the Austrian Science Fund (FWF grant P32275 to MK), the
9
Department of Ecology and Evolution at the University of Lausanne, and the Department of
10
Biology at the University of Fribourg. TF also received support from a Mercator Fellowship of the
11
German Research Foundation (DFG), held as an EvoPAD Visiting Professor at the Institute for
12
Evolution and Biodiversity, University of Mnster (Germany).
13
14
Author contributions
15
Contributions following CRediT (Contributor Roles Taxonomy; https://casrai.org/credit):
16
MK: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation,
17
Methodology, Resources, Software, Validation, Visualization, Writing original draft, Writing
18
review & editing; ED: Formal Analysis, Investigation, Methodology, Software, Validation,
19
Visualization, Writing original draft, Writing review & editing; TJK: Funding acquisition,
20
Writing review & editing; PS: Resources, Writing review & editing; TF: Conceptualization,
21
Funding acquisition, Investigation, Project administration, Resources, Supervision, Validation,
22
Writing original draft, Writing review & editing.
23
24
References
25
Alexa A, Rahnenführer J. 2009. Gene set enrichment analysis with topGO. Bioconductor
26
Improv. 27
27
(http://bioconductor.riken.jp/packages/3.0/bioc/vignettes/topGO/inst/doc/topGO.pdf)
28
Anderson AR, Hoffmann AA, Mckechnie SW, Umina PA, Weeks AR. 2005. The latitudinal cline
29
in the In(3R)Payne inversion polymorphism has shifted in the last 20 years in Australian
30
Drosophila melanogaster populations. Mol Ecol 14:851-858
31
Arguello JR, Laurent S, Clark AG. 2019. Demographic History of the Human Commensal
32
Drosophila melanogaster. Genome Biol Evol. 11:844-854.
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
38
Ashburner M, Lemeunier F. 1976. Relationships within the melanogaster Species Subgroup of
1
the Genus Drosophila (Sophophora). I. Inversion Polymorphisms in Drosophila melanogaster
2
and Drosophila simulans. Proc Roy Soc London B 193:137-157.
3
Aulard S, David JR, Lemeunier F. 2002. Chromosomal inversion polymorphism in Afrotropical
4
populations of Drosophila melanogaster. Genet Res. 79:49-63.
5
Becher H, Jackson BC, Charlesworth B. 2020. Patterns of genetic variability in genomic regions
6
with low rates of recombination. Curr Biol. 30:94-100.e103.
7
Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful
8
approach to multiple testing. J Roy Stat Soc B. 57:289-300.
9
Berdan EL, Blanckaert A, Butlin RK, Bank C. 2021. Deleterious mutation accumulation and the
10
long-term fate of chromosomal inversions. PLOS Genet 17: e1009411.
11
Booker TR, Yeaman S, Whitlock MC. 2021. Global adaptation complicates the interpretation of
12
genome scans for local adaptation. Evol Lett 5:4-15.
13
Bray NL, Pimentel H, Melsted P, Pachter L. 2016. Near-optimal probabilistic RNA-seq
14
quantification. Nature Biotech 34:525-527.
15
Bryant D, Moulton V. 2004. Neighbor-Net: An agglomerative method for the construction of
16
phylogenetic networks. Mol Biol Evol 21: 255-265.
17
Cahill KM, Huo Z, Tseng GC, Logan RW, Seney ML. 2018. Improved identification of
18
concordant and discordant gene expression signatures using an updated rank-rank
19
hypergeometric overlap approach. Sci Rep 8:9588.
20
Campos JL, Zhao L, Charlesworth B. 2017. Estimating the parameters of background selection
21
and selective sweeps in Drosophila in the presence of gene conversion. Proc Natl Acad Sci
22
USA 114: E4762E47771.
23
Charlesworth B. 1974. Inversion polymorphism in a two-locus genetic system. Genet Res.
24
23:259-280.
25
Charlesworth B. 1998. Measures of divergence between populations and the effect of forces
26
that reduce variability. Mol Biol Evol. 15:538-543.
27
Charlesworth B. 2022. The effects of weak selection on neutral diversity at linked sites.
28
Genetics 221:iyac027.
29
Charlesworth B. 2023. The effects of inversion polymorphisms on patterns of neutral genetic
30
diversity. bioRxiv: https://doi.org/10.1101/2023.02.23.529778.
31
Charlesworth B, Barton NH. 2018. The Spread of an Inversion with Migration and Selection.
32
Genetics 208:377-382.
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
39
Charlesworth B, Charlesworth D. 1973. Selection of new inversions in multi-locus genetic
1
systems. Genet Res. 21:167-183.
2
Charlesworth B, Charlesworth D. 2018. Neutral variation in the context of selection. Mol Biol
3
Evol. 35:1359-1361.
4
Charlesworth B, Flatt T. 2021. On the fixation or non-fixation of inversions under epistatic
5
selection. Mol Ecol. 30: 3896-3897
6
Charlesworth B, Jensen JD. 2021. Effects of Selection at Linked Sites on Patterns of Genetic
7
Variability. Ann Rev Ecol Evol Syst 52:177-197.
8
Charlesworth B, Nordborg M, Charlesworth D. 1997. The effects of local selection, balanced
9
polymorphism and background selection on equilibrium patterns of genetic diversity in
10
subdivided populations. Genet Res. 70:155-174.
11
Charlesworth D. 2016. The status of supergenes in the 21st century: recombination suppression
12
in Batesian mimicry and sex chromosomes and other complex adaptations. Evol Appl. 9:74-
13
90.
14
Chen J, Nolte V, Schlötterer C. 2015. Temperature-Related Reaction Norms of Gene
15
Expression: Regulatory Architecture and Functional Implications. Mol Biol Evol 32:2393-
16
2402.
17
Chen Y, Lee SF, Blanc E, Reuter C, Wertheim B, Martinez-Diaz P, Hoffmann AA, Partridge L.
18
2012. Genome-Wide Transcription Analysis of Clinal Genetic Variation in Drosophila. PLoS
19
ONE 7:e34620.
20
Chovnick A. 1973. Gene conversion and transfer of genetic information within the inverted
21
region of inversion heterozygotes. Genetics 75:123-131.
22
Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM. 2012.
23
A program for annotating and predicting the effects of single nucleotide polymorphisms,
24
SnpEff. Fly 6:80-92.
25
Clemson AS, Sgro CM, Telonis-Scott M. 2016. Thermal plasticity in Drosophila melanogaster
26
populations from eastern Australia: quantitative traits to transcripts. J Evol Biol 29:2447-2463.
27
Corbett-Detig RB, Hartl DL. 2012. Population Genomics of Inversion Polymorphisms in
28
Drosophila melanogaster. PLoS Genet. 8:e1003056.
29
Corbett-Detig RB, Cardeno C, Langley CH. 2012. Sequence-Based Detection and Breakpoint
30
Assembly of Polymorphic Inversions. Genetics 192:131-137.
31
Crown KN, Miller DE, Sekelsky J, Hawley RS. 2018. Local Inversion Heterozygosity Alters
32
Recombination throughout the Genome. Curr Biol. 28:2984-2990.
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
40
David JR, Capy P. 1988. Genetic variation of Drosophila melanogaster natural populations.
1
Trends Genet. 4:106-111.
2
Dobzhansky T. 1943. Genetics of Natural Populations IX. Temporal Changes in the
3
Composition of Populations of Drosophila pseudoobscura. Genetics 28:162-186.
4
Dobzhansky T. 1947. Genetics of Natural Populations. XIV. A Response of Certain Gene
5
Arrangements in the Third Chromosome of Drosophila pseudoobscura to Natural Selection.
6
Genetics 32:142-160.
7
Dobzhansky T. 1948. Genetics of Natural Populations. XVIII. Experiments on Chromosomes of
8
Drosophila pseudoobscura from Different Geographic Regions. Genetics 33:588-602.
9
Dobzhansky T. 1949. Observations and Experiments on Natural Selection in Drosophila. Proc
10
8th Intl Cong Genet., republished in Hereditas 35:210-224.
11
Dobzhansky T. 1950. Genetics of Natural Populations. XIX. Origin of Heterosis through Natural
12
Selection in Populations of Drosophila pseudoobscura. Genetics 35:288-302.
13
Dobzhansky T. 1951. Genetics and the Origin of Species, 3rd edition. New York: Columbia
14
University Press.
15
Dobzhansky T, Pavlovsky O. 1957. An experimental study of interaction between genetic drift
16
and natural selection. Evolution 11:311-319.
17
Down TA, Bergman CM, Su J, Hubbard TJP. 2007. Large-Scale Discovery of Promoter Motifs in
18
Drosophila melanogaster. PLOS Computational Biology 3:e7.
19
Durmaz E, Benson C, Kapun M, Schmidt P, Flatt T. 2018. An inversion supergene in Drosophila
20
underpins latitudinal clines in survival traits. J Evol Biol. 31:1354-1364.
21
Durmaz E, Kerdaffrec E, Katsianis G, Kapun M, Flatt T. 2020. How Selection Acts on
22
Chromosomal Inversions. eLS 1(2) 2020 (doi.org/10.1002/9780470015902.a0028745).
23
Duursma, K. 2017. nlshelper: Convenient Functions for Non-Linear Regression. R package
24
version 0.2, https://CRAN.R-project.org/package=nlshelper.
25
Ewens WJ, Thomson G. 1970. Heterozygote selective advantage. Ann Hum Genet 33:365-376.
26
Fabian DK, Kapun M, Nolte V, Kofler R, Schmidt PS, Schlotterer C, Flatt T. 2012. Genome-wide
27
patterns of latitudinal differentiation among populations of Drosophila melanogaster from
28
North America. Mol Ecol 21:4748-4769.
29
Faria R, Johannesson K, Butlin RK, Westram AM. 2019. Evolving inversions. Trends Ecol Evol
30
34:239-248.
31
Fijarczyk A, Babik W. 2015. Detecting balancing selection in genomes: limits and prospects. Mol
32
Ecol 24:3529-3545.
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
41
Franssen SU, Nolte V, Tobler R, Schlötterer C. 2015. Patterns of Linkage Disequilibrium and
1
Long Range Hitchhiking in Evolving Experimental Drosophila melanogaster Populations. Mol
2
Biol Evol. 32: 495-509.
3
Frydenberg O.1963. Population studies of a lethal mutant in Drosophila melanogaster. I.
4
Behaviour in populations with discrete generations. Hereditas 50:89-116.
5
Fuller ZL, Haynes GD, Richards S, Schaeffer SW. 2016. Genomics of natural populations: How
6
differentially expressed genes shape the evolution of chromosomal inversions in Drosophila
7
pseudoobscura. Genetics 204:287-301.
8
Fuller ZL, Haynes GD, Richards S, Schaeffer SW. 2017 Genomics of natural populations:
9
Evolutionary forces that establish and maintain gene arrangements in Drosophila
10
pseudoobscura. Mol Ecol. 26: 6539-6562.
11
Fuller ZL, Koury SA, Phadnis N, Schaeffer SW. 2019. How chromosomal rearrangements
12
shape adaptation and speciation: Case studies in Drosophila pseudoobscura and its sibling
13
species Drosophila persimilis. Mol. Ecol. 28: 1283-1301.
14
Gilbert KJ, Pouyet F, Excoffier L, Peischl S. 2020. Transition from background selection to
15
associative overdominance promotes diversity in regions of low recombination. Curr Biol. 30:
16
101-107.e103.
17
Guerrero RF, Rousset F, Kirkpatrick M. 2012. Coalescent patterns for chromosomal inversions
18
in divergent populations. Phil Trans Roy Soc London B 367:430-438.
19
Hedrick P. 1987. Gametic disequilibrium measures: proceed with caution. Genetics 117:331-
20
341.
21
Hill W, Robertson A. 1968. Linkage disequilibrium in finite populations. Theor Appl Genet.
22
38:226-231.
23
Hill W, Weir B. 1988. Variances and covariances of squared linkage disequilibria in finite
24
populations. Theor Pop Biol. 33:54-78.
25
Hoffmann AA, Rieseberg LH. 2008. Revisiting the Impact of Inversions in Evolution: From
26
Population Genetic Markers to Drivers of Adaptive Shifts and Speciation? Ann Rev Ecol Evol
27
Syst. 39:21-42.
28
Hoffmann AA, Weeks AR. 2007. Climatic selection on genes and traits after a 100-year-old
29
invasion: a critical look at the temperate-tropical clines in Drosophila melanogaster from
30
eastern Australia. Genetica 129:133-147.
31
Hoffmann AA, Sgrò CM, Weeks AR. 2004. Chromosomal inversion polymorphisms and
32
adaptation. Trends Ecol Evol. 19:482-488.
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
42
Huang Y, Lack JB, Hoppel GT, Pool JE. 2022. Gene regulatory evolution in cold-adapted fly
1
populations neutralizes plasticity and may undermine genetic canalization. Genome Biol Evol
2
14: evac050.
3
Hudson RR, Kaplan NL. 1988. The coalescent process in models with selection and
4
recombination. Genetics 120:831-840.
5
Hudson RR, Boos DD, Kaplan NL. 1992. A statistical test for detecting geographic subdivision.
6
Mol Biol Evol. 9:138-151.
7
Huson DH. 1998. SplitsTree: analyzing and visualizing evolutionary data. Bioinformatics 14:68
8
73.
9
Innan H, Stephan W. 2000. The Coalescent in an Exponentially Growing Metapopulation and Its
10
Application to Arabidopsis thaliana. Genetics 155:2015-2019.
11
Ishii K, Charlesworth B. 1977. Associations between allozyme loci and gene arrangements due
12
to hitch-hiking effects of new inversions. Genet Res. 30:93-106.
13
Juneja P, Quinn A, Jiggins FM. 2016. Latitudinal clines in gene expression and cis-regulatory
14
element variation in Drosophila melanogaster. BMC Genom 17:981.
15
Kaplan NL, Darden T, Hudson RR. 1988. The coalescent process in models with selection.
16
Genetics 120: 819-820.
17
Kapopoulou A, Pfeifer SP, Jensen JD, Laurent S. 2018. The Demographic History of African
18
Drosophila melanogaster. Genome Biol Evol. 10:2338-2342.
19
Kapopoulou A, Kapun M, Pieper B, Pavlidis P, Wilches R, Duchen P, Stephan W, Laurent S.
20
2020. Demographic analyses of a new sample of haploid genomes from a Swedish
21
population of Drosophila melanogaster. Sci Rep 10:22415.
22
Kapun M, Flatt T. 2019. The adaptive significance of chromosomal inversion polymorphisms in
23
Drosophila melanogaster. Mol Ecol. 28:1263-1282.
24
Kapun M, Fabian DK, Goudet J, Flatt T. 2016a. Genomic Evidence for Adaptive Inversion
25
Clines in Drosophila melanogaster. Mol Biol Evol. 33:13171336.
26
Kapun M, Schmidt C, Durmaz E, Schmidt PS, Flatt T. 2016b. Parallel effects of the inversion
27
In(3R)Payne on body size across the North American and Australian clines in Drosophila
28
melanogaster. J Evol Biol. 29:1059-1072.
29
Kapun M, van Schalkwyk H, McAllister B, Flatt T, Schlötterer C. 2014. Inference of
30
chromosomal inversion dynamics from Pool-seq data in natural and laboratory populations of
31
Drosophila melanogaster. Mol Ecol. 23:18131827.
32
Kapun M, Barrón MG, Staubach F, Obbard DJ, Wiberg RAW, Vieira J, Goubert C, Rota-Stabelli
33
O, Kankare M, Bogaerts-Márquez M, et al. 2020. Genomic Analysis of European Drosophila
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
43
melanogaster Populations Reveals Longitudinal Structure, Continent-Wide Selection, and
1
Previously Unknown DNA Viruses. Mol Biol Evol. 37:2661-2678.
2
Kapun M, Nunez JCB, Bogaerts-Marquez M, Murga-Moreno J, Paris M, Outten J, Coronado-
3
Zamora M, Tern C, Rota-Stabelli O, Garcia Guerreiro MPP, et al. 2021. Drosophila Evolution
4
over Space and Time (DEST) - A New Population Genomics Resource. Mol Biol Evol.
5
38:5782-5805
6
Keller A. 2007. Drosophila melanogaster's history as a human commensal. Curr Biol. 17:R77-
7
R81.
8
Kennington WJ, Partridge L, Hoffmann AA. 2006. Patterns of diversity and linkage
9
disequilibrium within the cosmopolitan inversion In(3R)Payne in Drosophila melanogaster are
10
indicative of coadaptation. Genetics 172:1655-1663.
11
Kennington WJ, Hoffmann AA, Partridge L. 2007. Mapping regions within cosmopolitan
12
inversion In(3R)Payne associated with natural variation in body size in Drosophila
13
melanogaster. Genetics 177:549-556.
14
Kimura M. 1956. A model of a genetic system which leads to closer linkage by natural selection.
15
Evolution 10:278-287.
16
Kirkpatrick M. 2010. How and Why Chromosome Inversions Evolve. PLoS Biol. 8:e1000501.
17
Kirkpatrick, M. 2017. The Evolution of Genome Structure by Natural and Sexual Selection. J
18
Hered. 1:3-11.
19
Kirkpatrick M, Barton N. 2006. Chromosome Inversions, Local Adaptation and Speciation.
20
Genetics 173:419-434.
21
Kirkpatrick M, Kern A. 2012. Where's the Money? Inversions, Genes, and the Hunt for Genomic
22
Targets of Selection. Genetics 190:1153-1155.
23
Knibb WR. 1982. Chromosome inversion polymorphisms in Drosophila melanogaster. II.
24
Geographic clines and climatic associations in Australasia, North America and Asia. Genetica
25
58:213-221.
26
Knibb WR. 1983. Chromosome inversion polymorphisms in Drosophila melanogaster. III.
27
Gametic disequilibria and the contributions of inversion clines to the Adh and Gpdh clines in
28
Australasia. Genetica 61:139-146.
29
Knibb WR. 1986. Temporal variation of Drosophila melanogaster Adh allele frequencies,
30
inversion freqencies, and population sizes. Genetica 71:175-190.
31
Knibb WR, Oakeshott JG, Gibson JB. 1981. Chromosome inversion polymorphisms in
32
Drosophila melanogaster. I. Latitudinal clines and associations between inversions in
33
Australasian populations. Genetics 98:833-847.
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
44
Kofler R, Schlötterer C. 2012. Gowinda: unbiased analysis of gene set enrichment for genome-
1
wide association studies. Bioinformatics 28:2084-2085.
2
Kojima K-i, Gillespie J, Tobari YN. 1970. A profile of Drosophila species' enzymes assayed by
3
electrophoresis. I. Number of alleles, heterozygosities, and linkage disequilibrium in glucose-
4
metabolizing systems and some other enzymes. Biochem Genet 4:627-637.
5
Kolaczkowski B, Kern AD, Holloway AK, Begun DJ. 2011. Genomic differentiation between
6
temperate and tropical Australian populations of Drosophila melanogaster. Genetics 187:245-
7
260.
8
Korunes KL, Noor MAF. 2019. Pervasive gene conversion in chromosomal inversion
9
heterozygotes. Mol Ecol. 28:1302-1315.
10
Kreitman M. 1983. Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila
11
melanogaster. Nature 304:412-417.
12
Krimbas CB, Powell JR (editors). 1992. Drosophila Inversion Polymorphism. Boca Raton, FL:
13
CRC Press.
14
Lachaise D, Cariou M-L, David JR, Lemeunier F, Tsacas L, Ashburner M. 1988. Historical
15
biogeography of the Drosophila melanogaster species subgroup. Evol Biol. 22:159-225.
16
Lack JB, Cardeno CM, Crepeau MW, Taylor W, Corbett-Detig RB, Stevens KA, Langley CH,
17
Pool JE. 2015. The Drosophila genome nexus: a population genomic resource of 623
18
Drosophila melanogaster genomes, including 197 from a single ancestral range population.
19
Genetics 199:1229-1241.
20
Lack JB, Lange JD, Tang AD, Corbett-Detig RB, Pool JE. 2016. A Thousand Fly Genomes: An
21
Expanded Drosophila Genome Nexus. Mol Biol Evol. 33:3308-3313.
22
Lange JD, Bastide H, Lack JB, Pool JE. 2022. A Population Genomic Assessment of Three
23
Decades of Evolution in a Natural Drosophila Population. Mol Biol Evol 39: msab368.
24
Lavington E, Kern AD. 2017. The Effect of Common Inversion Polymorphisms In(2L)t and
25
In(3R)Mo on Patterns of Transcriptional Variation in Drosophila melanogaster. G3 7:3659-
26
3668.
27
Law CW, Chen Y, Shi W, Smyth GK. 2014. voom: Precision weights unlock linear model
28
analysis tools for RNA-seq read counts. Genome Biol. 15:R29.
29
Lee C-R, Wang B, Mojica JP, Mandáková T, Prasad KVSK, Goicoechea JL, Perera N, Hellsten
30
U, Hundley HN, Johnson J, et al. 2017. Young inversion with multiple linked QTLs under
31
selection in a hybrid zone. Nature Ecol Evol. 1:0119.
32
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
45
Lemeunier F, Aulard S. 1992. Inversion polymorphism in Drosophila melanogaster. In: Krimbas
1
CB, Powell JR, editors. Drosophila Inversion Polymorphism. Boca Raton, Florida: CRC
2
Press. p. 339-405.
3
Levene H. 1953. Genetic equilibrium when more than one ecological niche is available. Am Nat.
4
87:311-313.
5
Levine MT, Eckert ML, Begun DJ. 2011. Whole-Genome Expression Plasticity across Tropical
6
and Temperate Drosophila melanogaster Populations from Eastern Australia. Mol Biol Evol.
7
28:249-256.
8
Lewontin RC. 1974. The Genetic Basis of Evolutionary Change. New York, NY: Columbia
9
University Press.
10
Li H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM.
11
arXiv preprint, arXiv:1303.3997 (https://arxiv.org/abs/1303.3997).
12
Li H, Stephan W. 2006. Inferring the Demographic History and Rate of Adaptive Substitution in
13
Drosophila. PLoS Genet. 2:e166.
14
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R,
15
Genome Project Data Processing S. 2009. The Sequence Alignment/Map format and
16
SAMtools. Bioinformatics 25:2078-2079.
17
Lowry DB, Willis JH. 2010. A Widespread Chromosomal Inversion Polymorphism Contributes to
18
a Major Life-History Transition, Local Adaptation, and Reproductive Isolation. PLoS Biol.
19
8:e1000500.
20
Machado HE, Bergland AO, Taylor R, Tilk S, Behrman E, Dyer K, Fabian DK, Flatt T, Gonzàlez
21
J, Karasov TL, et al.. 2021. Broad geographic sampling reveals the shared basis and
22
environmental correlates of seasonal adaptation in Drosophila. eLife 10:e67577.
23
Mackintosh CJ, Scott MF, Reuter M, Pomiankowski A. 2022. The establishment of locally
24
adaptive inversions in structured populations. bioRxiv: 2022.2012.2005.519181.
25
Marroni F, Pinosio S, Zaina G, Fogolari F, Felice N, Cattonaro F, Morgante M. 2011. Nucleotide
26
diversity and linkage disequilibrium in Populus nigra cinnamyl alcohol dehydrogenase
27
(CAD4) gene. Tree Genet Genomes 7:1011-1023.
28
Martin M. 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads.
29
EMBnet.journal 17:10-12.
30
Matzkin LM, Merritt TJS, Zhu C-T, Eanes WF. 2005. The Structure and Population Genetics of
31
the Breakpoints Associated With the Cosmopolitan Chromosomal Inversion In(3R)Payne in
32
Drosophila melanogaster. Genetics 170:1143-1152.
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
46
Mayr E. 1963. Animal Species and Evolution. Cambridge, MA: Belknap Press of Harvard
1
University Press.
2
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler
3
D, Gabriel S, Daly M, et al. 2010. The Genome Analysis Toolkit: A MapReduce framework for
4
analyzing next-generation DNA sequencing data. Genome Res. 20:1297-1303.
5
Mettler LE, Voelker RA, Mukai T. 1977. Inversion clines in populations of Drosophila
6
melanogaster. Genetics 87:169-176.
7
Mukai T, Nagano S. 1983. The genetic structure of natural populations of Drosophila
8
melanogaster. XVI. Excess of additive genetic variance of viability. Genetics 105:115-134.
9
Murali T, Pacifico S, Yu J, Guest S, Roberts G, Finley R. 2010. DroID 2011: a comprehensive,
10
integrated resource for protein, transcription factor, RNA and gene interactions for
11
Drosophila. Nucl Acids Res. 39:D736D743.
12
Nassar R, Muhs HJ, Cook RD. 1973. Frequency-dependent selection at the Payne inversion in
13
Drosophila melanogaster. Evolution 27:558-564.
14
Navarro A, Barton NH. 2003. Accumulating postzygotic isolation genes in parapatry: a new twist
15
on chromosomal speciation. Evolution 57:447-459.
16
Navarro A, Barbadilla A, Ruiz A. 2000. Effect of inversion polymorphism on the neutral
17
nucleotide variability of linked chromosomal regions in Drosophila. Genetics 155:685-698.
18
Navarro A, Betran E, Barbadilla A, Ruiz A. 1997. Recombination and gene flux caused by gene
19
conversion and crossing over in inversion heterokaryotypes. Genetics 146:695 - 709.
20
Navarro A, Betrán E, Zapata C, Ruiz A. 1996. Dynamics of gametic disequilibria between loci
21
linked to chromosome inversions: the recombination-redistributing effect of inversions. Genet
22
Res. 67:67-76.
23
Nègre N, Brown CD, Ma L, Bristow CA, Miller SW, Wagner U, Kheradpour P, Eaton ML, Loriaux
24
P, Sealfon R, et al. 2011. A Cis-Regulatory Map of the Drosophila Genome. Nature 471:527
25
531.
26
Nei M. 1973. Analysis of Gene Diversity in Subdivided Populations. Proc Natl Acad Sci USA
27
70: 3321-3323.
28
Nei M, Li W-H. 1975. Probability of identical monomorphism in related species. Genet Res.
29
26:31-43.
30
Nei M, Li W-H. 1979. Mathematical model for studying genetic variation in terms of restriction
31
endonucleases. Proc Natl Acad Sci USA 76:5269-5273.
32
Nei M, Li W-H. 1980. Non-random association between electromorphs and inversion
33
chromosomes in finite populations. Genet Res. 35:65-83.
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
47
Noor MAF, Grams KL, Bertucci LA, Reiland J. 2001. Chromosomal inversions and the
1
reproductive isolation of species. Proc Natl Acad Sci USA 98:12084-12088.
2
Nordborg M, Charlesworth B, Charlesworth D. 1996. Increased levels of polymorphism
3
surrounding selectively maintained sites in highly selfing species. Proc Roy Soc London B
4
263: 1033-1039.
5
Ohta T. 1971. Associative overdominance caused by linked detrimental mutations. Genet Res.
6
18: 277-286.
7
Payne F. 1924. Crossover modifiers in the third chromosome of Drosophila melanogaster.
8
Genetics 9: 327-342.
9
Pinheiro J, Bates D, DebRoy S, Sarkar D, R Core Team. 2021. nlme: Linear and Nonlinear
10
Mixed Effects Models. R package v.3.1-152, https://CRAN.R-project.org/package=nlme.
11
Plaisier SB, Taschereau R, Wong JA, Graeber TG. 2010. Rankrank hypergeometric overlap:
12
identification of statistically significant overlap between gene-expression signatures. Nucl
13
Acids Res. 38:e169.
14
Pool JE. 2015. The Mosaic Ancestry of the Drosophila Genetic Reference Panel and the D.
15
melanogaster Reference Genome Reveals a Network of Epistatic Fitness Interactions. Mol
16
Biol Evol. 32:3236-3251.
17
Pool JE, Braun DT, Lack JB. 2017. Parallel Evolution of Cold Tolerance within Drosophila
18
melanogaster. Mol Biol Evol. 34:349-360.
19
Pool JE, Corbett-Detig RB, Sugino RP, Stevens KA, Cardeno CM, Crepeau MW, Duchen P,
20
Emerson JJ, Saelao P, Begun DJ, et al. 2012. Population Genomics of Sub-Saharan
21
Drosophila melanogaster: African Diversity and Non-African Admixture. PLoS Genet.
22
8:e1003080.
23
Prakash S, Lewontin RC. 1968. A molecular approach to the study of genic heterozygosity in
24
natural populations. III. Direct evidence of coadaptation in gene arrangements of Drosophila.
25
Proc Natl Acad Sci USA 59:398-405.
26
Prakash S, Lewontin R. 1971. A molecular approach to the study of genic heterozygosity in
27
natural populations. V. Further direct evidence of coadaptation in inversions of Drosophila.
28
Genetics 69:405.
29
Prakash S, Merritt RB. 1972. Direct evidence of genic differentiation between sex ratio and
30
standard gene arrangements of X chromosome in Drosophila pseudoobscura. Genetics
31
72:169-175.
32
R Core Team. 2013. R: A language and environment for statistical computing. R-project.org
33
(http://www.R-project.org)
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
48
Rako L, Anderson AR, Sgro CM, Stocker AJ, Hoffmann AA. 2006. The association between
1
inversion In(3R)Payne and clinally varying traits in Drosophila melanogaster. Genetica
2
128:373-384.
3
Rane RV, Rako L, Kapun M, Lee SF, Hoffmann AA. 2015. Genomic evidence for role of
4
inversion 3RP of Drosophila melanogaster in facilitating climate change adaptation. Mol Ecol.
5
24:24232432.
6
Reinhardt JA, Kolaczkowski B, Jones CD, Begun DJ, Kern AD. 2014. Parallel Geographic
7
Variation in Drosophila melanogaster. Genetics 197:361-373.
8
Remington DL, Thornsberry JM, Matsuoka Y, Wilson LM, Whitt SR, Doebley J, Kresovich S,
9
Goodman MM, Buckler ES. 2001. Structure of linkage disequilibrium and phenotypic
10
associations in the maize genome. Proc Natl Acad Sci USA 98:11479-11484.
11
Rieseberg LH. 2001. Chromosomal rearrangements and speciation. Trends Ecol Evol. 16:351-
12
358.
13
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. 2015. limma powers
14
differential expression analyses for RNA-sequencing and microarray studies. Nucl Acids Res.
15
43:e47.
16
Robertson A. 1962. Selection for heterozygotes in small populations. Genetics 47:1291-1300.
17
Robinson M, McCarthy D, Chen Y, Smyth G. 2011. edgeR: differential expression analysis of
18
digital gene expression data. User guide
19
(http://bioconductor.org/packages/release/bioc/html/edgeR.html).
20
Robinson MD, McCarthy DJ, Smyth GK. 2010. edgeR: a Bioconductor package for differential
21
expression analysis of digital gene expression data. Bioinformatics 26:139-140.
22
Rozas J, Aguadé M. 1994. Gene conversion is involved in the transfer of genetic information
23
between naturally occurring inversions of Drosophila. Proc Natl Acad Sci USA 91:11517-
24
11521.
25
Said I, Byrne A, Serrano V, Cardeno C, Vollmers C, Corbett-Detig R. 2018. Linked genetic
26
variation and not genome structure causes widespread differential expression associated
27
with chromosomal inversions. Proc Natl Acad Sci USA 115:5492-5497.
28
Schaal SM, Haller BC, Lotterhos KE. 2022. Inversion invasions: when the genetic basis of local
29
adaptation is concentrated within inversions in the face of gene flow. Phil Trans Roy Soc B
30
377: 20210200.
31
Schaeffer SW. 2002. Molecular population genetics of sequence length diversity in the Adh
32
region of Drosophila pseudoobscura. Genet Res. 80: 163-175.
33
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
49
Schaeffer SW, Anderson WW. 2005. Mechanisms of genetic exchange within the chromosomal
1
inversions of Drosophila pseudoobscura. Genetics 171:1729-1739.
2
Schaeffer SW, Goetting-Minesky MP, Kovacevic M, Peoples JR, Graybill JL, Miller JM, Kim K,
3
Nelson JG, Anderson WW. 2003. Evolutionary genomics of inversions in Drosophila
4
pseudoobscura: evidence for epistasis. Proc Natl Acad Sci USA 100:8319-8324.
5
Schlötterer C, Tobler R, Kofler R, Nolte V. 2014. Sequencing pools of individuals mining
6
genome-wide polymorphism data without big funding. Nat Rev Genet. 15:749-763.
7
Sezgin E, Duvernell DD, Matzkin LM, Duan Y, Zhu C-T, Verrelli BC, Eanes WF. 2004. Single-
8
Locus Latitudinal Clines and Their Relationship to Temperate Adaptation in Metabolic Genes
9
and Derived Alleles in Drosophila melanogaster. Genetics 168:923-931.
10
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B,
11
Ideker T. 2003. Cytoscape: A Software Environment for Integrated Models of Biomolecular
12
Interaction Networks. Genome Res. 13:2498-2504.
13
Shapiro JA, Huang W, Zhang C, Hubisz MJ, Lu J, Turissini DA, Fang S, Wang H-Y, Hudson
14
RR, Nielsen R, et al. 2007. Adaptive genic evolution in the Drosophila genomes. Proc Natl
15
Acad Sci USA 104:2271-2276.
16
Simmons MJ, Crow JF. 1977. Mutations Affecting Fitness in Drosophila Populations. Ann Rev
17
Genet. 11:49-78.
18
Smyth GK. 2005. limma: Linear Models for Microarray Data. In: Gentleman R, Carey VJ, Huber
19
W, Irizarry RA, Dudoit S, editors. Bioinformatics and Computational Biology Solutions Using
20
R and Bioconductor. New York, NY: Springer New York. p. 397-420.
21
Soneson C, Love MI, Robinson MD. 2015. Differential analyses for RNA-seq: transcript-level
22
estimates improve gene-level inferences. F1000 Res. 4:1521.
23
Sprengelmeyer QD, Mansourian S, Lange JD, Matute DR, Cooper BS, Jirle EV, Stensmyr MC,
24
Pool JE. 2020. Recurrent Collection of Drosophila melanogaster from Wild African
25
Environments and Genomic Insights into Species History. Mol Biol Evol. 37:627-638.
26
Stefansson H, Helgason A, Thorleifsson G, Steinthorsdottir V, Masson G, Barnard J, Baker A,
27
Jonasdottir A, Ingason A, Gudnadottir VG, et al. 2005. A common inversion under selection
28
in Europeans. Nat Genet. 37:129-137.
29
Strobeck C. 1983. Expected linkage disequilibrium for a neutral locus linked to a chromosomal
30
arrangement. Genetics 103:545-555.
31
Sturtevant AH. 1917. Genetic Factors Affecting the Strength of Linkage in Drosophila. Proc Natl
32
Acad Sci USA 3: 555-558.
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
50
Sturtevant AH. 1919. Inherited linkage variations in the second chromosome. Contributions to
1
the genetics of Drosophila melanogaster. III. Washington DC: Carnegie Institute 305-341.
2
Sturtevant AH. 1921. A Case of Rearrangement of Genes in Drosophila. Proc Natl Acad Sci
3
USA 7:235-237.
4
Supek F, Bošnjak M, Škunca N, Šmuc T. 2011. REVIGO Summarizes and Visualizes Long Lists
5
of Gene Ontology Terms. PLoS ONE 6:e21800.
6
Sved JA. 1968. The stability of linked systems of loci with a small population size. Genetics 59:
7
543-563.
8
Tajima F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA
9
polymorphism. Genetics 123:585-595.
10
Umina PA, Weeks AR, Kearney MR, McKechnie SW, Hoffmann AA. 2005. A Rapid Shift in a
11
Classic Clinal Pattern in Drosophila Reflecting Climate Change. Science 308:691-693.
12
Voelker RA, Cockerham CC, Johnson FM, Schaffer HE, Mukai T, Mettler LE. 1978. Inversions
13
fail to account for allozyme clines. Genetics 88:515-527.
14
Wallace AG, Detweiler D, Schaeffer SW. 2013. Molecular Population Genetics of Inversion
15
Breakpoint Regions in Drosophila pseudoobscura. G3 3:1151-1163.
16
Waller DM. 2021. Addressing Darwin's dilemma: Can pseudo-overdominance explain persistent
17
inbreeding depression and load? Evolution 75: 779-793.
18
Wang M, Zhao Y, Zhang B. 2015. Efficient Test and Visualization of Multi-Set Intersections. Sci
19
Rep. 5:16923.
20
Weir B, Cockerham CC. 1984. Estimating F-Statistics for the Analysis of Population Structure.
21
Evolution 38:13581370.
22
Wellenreuther M, Bernatchez L. 2018. Eco-evolutionary genomics of chromosomal inversions.
23
Trends Ecol Evol. 33:427-440.
24
Westram AM, Faria R, Johannesson K, Butlin R, Barton N. 2022. Inversions and parallel
25
evolution. Phil Trans Roy Soc B 377: 20210203.
26
Wright S, Dobzhansky T. 1946. Genetics of Natural Populations. XII. Experimental reproduction
27
of some of the changes caused by natural selection in certain populations of Drosophila
28
pseudoobscura. Genetics 31:125-156.
29
Yang X, Li J, Lee Y, Lussier YA. 2011. GO-Module: functional synthesis and improved
30
interpretation of Gene Ontology patterns. Bioinformatics 27:1444-1446.
31
Yu J, Pacifico S, Liu G, Finley RL. 2008. DroID: the Drosophila Interactions Database, a
32
comprehensive resource for annotated gene and protein interactions. BMC Genom. 9:461.
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
51
Zeng K, Charlesworth B, Hobolth A. 2021. Studying models of balancing selection using phase-
1
type theory. Genetics 218:iyab055.
2
Zhao L, Charlesworth B. 2016. Resolving the conflict between associative overdominance and
3
background selection. Genetics 203:1315-1334.
4
Zhao L, Wit J, Svetec N, Begun DJ. 2015. Parallel Gene Expression Differences between Low
5
and High Latitude Populations of Drosophila melanogaster and D. simulans. PLoS Genet
6
11:e1005184.
7
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
52
Tables
1
Table 1. Effects on patterns of genetic variation. (A) F-values from a three-way ANOVA testing
2
for differences in π and Tajima’s D with respect to geographic origin, In(3R)Payne karyotype,
3
and genomic position relative to the inversion (inside vs. outside). (B) Planned contrasts based
4
on estimated coefficients from ANOVA, testing for differences in π and Tajima’s D between
5
inverted and standard chromosomes with respect to geography and genomic position (inside vs.
6
outside), using the emmeans package in R. * p < 0.05; ** p < 0.01; *** p < 0.001. Also see fig. 1;
7
see Materials and Methods for further details.
8
9
(A) Factor
π: ANOVA F-value
D: ANOVA F-value
Origin
F3,1312 = 505.27 ***
F3,1952 = 1809.03 ***
Karyotype
F1,1312 = 14.72 ***
F1,1952 = 546.38 ***
Genomic position
F1,1312 = 21.61 ***
F1,1952 = 178.14 ***
Origin x Karyotype
F3,1312 = 9.74 ***
F3,1952 = 16.68 ***
Origin x Genomic position
F3,1312 = 3.28 *
F3,1952 = 12.93 ***
Karyotype x Genomic position
F1,1312 = 5.16 *
F1,1952 = 169.35 ***
Origin x Karyotype x Genomic position
F3,1312 = 8.1 ***
F3,1952 = 24.12 ***
(B) Geographic origin
Genomic position
π: t-value
D: t-value
Africa (Zambia)
inside
-8.237 ***
20.362 ***
outside
-0.857
3.341
Europe (Portugal)
inside
0.037
6.593 ***
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
53
outside
-1.841
6.320 ***
North America (Florida)
inside
-0.469
10.979 ***
outside
-0.495
-0.051
Australia (Queensland)
inside
0.032
13.527 ***
outside
0.978
5.044 ***
1
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
1
2
3
4
5
6
Tukey’s HSD statistic
Origin
Position
ANOVA F-value
Karyotype vs. Geography
Karyotype + Geography vs.
Geography
Karyotype + Geography vs.
Karyotype
Europe
Inside
F2,366 = 543.85 ***
0.33***
0.34***
0.01
Outside
F2,780 = 8.4722 **
0.01
0.02***
0.01
North America
Inside
F2,366 = 221.02 ***
0.21***
0.23***
0.02
Outside
F2,792 = 119.6 ***
-0.03***
0.02***
0.05***
Australia
Inside
F2,366 =170.35 ***
0.14***
0.2***
0.06***
Outside
F2,255 = 77.524 ***
-0.07***
-0.02***
0.05***
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
55
Figure legends
1
2
Fig. 1. Distribution of samples and phylogenetic relationships among In(3R)Payne karyotypes
3
across four continents. (A) Geographic origin of the samples used in this study. The color code
4
indicates the continent where flies were sampled (Africa, Europe, North America, Australia). The
5
outline of the circles indicates whether the samples contain chromosomes with In(3R)Payne (in
6
cyan) and/or with the standard arrangement (in black); the size of the circles indicates whether
7
samples were used only for phylogenetic reconstruction (small circles) or in addition also for
8
karyotype-specific genomic analyses (large circles). (B) Haplotype network constructed from
9
3766 SNPs within the breakpoints of In(3R)Payne; cyan edges represent samples with
10
In(3R)Payne, whereas black edges represent samples with the standard arrangement. (C)
11
Haplotype network based on 4849 randomly drawn SNPs at a distance of >200 kb from In(3L)P
12
and In(3R)Payne (see Materials and Methods). See table 1 for statistical analyses. Note that
13
several haplotypes from Florida cluster with the NG9 reference strain (see fig. 1B, fig. 1C). This
14
may be an artifact of our bioinformatic method for haplotype reconstruction (see Materials and
15
Methods); we therefore excluded these samples from downstream analyses.
16
17
Fig. 2. Patterns of nucleotide variability (π) and Tajima’s D in the region spanned by
18
In(3R)Payne. (A) Average values of nucleotide variability π, calculated in 100 kb non-
19
overlapping windows, with respect to geographic origin and genomic position relative to
20
In(3R)Payne, separately for inverted and standard arrangement chromosomes. (B) Average
21
values for Tajima’s D, calculated in 100 kb non-overlapping windows, separately for the two
22
arrangement types. See table 1 for details of ANOVA results for π and Tajima’s D; asterisks
23
(***, p < 0.001) represent significant p-values from planned contrasts. Also see supplementary
24
fig. 1 and supplementary fig. 2 (Supplementary Material online).
25
26
Fig. 3. Patterns of short- and long-range linkage disequilibrium (LD) in the region spanned by
27
In(3R)Payne. (A) Distribution and decay of LD as estimated by r2 within 100 kb distance, based
28
on 5000 randomly drawn SNPs inside the region spanned by In(3R)Payne for standard (grey)
29
and inverted (cyan) chromosomes from different geographic samples. Significant p-values in the
30
top right corners of the plots indicate differences in the decay of LD among karyotypes as
31
inferred from analyses of deviance applied to non-linear regression models (see Material and
32
Methods for details). (B) Triangular heat maps with estimates of r2 for 5000 SNPs randomly
33
drawn from chromosomal arm 3R in samples from North America (Florida). We restricted our
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
56
analyses to subsamples of 5000 SNPs due to computational reasons: n = 5000 SNPs implies
1
n(n-1)/2 = 12,497,500 pairwise comparisons, with much larger numbers becoming
2
computationally prohibitive. In the upper triangle, r2 was estimated jointly for inverted and
3
standard chromosomes (see fig. S3, Supplementary Material online, for similar plots for the
4
other continents). The bottom left and right plots show separate r2 estimates for inverted and
5
standard chromosomes, respectively.
6
7
Fig. 4. Chromosome-wide patterns of genetic differentiation (FST) due to In(3R)Payne karyotype
8
and/or the effects of geography. Line plots show the distribution of FST in 100 kb non-
9
overlapping windows along chromosome arm 3R. For the non-clinal African population sample
10
from Zambia (top panel), the line plot shows FST between inverted and standard chromosomes
11
within the Siavonga population sample. For the non-African populations from Europe, North
12
America and Australia, which are all situated along latitudinal gradients, the different lines depict
13
the effects of ‘karyotype’ (black line; pairwise differences between standard and inverted
14
chromosome from within a given low-latitude population), ‘geography’ (dark grey line; pairwise
15
comparisons between standard arrangement chromosomes from low- vs. high-latitude
16
populations, i.e. from the cline ‘ends’); and of ‘geography plus karyotype’ (‘G+K’; light grey line;
17
pairwise comparisons of inverted and standard chromosomes between the endpoints of a given
18
cline). Heat maps beneath each line plot show r2 between each SNP and In(3R)Payne. Note
19
that genomic information for the first 12 million bp is not available for the Australian data.
20
21
Fig. 5. Genetic differentiation (FST) as a function of In(3R)Payne karyotype and/or geography.
22
Bar plots show average values of FST in 100 kb non-overlapping windows in different genomic
23
regions relative to In(3R)Payne (inside vs. outside the inverted region) for the three non-African
24
continents (Europe, North America, Australia) which include clinal (low- vs. high-latitude)
25
populations samples. The different bars represent pairwise FST comparisons for (i) geographic
26
differentiation (‘G’, comparing standard arrangement chromosomes from populations at the
27
endpoints of clines), (ii) karyotypic differentiation (‘K’, comparing inverted and standard
28
arrangement chromosomes sampled from within the same low-latitude populations), and (iii)
29
geographic plus karyotypic differentiation (‘G +K’, comparing inverted chromosomes from low-
30
latitude populations with standard chromosomes from high-latitude populations). See Materials
31
and Methods for details; also see fig. 4 and table 2 for statistical analyses.
32
33
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
57
Fig. 6. African origin of inversion-specific alleles. Panel A shows median allele frequencies of
1
inversion-specific alleles from North America (FST ≥ 0.9) in inverted and standard arrangement
2
chromosomes in population samples from Africa, Europe and North America. Panel B shows
3
that highly differentiated SNPs in the Zambian population (exhibiting a frequency difference
4
0.5 between standard and inverted chromosomes) are mostly clustered around the inversion
5
breakpoints, with some smaller clusters (‘mini-peaks’) of SNPs also visible around positions ~19
6
Mbp and ~21 Mbp. Also see supplementary fig. S7 (Supplementary Material online). The
7
analyses above are based on 1786 SNPs in total, and 277 SNPs in the Zambia sample.
8
9
Fig. 7. Overlap of In(3R)Payne-associated candidate genes and SNPs among continents. Bar
10
plots show the counts of overlapping candidate genes (A) or SNPs (B) in Africa, North America
11
and Europe, as indicated by black dots underneath each bar plot. The total number of
12
candidates for each dataset is shown on the right side of the black dots. The color of the bars
13
corresponds to the significance of the overlaps, as inferred by the R package SuperExactTest
14
and as indicated by the color gradient in the legend. The grey overlays at the bottom of the bars
15
indicate the amount of expected overlap. Also see supplementary table S3 and supplementary
16
table S4 (Supplementary Material online); also see Materials and Methods.
17
18
Fig. 8. Transcriptomic analyses of In(3R)Payne karyotypes. (A) Q-Q plot comparing p-values
19
from differential expression (DE) analyses with limma voom between North American 3R Payne
20
inverted and standard chromosomes at 18°C and 25°C, respectively. Since karyotype-specific
21
DE was much stronger at 18°C, we focused on this dataset for downstream analyses. (B)
22
Manhattan plot depicting -log10(p)-values for each gene relative to its average genomic position.
23
Candidate genes with karyotype-specific expression irrespective of temperature are highlighted
24
in red; those showing karyotype-specific expression at 18°C only in blue; and those that are
25
candidates in both datasets are highlighted in purple. Candidates of both datasets are enriched
26
within the region spanned by 3R Payne. (C) Significant gene ontology (GO) terms based on
27
differentially expressed genes among karyotypes at 18°C with Benjamini-Hochberg-adjusted p-
28
values < 0.05. Also see supplementary table S5 (Supplementary Material online) and Materials
29
and Methods for further details.
30
31
Fig. 9. Overlap between genomic and transcriptomic candidate loci associated with
32
In(3R)Payne. (A) Summary of results of rank-rank hypergeometric overlaps (RRHO). The dark
33
red area indicates the highly significant overlap between genomic and transcriptomic
34
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
58
candidates. A core set of 86 candidate loci located in the top right corner of the heatmap in (A)
1
is tightly clustered inside the inversion or in close proximity to it (inversion highlighted in cyan).
2
(B) Significant overlap between genomic candidates (based on candidate SNPs with FST ≥ 0.9
3
between karyotypes in Florida; in light yellow) and transcriptomic candidates (based on
4
significant differential expression between inverted and standard karyotypes from Florida reared
5
at 18°C = FIFS18; in light blue); p-value estimated using SuperExactTest in R. Also see
6
supplementary figure S7, supplementary table S3 and supplementary table S7 (Supplementary
7
Material online).
8
9
10
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
59
1
Figure 1
2
159x160 mm ( x DPI)
3
4
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
60
1
Figure 2
2
159x134 mm ( x DPI)
3
4
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
61
1
Figure 3
2
159x200 mm ( x DPI)
3
4
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
62
1
Figure 4
2
159x110 mm ( x DPI)
3
4
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
63
1
Figure 5
2
159x83 mm ( x DPI)
3
4
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
64
1
Figure 6
2
159x133 mm ( x DPI)
3
4
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
65
1
Figure 7
2
159x93 mm ( x DPI)
3
4
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
66
1
Figure 8
2
159x96 mm ( x DPI)
3
4
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
67
1
Figure 9
2
159x71 mm ( x DPI)
3
ACCEPTED MANUSCRIPT
Downloaded from https://academic.oup.com/mbe/advance-article/doi/10.1093/molbev/msad118/7176213 by guest on 24 May 2023
... Inversions are a particularly important form of structural variation because they can be large (multiple Mb in genomic extent) and can strongly suppress recombination between inverted and ancestral karyotypes (17). These characteristics make them potent evolutionary modifiers that can facilitate local or clinal adaptive processes (18)(19)(20), often have strong phenotypic effects (21,22), and may capture a large fraction of the standing genetic variation in some species (23,24). ...
... Since selective forces on inversions may change over time, the reservoir of variation accumulated within type II inversions may eventually become a target for positive selection. This idea is supported by a recent global analysis (20) of the In(3R)Payne inversion in Drosophila melanogaster which occurs as a balanced polymorphism in its ancestral African population but has now formed sharp latitudinal clines underlying climate adaptation in North America (37) and Australia (38). ...
Preprint
Full-text available
The future survival of coral reefs in the Anthropocene depends on the capacity of corals to adapt as oceans warm and extreme weather events become more frequent. Targeted interventions designed to assist evolutionary processes in corals require a comprehensive understanding of the distribution and structure of standing variation, however, efforts to map genomic variation in corals have so far focussed almost exclusively on SNPs, overlooking structural variants that have been shown to drive adaptive processes in other taxa. Here we show that the reef-building coral, Acropora kenti (syn. tenuis) harbors at least five large, highly polymorphic structural variants, all of which exhibit signatures of strongly suppressed recombination in heterokaryotypes, a feature commonly associated with chromosomal inversions. Based on their high minor allele frequency, uniform distribution across habitats, and elevated genetic load, we propose that these inversions in A. kenti are likely to be under balancing selection. An excess of SNPs with high impact on protein coding genes within these loci elevates their importance both as potential targets for adaptive selection and as contributors to genetic decline if coral populations become fragmented or inbred in future.
... While local adaptation, gene flow, and associative overdominance may all contribute to geographic variation in 128 inversion frequency among D. melanogaster populations, it is not clear that these processes are collectively 129 sufficient to explain observed patterns. Alternatively, the allele frequencies and diversity patterns of at least some 130 of these inversions might be primarily shaped by some form of balancing selection acting on inversion-linked 131 variation, in which the balanced frequency is dependent on the environment (Kapun et al. 2023). Balancing 132 selection on inversions has received increasing attention in recent literature (Wellenreuther and Bernatchez 2018;133 Faria et al. 2019). ...
Preprint
Full-text available
Chromosomal inversion polymorphisms can be common, but the causes of their persistence are often unclear. We propose a model for the maintenance of inversion polymorphism, which requires that some variants contribute antagonistically to two phenotypes, one of which has negative frequency-dependent fitness. These conditions yield a form of frequency-dependent disruptive selection, favoring two predominant haplotypes segregating alleles that favor opposing antagonistic phenotypes. An inversion associated with one haplotype can reduce the fitness load incurred by generating recombinant offspring, reinforcing its linkage to the haplotype and enabling both haplotypes to accumulate more antagonistic variants than expected otherwise. We apply a purpose-built forward simulator to examine these dynamics under a tradeoff between viability and male display. These simulations indeed generated long haplotypes where alleles of opposing effects occupied alternate chromosomal arrangements. Antagonism increases with time, and can ultimately yield karyotypes at predictable frequencies, and notable genotype frequency differences between sexes and between developmental stages. To test whether this model may contribute to abundant inversion polymorphism in Drosophila melanogaster, we tracked inversion frequencies in laboratory crosses to test whether they influence male reproductive success or survival. We find that two of the four tested inversions show significant evidence for the tradeoff examined, with In(3R)K favoring survival and In(3L)Ok favoring male reproduction. Additionally, all inversions show survival differences between sexes, and paternal success depends on maternal genotype. Based on this work, we expect that balancing selection on antagonistically pleiotropic traits may provide a significant and underappreciated contribution to the maintenance of natural inversion polymorphism.
Article
Chromosomal inversions are structural mutations that can play a prominent role in adaptation and speciation. Inversions segregating across species boundaries (trans-species inversions) are often taken as evidence for ancient balancing selection or adaptive introgression, but can also be due to incomplete lineage sorting. Using whole-genome resequencing data from 18 populations of 11 recognized munia species in the genus Lonchura (N = 176 individuals), we identify four large para- and pericentric inversions ranging in size from 4 to 20 Mb. All four inversions cosegregate across multiple species and predate the numerous speciation events associated with the rapid radiation of this clade across the prehistoric Sahul (Australia, New Guinea) and Bismarck Archipelago. Using coalescent theory, we infer that trans-specificity is improbable for neutrally segregating variation despite substantial incomplete lineage sorting characterizing this young radiation. Instead, the maintenance of all three autosomal inversions (chr1, chr5, and chr6) is best explained by selection acting along ecogeographic clines not observed for the collinear parts of the genome. In addition, the sex chromosome inversion largely aligns with species boundaries and shows signatures of repeated positive selection for both alleles. This study provides evidence for trans-species inversion polymorphisms involved in both adaptation and speciation. It further highlights the importance of informing selection inference using a null model of neutral evolution derived from the collinear part of the genome.
Article
Full-text available
Inversions restrict recombination when heterozygous with standard arrangements, but often have few noticeable phenotypic effects. Nevertheless, there are several examples of inversions that can be maintained polymorphic by strong selection under laboratory conditions. A long-standing model for the source of such selection is divergence between arrangements with respect to recessive or partially recessive deleterious mutations, resulting in a selective advantage to heterokaryotypic individuals over homokaryotypes. This paper uses a combination of analytical and numerical methods to investigate this model, for the simple case of an autosomal inversion with multiple independent nucleotide sites subject to mildly deleterious mutations. A complete lack of recombination in heterokaryotypes is assumed, as well as constancy of the frequency of the inversion over space and time. It is shown that a significantly higher mutational load will develop for the less frequent arrangement. A selective advantage to heterokaryotypes is only expected when the two alternative arrangements are nearly equal in frequency, so that their mutational loads are very similar in size. The effects of some Drosophila pseudoobscura polymorphic inversions on fitness traits seem to be too large to be explained by this process, although it may contribute to some of the observed effects. Several population genomic statistics can provide evidence for signatures of a reduced efficacy of selection associated with the rarer of two arrangements, but there is currently little published data that are relevant to the theoretical predictions.
Article
Full-text available
Fluctuations in the strength and direction of natural selection through time are a ubiquitous feature of life on Earth. One evolutionary outcome of such fluctuations is adaptive tracking, wherein populations rapidly adapt from standing genetic variation. In certain circumstances, adaptive tracking can lead to the long-term maintenance of functional polymorphism despite allele frequency change due to selection. Although adaptive tracking is likely a common process, we still have a limited understanding of aspects of its genetic architecture and its strength relative to other evolutionary forces such as drift. Drosophila melanogaster living in temperate regions evolve to track seasonal fluctuations and are an excellent system to tackle these gaps in knowledge. By sequencing orchard populations collected across multiple years, we characterized the genomic signal of seasonal demography and identified that the cosmopolitan inversion In(2L)t facilitates seasonal adaptive tracking and shows molecular footprints of selection. A meta-analysis of phenotypic studies shows that seasonal loci within In(2L)t are associated with behavior, life history, physiology, and morphological traits. We identify candidate loci and experimentally link them to phenotype. Our work contributes to our general understanding of fluctuating selection and highlights the evolutionary outcome and dynamics of contemporary selection on inversions.
Article
Full-text available
Inversions are structural mutations that reverse the sequence of a chromosome segment and reduce the effective rate of recombination in the heterozygous state. They play a major role in adaptation, as well as in other evolutionary processes such as speciation. Although inversions have been studied since the 1920s, they remain difficult to investigate because the reduced recombination conferred by them strengthens the effects of drift and hitchhiking, which in turn can obscure signatures of selection. Nonetheless, numerous inversions have been found to be under selection. Given recent advances in population genetic theory and empirical study, here we review how different mechanisms of selection affect the evolution of inversions. A key difference between inversions and other mutations, such as single nucleotide variants, is that the fitness of an inversion may be affected by a larger number of frequently interacting processes. This considerably complicates the analysis of the causes underlying the evolution of inversions. We discuss the extent to which these mechanisms can be disentangled, and by which approach. Abstract Inversions often play key roles in adaptation and speciation, but the processes that direct their evolution are obscured by the characteristic that makes them so unique (reduced recombination between arrangements). In this review, we examine how different mechanisms can impact inversion evolution, weaving together both theoretical and empirical studies. We emphasize that most patterns are overdetermined (i.e. can be caused by multiple processes), but we highlight new technologies that provide a path forward towards disentangling these mechanisms.
Article
In all species, new chromosomal inversions are constantly being formed by spontaneous rearrangement and then stochastically eliminated from natural populations. In Drosophila, when new chromosomal inversions overlap with a pre-existing inversion in the population, their rate of elimination becomes a function of the relative size, position, and linkage phase of the gene rearrangements. These altered dynamics result from complex meiotic behavior wherein overlapping inversions generate asymmetric dyads that cause both meiotic drive/drag and segmental aneuploidy. In this context, patterns in rare inversion polymorphisms of a natural population can be modeled from the fundamental genetic processes of forming asymmetric dyads via crossing-over in meiosis I and preferential segregation from asymmetric dyads in meiosis II. Here, a mathematical model of crossover-dependent female meiotic drive is developed and parameterized with published experimental data from Drosophila melanogaster laboratory constructs. This mechanism is demonstrated to favor smaller, distal inversions and accelerate the elimination of larger, proximal inversions. Simulated sampling experiments indicate that the paracentric inversions directly observed in natural population surveys of Drosophila melanogaster are a biased subset that both maximizes meiotic drive and minimizes the frequency of lethal zygotes caused by this cytogenetic mechanism. Incorporating this form of selection into a population genetic model accurately predicts the shift in relative size, position, and linkage phase for rare inversions found in this species. The model and analysis presented here suggest that this weak form of female meiotic drive is an important process influencing the genomic distribution of rare inversion polymorphisms.
Article
Full-text available
The strong reduction in the frequency of recombination in heterozygotes for an inversion and a standard gene arrangement causes the arrangements to become partially isolated genetically, resulting in sequence divergence between them and changes in the levels of neutral variability at nucleotide sites within each arrangement class. Previous theoretical studies on the effects of inversions on neutral variability have either assumed that the population is panmictic or that it is divided into two populations subject to divergent selection. Here, the theory is extended to a model of an arbitrary number of demes connected by migration, using a finite island model with the inversion present at the same frequency in all demes. Recursion relations for mean pairwise coalescent times are used to obtain simple approximate expressions for diversity and divergence statistics for an inversion polymorphism at equilibrium under recombination and drift, and for the approach to equilibrium following the sweep of an inversion to a stable intermediate frequency. The effects of an inversion polymorphism on patterns of linkage disequilibrium are also examined. The reduction in effective recombination rate caused by population subdivision can have significant effects on these statistics. The theoretical results are discussed in relation to population genomic data on inversion polymorphisms, with an emphasis on Drosophila melanogaster. Methods are proposed for testing whether or not inversions are close to recombination-drift equilibrium, and for estimating the rate of recombinational exchange in heterozygotes for inversions; difficulties involved in estimating the ages of inversions are also discussed.
Preprint
Full-text available
The strong reduction in the frequency of recombination in heterozygotes for an inversion and a standard gene arrangement causes inversion and standard haplotypes to become partially isolated, resulting in sequence divergence between them and to changes in the levels of neutral variability at nucleotide sites within each arrangement class, as is also the case for other types of balanced polymorphisms. Previous theoretical studies have either assumed that the population is panmictic or that it is divided into two populations subject to divergent selection. Here, the theory is extended to a model of an arbitrary number of demes connected by migration, using a finite island. Recursion relations for mean pairwise coalescent times are used to obtain simple approximate expressions for diversity and divergence statistics relevant to an inversion polymorphism at equilibrium under recombination and drift, and for the approach to equilibrium following the sweep of an inversion to a stable intermediate frequency. The effects of an inversion polymorphism on patterns of linkage disequilibrium are also examined. The reduction in effective recombination rate caused by population subdivision can have significant effects on these statistics, and hence on estimates of the ages of inversions. The theoretical results are discussed in relation to population genomic data on inversion polymorphisms, with an emphasis on Drosophila melanogaster . Methods are proposed for testing whether or not inversions are close to recombination-drift equilibrium, and for estimating the rate of recombinational exchange in heterozygotes for inversions; difficulties involved in estimating the ages of inversions are also discussed.
Article
Full-text available
Across many species where inversions have been implicated in local adaptation, genomes often evolve to contain multiple, large inversions that arise early in divergence. Why this occurs has yet to be resolved. To address this gap, we built forward-time simulations in which inversions have flexible characteristics and can invade a metapopulation undergoing spatially divergent selection for a highly polygenic trait. In our simulations, inversions typically arose early in divergence, captured standing genetic variation upon mutation, and then accumulated many small-effect loci over time. Under special conditions, inversions could also arise late in adaptation and capture locally adapted alleles. Polygenic inversions behaved similarly to a single supergene of large effect and were detectable by genome scans. Our results show that characteristics of adaptive inversions found in empirical studies (e.g. multiple large, old inversions that are F ST outliers, sometimes overlapping with other inversions) are consistent with a highly polygenic architecture, and inversions do not need to contain any large-effect genes to play an important role in local adaptation. By combining a population and quantitative genetic framework, our results give a deeper understanding of the specific conditions needed for inversions to be involved in adaptation when the genetic architecture is polygenic. This article is part of the theme issue ‘Genomic architecture of supergenes: causes and evolutionary consequences’.
Article
Full-text available
Local adaptation leads to differences between populations within a species. In many systems, similar environmental contrasts occur repeatedly, sometimes driving parallel phenotypic evolution. Understanding the genomic basis of local adaptation and parallel evolution is a major goal of evolutionary genomics. It is now known that by preventing the break-up of favourable combinations of alleles across multiple loci, genetic architectures that reduce recombination, like chromosomal inversions, can make an important contribution to local adaptation. However, little is known about whether inversions also contribute disproportionately to parallel evolution. Our aim here is to highlight this knowledge gap, to showcase existing studies, and to illustrate the differences between genomic architectures with and without inversions using simple models. We predict that by generating stronger effective selection, inversions can sometimes speed up the parallel adaptive process or enable parallel adaptation where it would be impossible otherwise, but this is highly dependent on the spatial setting. We highlight that further empirical work is needed, in particular to cover a broader taxonomic range and to understand the relative importance of inversions compared to genomic regions without inversions. This article is part of the theme issue ‘Genomic architecture of supergenes: causes and evolutionary consequences’.
Article
Full-text available
The relationships between adaptive evolution, phenotypic plasticity, and canalization remain incompletely understood. Theoretical and empirical studies have made conflicting arguments on whether adaptive evolution may enhance or oppose the plastic response. Gene regulatory traits offer excellent potential to study the relationship between plasticity and adaptation, and they can now be studied at the transcriptomic level. Here we take advantage of three closely-related pairs of natural populations of Drosophila melanogaster from contrasting thermal environments that reflect three separate instances of cold tolerance evolution. We measure the transcriptome-wide plasticity in gene expression levels and alternative splicing (intron usage) between warm and cold laboratory environments. We find that suspected adaptive changes in both gene expression and alternative splicing tend to neutralize the ancestral plastic response. Further, we investigate the hypothesis that adaptive evolution can lead to decanalization of selected gene regulatory traits. We find strong evidence that suspected adaptive gene expression (but not splicing) changes in cold-adapted populations are more vulnerable to the genetic perturbation of inbreeding than putatively neutral changes. We find some evidence that these patterns may reflect a loss of genetic canalization accompanying adaptation, although other processes including hitchhiking recessive deleterious variants may contribute as well. Our findings augment our understanding of genetic and environmental effects on gene regulation in the context of adaptive evolution.
Article
Full-text available
The effects of selection on variability at linked sites have an important influence on levels and patterns of within-population variation across the genome. Most theoretical models of these effects have assumed that selection is sufficiently strong that allele frequency changes at the loci concerned are largely deterministic. These models have led to the conclusion that directional selection for selectively favorable mutations, or against recurrent deleterious mutations, reduces nucleotide site diversity at linked neutral sites. Recent work has shown, however, that fixations of weakly selected mutations, accompanied by significant stochastic changes in allele frequencies, can sometimes cause higher diversity at linked sites when compared with the effects of fixations of neutral mutations. The present paper extends this work by deriving approximate expressions for the mean conditional times to fixation and loss of mutations subject to selection, and analysing the conditions under which selection increases rather than reduces these times. Simulations are used to examine the relations between diversity at a neutral site and the fixation and loss times of mutations at a linked site that is subject to selection. It is shown that the long-term level of neutral diversity can be increased over the purely neutral value by recurrent fixations and losses of linked, weakly selected dominant or partially dominant favorable mutations, or linked recessive or partially recessive deleterious mutations. The results are used to examine the conditions under which associative overdominance, as opposed to background selection, is likely to operate.
Article
Full-text available
Population genetics seeks to illuminate the forces shaping genetic variation, often based on a single snapshot of genomic variation. However, utilizing multiple sampling times to study changes in allele frequencies can help clarify the relative roles of neutral and non-neutral forces on short time scales. This study compares whole-genome sequence variation of recently collected natural population samples of Drosophila melanogaster against a collection made approximately 35 years prior from the same locality—encompassing roughly 500 generations of evolution. The allele frequency changes between these time points would suggest a relatively small local effective population size on the order of 10,000, significantly smaller than the global effective population size of the species. Some loci display stronger allele frequency changes than would be expected anywhere in the genome under neutrality—most notably the tandem paralogs Cyp6a17 and Cyp6a23, which are impacted by structural variation associated with resistance to pyrethroid insecticides. We find a genome-wide excess of outliers for high genetic differentiation between old and new samples, but a larger number of adaptation targets may have affected SNP-level differentiation versus window differentiation. We also find evidence for strengthening latitudinal allele frequency clines: northern-associated alleles have increased in frequency by an average of nearly 2.5% at SNPs previously identified as clinal outliers, but no such pattern is observed at random SNPs. This project underscores the scientific potential of using multiple sampling time points to investigate how evolution operates in natural populations, by quantifying how genetic variation has changed over ecologically relevant timescales.
Article
Full-text available
Drosophila melanogaster is a leading model in population genetics and genomics, and a growing number of whole-genome datasets from natural populations of this species have been published over the last years. A major challenge is the integration of disparate datasets, often generated using different sequencing technologies and bioinformatic pipelines, which hampers our ability to address questions about the evolution of this species. Here we address these issues by developing a bioinformatics pipeline that maps pooled sequencing (Pool-Seq) reads from D. melanogaster to a hologenome consisting of fly and symbiont genomes and estimates allele frequencies using either a heuristic (PoolSNP) or a probabilistic variant caller (SNAPE-pooled). We use this pipeline to generate the largest data repository of genomic data available for D. melanogaster to date, encompassing 271 previously published and unpublished population samples from over 100 locations in > 20 countries on four continents. Several of these locations have been sampled at different seasons across multiple years. This dataset, which we call Drosophila Evolution over Space and Time (DEST), is coupled with sampling and environmental meta-data. A web-based genome browser and web portal provide easy access to the SNP dataset. We further provide guidelines on how to use Pool-Seq data for model-based demographic inference. Our aim is to provide this scalable platform as a community resource which can be easily extended via future efforts for an even more extensive cosmopolitan dataset. Our resource will enable population geneticists to analyze spatio-temporal genetic patterns and evolutionary dynamics of D. melanogaster populations in unprecedented detail.
Article
Full-text available
Several recent publications have stated that epistatic fitness interactions cause the fixation of inversions that suppress recombination among the loci involved. Under this type of selection, however, the suppression of recombination in an inversion heterozygote can create a form of heterozygote advantage, which prevents the inversion from becoming fixed by selection. This process has been explicitly modelled by previous workers.
Article
Full-text available
To advance our understanding of adaptation to temporally varying selection pressures, we identified signatures of seasonal adaptation occurring in parallel among Drosophila melanogaster populations. Specifically, we estimated allele frequencies genome-wide from flies sampled early and late in the growing season from 20 widely dispersed populations. We identified parallel seasonal allele frequency shifts across North America and Europe, demonstrating that seasonal adaptation is a general phenomenon of temperate fly populations. Seasonally fluctuating polymorphisms are enriched in large chromosomal inversions and we find a broad concordance between seasonal and spatial allele frequency change. The direction of allele frequency change at seasonally variable polymorphisms can be predicted by weather conditions in the weeks prior to sampling, linking the environment and the genomic response to selection. Our results suggest that fluctuating selection is an important evolutionary force affecting patterns of genetic variation in Drosophila .
Article
Patterns of variation and evolution at a given site in a genome can be strongly influenced by the effects of selection at genetically linked sites. In particular, the recombination rates of genomic regions correlate with their amount of within-population genetic variability, the degree to which the frequency distributions of DNA sequence variants differ from their neutral expectations, and the levels of adaptation of their functional components. We review the major population genetic processes that are thought to lead to these patterns, focusing on their effects on patterns of variability: selective sweeps, background selection, associative overdominance, and Hill–Robertson interference among deleterious mutations. We emphasize the difficulties in distinguishing among the footprints of these processes and disentangling them from the effects of purely demographic factors such as population size changes. We also discuss how interactions between selective and demographic processes can significantly affect patterns of variability within genomes.