ChapterPDF Available

Genetics and Slavic languages (Brill Encyclopedia of Slavic Languages 2020)

Authors:
/
Encyclopedia of Slavic Languages and Linguistics Online
Genetics and Slavic Languages
(6,648 words)
West, East, and South Slavs occupy a vast region within
Europe, but culturally they form a group of closely related
peoples speaking languages that diverged only recently
(the Slavic language group). It is believed that Slavic
languages had spread in the early medieval period (after
500 CE) due to the dispersion of their bearers from east-
central Europe (see Slavic migrations). The genetic prole
of today’s Slavs, however, reveals patterns that do not
follow linguistic ones. In particular, there is a substantial
closeness between West and East Slavs and Baltic speakers,
whereas South Slavs are clearly diferentiated and
genetically akin to other non-Slavic Balkan peoples. This
suggests that the distribution of Slavic languages was
chiey due to the cultural assimilation (via language shift)
of pre-Slavic people rather than to the replacement of
autochthonous populations by Slavs. On the other hand, a
high number of long genomic segments shared between
East Europeans including Slavic and non-Slavic peoples
suggests that there were migrations across East Europe in the medieval period, i.e., a demic
component – an actual movement of people – in the establishment of the Slavic community.
Collectively, genetic evidence suggests that the genesis of Slavs has been a complicated process
with a predominant cultural component.
Groups of people have lengthy histories of migrating, interacting with one another, growing or
declining in size, adapting to diferent environments, and being exposed to diseases. All of these
events have left traces in their genomes. Today, the availability of new analytical tools from the
discipline of evolutionary genetics makes it easier to study the demographic history of a
population via comparisons between the genetic structure of modern and ancient humans.
Article Table of Contents
Uniparental genetic diversity
of today’s Slavs illuminate
pre-Slavic history
Genome-wide variation
reveals both ancient and
recent layers of the
demographic history of Slavs
The Slavs: Spreading
language, moving genes?
Ancient human DNA
variation and the Slavs
Concluding remarks
Bibliography
/
Coupled with other approaches to the study of human past, evolutionary genetics should in the
future greatly improve our understanding of the genetic and cultural diversity of today’s human
communities.
Slavic-speaking populations occupy nearly a half of the European continent and constitute
around a third of its population, thus making them one of the most numerous and widespread
language groups in today’s Europe (Simons and Fennig 2018).Moreover, owing to economic and
political circumstances beginning in the late medieval period, Russian, an East Slavic language,
is spoken today far beyond the historical area of its development, in central parts of European
Russia – in the Volga and Ural region, north Caucasus, Siberia and the Far East. Linguistically, the
Slavs are divided into groups of West and East Slavs, who occupy east-central Europe, and South
Slavs, located in the eastern Alps and the Balkan Peninsula. Geographically, the South Slavs are
separated from other Slavs by the Carpathian Mountains and are linguistically distinct from non-
Slavic Hungarians, Austrians, and Romanians.
The historical establishment of the Slavic community has been a focus of historical, linguistic,
and archaeological studies. Historical and linguistic evidence are in agreement that the Slavic
expansion took place during the early medieval period, presumably from east-central Europe.
While it is hard to estimate how signicant this expansion was numerically, it eventually led to a
reshaping of the linguistic landscape of Europe (the Slavicization of Europe). Slavic languages
spread eastward to territories occupied by Baltic, Turkic, and Uralic speakers; westward, where
Slavs interacted with speakers of Germanic languages; and in the Balkans, where they
encountered a diverse ethnic background (Barford 2001; Curta 2001; Heather 2010).
Archaeological data, however, are less unambiguous: it is still debatable whether the Slavs
spread can be linked to some east and central European culture of the Iron Age and early
medieval time (Curta 2001; Sedov 1979; 2002).
At least two diferent models underlie the spread of Slavic languages: one involves actual
movements of people and a substantial replacement of the autochthonous population (the
prevalence of a demic component); the second involves the spread of Slavic languages without
large-scale movements of Slavic language bearers (the prevalence of a cultural component).
The genetic structure of today’s Slavs is well characterized at intra-Slavic and intra-European
scales. Major aspects of their demographic history, including a lengthy pre-Slavic period, have
been reconstructed. This genetic evidence allows testing both models: in extreme cases, demic
difusion would lead to Slavic speakers being genetically more similar to one another than to
non-Slavic neighboring populations, whereas cultural spread would not afect genetic structure,
and Slavs would not be expected to difer genetically from their non-Slavic neighbors.
/
This article provides an overview of the genetic structure and genetic history of Slavic-speaking
populations. It begins with a discussion of evidence from uniparental genetic loci variation –
mitochondrial DNA and nonrecombining region of the Y chromosome, followed by biparental
human genome variation – and it closes with a synopsis of available data from ancient human
DNA studies. Demographic models that are compatible with the observed genetic structuring of
today’s Slavs are also discussed.
Uniparental genetic diversity of today’s Slavs illuminate pre-Slavic history
Mitochondrial DNA (mtDNA) and the nonrecombining region of the Y chromosome (NRY)
stand out from the rest of the human genome: each is transmitted exclusively via females and
males respectively, thus carrying information on only one genealogical line. Neither locus
recombines, i.e., they do not exchange their genomic material with other parts of the genome,
hence they are inherited as a single locus from a parent to an ofspring. Changes in the DNA
sequences occur and accumulate further only due to mutation processes. These peculiarities
allow researchers to classify all worldwide types of mtDNAs and NRYs into groups called
haplogroups, and, applying a phylogenetic approach, to reconstruct their hierarchical order and
estimate their age, represented in phylogenetic trees (van Oven and Kayser 2009). Topology and
age estimates of the tree branches (haplogroups), coupled with their frequencies, will reect
matrilineal (mtDNA) and patrilineal (NRY) histories of human populations, respectively, e.g.,
migration routes, admixture, and population dynamics. It is important to remember that
mtDNA and NRY represent only a tiny portion of the human genome, hence they explain a small
part of genetic variation. The genetic history of a population cannot be drawn solely from the
variation of either of the loci separately, and information from the whole human genome (22
pairs of autosomes, sex chromosomes, and mtDNA) should be used.
Traditionally, mtDNA and NRY population studies included frequency-based analyses of
haplogroups (haplogroup status was established typically from a restricted number of
informative nucleotide positions of mtDNA and NRY, respectively) and haplotype-based
analyses (information from complete sequences of mtDNA, or short tandem repeats of NRY,
were used). The mtDNA pool of the European region includes haplogroups N1, W, X, JT, R0
(including R0a, H, and V), and U, which are broadly classied as the Western Eurasian type
(Underhill and Kivisild 2007). The frequencies of these haplogroups are comparable among
geographically remote European populations, and regional diferences in the matrilineal gene
pool emerge only at a higher level of phylogenetic analysis involving complete mtDNA
sequences. NRY in Europe is represented by haplogroups E3b, G, I1, I2, J2, N3, K2, R1b, and R1a
(Underhill & Kivisild 2007). Unlike mtDNA, the structure of the NRY gene pool of European
populations is well recognized even at a low level of phylogenetic resolution (Rosser et al. 2000).
The matrilineal genetic variation of Slavs
/
Mitochondrial DNA composition in contemporary Slavs is represented by haplogroups that
belong to the West Eurasian type – almost 90% of diversity falls to R0 (including H and V), U, T, J,
N1, and their sub-haplogroups. The most frequent is haplogroup H, comprising around 40%, and
is represented by the variety of its subtypes, with the most frequent being H1b and H2a (Loogväli
et al. 2004). The second major maternal lineage is haplogroup U with predominant U5a, U5b,
and U4 (making up to 20%), followed by haplogroups J and T (each making up around 10%;
Kushniarevich et al. 2015). The mtDNA pool of all Slavs also harbors haplogroups – typically not
more than 10% in total – which are classied as East Eurasian (A, C, D, G, F, Z; Malyarchuk et al.
2008c) or African (L; Malyarchuk et al. 2008a). The presence of those mtDNA haplogroups in
minor amounts is characteristic for European populations in general.
At the inter-Slavic level of mtDNA diversity, East, West, and South Slavs demonstrate only slight
variation along the south–north line. East Slavs and, specically, Russians from the northern
region of European Russia, are the most diferentiated from the rest of Slavs (Morozova et al.
2012). At the intra-European level of mtDNA variation, Slavs overlap with their non-Slavic-
speaking geographic neighbors from central, east, and southeast Europe and are diferentiated
from western Europeans and from the populations of the Volga region (Morozova et al. 2012;
Kushniarevich et al. 2015).
Studies that used complete mtDNA sequences have revealed some region-specic haplogroups
or those that are “enriched” in the Slavic-speaking populations (Davidovic et al. 2017; Malyarchuk
et al. 2010; 2008b; Mielnik-Sikorska et al. 2013; Šarac et al. 2014). However, the age estimates of
those haplogroups in West and East Slavs, e.g., H5a1, U4a2, U5a2a, and U5a2b1, suggest their
presence in east-central Europe long before the Slavic expansion according to historical
information.
Hence, the structuring of the mtDNA pool of modern-day Slavs does not support the view that
Slavic-speaking populations share a matrilineal origin that would diferentiate them from non-
Slavic central-east and southeast Europeans. However, given the low level of mtDNA pool
diferentiation across east Europe in general, mtDNA alone might not be suciently informative
on recent demographic history.
The patrilineal genetic diversity of Slavs
Variation of the NRY types in Slavs is represented by the array of east-central and southeast
European haplogroups among which two – R1a-Z282 and I2a-P37 – predominate (Rosser et al.
2000; Rootsi et al. 2004; Underhill et al. 2015). In addition to those two basal components, the
patrilineal gene pool of East Slavs harbors haplogroup N3-Tat at considerable frequencies, while
this haplogroup is almost absent in the genomes of West and South Slavs (Balanovsky et al. 2008;
Ilumäe et al. 2016; Kushniarevich et al. 2015; 2013). West and South Slavs in turn incorporate
substantial frequencies of haplogroups R1b-M269 and E3-M35, respectively (Pericić et al. 2005;
/
Rębała et al. 2013). Along with those primary components, the gene pool of all Slavic speakers
bears also haplogroups that are less frequent but typical for the European region in general: G-
M201, I1-M253, J-M172, T-M70, and Q-M242 (Balanovsky et al. 2008; Kushniarevich et al. 2015;
2013; Pericić et al. 2005; Rębała et al. 2013).
The paternal gene pool of modern-day Slavs reveals a pronounced internal structure that largely
follows the north–south axis of their geographic origin. Most East Slavs, occupying the vast
region within the East European Plain (Russians from central-southern regions of European
Russia, Belarusians, and Ukrainians), form a group of populations that are very close to one
another in their NRY variation (Balanovsky et al. 2008; Kushniarevich et al. 2015). Notable is the
diferentiation of today’s Russian-speaking peoples living in the northern region of east Europe.
The lower frequency of haplogroup R1a-SRY1532 and the higher share of haplogroup N3-Tat in
their genomes bring them closer to their Finnic-speaking neighbors (Balanovsky et al. 2008;
Ilumäe et al. 2016). Compared to the three East Slavic-speaking populations, the West Slavs
demonstrate a higher level of internal diferentiation. In particular, whereas Poles and, to a lesser
extent, Slovaks, remain close to the East Slavic group, Czechs are shifted toward Western
Europeans due to the higher incidence of haplogroup R1b-M269 and lower R1a-M17 (Rębała et al.
2013; Woźniak et al. 2010). The Sorbs, a West Slavic-speaking community, although residing
beyond the western fringe of the current “Slavic area,” in the territory of present-day Germany,
demonstrate an extremely high frequency of the haplogroup R1a-M17, which brings them close
to Poles and East Slavs (Rębała et al. 2013). South Slavs are found on the “southern” tip of the NRY
variation of Slavic speakers. There is an intragroup diferentiation between populations of the
western (Slovenes, Croats, and Bosnians) and eastern (Macedonians and Bulgarians) regions of
the Balkan Peninsula, with Serbians placed in between (Pericić et al. 2005; Kushniarevich et al.
2015).
In most cases, the NRY variation between Slavic and non-Slavic speakers within the same
geographic regions is in gradient mode, with no sharp boundaries, for example, between
Hungarians and Romanians and Slavs to the north and south, and between East Slavs and their
Baltic-speaking neighbors, the Latvians and Lithuanians (Kushniarevich et al. 2015). Within the
Balkan region, non-Slavic-speaking Albanians are found to be close to populations of the
southern and eastern parts of the peninsula – Macedonians, Greeks, and Bulgarians – mirroring
their geographic position (Ferri et al. 2010; Kovacevic et al. 2014; Pericić et al. 2005). However, a
clear diferentiation in the NRY composition that coincides with the linguistic border exists
between modern-day Poles and Germans (Kayser et al. 2005). It has been suggested that both
the early stages of NRY shaping in medieval Slavs (Rębała et al. 2013) and the post-World War II
resettlement of people (Kayser et al. 2005) have contributed to this diferentiation.
Altogether, the NRY variation of Slavs forms a gradient that follows to a large extent their
geographic position. Within this genetic gradient, East Slavs together with modern-day Poles and
Sorbs unite in a close group whereas the rest of the West and South Slavs are more diferentiated
/
Sorbs unite in a close group, whereas the rest of the West and South Slavs are more diferentiated
and intermingled with their non-Slavic surroundings. The geographic distributions of the major
eastern European NRY haplogroups (R1a-Z282, I2a-P37) overlap with the area occupied by the
present-day Slavs to a great extent, and it might be tempting to consider both haplogroups as
Slavic-specic patrilineal lineages. Although the Slavic expansion could have contributed to the
spread of the NRY haplogroup R1a-Z282 toward the south, however, available age estimates
suggest that the current diversity within it predates the Slavic expansion (Karmin et al. 2015).
This agrees with haplogroup R1a-Z282 being a part of a pre-Slavic substrate of eastern Europe;
additional full NRY sequences of this haplogroup of east Europeans of various linguistic
aliations are needed to resolve its relation to the historic Slavic migration.
Genome-wide variation reveals both ancient and recent layers of the
demographic history of Slavs
The human genome consists of around three billion nucleotides, millions of which are used
nowadays in evolutionary studies due to advancements in technologies of whole genome
genotyping and next-generation sequencing (Pagani et al. 2016). Methods based on the analysis
of single nucleotide positions’ frequencies (e.g., principal component analysis, model-based
cluster analysis; Alexander et al. 2009; Patterson et al. 2006) are informative about the deep
history of human populations, while analyses of long segments of the genome are sensitive to
more recent demographic events (Browning and Browning 2012; Hellenthal et al. 2014; Ralph
and Coop 2013).
The analysis of whole genome variation among contemporary European populations revealed
that the genetic structuring within Europe mirrors its geography to a very great extent
(Novembre et al. 2008). Slavic-speaking populations, unsurprisingly, occupy the eastern edge of
this European genomic variation (Novembre et al. 2008). Intra-Slavic genomic structuring is
similar to that observed in the NRY: Slavs form several groups that are distributed along the
north–south line of their geographic origin. Most of the East Slavs (Ukrainians, Belarusians, and
Russians from southern central regions) along with West Slavic Poles and Sorbs group together,
whereas Russians from northern regions stand apart from the group but in genetic vicinity to
their Finnic-speaking geographic neighbors, Veps, Karelians, and Finns (Khrunin et al. 2013;
Kushniarevich et al. 2015). The South Slavs form a group with an internal genetic continuum
along a northwestsoutheast line with Slovenes, Croatians, and Bosnians found on the western,
and Macedonians and Bulgarians on its eastern, periphery, with Serbs and Montenegrins being
located in between (Kovacevic et al. 2014; Kushniarevich et al. 2015). West Slavic speakers,
Czechs and Slovaks, stand in between the two groups, which is in agreement with their
geographic position (Kushniarevich et al. 2015).
At the European level of genomic variation, Slavic-speaking populations are intermingled with
their non-Slavic geographic neighbors, being thus incorporated into genetic clines in Europe.
East Slavs are close in their genetic prole to Baltic-speaking Latvians and Lithuanians and
/
East Slavs are close in their genetic prole to Baltic-speaking Latvians and Lithuanians and
Finnic-speaking Estonians, as well as with Mordvins (Mordvinian speakers of the Finno-Ugric
language group; Kushniarevich et al. 2015). Slovaks and, to a greater extent, Czechs (West Slavs)
share similar patterns of genomic variation with western Europeans, while the group of South
Slavs does so with Hungarians and Romanians in the north and Greeks and Albanians toward
the south (Kushniarevich et al. 2015). Kovacevic et al. revealed that genetic similarity crosses the
linguistic and religious borders in the western Balkan region (Kovacevic et al. 2014). The study
has shown that Albanians from Kosovo – Kosovars – although linguistically belonging to a
separate Indo-European branch (Albanian), are an inherent part of the western Balkan genetic
continuum, where they are found on its eastern fringe and overlap with Macedonians and
Greeks (Kovacevic et al. 2014).
A model-based cluster analysis dissects a genome of an individual as a composition of k
ancestral components (Alexander et al. 2009). When applied to modern Slavs in the context of
worldwide variation, two major ancestral components – one spread in northeast Europe and
declining toward south Europe, and the other one spread in southern Europe/Caucasus/Middle
East and reducing northward – were shown to be basal to genomes of central-east and southeast
Europeans including all Slavs (Kushniarevich et al. 2015). In addition to these, two minor
ancestral components contribute diferentially to modern-day Slavic genomes. East Slavs and
predominantly Russians from central and northern regions bear a component that is widely
spread from Siberia to the populations of the Volga region and south Baltics (Kushniarevich et al.
2015). In turn, South Slavs and specically those close to the eastern region of the Balkan
Peninsula (Macedonians and Bulgarians) have minute amounts of the component spread in
South Asia/Caucasus/Near and Middle East (Kushniarevich et al. 2015).
Thus, the genetic structure of today’s Slavs, as revealed from the frequency spectrum of
thousands of single genomic positions, is an inherent part of a structure of east-central and
southeast Europeans irrespective of the language that they speak. A clear diferentiation within
Slavs (e.g., between East and South), on the one hand, and a similarity between Slavs and their
non-Slavic immediate neighbors (e.g., East Slavs and Baltic speakers; South Slavs and
Romanians), on the other, do not support the presence of a pronounced shared “Slavic-specic”
genomic component.
Slavs as a linguistic community had developed mostly during historical time, an insight into the
younger (i.e., more recent) layer of the human genome may contribute to the problem of the
Slavic genesis. Recent genetic history can be approached through an analysis of genomic
segments that are shared between two individuals through a common ancestor in the past.
Those segments are called segments identical by descent (ibd). The length of an ibd segment is
an approximation of its age – as a rule, longer segments correspond to younger ages and vice
versa; the number of the segments reects the extent of relatedness among populations
(Browning and Browning 2012).
/
In a study by Ralph and Coop (2013), the distribution of such ibd segments among diferent
European populations was analyzed. The authors found that Europeans are connected to one
another through numerous ancestors in the past. They demonstrated that Eastern Europeans,
both Slavic and non-Slavic, stand out among others by sharing a particularly high number of ibd
segments, and these segments were dated to the period of 1000–2000 years ago (Ralph and Coop
2013). It has been suggested that such a high level of recent genetic relatedness within eastern
Europe “is consistent with the idea that these populations derive a substantial proportion of
their ancestry from various groups that expanded during the ‘migration period’ from the fourth
through ninth centuries” (Ralph and Coop 2013).
Another study also used genomic segments to genetically characterize and date interactions
(admixture) among Eurasian populations during historical times (Hellenthal et al. 2014). Using a
novel approach, the authors have shown that the genomes of Eastern Europeans, both Slavic and
non-Slavic, can be modeled as a mixture of several genetic sources that are similar (although not
identical) to today’s populations of (i) northeast Asia (e.g., Oroqen, Mongol, and Yakut), (ii) the
Caucasus and the Middle East (including Greek), and (iii) northwest Europe. Interestingly,
admixture events related to the latter, northwest European (or a range of East Europeans in the
original analysis), source might correspond to the spread of Slavs across western Europe in the
early medieval period (Hellenthal et al. 2014).
In a study of (Kushniarevich et al. 2015), the distribution of ibd segments was used to address
specically the genetic heritage of the medieval expansion of Slavs, as suggested in historical
sources, presumably from east-central Europe to the Balkan region. All Slavic speakers were
divided into two groups, East-West Slavs and South Slavs. The authors estimated the number of
ibd segments shared within Slavs (i.e., between the two groups) and compared it to the number
of ibd segments shared between each group of Slavs and the surrounding populations.
Consistently with previous observations, it has been shown that populations across a vast region
in east-central and southeast Europe, mostly Slavic speakers, albeit not exclusively – Baltic
speakers, Finnic speakers of northeast Russia, Hungarians, and Romanians – bear a high number
of shared genomic segments. This pattern of high relatedness declines sharply on the western,
eastern, and southeastern edges of the “Slavic” area (Kushniarevich et al. 2015).
Altogether, long genomic segments distribution in eastern Europe, where Slavs predominate
today but are not an exclusive linguistic group, are compatible with actual movements of people
across this region, presumably within historical time.
The Slavs: Spreading language, moving genes?
The genetic structuring and evolutionary history of Slavic-speaking populations allow one to
review critically the demic or cultural models presumed to underlie the spread of the Slavic
languages.
/
g g
The NRY and genome-wide variation suggest that the genomes of modern Slavs have developed
within two geographically and genetically distinct substrates (Kushniarevich et al. 2015). The
east-central European substrate, uniting the majority of West and East Slavs, as well as Baltic-
speaking Latvians and Lithuanians and Finnic-speaking Estonians, can be characterized as
homogeneous over a large area. Its feature is the predominance of the NRY haplogroups R1a-
M458/M558, the presence of haplogroup N3-Tat, and a large share of the northeast European
ancestral component as seen in genome-wide variation. The other, the southeast European
genetic substrate, is conned to the Balkan region, is more diverse, and is basal to South Slavs
and non-Slavic Balkan people. The major features of this substrate are the high frequency of NRY
haplogroup I2a-P37, the large share of NRY haplogroup E-M35, and the prevalence of a south
European/Middle Eastern ancestral component detected from genome-wide variation
(Kushniarevich et al. 2015). Importantly, populations within each substrate are genetically closer
to one another irrespective of the language that they speak than are the Slavs in the two
substrates.
Studies of ancient human DNA samples from prehistoric Europe suggest several drastic
turnovers in its genetic structure, e.g., the people who brought farming to Europe were
genetically diferent from European hunter-gatherers (Lazaridis et al. 2014; Mathieson et al.
2018). However, it is likely that the basis of the European genetic structure as we see it today had
formed already around the very dynamic Bronze Age and has not changed substantially in the
subsequent period: the modern European genetic landscape mirrors that of Bronze Age Europe
to a great extent (Allentoft et al. 2015). In other words, the modern-day Slavic community
displays the genetic structure (seen in both NRY and frequency-based analyses of genome-wide
data) that apparently existed already in Europe for more than three thousand years. If so, the
cultural assimilation (language shift) of autochthonous people without a substantial
replacement would be the basic model behind the observed genetic structure of modern Slavs.
Indeed, if a vast territory that spans from today’s Poland in the west to European Russia in the
east can be considered a “core Slavic area,” the pre-Slavic genetic layers are very well
distinguished in the genomes of Russians from the northern region of European Russia
(Balanovsky et al. 2008; Khrunin et al. 2013; Kushniarevich et al. 2015; Morozova et al. 2012), as
well as on the western fringes of this area, as seen in the genetic proximity of Czechs to their
German-speaking neighbors (Kushniarevich et al. 2015; Rębała et al. 2013; Woźniak et al. 2010).
In comparison, methods based on ibd segments and haplotype analyses capture more recent
demographic events (roughly 1st mill. CE and later) and provide evidence compatible with
certain movements of people during the early medieval period. The high number of genomic
segments shared among Eastern Europeans, including the groups of West-East and South Slavs,
suggests that the region may have experienced detectable migrations in historical time,
presumably during the 1st millennium CE (Hellenthal et al. 2014; Kushniarevich et al. 2015; Ralph
and Coop 2013). Similarly, NRY haplogroup R1a-Z282, which predominates in Poles and most of
E t Sl t l b t i f ti f i ” i th l ti f th B lk i
/
East Slavs, occurs “at low but informative frequencies” in the populations of the Balkan region,
which might indicate male-mediated gene ow from central Europe toward the south associated
with Slavic dispersion (Underhill et al. 2015). Preliminary age estimates of genomic segments
(centered on the 1st mill. CE; Hellenthal et al. 2014; Ralph and Coop 2013) overlap with both
historical and linguistic data which assume a nonhomogeneous Slavic-speaking population
(Sedov 1995; Sussex and Cubberley 2006) and the breakup of the Proto-Slavic language
(Kushniarevich et al. 2015) in the early 1st millennium CE, followed by a rapid Slavicization of
eastern Europe with further language radiation in the mid-1st millennium CE. These genetic
observations, although requiring further investigation, are an indication of a layer of shared
ancestry in genomes of Slavs that stems from a very recent, supposedly medieval, migration
period.
Thus, the pattern of genetic stratication seen in today’s “Slavic” area most likely took shape
around the Bronze Age, which predates the assumed Slavic expansion for more than three
thousand years. Cultural assimilation (language shift) of pre-Slavic people rather than
resettlement of Slavs over vast regions replacing autochthonous people, seems to be a primary
model for the distribution of Slavic languages. However, some medieval migrations, which have
likely left their traces in long genomic segments, could also have contributed to the Slavicization
of Europe. As we outline below, ancient human DNA studies of medieval Europe might ll some
gaps in the genetic history of Slavs.
Ancient human DNA variation and the Slavs
In addition to the analysis of genetic variability of living humans, DNA from human remains
from time periods spanning from several hundreds to thousands of years ago can also be used to
study human past. Improvements in ancient DNA (aDNA) extraction, advances in sequencing
technologies, and the development of analytical machinery have together revolutionized the
study of human prehistory all over the world, including Europe, and have added important
details missing from the study of more recent historical times (Skoglund and Mathieson 2018).
Most of the aDNA research done to date in Europe has focused on its lengthiest, prehistoric,
period (Skoglund and Mathieson 2018). It has been shown that much of the genetic structure of
modern Europeans can be modeled as a mixture of three key ancestral populations – Mesolithic
hunter-gatherers, farmers from the Near East, and Bronze Age nomads. The geographic spread of
those ancestral populations contributed to the west–east and south–north genetic
diferentiation seen in today’s Europeans (e.g., Bronze Age ancestry is more pronounced in east-
central Europeans; northeastern Europeans bear a substantial portion of a northern Eurasian
component in their genomes). It has been shown that the genetic structure of Europeans took
shape around the end of the Bronze Age and has not changed substantially in the subsequent
period, but that the European population started to grow rapidly around this time (Allentoft et
al. 2015; Veeramah et al. 2018).
/
Subsequent movements and mixture of people during the Iron Age, antiquity, and medieval
times took place when population density was already high, so the genetic impact of these
demographic events is poorly detectable by the available methods and data.
Medieval Europe, including the Europe of the Great Migration period (6th–10th cc. CE), has just
started to draw the attention in aDNA studies (Amorim et al. 2018; Juras et al. 2014; Veeramah et
al. 2018). It is important to remember, however, that the reconstruction of the genetic history of
a group of people related linguistically using aDNA presents diculties that are hard to
overcome. One of the problems is that the linguistic aliation of people whose remains are
analyzed cannot be established in most cases. The genetics of people from any particular
archaeological culture is often linked with a certain linguistic group indirectly guided by
information from written source – a methodology that leads to substantial uncertainties.
Another problem with the reconstruction of the early history of Slavs using aDNA is their long-
lasting burial tradition that favored cremation. It is likely that Slavic peoples cremated their dead
up until the 9th and 10th centuries CE, hence leaving no material to be analyzed genetically.
Keeping these cautionary notes in mind, aDNA studies relevant to the history of the Slavs are
outlined below.
Juras et al. (2014) have analyzed the mtDNA diversity of two sample sets from the territory of
modern Poland from the Roman Iron Age (200 BCE–500 CE) and the Medieval Age (1000–1400
CE). This was the rst study that addressed the two time points that pre- and postdate the
migration period in east-central Europe and covered the territory close to or overlapping with
the supposed origin of the dispersal of the Slavs. It has been demonstrated that matrilineal pools
of people that lived nearly one thousand years apart from each other belong to the West
Eurasian type. Comparison of mtDNA from the Roman Iron age, the medieval age, modern
Poles, and other Europeans has revealed a number of identical haplotypes. This observation led
the authors to conclude that there is continuity in the matrilineal gene pool in the territory of
modern Poland for at least two thousand years, which agrees with the autochthonous theory of
Slavic origins (Juras et al. 2014), although the genetic pattern may also serve as evidence of the
survival of a pre-Slavic matrilineal substrate in the Slavicized people on the territory of modern
Poland.
Like the medieval Poles, the medieval populations (9th–12th cc. CE) on the territory of
contemporary Slovakia and the proto-Bulgars from the modern Bulgarian region (8th–10th cc.
CE) bear typical West Eurasian mtDNA and demonstrate similarity with modern-day
populations from the respective territories (Csákyová et al. 2016; Nesheva et al. 2015).
However, such genetic continuity between medieval and modern populations of central Europe
is not the sole demographic scenario observed in this region. It has been demonstrated, for
example, that the medieval population of Hungary was genetically diverse and included inter
alia East Eurasian mtDNA, which is virtually absent in today’s Hungarians (Tömöry et al. 2007).
/
, y y g ( y 7)
Studies of medieval Hungary show that frequency-based analyses of ancient mtDNA are
informative in detecting female-mediated gene ow (which is a proxy for population migration)
from outside Europe – as in the case of medieval Hungarian conquerors – when source and
receiving populations originate from remote and genetically diferentiated regions (in this case,
Asia vs. Europe).
However, migrations within Europe during the medieval period might be masked by the overall
uniformity of mtDNA and hence require information from full mitogenomes as well as whole
genomes. Indeed, Amorim et al. (2018) used whole-genome information of human remains from
cemeteries associated with the Germanic-speaking Lombards of the 6th–7th centuries CE and
showed that their genetic proles support movements from central to southern Europe. Further
genetic evidence of the dynamism of the early medieval period in Europe comes from a study by
Veeramah et al. (2018). The authors analyzed whole genomes of people who lived in the 5th–6th
centuries CE in the territory of modern Bavaria (south Germany) and revealed that the females
were highly diverse and some of them were of southeast European origin. Although there is no
paleogenomic study dedicated to Slavs based on whole-genome variation, the above examples
demonstrate the great potential of aDNA for our understanding of the demography of recent
periods in Europe.
To sum up, the available data on the mtDNA diversity of medieval Slavs from central Europe
suggest genetic continuity over at least one thousand years and are not consistent with massive
movements of people. However, as it is argued above, mtDNA is not enough to study the
demography of medieval central-east Europe, and whole-genome information from skeletal
remains of Slavic-associated sites is needed to complete the story of the genesis of Slavs.
Concluding remarks
Modern-day Slavic-speaking populations occupy a vast territory in east-central and southeast
Europe and are one of the most numerous language groups in Europe. The genetic structuring of
Slavic populations mirrors to a great extent their current geographic location, revealing a
continuum with most West and East Slavs being on its northern pole and South Slavs on its
southern pole, respectively. High genetic anity has been demonstrated for the Slavic
populations residing today within the vast Eastern European Plain: Sorbs, Poles, Belarusians,
Ukrainians, and Russians from the central-southern regions of western Russia. In contrast, the
South Slavs demonstrate a relatively higher genetic heterogeneity. The genetic structuring
observed in today’s Slavs likely predates the historical Slavic expansion. The genomes of the
modern Slavs have evolved primarily in situ within two diferent genetic backgrounds – the
central-east European substrate for most of West and East Slavs, and the southeast European
substrate for the South Slavs. Some genetic data are suggestive that the spread of Slavic
languages and the establishment of Slavic community might have involved movements of
people between east-central and southeast Europe during the medieval period However
/
people between east central and southeast Europe during the medieval period. However,
cultural assimilation (language shift) of autochthonous people rather than their replacement by
expanding Slavs seems to have played a key role in the expansion of Slavic languages. Whole
genome sequences of modern Slavs and of skeletal remains from Slavic-associated
archaeological sites will be further needed to get insight into the all layers of the genetic history
of the Slavs.
Alena Kushniarevich
Alexei Kassian
Bibliography
Alexander, David H., et al. 2009. Fast model-based estimation of ancestry in unrelated
individuals.Genome research19, 1655–1664. <https://doi.org/10.1101/gr.094052.109>
Allentoft, Morten E., et al. 2015. Population genomics of Bronze Age Eurasia.Nature522, 167–172.
<https://doi.org/10.1038/nature14507>
Amorim, Carlos Eduardo G., et al. 2018. Understanding 6th-century barbarian social organization
and migration through paleogenomics.Nature communications9, 3547.
<https://doi.org/10.1038/s41467-018-06024-4>
Balanovsky, Oleg, et al. 2008. Two sources of the Russian patrilineal heritage in their Eurasian
context.American journal of human genetics82, 236–250.
<https://doi.org/10.1016/j.ajhg.2007.09.019>
Barford, Paul M. 2001.The early Slavs: Culture and society in early medieval Eastern Europe. Ithaca.
Browning, Sharon R., and Brian L. Browning. 2012. Identity by descent between distant relatives:
Detection and applications.Annual review of genetics46, 617–633.
<https://doi.org/10.1146/annurev-genet-110711-155534>
Csákyová, Veronika, et al. 2016. Maternal genetic composition of a medieval population from a
Hungarian-Slavic contact zone in Central Europe.PLoS ONE11, e0151206.
<https://doi.org/10.1371/journal.pone.0151206>
Curta, Florin. 2001.The making of the Slavs: History and archaeology of the Lower Danube Region,
c. 500–700. Cambridge UK.
Davidovic, Slobodan, et al. 2017. Mitochondrial super-haplogroup U diversity in Serbians.Annals
of human biology44, 408–418. <https://doi.org/10.1080/03014460.2017.1287954>
/
Ferri, Gianmarco, et al. 2010. Y-STR variation in Albanian populations: Implications on the match
probabilities and the genetic legacy of the minority claiming an Egyptian descent.International
journal of legal medicine124/5, 363-370. <https://doi: 10.1007/s00414-010-0432-x>
Heather, Peter. 2010.Empires and Barbarians: The fall of Rome and the birth of Europe. Oxford.
Hellenthal, Garrett, et al. 2014. A genetic atlas of human admixture history.Science343, 747–751.
<https://doi.org/10.1126/science.1243518>
Ilumäe, Anne-Mai, et al. 2016. Human Y chromosome haplogroup N: A non-trivial time-resolved
phylogeography that cuts across language families.American journal of human genetics99, 163–
173. <https://doi.org/10.1016/j.ajhg.2016.05.025>
Juras, Anna, et al. 2014. Ancient DNA reveals matrilineal continuity in present-day Poland over
the last two millennia.PLoS ONE9, e110839. <https://doi.org/10.1371/journal.pone.0110839>
Karachanak-Yankova, Sena, et al. 2017. The uniparental genetic landscape of modern Slavic-
speaking populations.Advances in anthropology07, 318–332.
<https://doi.org/10.4236/aa.2017.74018>
Karmin, Monika, et al. 2015. A recent bottleneck of Y chromosome diversity coincides with a
global change in culture.Genome research25, 459–466. <https://doi.org/10.1101/gr.186684.114>
Kayser, Manfred, et al. 2005. Signicant genetic diferentiation between Poland and Germany
follows present-day political borders, as revealed by Y-chromosome analysis.Human genetics117,
428–443. <https://doi.org/10.1007/s00439-005-1333-9>
Khrunin, Andrey V., et al. 2013. A genome-wide analysis of populations from European Russia
reveals a new pole of genetic diversity in Northern Europe.PLoS ONE8.
<https://doi.org/10.1371/journal.pone.0058552>
Kovacevic, Lejla, et al. 2014. Standing at the gateway to Europe: The genetic structure of Western
Balkan populations based on autosomal and haploid markers.PLoS ONE9, e105090.
<https://doi.org/10.1371/journal.pone.0105090>
Kushniarevich, Alena, et al. 2013. Uniparental genetic heritage of Belarusians: Encounter of rare
middle eastern matrilineages with a Central European mitochondrial DNA Pool.PLoS ONE8,
e66499. <https://doi.org/10.1371/journal.pone.0066499>
Kushniarevich, Alena, et al. 2015. Genetic heritage of the Balto-Slavic speaking populations:
synthesis of autosomal, mitochondrial and Y-chromosomal data.PLoS ONE10, e0135820.
<https://doi.org/10.1371/journal.pone.0135820>
/
Lazaridis, Iosif, et al. 2014. Ancient human genomes suggest three ancestral populations for
present-day Europeans.Nature513, 409–413. <https://doi.org/10.1038/nature13673>
Loogväli, Eva-Liis, et al. 2004. Disuniting uniformity: A pied cladistic canvas of mtDNA
haplogroup H in Eurasia.Molecular biology and evolution21, 2012–2021.
<https://doi.org/10.1093/molbev/msh209>
Malyarchuk, Boris A., et al. 2008a. Reconstructing the phylogeny of African mitochondrial DNA
lineages in Slavs.European journal of human genetics16, 1091–1096.
<https://doi.org/10.1038/ejhg.2008.70>
Malyarchuk, Boris A., et al. 2008b. Mitochondrial DNA phylogeny in Eastern and Western
Slavs.Molecular biology and evolution25, 1651–1658. <https://doi.org/10.1093/molbev/msn114>
Malyarchuk, Boris A., et al. 2008c. On the origin of Mongoloid component in the mitochondrial
gene pool of Slavs.Genetika44, 401–406.
Malyarchuk, Boris, et al. 2010. The peopling of Europe from the mitochondrial haplogroup U5
perspective.PLoS ONE5, e10285. <https://doi.org/10.1371/journal.pone.0010285>
Mathieson, Iain, et al. 2018. The genomic history of southeastern Europe.Nature555, 197–203.
<https://doi.org/10.1038/nature25778>
Mielnik-Sikorska, et al. 2013. The history of Slavs inferred from complete mitochondrial genome
sequences.PLoS ONE8, e54360. <https://doi.org/10.1371/journal.pone.0054360>
Morozova, Irina, et al. 2012. Russian ethnic history inferred from mitochondrial DNA
diversity.American Journal of physical anthropology147, 341–351.
<https://doi.org/10.1002/ajpa.21649>
Nesheva, Desislava Valentinova, et al. 2015. Mitochondrial DNA suggests a Western Eurasian
origin for Ancient (Proto-) Bulgarians.Human biology87, 19–28.
Novembre, John, et al. 2008. Genes mirror geography within Europe.Nature456, 98–101.
<https://doi.org/10.1038/nature07331>
Oven M. van, and M. Kayser 2009. Updated comprehensive phylogenetic tree of global human
mitochondrial DNA variation.Hum mutat30/2, e386–e394. <http://www.phylotree.org>
(08.04.2020). <https://doi:10.1002/humu.20921>
Pagani, Luca, et al. 2016. Genomic analyses inform on migration events during the peopling of
Eurasia.Nature538, 238–242. <https://doi.org/10.1038/nature19792>
/
Patterson, Nick, et al. 2006. Population structure and eigenanalysis.PLoS Genetics2, e190.
<https://doi.org/10.1371/journal.pgen.0020190>
Peričić, Marijana, et al. 2005. High-resolution phylogenetic analysis of southeastern Europe
traces major episodes of paternal gene ow among Slavic populations.Molecular biology and
evolution22, 1964–1975. <https://doi.org/10.1093/molbev/msi185>
Ralph, Peter, and Graham Coop. 2013. The geography of recent genetic ancestry across
Europe.PLoS Biology11, e1001555. <https://doi.org/10.1371/journal.pbio.1001555>
Rębała, Krzysztof, et al. 2013. Contemporary paternal genetic landscape of Polish and German
populations: From early medieval Slavic expansion to post-World War II resettlements.European
journal of human genetics21, 415–422. <https://doi.org/10.1038/ejhg.2012.190>
Rootsi, Siiri, et al. 2004. Phylogeography of Y-chromosome haplogroup I reveals distinct domains
of prehistoric gene ow in Europe.American journal of human genetics75, 128–137.
<https://doi.org/10.1086/422196>
Rosser, Zoë H., et al. 2000. Y-chromosomal diversity in Europe is clinal and inuenced primarily
by geography, rather than by language.American journal of human genetics67, 1526–1543.
<https://doi.org/10.1086/316890>
Šarac, Jelena, et al. 2014. Maternal genetic heritage of Southeastern Europe reveals a new
Croatian isolate and a novel, local sub-branching in the X2 haplogroup.Annals of human
genetics78, 178–194. <https://doi.org/10.1111/ahg.12056>
Sedov, Valentin V. 1979.Proisxoždenie i rannjaja istorija slavjan. Moscow.
Sedov, Valentin V. 1995.Slavjane v rannem srednevekov´e. Moscow.
Sedov, Valentin V. 2002.Slavjane: Istoriko-arxeologičeskoe issledovanie. Moscow.
Simons, Gary F., and Charles D. Fennig (eds.). 2018.Ethnologue: Languages of the world. Dallas.
<http://www.ethnologue.com> (20.04.2020).
Skoglund, Pontus, and Iain Mathieson. 2018. Ancient genomics of modern humans: The rst
decade.Annual review of genomics and human genetics19, 381–404.
<https://doi.org/10.1146/annurev-genom-083117-021749>
Sussex, Roland, and Paul Cubberley. 2006.The Slavic languages. Cambridge UK.
21
/
Tömöry, Gyöngyvér, et al. 2007. Comparison of maternal lineage and biogeographic analyses of
ancient and modern Hungarian populations.American journal of physical anthropology134, 354–
368. <https://doi.org/10.1002/ajpa.20677>
Underhill, Peter A. and Toomas Kivisild. 2007. Use of y chromosome and mitochondrial DNA
population structure in tracing human migrations.Annual review of genetics41, 539–564.
<https://doi.org/10.1146/annurev.genet.41.110306.130407>
Underhill, Peter A., et al. 2015. The phylogenetic and geographic structure of Y-chromosome
haplogroup R1a.European journal of human genetics23, 124–131.
<https://doi.org/10.1038/ejhg.2014.50>
Veeramah, Krishna R., et al. 2018. Population genomic analysis of elongated skulls reveals
extensive female-biased immigration in Early Medieval Bavaria.PNAS115, 3494–3499.
<https://doi.org/10.1073/pnas.1719880115>.
Woźniak, Marcin, et al. 2010. Similarities and distinctions in Y chromosome gene pool of Western
Slavs.American journal of physical anthropology142, 540–548.
<https://doi.org/10.1002/ajpa.21253>
Cite this page
Kushniarevich, Alena and Kassian, Alexei, “Genetics and Slavic Languages”, in: Encyclopedia of Slavic Languages and Linguistics Online, Editor-in-Chief
Marc L. Greenberg. Consulted online on 05 June 2020 <http://dx.doi.org/10.1163/2589-6229_ESLO_COM_032367>
First published online: 2020
... This scheme is widely accepted in historical and comparative linguistics. Other resultant classifications which took geographical distribution, genetic variations of language speakers, or historical and political differences as contributing evidences yielded similar results, confirming the robustness of the tripartite division (Grenoble, 2010;Kushniarevich et al., 2015;Kushniarevich and Kassian, 2020). 2 These methods attempted to restore the process of language evolution by considering the geographical and cultural factors of the language speakers. ...
Article
This study proposes a linguistic classification method based on quantitative typology, which leverages a large-scale multilingual parallel corpus to obtain valid language classification result by excluding the influence of covariates such as text genre and semantic content in cross-language comparison. To achieve this, we model the type–token relationships of each Slavic parallel text and calculate the lexical diversity to approximate the morphological complexity of the language. We perform automatic clustering of languages based on these lexical diversity metrics. Our findings show that (1) the lexical diversity metrics can well reflect that the language is located somewhere on the continuum of ‘analytism-synthetism’; (2) the automatic clustering based on these metrics effectively reflects the genealogical classification of Slavic languages; and (3) the geographical distribution of lexical diversity in the region where Slavic languages are spoken shows a monotonic increasing trend from southwest to northeast, which is consistent with the pattern found by previous authors on a global scale. The methodological approach taken in this study is data-driven, with the benefit of being independent of theoretical assumptions and easy for computer processing. This approach can offer a better insight into corpus-based typology and may shed light on the understanding of language as a human-driven complex adaptive system.
Article
Ancient (proto-) Bulgarians have long been thought of as a Turkic population. However, evidence found in the past three decades shows that this is not the case. Until now, this evidence has not included ancient mitochondrial DNA (mtDNA) analysis. To fijill this void, we collected human remains from the 8th to the 10th century ad located in three necropolises in Bulgaria: Nojarevo (Silistra region) and Monastery of Mostich (Shumen region), both in northeastern Bulgaria, and Tuhovishte (Satovcha region) in southwestern Bulgaria. The phylogenetic analysis of 13 ancient DNA samples (extracted from teeth) identifijied 12 independent haplotypes, which we further classifijied into mtDNA haplogroups found in present-day European and western Eurasian populations. Our results suggest a western Eurasian matrilineal origin for proto-Bulgarians, as well as a genetic similarity between proto- and modern Bulgarians. Our future work will provide additional data that will further clarify proto-Bulgarian origins, thereby adding new clues to the current understanding of European genetic evolution.
Article
The data on mitochondrial DNA (mtDNA) restriction polymorphism in Czech population (n = 279) are presented. It was demonstrated that in terms of their structure, mitochondrial gene pools of Czechs and other Slavic populations (Russians, Poles, Slovenians, and Bosnians) were practically indistinguishable. In Czechs, the frequency of eastern-Eurasian (Mongoloid) mtDNA lineages constituted 1.8%. The spread of eastern-Eurasian mtDNA lineages belonging to different ethnolinguistic groups in the populations of Europe was examined. Frequency variations of these DNA lineages in different Slavic groups was observed, with the range from 1.2 and 1.6% in Southern and Western Slavs, respectively, to 1.3 to 5.2% in Eastern Slavs, the Russian population of Eastern Europe. The highest frequency of Mongoloid component was detected in the mitochondrial gene pools of Russian populations from the Russian North and the Northwestern region of Russia. This finding can be explained in terms of assimilation of northern-European Finno-Ugric populations during the formation of the Russian population of these regions. The origin of Mongoloid component in the gene pools of different groups of Slavs is discussed.
Article
Clinal patterns of autosomal genetic diversity within Europe have been interpreted in previous studies in terms of a Neolithic demic diffusion model for the spread of agriculture; in contrast, studies using mtDNA have traced many founding lineages to the Paleolithic and have not shown strongly clinal variation. We have used 11 human Y-chromosomal biallelic polymorphisms, defining 10 haplogroups, to analyze a sample of 3,616 Y chromosomes belonging to 47 European and circum-European populations. Patterns of geographic differentiation are highly nonrandom, and, when they are assessed using spatial autocorrelation analysis, they show significant clines for five of six haplogroups analyzed. Clines for two haplogroups, representing 45% of the chromosomes, are continentwide and consistent with the demic diffusion hypothesis. Clines for three other haplogroups each have different foci and are more regionally restricted and are likely to reflect distinct population movements, including one from north of the Black Sea. Principal-components analysis suggests that populations are related primarily on the basis of geography, rather than on the basis of linguistic affinity. This is confirmed in Mantel tests, which show a strong and highly significant partial correlation between genetics and geography but a low, nonsignificant partial correlation between genetics and language. Genetic-barrier analysis also indicates the primacy of geography in the shaping of patterns of variation. These patterns retain a strong signal of expansion from the Near East but also suggest that the demographic history of Europe has been complex and influenced by other major population movements, as well as by linguistic and geographic heterogeneities and the effects of drift.
Article
Including Bosnian, Russian, Polish and Slovak, the Slavic group of languages is the fourth largest Indo-European sub-group. Spoken by 297 million people, it is one of the major language families of the modern world. This book presents a survey of all aspects of the linguistic structure of the Slavic languages. Roland Sussex and Paul Cubberley cover Slavic dialects and sociolinguistic issues, and the socio-historical evolution of the Slavic languages, in addition to general linguistic topics. © Roland Sussex and Paul Cubberley 2006 and Cambridge University Press, 2009.
Fast model-based estimation of ancestry in unrelated individuals
  • David H Alexander
Alexander, David H., et al. 2009. Fast model-based estimation of ancestry in unrelated individuals. Genome research 19, 1655-1664. <https://doi.org/10.1101/gr.094052.109>
Population genomics of Bronze Age Eurasia
  • Morten E Allentoft
Allentoft, Morten E., et al. 2015. Population genomics of Bronze Age Eurasia. Nature 522, 167-172. <https://doi.org/10.1038/nature14507>