ArticlePDF Available

Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian Migrations

Authors:

Abstract and Figures

Understanding human genetic diversity in Africa is important for interpreting the evolution of all humans, yet vast regions in Africa, such as Chad, remain genetically poorly investigated. Here, we use genotype data from 480 samples from Chad, the Near East, and southern Europe, as well as whole-genome sequencing from 19 of them, to show that many populations today derive their genomes from ancient African-Eurasian admixtures. We found evidence of early Eurasian backflow to Africa in people speaking the unclassified isolate Laal language in southern Chad and estimate from linkage-disequilibrium decay that this occurred 4,750–7,200 years ago. It brought to Africa a Y chromosome lineage (R1b-V88) whose closest relatives are widespread in present-day Eurasia; we estimate from sequence data that the Chad R1b-V88 Y chromosomes coalesced 5,700–7,300 years ago. This migration could thus have originated among Near Eastern farmers during the African Humid Period. We also found that the previously documented Eurasian backflow into Africa, which occurred ∼3,000 years ago and was thought to be mostly limited to East Africa, had a more westward impact affecting populations in northern Chad, such as the Toubou, who have 20%–30% Eurasian ancestry today. We observed a decline in heterozygosity in admixed Africans and found that the Eurasian admixture can bias inferences on their coalescent history and confound genetic signals from adaptation and archaic introgression.
Content may be subject to copyright.
ARTICLE
Chad Genetic Diversity Reveals an African History
Marked by Multiple Holocene Eurasian Migrations
Marc Haber,
1,
*Massimo Mezzavilla,
1,2
Anders Bergstro
¨m,
1
Javier Prado-Martinez,
1
Pille Hallast,
1,3
Riyadh Saif-Ali,
4
Molham Al-Habori,
4
George Dedoussis,
5
Eleftheria Zeggini,
1
Jason Blue-Smith,
6,10
R. Spencer Wells,
7
Yali Xue,
1
Pierre A. Zalloua,
8,9
and Chris Tyler-Smith
1,
*
Understanding human genetic diversity in Africa is important for interpreting the evolution of all humans, yet vast regions in Africa,
such as Chad, remain genetically poorly investigated. Here, we use genotype data from 480 samples from Chad, the Near East, and south-
ern Europe, as well as whole-genome sequencing from 19 of them, to show that many populations today derive their genomes from
ancient African-Eurasian admixtures. We found evidence of early Eurasian backflow to Africa in people speaking the unclassified isolate
Laal language in southern Chad and estimate from linkage-disequilibrium decay that this occurred 4,750–7,200 years ago. It brought to
Africa a Y chromosome lineage (R1b-V88) whose closest relatives are widespread in present-day Eurasia; we estimate from sequence data
that the Chad R1b-V88 Y chromosomes coalesced 5,700–7,300 years ago. This migration could thus have originated among Near Eastern
farmers during the African Humid Period. We also found that the previously documented Eurasian backflow into Africa, which occurred
~3,000 years ago and was thought to be mostly limited to East Africa, had a more westward impact affecting populations in northern
Chad, such as the Toubou, who have 20%–30% Eurasian ancestry today. We observed a decline in heterozygosity in admixed Africans
and found that the Eurasian admixture can bias inferences on their coalescent history and confound genetic signals from adaptation and
archaic introgression.
Introduction
African genetic diversity is still incompletely understood,
and vast regions in Africa remain genetically undocu-
mented. Chad, for example, makes up ~5% of Africa’s sur-
face area, and its central location, connecting sub-Saharan
Africa with North and East Africa, positions it to play an
important role as a crossroad or barrier to human migra-
tions. However, Chad has been little studied at a whole-
genome level, and its position within African genetic diver-
sity is not well known. With 200 ethnic groups and more
than 120 indigenous languages and dialects, Chad has
extensive ethnolinguistic diversity.
1
It has been suggested
that this diversity can be attributed to Lake Chad, which
has attracted human populations to its fertile surroundings
since prehistoric times, especially after the progressive
desiccation of the Sahara starting ~7,000 years ago (ya).
2,3
Important questions about Africa’s ethnic diversity are
the relationships among the different groups and the rela-
tionships between cultural groups and existing genetic
structures. In the present study, we analyzed four Chadian
populations with different ethnicities, languages, and
modes of subsistence. Our samples are likely to capture
recent genetic signals of migration and mixing and also
have the potential to show ancestral genomic relationships
that are shared among Chadians and other populations.
An additional major question relates to the prehistoric
Eurasian migrations to Africa: what was the extent of these
migrations, how have they affected African genetic diver-
sity, and what present-day populations harbor genetic
signals from the ancient migrating Eurasians? We have
previously reported evidence of gene flow from the Near
East to East Africa ~3,000 ya, as well as subsequent selec-
tion in Ethiopians on non-African-derived alleles related
to light skin pigmentation.
4
A recent attempt to quantify
the extent of such backflow into Africa more generally,
by using ancient DNA (aDNA), suggested that the impact
of the Eurasian migration was mostly limited to East
Africa.
5
However, previous studies using mitochondrial
DNA and the Y chromosome in populations from the
Chad Basin found some with an East African
6
or Mediter-
ranean and Eurasian influence,
7,8
and analysis based on
genome-wide data
9
found a non-African component (sug-
gested to be from East Africa) in central Sahelian popula-
tions. Thus, studying diverse Chadian populations on a
whole-genome level presents an opportunity to shed
more light on the history of African-Eurasian mixtures,
including whether or not selection after admixture is a
widespread phenomenon in Africa and how the historical
events in Chad are related to events that have occurred
elsewhere in Africa and the Near East.
In this work, we present a genetic dataset of 480
Chadian, Near Eastern, and European individuals geno-
typed at 2.5 million SNPs, in addition to high-coverage
1
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK;
2
Institute for Maternal and Child Health, IRCCS Burlo
Garofolo, University of Trieste, 34137 Trieste, Italy;
3
Institute of Molecular and Cell Biology, University of Tartu, Tartu 51010, Estonia;
4
Department of
Biochemistry and Molecular Biology, Faculty of Medicine and Health Sciences, Sana’a University, Sana’a 19065, Yemen;
5
Department of Nutrition and Di-
etetics, Harokopio University Athens, Athens 17671, Greece;
6
National Geographic Society, Washington, DC 20036, USA;
7
Department of Integrative
Biology, University of Texas at Austin, Austin, TX 78712, USA;
8
Lebanese American University, Chouran, Beirut 1102 2801, Lebanon;
9
Harvard T.H.
Chan School of Public Health, Boston, MA 02115, USA
10
Present address: Karius, Inc., 1505A Adams Drive, Menlo Park, CA 94025, USA
*Correspondence: mh25@sanger.ac.uk (M.H.), cts@sanger.ac.uk (C.T.-S.)
http://dx.doi.org/10.1016/j.ajhg.2016.10.012.
The American Journal of Human Genetics 99, 1–9, December 1, 2016 1
Ó2016 The Author(s). This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Please cite this article in press as: Haber et al., Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian
Migrations, The American Journal of Human Genetics (2016), http://dx.doi.org/10.1016/j.ajhg.2016.10.012
whole-genome sequences from 19 of these individuals.
From Chad, we studied (1) the Toubou, who are nomads
from northern Chad and speak a Nilo-Saharan language;
(2) the Sara, who are a sedentary population from southern
Chad and also speak a Nilo-Saharan language; (3) the Laal
speakers, a population of just ~750 individuals who speak
an unclassified language isolate and live in southern Chad;
and (4) an urban population from the capital city of
N’Djamena. In addition to the Chadians, we included
Greek, Lebanese, and Yemen samples whose location and
history suggest that they might be informative about
early African-Eurasian migrations. We used this dataset to
advance our understanding of human genetic diversity in
Africa and neighboring regions by focusing on population
migration and mixing and how the admixture process has
shaped present-day genetic variation.
Subjects and Methods
Samples and Data
Samples were collected from Chad (238), Lebanon (126), Greece
(96), and Yemen (20) (Figure 1A); details can be found in Table
S1. All samples (except for those from Greece) were genotyped
with the Illumina HumanOmni2.5-8 BeadChip, which covers
~2.5 million SNPs. Greek genotype information for the 2.5 million
sites was extracted from sequence data (E.Z., unpublished data)
and merged with array data from other populations. In addition,
19 samples (Chad [11], Greece [4], and Lebanon [4]) were whole-
genome sequenced at >303depth with Illumina HiSeq X Ten
or HiSeq 2500 technology. Genotyping and sequencing were
completed at the Wellcome Trust Sanger Institute. Informed con-
sent was obtained from the studied subjects, and the use of the
samples in genetic studies was approved by the Human Materials
and Data Management Committee at the Wellcome Trust Sanger
Institute (approval numbers 09/056 and 14/072) and by the insti-
tutional review board (number SMPZ121307-02) of the Lebanese
American University.
The genotyping data were merged with data from the African
Genome Variation Project,
10
the 1000 Genomes Project,
11
and Pa-
gani et al.,
12
resulting in a combined dataset of ~1.1 million SNPs
in 2,453 samples. Analyses including ancient genomes involved
merging the panel described above with the Haak et al. data-
set,
13
resulting in ~90,000 SNPs in common. Comparative
whole-genome sequences were obtained from Pagani et al.
12
and
Complete Genomics.
14
Genotype data were processed with PLINK:
15
the SNP genotype
success rate required was set to 99%, whereas SNPs with a minor
allele frequency <0.001 or Hardy-Weinberg p value <0.000001
were removed. Genotypes from sequence data were called with
SAMtools v.1.2
16
and BCFtools v.1.2 with the command ‘‘samtools
mpileup -q 20 -Q 20 -C 50 jbcftools call -c -V indels.’’ Concordance
with array genotypes had a rate of 0.999. Phasing was carried out
with SHAPEIT
17
with 1000 Genomes Project phase 3 haplotypes
18
as a reference panel.
Population Structure and Gene Flow
Principal components were computed with EIGENSOFT v.4.2.
19
Effective population size and rates of gene flow were inferred
by the multiple sequentially Markovian coalescent (MSMC)
approach
20
with four high-coverage phased genomes from each
population. We assumed a generation time of 30 years and a
mutation rate of 1.25 310
8
mutations per nucleotide per gener-
ation. Admixture masks to identify African and Eurasian segments
within mixed high-coverage genomes were generated with PCAd-
mix
21
including two ancestral populations based on the 1000
Genomes Project YRI (Yoruba in Ibadan, Nigeria) and CEU (Utah
residents with northern and western European ancestry from the
CEPH collection) populations. 1 cM windows with a posterior
probability of >0.9 for the most likely ancestral state were
collected and used for creating African and Eurasian masks.
Figure 1. Population Locations and Genetic Structure
(A) The map shows the location of newly genotyped or sequenced populations.
(B) PCA of worldwide populations shows that Near Easterners and East Africans are intermediate to Eurasians and sub-Saharan Africans
on PC1. Chad populations are close to sub-Saharan Africans and have some samples drawn toward Ethiopians.
(C) Magnification of the African PCA shows different affinities of the Chad populations to other Africans: the Toubou cluster close to
Ethiopians, whereas the Sara and Laal speakers are close to the Yoruba. The mixed samples from N’Djamena, the capital, are intermediate
to the Toubou, Sara, and Laal speakers.
2The American Journal of Human Genetics 99, 1–9, December 1, 2016
Please cite this article in press as: Haber et al., Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian
Migrations, The American Journal of Human Genetics (2016), http://dx.doi.org/10.1016/j.ajhg.2016.10.012
Phylogenetic analysis of whole Y chromosome sequences was
carried out as described in Bergstro
¨m et al.
22
Internal node ages
were estimated with the rho-statistic
23
and converted to units of
years by application of a Y chromosome mutation rate of 0.76 3
10
9
(95% confidence interval [CI] ¼0.67 310
9
to 0.86 3
10
9
) mutations per site per year.
24
Additionally, Y chromosome
haplogroups were defined to the highest resolution possible with
636 SNPs from the array data that overlapped International Soci-
ety of Genetic Genealogy (ISOGG) markers.
Admixture Analysis
Population-mixture signals and proportions were tested with
qp3Pop, qpDstat, and qpF4Ratio from the ADMIXTOOLS pack-
age.
25
Admixture proportions were additionally estimated with
ADMIXTURE v.1.3.0.
26
ALDER
27
and MALDER
28
were used to
date the time of admixture with all pairs of African-Eurasian popu-
lations as references. Significant results with a p value <0.01 were
collected and plotted.
Measure of Heterozygosity and Simulations
Heterozygosity on a per-individual basis was estimated with
VCFtools v.0.1.13
29
for ~2.17 Gb of the uniquely mappable
genome.
11
Heterozygosity was also estimated after correction for
recent inbreeding via the removal of long runs of homozygosity
(>2 Mb). We investigated the effect of gene flow on the observed
heterozygosity by using individual-based forward-time simula-
tions implemented in SimuPOP v.1.1.7.
30
Selection after Admixture
Evidence for positive selection was tested with the population
branch statistic (PBS)
31
with correction for the long-term effective
population size.
32
We constructed a tree with the Toubou popula-
tion branching from the Laal speakers and the Chinese Han as an
outgroup. We collected values above the 95
th
percentile of the PBS
distribution and looked for variants previously reported under
putative selection in Europeans.
Results
Genetic Structure in Chad Indicates a Complex
Admixture History
We performed an initial exploration of our dataset by
using principal-component analysis (PCA).
19
The first
component (PC1) captured the genetic differentiation be-
tween Africans and Eurasians (Figure 1B). Populations
such as the Near Easterners and North and East Africans
fell between the Europeans and sub-Saharan Africans.
The Chadian groups lay near the sub-Saharan Africans:
the Sara and Laal speakers clustered tightly with sub-
Saharan Africans, such as the Yoruba, whereas the Toubou
were somewhat more distant and appeared drawn toward
East Africans, such as the Ethiopians. Samples collected
from the capital of Chad, N’Djamena, appeared in a
central position between the Toubou cluster and the
Sara and Laal cluster (Figure 1C). Many individuals from
N’Djamena have not reported their ethnicity or have re-
ported a mixed ethnic origin. Therefore, recent mixture
could be responsible for their position on the PCA.
We further investigated the genetic variation in Chad by
estimating changes in the effective populations size (N
e
)
over time via the MSMC approach.
20
Eurasians and Afri-
cans diverged around 60,000–80,000 ya and subsequently
had different patterns of population-size changes: in
particular, compared with Africans, the Eurasian popula-
tion experienced a sharp decrease in size ~60,000 ya.
20
We observed this expected pattern in most populations
in our dataset (Figure 2), but a few stood out: (1) Egyptians
had a population bottleneck that was much more pro-
nounced than that of other Africans but not as sharp as
that of Eurasian populations; and (2) the Toubou and Ethi-
opians shared a very similar pattern during the bottleneck:
they were close to other Africans but had a somewhat
sharper decrease in population size (Figure 2). We would
not expect such different fluctuations in population sizes
at 60,000 ya in populations who shared a common origin
during this period. For example, all Eurasians trace their
origin to a population who exited Africa ~60,000 ya, and
this is reflected in indistinguishable N
e
patterns during
this period,
20,33
which we also observed in the CEU,
Greeks, and Lebanese (Figure 2), as expected. A shared
pattern of N
e
in ancient times was also observed in the
Sara, Laal speakers, and other Africans, such as the Yoruba.
We suggest that the deviation from the expected N
e
pattern
in the Toubou is related to extensive admixture history
with Eurasians, like the Eurasian admixture seen in Ethio-
pians, and we explore this possibility directly with admix-
ture tests below.
Multiple Eurasian Admixtures in Africa after 6,000 ya
We have previously reported massive gene flow ~3,000 ya
from Eurasians to Ethiopian populations.
4
Here, we reas-
sess the presence of Eurasian ancestry in Africa by using
f
3
statistics
25
in the form of f
3
(X; Eurasian, Yoruba), where
Figure 2. Population-Size Estimates from Whole-Genome
Sequences
Population size was inferred by MSMC analysis with four haplo-
types from each population. Eurasian populations had a distinctive
bottleneck at the time of their exodus from Africa ~60,000 ya.
Compared to other Africans, admixed Africans (from a Eurasian
gene flow), such as Egyptians, Ethiopians, and the Toubou, also
showed a decline in population size during the same period.
The American Journal of Human Genetics 99, 1–9, December 1, 2016 3
Please cite this article in press as: Haber et al., Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian
Migrations, The American Journal of Human Genetics (2016), http://dx.doi.org/10.1016/j.ajhg.2016.10.012
a negative value with a Zscore <4 indicates that X is a
mixture of Africans and Eurasians. We found, as expected,
that most Ethiopians are a mixture of Africans and
Eurasians. An exception is the Gumuz population, where
f
3
(Gumuz; Eurasian, Yoruba) is always positive. The
Gumuz language belongs to the Nilo-Saharan family,
which could have isolated the Gumuz from the Afro-
Asiatic-speaking Ethiopians. However, we found that
the Toubou in Chad, who also speak a Nilo-Saharan lan-
guage, are a mixture of Africans and Eurasians, making
f
3
(Toubou; Eurasian, Yoruba) always significantly negative.
This suggests that the impact of Eurasian migrations today
extends beyond East Africa and the Afro-Asiatic-speaking
populations. We did not detect significant (Zscore <4)
Eurasian admixture in the Sara (Nilo-Saharan language
family) or the Laal speakers (unclassified language) with
the use of f
3
statistics (lowest Zscore for the Sara was
>2.9; for the Laal speakers, Zscores were all positive).
However, this statistic loses sensitivity with small mixture
proportions and post-admixture drift,
27
so positive values
from the f
3
statistics do not necessarily reflect a complete
absence of admixture. We thus further tested for admixture
by using ALDER and MALDER, which assess admixture-
induced linkage disequilibrium (LD) and can detect small
mixture proportions from a substantially diverged refer-
ence possibly missed by the f
3
statistic. ALDER detected
admixture in the Toubou, Sara, and Laal speakers (Table
S2). MALDER, which has the potential to determine
whether or not the admixture LD in the population is
best represented as the result of one or multiple mixtures,
showed that two mixture events had occurred in the
Toubou (Figure 3A; Table S3). The first event occurred
2,850–3,500 ya (Zscore ¼11), a time close to the date of
mixture in East Africans 2,500–2,700 ya (Zscore ¼26).
The second mixture event occurred much more recently
at 170–260 ya (Zscore ¼5). In southern Chad, we detected
mixture events that were more ancient than those in the
north. Mixture occurred 3,900–4,800 ya (Zscore ¼10) in
the Sara and 4,750–7,200 ya (Zscore ¼5) in the Laal
speakers (Figure 3A). These time estimates overlap, and
we interpret them as signals from the same admixture
event, whose time in the distant past was estimated more
reliably in the Laal speakers because they carry more
Eurasian ancestry (1.25%–4.5%) than the Sara (0.3%–2%)
(see estimates of admixture proportions below), even
though the Sara have smaller standard errors because of
their larger sample size. In particular, we suggest that the
Eurasian mixture event in the Sara and Laal speakers is in-
dependent of the mixture event in East Africans and the
Toubou for two reasons: (1) admixture LD showed that
the events in southern Chad preceded the events in East
Africa by 2,000–4,500 years, and (2) we found in Chad a
Eurasian Y chromosome lineage (Y haplogroup R1b-V88)
that had penetrated all Chadian populations examined
but was absent or rare from the Ethiopians examined
(Table S4;Figure S1). From whole Y chromosome seq-
uences (Figure S2), we estimate that the Chadian R1b-
V88 chromosomes sampled emerged 5,700–7,300 ya
(Figure 3B), a time comparable to the Laal speaker admix-
ture dates (4,750–7,200 ya) estimated from genome-wide
LD-decay patterns.
The Sources of Eurasian Backflow into Chad and East
Africa Are Correlated
Previous studies have suggested that the Eurasian backflow
into East Africa came from a population related to early
Neolithic farmers.
5
We wanted to know whether the
Eurasian ancestry we found in the Toubou, which we
Figure 3. Timing of the Eurasian Admixture in Africa
(A) Crosses represent significant admixture events in the history of the Toubou, Amhara, Sara, and Laal speakers. Time of admixture is
estimated from LD by ALDER with all pairs of African-Eurasian populations in our dataset as references. MALDER extends ALDER infer-
ence by detecting multiple mixture events, such as in the case of the Toubou population (shown here in green lozenges).
(B) A maximum-likelihood tree shows the males belonging to haplogroup R1b in the 1000 Genomes Project and the R1b males in our
dataset. The number of samples is shown on each branch tip. We estimate that the Chadian R1b emerged 5,700–7,300 ya, whereas most
European R1b haplogroups emerged 7,300–9,400 ya. The African and Eurasian lineages coalesced 17,900–23,000 ya.
(C) Putative sources and times of admixture of the Eurasian ancestry in Chad and East Africa.
4The American Journal of Human Genetics 99, 1–9, December 1, 2016
Please cite this article in press as: Haber et al., Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian
Migrations, The American Journal of Human Genetics (2016), http://dx.doi.org/10.1016/j.ajhg.2016.10.012
attribute to a mixture close in time to the date of mixture
in East Africans, can be traced to the same source popula-
tions that influenced Ethiopia. We performed the tests
f
3
(Toubou; Yoruba, X) and f
3
(Amhara; Yoruba, X), where
X is a present-day non-sub-Saharan African population in
our dataset and is related to one that contributed ancestry
to the Toubou and Amhara (Zscore <4) (Table S5). We
then looked at the correlation of the f
3
statistic values
between the two tests (Figure 4A). We found that the
Eurasian source populations for the Amhara and Toubou
were highly correlated (r¼0.98; 95% CI ¼0.98–0.99;
p value <2.2 310
16
) and that the most significant result
was for present-day Sardinians. Exceptions to this cor-
relation were the North African populations (Tunisians,
Mozabite, Algerians, and Saharawi), who appeared to
have contributed more ancestry to the Toubou than to
the Amhara. We repeated the tests by using published
ancient genomes (Table S6) and also found a high correla-
tion of the Eurasian sources for the Amhara and Toubou
(r¼0.98; 95% CI ¼0.97–0.99; p value <2.2 310
16
); early
Neolithic farmers were the most significant contributors,
as reported previously
5
(Figure 4B). When we substituted
the Amhara with other Ethiopians (Wolayta and Oromo),
we found similar results (data not shown). In a parallel
comparison, we checked whether the sources of the Afri-
can ancestry in different Near Eastern populations were
also correlated. We tested f
3
(Lebanese; British, X) and
f
3
(Yemeni; British, X) and found a lower correlation of
the f
3
values (r¼0.62; 95% CI ¼0.32–0.80), suggesting a
more complicated history of gene flow from genetically
different Africans to different populations in the Near East.
We next quantified the proportion of African-Eurasian
mixture in the study populations by using two methods:
(1) ADMIXTURE
26
supervised with K¼2 and the British
and Yoruba as ancestral populations and (2) the f
4
ratio
a¼f
4
(British, chimp; X, Yoruba)/f
4
(British, chimp; early
Neolithic farmer, Yoruba), where X is one of the popula-
tions in our dataset (Figure S3). The results from the
two tests were highly correlated (r¼0.998; 95% CI ¼
0.996–0.999; p value <2.2 310
16
). Eurasian ancestry
was estimated at 26%–30% in the Toubou, 0.3%–2% in
the Sara, and 1.2%–4.5% in the Laal speakers. Eurasian
ancestry in Ethiopians ranged from 11%–12% in the
Gumuz to 53%–57% in the Amhara. African ancestry in
the Near East ranged from 7%–14% (Yemen) to 0.7%–5%
(Lebanese Christians).
Eurasian Gene Flow Shaped the Genomes
of Admixed Africans
Our results from the PCA and MSMC analysis showed a
deviation of the admixed populations from the patterns
observed in unadmixed (or less admixed) populations in
the same geographical region. The MSMC analysis, in
particular, showed that admixed Africans had patterns
indicative of a decline in heterozygosity (increased bottle-
neck ~60,000 ya), somewhat similarly to Eurasians. We
tested whole-genome heterozygosity in these populations
and found that it decreased in admixed Africans according
Figure 4. Sources of the Eurasian Ancestry in Chad and Ethiopia
The plot shows significant Eurasian sources for the Toubou and Amhara according to a three-population test (Zscore <4). An increase
in the absolute value of the f
3
statistic implies an increase in the genetic affinity of the Eurasian population X to the Toubou and Amhara.
(A) With the exception of North Africans, who showed increased affinity to the Toubou, present-day populations showed correlated
affinity to both the Toubou and Amhara. Among modern populations, Sardinians showed the highest genetic affinity to both the
Toubou and Amhara.
(B) Ancient Eurasians also showed correlated affinity to both the Toubou and Amhara; the early Neolithic LBK (Linearbandkeramik, or
Linear Pottery) population (~5,000 BCE) had the highest affinity.
The American Journal of Human Genetics 99, 1–9, December 1, 2016 5
Please cite this article in press as: Haber et al., Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian
Migrations, The American Journal of Human Genetics (2016), http://dx.doi.org/10.1016/j.ajhg.2016.10.012
to their Eurasian ancestry (Figure S4A). This decrease
was not related to recent inbreeding, given that removing
segments with long runs of homozygosity did not
change the overall pattern. Our simulations suggest that
decay in heterozygosity is expected after gene flow from
a population with diversity comparable to that of Eur-
asians (Figures S4B and S4C). We further investigated
heterozygosity in admixed Africans by assessing heterozy-
gosity of the different ancestral segments in the Toubou
genome. We found that admixed African-Eurasian seg-
ments had more heterozygosity (1.23 hets/kb) than seg-
ments of the genome where African-African haplotypes
were present (1.19 hets/kb) (Figure S5). However, the Tou-
bou genome segments with complete Eurasian ancestry
(Eurasian-Eurasian) had considerably lower heterozygosity
(~0.96 hets/kb; Figure S5), leading to the genome-wide
pattern of decay in heterozygosity observed in Africans
with Eurasian ancestry (Figure S4).
We wanted to understand the consequence of admixture
on the models that use the density of heterozygous sites to
infer the demographic history of populations. We first
tested whether the coalescent history estimated by
MSMC was affected by a small proportion of mixture,
such as the African mixture found in Greeks and Lebanese
(ranging from 0% to 5%). We tested the Greek, Lebanese,
CEU, and CHB (Han Chinese in Beijing, China) split times
from the Yoruba and found that all populations split from
the Yoruba ~70,000–80,000 ya, implying that the low pro-
portions of African admixture in the Greeks and Lebanese
did not detectably affect the estimates of relative cross-coa-
lescence rate (Figure S6A). We next tested the Toubou, who
have ~30% Eurasian ancestry. The Toubou appeared to
split from Eurasians ~30,000–40,000 ya, a time more
recent than expected considering the African-Eurasian
split 60,000–80,000 ya
20
(Figure S6B). We tested other
Africans in our dataset and found that the Sara, Laal
speakers, and Yoruba split from Eurasians, as expected,
~70,000–80,000 ya (Figure S6B). We then tested directly
whether the Eurasian ancestry affected the relative cross-
coalescence rate between the Toubou and Eurasians by
masking some of the Eurasian ancestry in the Toubou.
We used PCAdmix
21
to estimate the ancestry along each
chromosome and then used the identified Eurasian seg-
ments as a negative mask in our analysis. The split times
between the Toubou and Lebanese, for example, increased
by ~15,000 years (Figure S6B), shifting the split date toward
the expected African-Eurasian split time.
We found that, in addition to influencing the relative
cross-coalescence rate, admixture can also inflate putative
signals of positive selection. For example, using the PBS
31
to detect recent positive selection that occurred in the
Toubou after their divergence from the Yoruba, we found
signals of selection on MCM6 (MIM: 601806) rs4988235,
a variant associated with the lactase-persistence pheno-
type. This SNP was previously found to be under strong
positive selection in Europeans, where it was probably ad-
vantageous to individuals living in pastoralist societies.
34
The frequency of this variant in the Toubou is 2%, and it
is absent from the sub-Saharan African and other Chadic
samples (the Sara and Laal speakers) examined here.
Although this SNP appears to be a candidate for selection,
we suggest that it has probably drifted neutrally in the
Toubou after the Eurasian gene flow: the Toubou have
~30% Eurasian ancestry from a population similar to the
Greeks, who have 13% derived alleles at rs4988235,
suggesting an expectation of ~3.9% of the derived allele
simply from admixture. We similarly found in the Toubou
signals at HERC2 (MIM: 605837) rs1129038 a major con-
tributor to blue eye color in Europeans
35
(Toubou derived
allele frequency [DAF] ¼0.014; Greek DAF ¼0.33; Yoruba,
Sara, and Laal DAF ¼0), as well as a signal at SLC24A5
(MIM: 609802) rs1834640, a major contributor to pigmen-
tation
36
(Toubou DAF ¼0.19; Greek DAF ¼0.99; Yoruba,
Sara, and Laal DAF ¼0–0.04).
In addition to introducing to African populations genes
that were positively selected in Europe, the recent African-
Eurasian admixture carried Neanderthal alleles to Central
and East Africa. Neanderthals are closer to the Amhara
than to the Yoruba: D(Neanderthal, chimp, Amhara,
Yoruba) ¼0.0094.Neanderthals are also closer to the Toubou
than to the Yoruba: D(Neanderthal, chimp, Toubou,
Yoruba) ¼0.0041. On the other hand, we found that
Neanderthals are closer to Europeans than to Near East-
erners: D(Neanderthal, chimp; French, Yemen) ¼0.0056
and D(Neanderthal, chimp; French, Palestinian) ¼0.0040.
We estimated the archaic ancestry proportions by using
the ratio
a¼f4ðAltai;chimp;X;YorubaÞ
f4ðAltai;chimp;Vindija;YorubaÞ
and found that Neanderthal ancestry was ~0.5% in the
Toubou and~1% in the Amhara. We then computed the
correlation between the Neanderthal ancestry proportions
and the Eurasian and African ancestry proportions we
identified. Neanderthal ancestry in admixed Africans and
Near Easterners was highly correlated with their Eurasian
ancestry (r¼0.98; p value <2.2 310
16
) and inversely
correlated with their African ancestry (Figure 5).
Discussion
We have generated an extensive set of genotyping and
high-coverage whole-genome sequencing data to study
the genetic history of Chad and neighboring populations.
We found substantial genetic differences between the
ethnic groups inhabiting Chad today and suggest that
multiple ancient Eurasian migrations played a major role
in shaping the genetic diversity of the region (Figure 3C).
Here, we discuss these migrations and how the mixed
ancestry can confound proper interpretation of the evolu-
tionary processes that occurred in their history and there-
fore needs to be thoroughly accounted for in the study of
African genetic diversity.
6The American Journal of Human Genetics 99, 1–9, December 1, 2016
Please cite this article in press as: Haber et al., Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian
Migrations, The American Journal of Human Genetics (2016), http://dx.doi.org/10.1016/j.ajhg.2016.10.012
We detected the earliest Eurasian migrations to Africa in
the Laal-speaking people, an isolated language group of
fewer than 800 speakers who inhabit southern Chad.
We estimate that mixture occurred 4,750–7,200 ya, thus
after the Neolithic transition in the Near East, a period
characterized by exponential growth in human population
size. Environmental changes during this period (which
possibly triggered the Neolithic transition) also facilitated
human migrations. The African Humid Period, for
example, was a humid phase across North Africa that
peaked 6,000–9,000 ya
37
and biogeographically connected
Africa to Eurasia, facilitating human movement across
these regions.
38
In Chad, we found a Y chromosome line-
age (R1b-V88) that we estimate emerged during the same
period 5,700–7,300 ya (Figure 3B). The closest related Y
chromosome groups today are widespread in Eurasia and
have been previously associated with human expansions
to Europe.
39,40
We estimate that the Eurasian R1b lineages
initially diverged 7,300–9,400 ya, at the time of the
Neolithic expansions. However, we found that the African
and Eurasian R1b lineages diverged 17,900–23,000 ya, sug-
gesting that genetic structure was already established be-
tween the groups who expanded to Europe and Africa.
R1b-V88 was previously found in Central and West Africa
and was associated with a mid-Holocene migration of
Afro-asiatic speakers through the central Sahara into the
Lake Chad Basin.
8
In the populations we examined, we
found R1b in the Toubou and Sara, who speak Nilo-Sa-
haran languages, and also in the Laal people, who speak
an unclassified language. This suggests that R1b penetrated
Africa independently of the Afro-asiatic language spread or
passed to other groups through admixture.
In addition to the early Eurasian migration to Africa
~6,000 ya, a second migration ~3,000 ya affected the
Toubou population in northern Chad but had no detect-
able genetic impact on other Chadian populations. This
migration appears to be associated with the previously re-
ported Eurasian backflow into East Africa, given that the
source populations and dates of mixture are similar. Occur-
ring at the start of the Iron Age, these migrations could
have been facilitated by advances in warfare and transpor-
tation technology in the Near East. It is uncertain why the
impact of this migration in Chad affected only the Toubou.
The African ancestral component in the Toubou is best rep-
resented by the Laal-speaking population, suggesting that
the African-Eurasian mixture probably occurred in Chad.
However, ethnolinguistic barriers could have already
been established at this time between the Chad groups,
preventing a widespread dissemination of the Eurasian
ancestry. The Toubou, despite their Islamic faith, do not
show the genetic admixture detected in many Near Eastern
and North African populations around 1,100 ya,
41
suggest-
ing conversion without population mixing at this time.
They did, however, receive additional Eurasian ancestry
in the past 200 years from a source represented by North
African populations such as Tunisians, Mozabite, Alger-
ians, and Sahrawi (Figure 3C). This recent interaction
could have been promoted by the nomadic lifestyle of
the present-day Toubou and a shared Muslim religion
with North Africans. Unsurprisingly, we also detected a
likely mixing of Chad populations in the sample from
the capital, which could be even more recent.
Eurasian backflow into Africa thus appears to have been
a recurrent event in the history of many Africans, given its
considerable impact on their genomes. Although popula-
tion mixture in general is a process that increases genetic
diversity, we observed a decrease in heterozygosity in the
admixed Africans. Our simulations showed that these re-
sults are expected after mixture at these proportions with
the Eurasians who suffered a significant bottleneck at the
time of their exodus from Africa ~60,000 ya. Conse-
quently, we found that mixture can complicate interpreta-
tion of the coalescent history inferred from models that
use the density of heterozygous sites in their implementa-
tions. In addition, we detected in admixed Africans an
inflation of positive-selection signals on alleles associated
with adult lactose tolerance and pigmentation in Euro-
peans, but we suggest that these alleles have drifted
neutrally in Africans after admixture. Furthermore, we de-
tected Neanderthal ancestry in admixed Africans and
found it to be proportional to their Eurasian ancestry.
Figure 5. Neanderthal Ancestry Correlation with the African-
Eurasian Admixture
Neanderthal ancestry is not expected in Africa, yet today many
Africans carry Neanderthal-derived alleles. The plot shows that
the Neanderthal ancestry proportion in Africans is correlated
with gene flow from Eurasians. For example, knowing that today
Eurasians carry ~2% of Neanderthal ancestry, we observed that
East Africans (Ethiopians) had ~1% Neanderthal ancestry and
~50% Eurasian ancestry. Correspondingly, Near Easterners showed
a decline in Neanderthal ancestry proportional to their levels of
African ancestry.
The American Journal of Human Genetics 99, 1–9, December 1, 2016 7
Please cite this article in press as: Haber et al., Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian
Migrations, The American Journal of Human Genetics (2016), http://dx.doi.org/10.1016/j.ajhg.2016.10.012
Similarly, in admixed Near Easterners, we found a decrease
in Neanderthal ancestry proportional to the gene flow
they have received from Africans. Although a higher
genetic affinity of Neanderthals to Europeans than to
Near Easterners was previously interpreted as additional
Neanderthal admixture in the history of Europeans,
42
we
propose that a more parsimonious explanation for these
observations is that African-Eurasian mixtures both intro-
duced Neanderthal ancestry to Africa and ‘‘diluted’’ the
Neanderthal ancestry in the Near East.
It is important to note that in this work we inevitably
invoke Occam’s razor to support the simplest model consis-
tent with our data; the history of the populations studied
here, including the time and sources of the Eurasian admix-
ture in Africa, could be more complex. aDNA from Chad
and neighboring regions remains a challenge given the
poor DNA preservation in hot climates, but future success-
ful efforts in aDNA research could provide additional in-
sights and reveal additional complexities not considered
by the modern-DNA-based models favored here.
43
Our study has shown that human genetic diversity in
Africa is still incompletely understood and that ancient
admixture adds to its complexity. This work highlights
the importance of exploring underrepresented popula-
tions, such as those from Chad, in genetic studies to
improve our understanding of the demographic processes
that shaped genetic variation in Africa and globally.
Accession Numbers
Whole-genome sequencing and SNP genotyping data are avai-
lable through the European Genome-phenome Archive (EGA)
under accession numbers EGA: EGAD00001002742 (sequences
from Chad and Lebanon), EGAD00001001440 (sequences from
Greece), and EGAS00001001231 (SNP data from Chad, Lebanon,
and Yemen).
Supplemental Data
Supplemental Data include six figures and six tables and can be
found with this article online at http://dx.doi.org/10.1016/j.
ajhg.2016.10.012.
Acknowledgments
We thank all sample donors for making this work possible and
the Wellcome Trust Sanger Institute pipelines for generating geno-
type and sequence data. We also thank Andrea Massaia for com-
ments on analyzing Y chromosome haplogroups from array data
and David Soria-Hernanz for comments on the ethnic groups
and languages in Chad. M.H., M.M., A.B., J.P.-M., E.Z., Y.X., and
C.T.-S. were supported by the Wellcome Trust (098051). P.H. was
supported by Estonian Research Council grant PUT1036. J.B.-S.
is a full-time employee of Karius, Inc., and R.S.W. is founder and
CEO of Insitome.
Received: August 15, 2016
Accepted: October 24, 2016
Published: November 23, 2016
Web Resources
European Genome-phenome Archive (EGA), https://www.ebi.ac.
uk/ega/home
ISOGG Y-DNA Haplotype Tree v.11.201, http://www.isogg.org/tree/
References
1. Central Intelligence Agency. (2014). The World Factbook 2014.
https://www.cia.gov/library/publications/the-world-factbook/
index.html.
2. Kro
¨pelin, S., Verschuren, D., Le
´zine, A.M., Eggermont, H.,
Cocquyt, C., Francus, P., Cazet, J.P., Fagot, M., Rumes, B., Rus-
sell, J.M., et al. (2008). Climate-driven ecosystem succession in
the Sahara: the past 6000 years. Science 320, 765–768.
3. Lo
¨hr, D. (2009). Lake Chad and the migratory routes to Borno:
A linguistic trail. In Migrations and Spatial Mobility in the
Lake Chad Basin, XIIIth Mega-Chad Conference, H. Tour-
neaux, ed. (Editions de l’IRD), pp. 665–681.
4. Pagani, L., Kivisild, T., Tarekegn, A., Ekong, R., Plaster, C., Gal-
lego Romero, I., Ayub, Q., Mehdi, S.Q., Thomas, M.G., Luiselli,
D., et al. (2012). Ethiopian genetic diversity reveals linguistic
stratification and complex influences on the Ethiopian gene
pool. Am. J. Hum. Genet. 91, 83–96.
5. Gallego Llorente, M., Jones, E.R., Eriksson, A., Siska, V., Arthur,
K.W., Arthur, J.W., Curtis, M.C., Stock, J.T., Coltorti, M., Pier-
uccini, P., et al. (2015). Ancient Ethiopian genome reveals
extensive Eurasian admixture in Eastern Africa. Science 350,
820–822.
6. Cerny
´, V., Fernandes, V., Costa, M.D., Ha
´jek, M., Mulligan,
C.J., and Pereira, L. (2009). Migration of Chadic speaking pas-
toralists within Africa based on population structure of Chad
Basin and phylogeography of mitochondrial L3f haplogroup.
BMC Evol. Biol. 9,63.
7. Cerezo, M.,
Cerny
´, V., Carracedo, A
´., and Salas, A. (2011). New
insights into the Lake Chad Basin population structure re-
vealed by high-throughput genotyping of mitochondrial
DNA coding SNPs. PLoS ONE 6, e18682.
8. Cruciani, F., Trombetta, B., Sellitto, D., Massaia, A., Destro-Bi-
sol, G., Watson,E., Beraud Colomb, E., Dugoujon, J.M., Moral,
P., and Scozzari, R. (2010). Human Y chromosome haplogroup
R-V88: a paternal genetic record of early mid Holocene trans-
Saharan connections and the spread of Chadic languages. Eur.
J. Hum. Genet. 18, 800–807.
9. Triska, P., Soares, P., Patin, E., Fernandes, V., Cerny, V., and Per-
eira, L. (2015). Extensive Admixture and Selective Pressure
Across the Sahel Belt. Genome Biol. Evol. 7, 3484–3495.
10. Gurdasani, D., Carstensen, T., Tekola-Ayele, F., Pagani, L.,
Tachmazidou, I., Hatzikotoulas, K., Karthikeyan, S., Iles, L.,
Pollard, M.O., Choudhury, A., et al. (2015). The African
Genome Variation Project shapes medical genetics in Africa.
Nature 517, 327–332.
11. Abecasis, G.R., Altshuler, D., Auton, A., Brooks, L.D., Durbin,
R.M., Gibbs, R.A., Hurles, M.E., McVean, G.A.; and 1000 Ge-
nomes Project Consortium (2010). A map of human genome
variation from population-scale sequencing. Nature 467,
1061–1073.
12. Pagani, L., Schiffels, S., Gurdasani, D., Danecek, P., Scally, A.,
Chen, Y., Xue, Y., Haber, M., Ekong, R., Oljira, T., et al.
(2015). Tracing the route of modern humans out of Africa
by using 225 human genome sequences from Ethiopians
and Egyptians. Am. J. Hum. Genet. 96, 986–991.
8The American Journal of Human Genetics 99, 1–9, December 1, 2016
Please cite this article in press as: Haber et al., Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian
Migrations, The American Journal of Human Genetics (2016), http://dx.doi.org/10.1016/j.ajhg.2016.10.012
13. Haak, W., Lazaridis, I., Patterson, N., Rohland, N., Mallick, S.,
Llamas, B., Brandt, G., Nordenfelt, S., Harney, E., Stewardson,
K., et al. (2015). Massive migration from the steppe was a
source for Indo-European languages in Europe. Nature 522,
207–211.
14. Drmanac, R., Sparks, A.B., Callow, M.J., Halpern, A.L., Burns,
N.L., Kermani, B.G., Carnevali, P., Nazarenko, I., Nilsen,
G.B., Yeung, G., et al. (2010). Human genome sequencing us-
ing unchained base reads on self-assembling DNA nanoarrays.
Science 327, 78–81.
15. Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira,
M.A., Bender, D., Maller, J., Sklar, P., de Bakker, P.I., Daly,
M.J., and Sham, P.C. (2007). PLINK: a tool set for whole-
genome association and population-based linkage analyses.
Am. J. Hum. Genet. 81, 559–575.
16. Li,H.,Handsaker,B.,Wysoker,A.,Fennell,T.,Ruan,J.,Homer,N.,
Marth, G., Abecasis, G., Durbin, R.; and 1000 Genome Project
Data Processing Subgroup (2009). The Sequence Alignment/
Map format and SAMtools. Bioinformatics 25, 2078–2079.
17. Delaneau, O., Zagury, J.F., and Marchini, J. (2013). Improved
whole-chromosome phasing for disease and population ge-
netic studies. Nat. Methods 10, 5–6.
18. Auton, A., Brooks, L.D., Durbin, R.M., Garrison, E.P., Kang,
H.M., Korbel, J.O., Marchini, J.L., McCarthy, S., McVean,
G.A., Abecasis, G.R.; and 1000 Genomes Project Consortium
(2015). A global reference for human genetic variation. Nature
526, 68–74.
19. Patterson, N., Price, A.L., and Reich, D. (2006). Population
structure and eigenanalysis. PLoS Genet. 2, e190.
20. Schiffels, S., and Durbin, R. (2014). Inferring human popula-
tion size and separation history from multiple genome se-
quences. Nat. Genet. 46, 919–925.
21. Brisbin, A., Bryc, K., Byrnes, J., Zakharia, F., Omberg, L., De-
genhardt, J., Reynolds, A., Ostrer, H., Mezey, J.G., and Busta-
mante, C.D. (2012). PCAdmix: principal components-based
assignment of ancestry along each chromosome in individ-
uals with admixed ancestry from two or more populations.
Hum. Biol. 84, 343–364.
22. Bergstro
¨m, A., Nagle, N., Chen, Y., McCarthy, S., Pollard,
M.O., Ayub, Q., Wilcox, S., Wilcox, L., van Oorschot, R.A.,
McAllister, P., et al. (2016). Deep roots for Aboriginal Austra-
lian Y chromosomes. Curr. Biol. 26, 809–813.
23. Forster, P., Harding, R., Torroni, A., and Bandelt, H.J. (1996).
Origin and evolution of Native American mtDNA variation:
a reappraisal. Am. J. Hum. Genet. 59, 935–945.
24. Fu, Q., Li, H., Moorjani, P., Jay, F., Slepchenko, S.M., Bondarev,
A.A., Johnson, P.L., Aximu-Petri, A., Pru
¨fer, K., de Filippo, C.,
et al. (2014). Genome sequence of a 45,000-year-old modern
human from western Siberia. Nature 514, 445–449.
25. Patterson, N., Moorjani, P., Luo, Y., Mallick, S., Rohland, N.,
Zhan, Y., Genschoreck, T., Webster, T., and Reich, D. (2012).
Ancient admixture in human history. Genetics 192, 1065–
1093.
26. Alexander, D.H., Novembre, J., and Lange, K. (2009). Fast
model-based estimation of ancestry in unrelated individuals.
Genome Res. 19, 1655–1664.
27. Loh, P.R., Lipson, M., Patterson, N., Moorjani, P., Pickrell, J.K.,
Reich, D., and Berger, B. (2013). Inferring admixture histories
of human populations using linkage disequilibrium. Genetics
193, 1233–1254.
28. Pickrell, J.K., Patterson, N., Loh, P.R., Lipson, M., Berger, B.,
Stoneking, M., Pakendorf, B., and Reich, D. (2014). Ancient
west Eurasian ancestry in southern and eastern Africa. Proc.
Natl. Acad. Sci. USA 111, 2632–2637.
29. Danecek, P., Auton, A., Abecasis, G., Albers, C.A., Banks, E.,
DePristo, M.A., Handsaker, R.E., Lunter, G., Marth, G.T.,
Sherry, S.T., et al.; 1000 Genomes Project Analysis Group
(2011). The variant call format and VCFtools. Bioinformatics
27, 2156–2158.
30. Peng, B., and Kimmel, M. (2005). simuPOP: a forward-time
population genetics simulation environment. Bioinformatics
21, 3686–3687.
31. Yi, X., Liang, Y., Huerta-Sanchez, E., Jin, X., Cuo, Z.X., Pool,
J.E., Xu, X., Jiang, H., Vinckenbosch, N., Korneliussen, T.S.,
et al. (2010). Sequencing of 50 human exomes reveals adapta-
tion to high altitude. Science 329, 75–78.
32. Ayub, Q., Mezzavilla, M., Pagani, L., Haber, M., Mohyuddin,
A., Khaliq, S., Mehdi, S.Q., and Tyler-Smith, C. (2015). The Ka-
lash genetic isolate: ancient divergence, drift, and selection.
Am. J. Hum. Genet. 96, 775–783.
33. Li, H., and Durbin, R. (2011). Inference of human population
history from individual whole-genome sequences. Nature
475, 493–496.
34. Enattah, N.S., Sahi, T., Savilahti, E., Terwilliger, J.D., Peltonen,
L., and Ja
¨rvela
¨, I. (2002). Identification of a variant associated
with adult-type hypolactasia. Nat. Genet. 30, 233–237.
35. Sturm, R.A., Duffy, D.L., Zhao, Z.Z., Leite, F.P., Stark, M.S., Hay-
ward, N.K., Martin, N.G., and Montgomery, G.W. (2008). A
single SNP in an evolutionary conserved region within intron
86 of the HERC2 gene determines human blue-brown eye co-
lor. Am. J. Hum. Genet. 82, 424–431.
36. Stokowski, R.P., Pant, P.V., Dadd, T., Fereday, A., Hinds, D.A.,
Jarman, C., Filsell, W., Ginger, R.S., Green, M.R., van der Ou-
deraa, F.J., and Cox, D.R. (2007). A genomewide association
study of skin pigmentation in a South Asian population.
Am. J. Hum. Genet. 81, 1119–1132.
37. deMenocal, P.B., and Tierney, J.E. (2012). Green Sahara: Afri-
can Humid Periods paced by Earth’s orbital changes. Nature
Education Knowledge 3,12.
38. Larrasoan
˜a, J.C., Roberts, A.P., and Rohling, E.J. (2013). Dy-
namics of green Sahara periods and their role in hominin evo-
lution. PLoS ONE 8, e76514.
39. Balaresque, P., Bowden, G.R., Adams, S.M., Leung, H.Y., King,
T.E., Rosser, Z.H., Goodwin, J., Moisan, J.P., Richard, C., Mill-
ward, A., et al. (2010). A predominantly neolithic origin for
European paternal lineages. PLoS Biol. 8, e1000285.
40. Batini, C., Hallast, P., Zadik, D., Delser, P.M., Benazzo, A., Ghir-
otto, S., Arroyo-Pardo, E., Cavalleri, G.L., de Knijff, P., Dupuy,
B.M., et al. (2015). Large-scale recent expansion of European
patrilineages shown by population resequencing. Nat. Com-
mun. 6, 7152.
41. Haber, M., Gauguier, D., Youhanna, S., Patterson, N., Moor-
jani, P., Botigue
´, L.R., Platt, D.E., Matisoo-Smith, E., Soria-Her-
nanz, D.F., Wells, R.S., et al. (2013). Genome-wide diversity in
the levant reveals recent structuring by culture. PLoS Genet. 9,
e1003316.
42. Rodriguez-Flores, J.L., Fakhro, K., Agosto-Perez, F., Ramstetter,
M.D., Arbiza, L., Vincent, T.L., Robay, A., Malek, J.A., Suhre, K.,
Chouchane, L., et al. (2016). Indigenous Arabs are descen-
dants of the earliest split from ancient Eurasian populations.
Genome Res. 26, 151–162.
43. Haber, M., Mezzavilla, M., Xue, Y., and Tyler-Smith, C. (2016).
Ancient DNA and the rewriting of human history: be sparing
with Occam’s razor. Genome Biol. 17,1.
The American Journal of Human Genetics 99, 1–9, December 1, 2016 9
Please cite this article in press as: Haber et al., Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian
Migrations, The American Journal of Human Genetics (2016), http://dx.doi.org/10.1016/j.ajhg.2016.10.012
... components and linguistic backgrounds than populations from the southern region, suggesting that cultural factors or Lake Chad Basin were likely a barrier to population movements within the central Sahelian region (Haber et al. 2016;Magnavita et al. 2019). As expected, a large barrier to human migration was detected in the Sahara Desert ( fig. ...
... S23 and S24, Supplementary Material online), highlighting their common ancestral origins. In contrast to other Middle Eastern populations from previous studies (Haber et al. 2016;Fernandes et al. 2019), the Rashaayda Arab population has very low levels of genetic admixture with African groups (supplementary fig. S28D, Supplementary Material online). ...
... To investigate genetic affinities in our dataset, we built three genome-wide SNP datasets for different types of analyses (see details of populations included in each dataset in supplementary table S2, Supplementary Material online). First, we assembled the "Sahel-SNP" dataset by merging the QC-filtered dataset from this study with selected populations from the Sahel/Savannah belt presented in previous genome-wide studies using the Illumina HumanOmni2.5 array (Triska et al. 2015;Haber et al. 2016;Vicente et al. 2019). Before and after each merging, we applied the same QC steps using plink. ...
Article
Full-text available
The Sahel/Savannah belt harbors diverse populations with different demographic histories and different subsistence patterns. However, populations from this large African region are notably under-represented in genomic research. To investigate the population structure and adaptation history of populations from the Sahel/Savannah space, we generated dense genome-wide genotype data of 327 individuals-comprising 14 ethnolinguistic groups, including ten previously unsampled populations. Our results highlight fine-scale population structure and complex patterns of admixture, particularly in Fulani groups and Arabic-speaking populations. Among all studied Sahelian populations, only the Rashaayda Arabic-speaking population from eastern Sudan shows a lack of gene flow from African groups, which is consistent with the short history of this population in the African continent. They are recent migrants from Saudi Arabia with evidence of strong genetic isolation during the last few generations and a strong demographic bottleneck. This population also presents a strong selection signal in a genomic region around the CNR1 gene associated with substance dependence and chronic stress. In Western Sahelian populations, signatures of selection were detected in several other genetic regions, including pathways associated with lactase persistence, immune response, and malaria resistance. Taken together, these findings refine our current knowledge of genetic diversity, population structure, migration, admixture and adaptation of human populations in the Sahel/Savannah belt and contribute to our understanding of human history and health.
... From the published sources, we collected 99 high-coverage Y chromosome genomes sequenced with next-generation sequencing technologies targeting over 9 Mb regions of the chromosome (Table S2) 3,11,13,66,88,[100][101][102][103][104][105][106][107][108][109][110] . If fastq reads were unavailable in the public repositories, these were extracted from BAM or CRAM genome files using SAMtools v1.9 111 and BEDtools v2.24 112 . ...
Article
West and South Asian populations profoundly influenced Eurasian genetic and cultural diversity. We investigate the genetic history of the Y chromosome haplogroup L1-M22, which, while prevalent in these regions, lacks in-depth study. Robust Bayesian analyses of 165 high-coverage Y chromosomes favor a West Asian origin for L1-M22 ∼20.6 thousand years ago (kya). Moreover, this haplogroup parallels the genome-wide genetic ancestry of hunter-gatherers from the Iranian Plateau and the Caucasus. We characterized two L1-M22 harboring population groups during the Early Holocene. One expanded with the West Asian Neolithic transition. The other moved to South Asia ∼8-6 kya but showed no expansion. This group likely participated in the spread of Dravidian languages. These South Asian L1-M22 lineages expanded ∼4-3 kya, coinciding with the Steppe ancestry introduction. Our findings advance the current understanding of Eurasian historical dynamics, emphasizing L1-M22’s West Asian origin, associated population movements, and possible linguistic impacts.
... And, rather, to the time of its early stage, before 15000 BP. Moreover, among the variants for dating the time of occurrence of R1b-V88 proposed by geneticists today, there are also those («the African and Eurasian R1b lineages diverged 17,900-23,000 years ago» (Haber et al. 2016(Haber et al. : 1322) that correlate well with the proposed scenario. Considering the problem of localization of the Afrasian homeland, it is necessary, I think, to change our approach to interpreting the entire set of available facts. ...
Preprint
Full-text available
This January I have published a book (Романчук 2024), which studies the historical and archaeological context of coming of R1b-V88 haplogroup of Y-chromosome to Africa. To make this book more visible, I would like to publish the English summary (Романчук 2024: 91-99) as a single article. This is a first draft, and I hope that the discussion will help to improve this text.
... Whilst it is acknowledged that mixing between populations usually increases diversity, results amongst Africans of admixed heritage have been in line with genetic diversity (heterozygosity) falling with rising Eurasian ancestry (Haber et al., 2016). And so, as African populations may be particular determiners of where the origin is estimated to be, it has been reasoned that non-African ancestry in Africa may have a tangible effect on the estimate -this has previously been addressed to some point by estimating the origin using autosomal diversity of only sub-Saharan Africans, with the diversity having, or not having, been adjusted for the population level of non-African ancestry (Cenac, 2022b). ...
Preprint
Full-text available
Modern humans are acknowledged to have expanded across Earth from Africa. Biological measures have appeared to reflect this expansion, such as genetic diversity and cranial sexual size dimorphism. Admixture is known to be an issue for using diversity to locate where the expansion set out from in Africa; using cranial dimorphism should make admixture less of an issue. Therefore, cranial dimorphism could be of importance for clarifying the origin. This study used data sourced from the Howells dataset to infer if cranial form and shape dimorphisms indicate the expansion, and understand why cranial size dimorphism looks indicative. Form and shape dimorphisms were calculated through RMET. Size dimorphism was calculated beforehand. Cranial form and shape dimorphisms increased with distance from Africa. Form dimorphism suggested an area of origin which was predominantly in Africa and marginally in Asia. For shape dimorphism, locations spanned much farther beyond Africa. Hence, cranial form dimorphism seemed to be quite indicative of the expansion, unlike cranial shape dimorphism. Form is known to feature size and shape – cranial form dimorphism may signify the expansion mainly, or only, due to cranial size dimorphism. It seemed unsettled why cranial size and form dimorphisms seem related to the expansion because it was vague whether cranial size indicates the expansion more for males than females. Previously, a collective estimate of the origin (from biological measures including cranial size dimorphism) pointed to southern Africa; in the collective estimation process, cranial size dimorphism supported the south as would cranial form and shape dimorphisms.
Article
Modern and ancient genomes are not necessarily drawn from homogeneous populations, as they may have been collected from different places and at different times. This heterogeneous sampling can be an issue for demographic inferences and results in biased demographic parameters and incorrect model choice if not properly considered. When explicitly accounted for, it can result in very complex models and high data dimensionality that are difficult to analyse. In this paper, we formally study the impact of such spatial and temporal sampling heterogeneity on demographic inference, and we introduce a way to circumvent this problem. To deal with structured samples without increasing the dimensionality of the site frequency spectrum (SFS), we introduce a new structured approach to the existing program fastsimcoal2. We assess the efficiency and relevance of this methodological update with simulated and modern human genomic data. We particularly focus on spatial and temporal heterogeneities to evidence the interest of this new SFS-based approach, which can be especially useful when handling scattered and ancient DNA samples, as in conservation genetics or archaeogenetics.
Book
Full-text available
Die englische Originalausgabe dieser Monografie erschien 2021 unter den Titel The Prehistory of Language: A Triangulated Y-Chromosome-Based Perspective. Ich bin Linguist und habe diese Übersetzung für meine Kollegen aus dem Sprachbereich angefertigt. Dennoch hoffe ich, dass andere akademische Forscher sich für diese Arbeit interessieren werden, insbesondere Genetiker, Archäologen, Anthropologen und Geowissenschaftler. Diejenigen, die ein allgemeines Interesse an Sprache und Genetik haben, sind ebenfalls herzlich eingeladen, meine Monografie zu lesen. In den letzten vierzig Jahren haben Forscher dank der Sequenzierungstechnologie die molekulargenetische Variation genutzt, um die menschliche Evolutionsgeschichte zu erforschen. Einige haben versucht, diese neue Forschungsrichtung noch weiter auszudehnen mit der Idee, dass genetische Werkzeuge die Vorgeschichte der Sprache erklären können. Da wir unsere Gene und unsere Muttersprache von unseren Eltern geerbt haben, sollten genetische und sprachliche Variationen gut miteinander korrelieren. Die Entschlüsselung der sprachlichen Vorgeschichte anhand genetischer Daten erfordert jedoch die Klärung mehrerer Fragen. Sollen wir die heutige DNA oder die alte DNA oder beides verwenden? Sollen wir mitochondriale, Y-Chromosomen- oder autosomale Marker verwenden? Sollten wir Modelle der Sprachvorgeschichte mit statistischen Methoden erstellen? Oder sollten wir Modelle mit einer Synthese aus archäologischen und paläoklimatologischen Daten erstellen? Ich schlage vor, dass wir eine triangulierte Y-Chromosom-basierte Modellierung als methodische Lösung für die Entschlüsselung der Vorgeschichte der Sprache mit genetischen Werkzeugen verwenden. In meiner Forschung wurden mindestens 110 sprachlich informative Y-Chromosom-Mutationen identifiziert. Die Evolutionsgeschichte dieser Mutationen deutet darauf hin, dass die Geschichte der Sprache vor etwa 100 000 Jahren begann, als der Homo sapiens aus Afrika auswanderte. Nachfolgende Migrationen sowie kulturelle und evolutionäre Anpassungen erklären dann die Ausbreitung der Sprache in alle Teile der Welt. Zu dieser Ausbreitung gehören der Mungo-See-Mensch in Australien, die Mammutsteppen Eurasiens, die feuchte Phase der Sahara-Wüste, die bidirektionale Migration von Rentierzüchtern entlang des Polarkreises, der Ackerbau entlang der Flüsse des Amazonas-Regenwaldes, die Einführung des Reisanbaus in Südasien, Malaria in den Tropen und Hypoxie auf dem tibetischen Plateau.
Article
Full-text available
As the ancestral homeland of our species, Africa contains elevated levels of genetic diversity and substantial population structure. Importantly, African genomes are heterogeneous: they contain mixtures of multiple ancestries, each of which have experienced different evolutionary histories. In this review, we view population genetics through the lens of admixture, highlighting how multiple demographic events have shaped African genomes. Each of these historical vignettes paints a recurring picture of population divergence followed by secondary contact. First, we give a brief overview of African genetic variation and examine deep population structure within Africa, including evidence of ancient introgression from archaic "ghost" populations. Second, we describe the genetic legacies of admixture events that have occurred during the past 10,000 years. This includes gene flow between different click-speaking Khoe-San populations, the stepwise spread of pastoralism from eastern to southern Africa, multiple migrations of Bantu speakers across the continent, as well as admixture from the Middle East and Europe into the Sahel region and North Africa. Furthermore, the genomic signatures of more recent admixture can be found in the Cape Peninsula and throughout the African diaspora. Third, we highlight how natural selection has shaped patterns of genetic variation across the continent, noting that gene flow provides a potent source of adaptive variation and that selective pressures vary across Africa. Finally, we explore the biomedical implications of African population genetic structure on health and disease and call for more ethically conducted studies of African genetic variation.
Article
Full-text available
This review focuses on the Sahel/Savannah belt, a large region of Africa where two alternative subsistence systems (pastoralism and agriculture), nowadays, interact. It is a long-standing question whether the pastoralists became isolated here from other populations after cattle began to spread into Africa (~8 thousand years ago, kya) or, rather, began to merge with other populations, such as agropastoralists, after the domestication of sorghum and pearl millet (~5 kya) and with the subsequent spread of agriculture. If we look at lactase persistence, a trait closely associated with pastoral lifestyle, we see that its variants in current pastoralists distinguish them from their farmer neighbours. Most other (mostly neutral) genetic polymorphisms do not, however, indicate such clear differentiation between these groups; they suggest a common origin and/or an extensive gene flow. Genetic affinity and ecological symbiosis between the two subsistence systems can help us better understand the population history of this African region. In this review, we show that genomic datasets of modern Sahel/Savannah belt populations properly collected in local populations can complement the still insufficient archaeological research of this region, especially when dealing with the prehistory of mobile populations with perishable material culture and therefore precarious archaeological visibility.
Preprint
Full-text available
Recent studies have identified Northeast Africa as an important area for human movements during the Holocene. Eurasian populations have moved back into Northeastern Africa and contributed to the genetic composition of its people. By gathering the largest reference dataset to date of Northeast, North, and East African as well as Middle Eastern populations, we give new depth to our knowledge of Northeast African demographic history. By employing local ancestry methods, we isolated the Non-African parts of modern-day Northeast African genomes and identified the best putative source populations. Egyptians and Sudanese Copts bore most similarities to Levantine populations whilst other populations in the region generally had predominantly genetic contributions from Arabian peninsula rather than Levantine populations for their Non-African genetic component. We also date admixture events and investigated which factors influenced the date of admixture and find that major linguistic families were associated with the date of Eurasian admixture. Taken as a whole we detect complex patterns of admixture and diverse origins of Eurasian admixture in Northeast African populations of today.
Article
Full-text available
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
Article
Full-text available
Characterizing genetic diversity in Africa is a crucial step for most analyses reconstructing the evolutionary history of anatomically modern humans. However, historic migrations from Eurasia into Africa have affected many contemporary populations, confounding inferences. Here, we present a 12.5× coverage ancient genome of an Ethiopian male (“Mota”) who lived approximately 4500 years ago. We use this genome to demonstrate that the Eurasian backflow into Africa came from a population closely related to Early Neolithic farmers, who had colonized Europe 4000 years earlier.
Article
Full-text available
Australia was one of the earliest regions outside Africa to be colonized by fully modern humans, with archaeological evidence for human presence by 47,000 years ago (47 kya) widely accepted [1, 2]. However, the extent of subsequent human entry before the European colonial age is less clear. The dingo reached Australia about 4 kya, indirectly implying human contact, which some have linked to changes in language and stone tool technology to suggest substantial cultural changes at the same time [3]. Genetic data of two kinds have been proposed to support gene flow from the Indian subcontinent to Australia at this time, as well: first, signs of South Asian admixture in Aboriginal Australian genomes have been reported on the basis of genome-wide SNP data [4]; and second, a Y chromosome lineage designated haplogroup C(∗), present in both India and Australia, was estimated to have a most recent common ancestor around 5 kya and to have entered Australia from India [5]. Here, we sequence 13 Aboriginal Australian Y chromosomes to re-investigate their divergence times from Y chromosomes in other continents, including a comparison of Aboriginal Australian and South Asian haplogroup C chromosomes. We find divergence times dating back to ∼50 kya, thus excluding the Y chromosome as providing evidence for recent gene flow from India into Australia.
Article
Full-text available
Ancient DNA research is revealing a human history far more complex than that inferred from parsimonious models based on modern DNA. Here, we review some of the key events in the peopling of the world in the light of the findings of work on ancient DNA.
Article
Full-text available
An open question in the history of human migration is the identity of the earliest Eurasian populations that have left contemporary descendants. The Arabian Peninsula was the initial site of the out of Africa migrations that occurred between 125,000 - 60,000 years ago, leading to the hypothesis that the first Eurasian populations were established on the Peninsula and that contemporary indigenous Arabs are direct descendants of these ancient peoples. To assess this hypothesis, we sequenced the entire genomes of 104 unrelated natives of the Arabian Peninsula at high coverage, including 56 of indigenous Arab ancestry. The indigenous Arab genomes defined a cluster distinct from other ancestral groups and these genomes showed clear hallmarks of an ancient out of Africa bottleneck. Similar to other Middle Eastern populations, the indigenous Arabs had higher levels of Neanderthal admixture compared to Africans but had lower levels than Europeans and Asians. These levels of Neanderthal admixture are consistent with an early divergence of Arab ancestors after the out of Africa bottleneck but before the major Neanderthal ad-mixture events in Europe and other regions of Eurasia. When compared to worldwide populations sampled in the 1000 Genomes Project, while the indigenous Arabs had a signal of admixture with Europeans, they clustered in a basal, outgroup position to all 1000 Genomes non-Africans when considering pairwise similarity across the entire genome. These results place indigenous Arabs as the most distant relatives of all other contemporary non-Africans and identify these people as direct descendants of the first Eurasian populations established by the out of Africa migrations.
Article
Full-text available
Genome-wide studies of African populations have the potential to reveal powerful insights into the evolution of our species as these diverse populations have been exposed to intense selective pressures imposed by infectious diseases, diet and environmental factors. Within Africa, the Sahel Belt extensively overlaps the geographical center of several endemic infections such as malaria, trypanosomiasis, meningitis and hemorrhagic fevers. We screened 2.5 million SNPs in 161 individuals from 13 Sahelian populations which together with published data cover Western, Central and Eastern Sahel, and include both nomadic and sedentary groups. We confirmed the role of this belt as a main corridor for human migrations across the continent. Strong admixture was observed in both Central and Eastern Sahelian populations, with North Africans and Near Eastern/Arabians respectively, but it was inexistent in Western Sahelian populations. Genome-wide local ancestry inference in admixed Sahelian populations revealed several candidate regions that were significantly enriched for non-autochthonous haplotypes, and many showed to be under positive selection. The DARC gene region in Arabs and Nubians was enriched for African ancestry, while the RAB3GAP1/LCT/MCM6 region in Oromo, the TAS2R gene family in Fulani and the ALMS1/NAT8 in Turkana and Samburu were enriched for non-African ancestry. Signals of positive selection varied in terms of geographic amplitude. Some genomic regions were selected across the belt, the most striking example being the malaria-related DARC gene. Others were Western-specific (oxytocin, calcium and heart pathways), Eastern-specific (lipid pathways) or even population-restricted (TAS2R genes in Fulani which may reflect sexual selection).
Article
Full-text available
Ancient African helps to explain the present Tracing the migrations of anatomically modern humans has been complicated by human movements both out of and into Africa, especially in relatively recent history. Gallego Llorente et al. sequenced an Ethiopian individual, “Mota,” who lived approximately 4500 years ago, predating one such wave of individuals into Africa from Eurasia. The genetic information from Mota suggests that present-day Sardinians were the likely source of the Eurasian backflow. Furthermore, 4 to 7% of most African genomes, including Yoruba and Mbuti Pygmies, originated from this Eurasian gene flow. Science , this issue p. 820