Figure 3 - uploaded by Maxat Zhabagin
Content may be subject to copyright.
Genetic relationships of Asian (including Transoxiana) populations using 30 Y-SNPs. Multidimensional scaling plot; stress = 0.17. Populations from 18 countries are marked by colors. The ten populations from this study are shown as rhombuses within squares, while populations from the literature are indicated by circles. Blue lines link populations located along the Amu Darya and Syr Darya rivers. Population codes are explained more fully in Supplementary Table 2. Colored cloud areas represent geographic clusters, with colors on the main plot following colors on the inset (Asian regions according to UN classification). 

Genetic relationships of Asian (including Transoxiana) populations using 30 Y-SNPs. Multidimensional scaling plot; stress = 0.17. Populations from 18 countries are marked by colors. The ten populations from this study are shown as rhombuses within squares, while populations from the literature are indicated by circles. Blue lines link populations located along the Amu Darya and Syr Darya rivers. Population codes are explained more fully in Supplementary Table 2. Colored cloud areas represent geographic clusters, with colors on the main plot following colors on the inset (Asian regions according to UN classification). 

Source publication
Article
Full-text available
We have analyzed Y-chromosomal variation in populations from Transoxiana, a historical region covering the southwestern part of Central Asia. We studied 780 samples from 10 regional populations of Kazakhs, Uzbeks, Turkmens, Dungans, and Karakalpaks using 35 SNP and 17 STR markers. Analysis of haplogroup frequencies using multidimensional scaling an...

Contexts in source publication

Context 1
... on a narrower geographic scale (Transoxiana and the neighboring regions) is available in Supplementary Fig. 3 (Supplementary Table 4). This PCA plot is based on a smaller number of haplogroups, but includes more Central Asian populations. Both an MDS plot of 30 haplogroups and a PC plot of 19 haplogroups (Fig. 3, Supplementary Fig. 2, Supplementary Fig. 3) demonstrate the four following ...
Context 2
... on a narrower geographic scale (Transoxiana and the neighboring regions) is available in Supplementary Fig. 3 (Supplementary Table 4). This PCA plot is based on a smaller number of haplogroups, but includes more Central Asian populations. Both an MDS plot of 30 haplogroups and a PC plot of 19 haplogroups (Fig. 3, Supplementary Fig. 2, Supplementary Fig. 3) demonstrate the four following ...
Context 3
... Uzbek and Tajik populations practicing settled agriculture, as well as Kyrgyz, are genetically distant from most nomadic populations (Mongol, Kazakh, Hazaras). Second, despite originating from three countries (Uzbekistan, Iran, Afghanistan), Turkmen populations form their own firmly separated cluster. The reason lies Table 2) in most Turkmen populations, which in par- ticular forms the third PC ( Supplementary Fig. 2), though this haplogroup is absent from the fourth Turkmen population 30 This is explained by the historically recent migration of Dungans from China and the maintenance of their Sino-Tibetan language, prevalent in China and northeastern India. Fourth, most of the Kazakh populations stud- ied cluster with Mongols, Pakistani Hazaras (HAZ1) and Afghan Hazaras (HAZ2) due to their high frequency of haplogroup C2-М217. This correlates with the historically well-known Mongol origin of the Hazaras 34,35 . In addition, Fig. 3 shows the populations located along a stretch of the Amu Darya and Syr Darya rivers linked by blue lines symbolizing the rivers. However, the positions of these genetic "rivers" only loosely correlate with their geographical ...
Context 4
... prevalence of specific haplogroups is even more pronounced for tribal-clan groups than for geographic populations ( Supplementary Fig. 1): C2*-М217(хМ48) comprises 88% of the Y-chromosomes of the Konyrat tribe, C2b1a2-М48 reaches 75% in the Kazakh clan Alimuly, and Q-M242 accounts for 71% in the Turkmen tribe Yomut. Based on haplogroup frequency, the Konyrat tribe is the most homogenous (HD = 0.23), while the Kozha-Sunak clan group is the most heterogeneous (HD = 0.94). The specificities of the clan pools of paternal lineages are the reason for the specificities of the geographic populations: the clan Alimuly prevails in the KAZ2 population (79% samples are from this clan), the tribe Konyrat predominates in the KAZ1 population (62%), and the tribe Yomut predominates in the TUR1 population (88%). Tables 2 and 3). Clusters corresponding to geographic parts of Asia were revealed in the multidi- mensional scaling plot (Fig. 3). The Western Asian cluster was represented by Arab, Turkish and Iranian popula- tions. Populations of India, Pakistan and Afghanistan made up Southern Asian cluster. Chinese form the Eastern Asia cluster. All Transoxianan populations lie in the Central Asian ...

Citations

... Polymorphism of the Y chromosome in the Kazakh population, particularly the southern area of Kazakhstan, is of interest both on a regional scale of Central Asia [17] and at the local tribal level [18,19]. It was discovered from the ancient Central Asian area of Transoxiana that two-thirds of the gene pool of southern Kazakhs (examined sample N = 780) is haplogroup C2-M217, which is often found among the Konyrat (88%) and Alimul (75%) tribes. ...
... A strong founder effect is also evidenced in studies of 12 tribes of Kazakhs in the Southern area (N = 567 samples [18] and N = 460 samples [19]). There is an exception to the rule: several more ancestors from other Y-chromosome haplogroups were identified for the clans of the steppe clergy (kozha and sunak), which, according to traditional genealogy, descend from a fellow tribesman of the Prophet Muhammad [17]. At the same time, the J1-L859 variant belonging to the Quraysh tribe of the Prophet Muhammad was not detected. ...
... At the same time, the J1-L859 variant belonging to the Quraysh tribe of the Prophet Muhammad was not detected. The steppe clergy's genealogy was based not on biological kinship, but on spiritual heritage passed down from the teacher of Islam, missionaries from various populations, to his disciples [17]. In general, the findings of the haplogroup diversity study, taking tribal organization into consideration, show that the gene pool of Southern Kazakhstan was established by not only genetically related, but also relatively distant tribes [18]. ...
Article
Full-text available
Background The Kazakhs are one of the biggest Turkic-speaking ethnic groups, controlling vast swaths of land from the Altai to the Caspian Sea. In terms of area, Kazakhstan is ranked ninth in the world. Northern, Eastern, and Western Kazakhstan have already been studied in relation to genetic polymorphism 27 Y-STR. However, current information on the genetic polymorphism of the Y-chromosome of Southern Kazakhstan is limited only by 17 Y-STR and no geographical study of other regions has been studied at this variation. Results The Kazakhstan Y-chromosome Haplotype Reference Database was expanded with 468 Kazakh males from the Zhambyl and Turkestan regions of South Kazakhstan by having their 27 Y-STR loci and 23 Y-SNP markers analyzed. Discrimination capacity (DC = 91.23%), haplotype match probability (HPM = 0.0029) and haplotype diversity (HD = 0.9992) are defined. Most of this Y-chromosome variability is attributed to haplogroups C2a1a1b1-F1756 (2.1%), C2a1a2-M48 (7.3%), C2a1a3-F1918 (33.3%) and C2b1a1a1a-M407 (6%). Median-joining network analysis was applied to understand the relationship between the haplotypes of the three regions. In three genetic layer can be described the position of the populations of the Southern region of Kazakhstan—the geographic Kazakh populations of Kazakhstan, the Kazakh tribal groups, and the people of bordering Asia. Conclusion The Kazakhstan Y-chromosome Haplotype Reference Database was formed for 27 Y-STR loci with a total sample of 1796 samples of Kazakhs from 16 regions of Kazakhstan. The variability of the Y-chromosome of the Kazakhs in a geographical context can be divided into four main clusters—south, north, east, west. At the same time, in the genetic space of tribal groups, the population of southern Kazakhs clusters with tribes from the same region, and genetic proximity is determined with the populations of the Hazaras of Afghanistan and the Mongols of China.
... 10.2) and 15 Y-STR data (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, GATA_H4). We calculated the time to the most recent common ancestor (TMRCA) for each cluster using the average squared distance (ASD) method, with a generation time set to 25 years, as previously described (Wang et al., 2014;Xu and Li, 2017;Zhabagin et al., 2017). ...
Article
Full-text available
The Kazakhs of Xinjiang province are characterized by their nomadic lifestyle and patrilineal clan system. However, compared to Central Asian Kazakhs, a lack of Y chromosomal high-resolution analysis has hindered our understanding of the paternal history of modern Xinjiang Kazakhs. Methods In this study, we present the analysis of 110 Y-SNP data from 209 Altay Kazakhs and 201 Ili Kazakhs in Xinjiang, along with their previously reported 24 Y-STR loci data. Results and discussion We found that the Y chromosome haplogroups exhibit greater diversity in Altay Kazakhs compared to Kazakhs in Kazakhstan, Russia, and other regions of China. Y-SNP-based PCA plots reveal that both the Altay and Ili Kazakhs are situated between the Turkic, Mongolia, and Tibeto-Burman clusters. The dominant haplogroup C2a1a3-F1918, which originated in northeast Asia during the Neolithic Age, accounts for nearly half of the Altay and Ili Kazakhs. The Y lineage network of C2a1a3-F1918 contained two subclusters. Approximately 60.6% of the Altay Kazakhs belong to the DYS448-23 subcluster, indicating their Kerey-Abakh ancestry. On the other hand, around three-quarters of the Ili Kazakhs belong to the DYS448-22 subcluster, suggesting their Kerey-Ashmaily heritage. Notably, the TMRCA ages of the DYS448-23 subcluster were calculated to be 289.4 ± 202.65 years, which aligns with the historical immigration of the Kerey clan back to the Altay Mountains after the defeat of the Dzungar by the Qing dynasty in the mid-18th century.
... Later, ancient Turkic and Mongolic tribes spread westward and scattered throughout Central Asia, the Middle East, and Eastern Europe [21]. The descendants of the above-mentioned ancient people mixed and formed the modern populations in Central Asia over a long historical period [22][23][24]. ...
... Previous research has also focused on the formation of modern populations in the recent millennium [23,24]. A strong tribe structure has been detected among populations from this region [26]. ...
... Previous studies have demonstrated that South Siberia is likely the center of diffusion of haplogroup Q-M242 beginning 30,000 thousand years ago [6,50,51]. Furthermore, many minor sub-lineages of Q-M242 have been detected in populations from Inner Eurasia, including Mongolic-and Turkic-speaking populations [6,23,50,51]. We detected diverse sub-lineages of Q-M242 (Table S1). ...
Article
Full-text available
In the past two decades, studies of Y chromosomal single nucleotide polymorphisms (Y-SNPs) and short tandem repeats (Y-STRs) have shed light on the demographic history of Central Asia, the heartland of Eurasia. However, complex patterns of migration and admixture have complicated population genetic studies in Central Asia. Here, we sequenced and analyzed the Y-chromosomes of 187 male individuals from Kazakh, Kyrgyz, Uzbek, Karakalpak, Hazara, Karluk, Tajik, Uyghur, Dungan, and Turkmen populations. High diversity and admixture from peripheral areas of Eurasia were observed among the paternal gene pool of these populations. This general pattern can be largely attributed to the activities of ancient people in four periods, including the Neolithic farmers, Indo-Europeans, Turks, and Mongols. Most importantly, we detected the consistent expansion of many minor lineages over the past thousand years, which may correspond directly to the formation of modern populations in these regions. The newly discovered sub-lineages and variants provide a basis for further studies of the contributions of minor lineages to the formation of modern populations in Central Asia.
... To date, a multitude of studies has employed genetic markers to investigate the genetic diversity and differentiation of the Kazakh population in global (Wells et al., 2001;Underhill et al., 2010;Underhill et al., 2015;Unterländer et al., 2017), regional (Karafet et al., 2002;Lalueza-Fox et al., 2004;M.;Zhabagin et al., 2017) and local (Gokcumen et al., 2008;Balmukhanov et al., 2013;Tarlykov et al., 2013;Wen et al., 2020;M.;Zhabagin et al., 2021) contexts. The accumulated data provide preliminary insights into the demographic history of Kazakhs. For instance, Central Asian populations possess high levels of mtDNA and Y chromosomal haplotype diversity (Underh ...
Article
Full-text available
Ethnogenesis of Kazakhs took place in Central Asia, a region of high genetic and cultural diversity. Even though archaeological and historical studies have shed some light on the formation of modern Kazakhs, the process of establishment of hierarchical socioeconomic structure in the Steppe remains contentious. In this study, we analyzed haplotype variation at 15 Y-chromosomal short-tandem-repeats obtained from 1171 individuals from 24 tribes representing the three socio-territorial subdivisions (Senior, Middle and Junior zhuz) in Kazakhstan to comprehensively characterize the patrilineal genetic architecture of the Kazakh Steppe. In total, 577 distinct haplotypes were identified belonging to one of 20 haplogroups; 16 predominant haplogroups were confirmed by SNP-genotyping. The haplogroup distribution was skewed towards C2-M217, present in all tribes at a global frequency of 51.9%. Despite signatures of spatial differences in haplotype frequencies, a Mantel test failed to detect a statistically significant correlation between genetic and geographic distance between individuals. An analysis of molecular variance found that ∼8.9% of the genetic variance among individuals was attributable to differences among zhuzes and ∼20% to differences among tribes within zhuzes. The STRUCTURE analysis of the 1164 individuals indicated the presence of 20 ancestral groups and a complex three-subclade organization of the C2-M217 haplogroup in Kazakhs, a result supported by the multidimensional scaling analysis. Additionally, while the majority of the haplotypes and tribes overlapped, a distinct cluster of the O2 haplogroup, mostly of the Naiman tribe, was observed. Thus, firstly, our analysis indicated that the majority of Kazakh tribes share deep heterogeneous patrilineal ancestries, while a smaller fraction of them are descendants of a founder paternal ancestor. Secondly, we observed a high frequency of the C2-M217 haplogroups along the southern border of Kazakhstan, broadly corresponding to both the path of the Mongolian invasion and the ancient Silk Road. Interestingly, we detected three subclades of the C2-M217 haplogroup that broadly exhibits zhuz-specific clustering. Further study of Kazakh haplotypes variation within a Central Asian context is required to untwist this complex process of ethnogenesis.
... Overall, we demonstrate here a remarkable example of genetic continuity since the Iron Age in Indo-Iranian populations from Central Asia despite the frenzy of population migrations in the area since the Bronze Age. Similar to Zhabagin et al. work 65 , the present study shows no impact of the Arab cultural expansion in Central Asia on the Indo-Iranian speaker's genetic diversity, despite the first one leading to a shift in language for Tajiks. We also do not see a gene flow from Iran despite the Persian cultural expansion which led to a language shift from an east-Iranian language to a west-Iranian in Tajiks-when Yaghnobis kept their east-Iranian language 66 . ...
Preprint
Full-text available
Since prehistoric times, South Central Asia has been at the crossroads of the movement of people, culture, and goods. Today, the Central Asia's populations are divided into two cultural and linguistic groups: the Indo-Iranian and the Turko-Mongolian groups. Previous genetic studies unveiled that migrations from East Asia contributed to the spread of Turko-Mongolian populations in Central Asia and the partial replacement of the Indo-Iranian population. However, little is known about the origin of the latter. To shed light on this, we compare the genetic data on two current-day populations - Yaghnobis and Tajiks - with genome-wide data from published ancient individuals. The present Indo-Iranian populations from Central Asia display a strong genetic continuity with Iron Age samples from Turkmenistan and Tajikistan. We model Yaghnobis as a mixture of 93% Iron Age individual from Turkmenistan and 7% from Baikal. For the Tajiks, we observe a higher Baikal ancestry and an additional admixture event with a South Asian population. Our results, therefore, suggest that in addition to a complex history, Central Asia shows a remarkable genetic continuity since the Iron Age, with only limited gene flow.
... For use in these analyses, DYS389II was calculated by subtracting the DYS389I allele size. The time to the most recent common ancestor (TMRCA) of each cluster detected in common haplogroups was determined by using the average squared distance (ASD) estimator as described previously [14,15,17]. ASD method is based on the assumption that median or modal STR haplotype in a lineage is the founder haplotype. ...
... Notably, the basal lineage R1a1a1b2-Z93* is commonly distributed in the South Siberian Altai region of Russia. In the upper-right corner of Fig. 3, a marked recent descent cluster (we used the criterion that haplotypes linked to the modal haplotype fewer than 5 mutational steps in the shaded area [17]) can be easily observed. Except two R1a1a1b2-Z93* samples (a Khakassian and an Altaian, dark blue), Table 2 Shao-qing Wen et al. ...
Article
Full-text available
The Kyrgyz are a trans-border ethnic group, mainly living in Kyrgyzstan. Previous genetic investigations of Central Asian populations have repeatedly investigated the Central Asian Kyrgyz. However, from the standpoint of human evolution and genetic diversity, Northwest Chinese Kyrgyz is one of the more poorly studied populations. In this study, we analyzed the non-recombining portion of the Y-chromosome from 298 male Kyrgyz samples from Xinjiang Uygur Autonomous Region in northwestern China, using a high-resolution analysis of 108 biallelic markers and 17 or 24 STRs. First, via a Y-SNP-based PCA plot, Northwest Chinese Kyrgyz tended to cluster with other Kyrgyz population and are located in the West Asian and Central Asian group. Second, we found that the Northwest Chinese Kyrgyz display a high proportion of Y-lineage R1a1a1b2a2a-Z2125, related to Bronze Age Siberian, and followed by Y-lineage C2b1a3a1-F3796, related to Medieval Niru’un Mongols, such as Uissun tribe from Kazakhs. In these two dominant lineages, two unique recent descent clusters have been detected via NETWORK analysis, respectively, but they have nearly the same TMRCA ages (about 13th–14th centuries). This finding once again shows that the expansions of Mongol Empire had a striking effect on the Central Asian gene pool.
... The closest phylogenetic relatives are found in the vicinity of South Asia, East Asia, or Oceania. logroup C2 on average reaches 80% [18][19][20][21][22]. One particular haplotype within Haplogroup C-M217 (star-cluster C2*(C2*-ST)) has received a great deal of attention, because of the possibility that it may represent direct patrilineal descent from Genghis Khan [23], though that hypothesis is controversial. ...
... According to the data, the estimated age of the C2 * - Haplogroup R1b arose from a mutation of the haplogroup R1 that occurred in a man who lived about 22,800 years to the present day (the date was determined from SNPs by YFull [26]). The last common ancestor of R1b carriers lived 20.4 thousand years ago [21]. ...
Article
Full-text available
A haplogroup is a group of similar alleles that have a common ancestor in which a mutationhas occurred, inherited by all descendants. Haplogroups, particularly from the Y-chromosome (Y-DNA), is widely used in population genetics and genetic genealogy, a science that studies the genetic history of mankind. Recent studies of the Y-chromosome of modern Kazakhs have demonstrated the diversity of the Kazakh gene pool. During the expedition carried out in 2014-2016, clinical material was collected from varios regions of Kazakhstan, representing samples of peripheral blood and buccal scrapings. All representatives of Kazakh nationality were familiarized with informed consent. In total 1623 respondents participated in the study, 169 of whom were representatives from Baiuly tribe of Junior zhuz. We analyzed the provided samples and found that the Baiuly is characterized by 10 haplogroups, the most prevailing of which is the C2 haplogroup (85%).
... Which one of them is the ancestor of the Uissuns? The only successor clan of the Darligin Mongols which has been genetically studied is Konyrat (Kungirat) [6,20]. The haplogroup C2-M407 is present at high frequency (86%) in Konyrat (Additional file 10), but not in the Uissuns. ...
... Genomic DNA extraction, genotyping, statistical analysis and median network analysis were done as described previously [20]. . Phylogenetic networks of Y-STR haplotypes were constructed using the Network 5 and Network Publisher software [24,25], excluding DYS385a/b. ...
Article
Full-text available
Background The majority of the Kazakhs from South Kazakhstan belongs to the 12 clans of the Senior Zhuz. According to traditional genealogy, nine of these clans have a common ancestor and constitute the Uissun tribe. There are three main hypotheses of the clans’ origin, namely, origin from early Wusuns, from Niru’un Mongols, or from Darligin Mongols. We genotyped 490 samples of South Kazakhs by 35 Y-chromosomal SNPs (single nucleotide polymorphism) and 17 STRs (short tandem repeat). Additionally, 133 samples from citizen science projects were included into the study. Results We found that three Uissun clans have unique Y-chromosomal profiles, but the remaining six Uissun clans and one non-Uissun clan share a common paternal gene pool. They share a high frequency (> 40%) of the C2*-ST haplogroup (marked by the SNP F3796), which is associated with the early Niru’un Mongols. Phylogenetic analysis of this haplogroup carried out on 743 individuals from 25 populations of Eurasia has revealed a set of haplotype clusters, three of which contain the Uissun haplotypes. The demographic expansion of these clusters dates back to the 13-fourteenth century, coinciding with the time of the Uissun’s ancestor Maiky-biy known from historical sources. In addition, it coincides with the expansion period of the Mongol Empire in the Late Middle Ages. A comparison of the results with published aDNA (ancient deoxyribonucleic acid) data and modern Y haplogroups frequencies suggest an origin of Uissuns from Niru’un Mongols rather than from Wusuns or Darligin Mongols. Conclusions The Y-chromosomal variation in South Kazakh clans indicates their common origin in 13th–14th centuries AD, in agreement with the traditional genealogy. Though genetically there were at least three ancestral lineages instead of the traditional single ancestor. The majority of the Y-chromosomal lineages of South Kazakhstan was brought by the migration of the population related to the medieval Niru’un Mongols.
... Y-chromosome data provided by Zhabagin et al (2017) may also support a south central Siberian refugium for haplogroups C2-M217, Q-M242 and R-M207. This study analyzed 780 samples from the nearby Central Asian region of Transoxiana. ...
... Source populations for the data include Kazakhs, Uzbeks, Turkmen, Dungan and Karakalpak. According to data provided by Zhabagin et al (2017), among the populations of the region, C2-М217 attains an overall frequency of thirty-one percent, R1a1a-M198 attains sixteen percent, and Q-M242 attains thirteen percent. ...
... It should be noted that although G-M201 attains a heavy frequency among some of the Kazakh tribes, overall G-M201 frequencies in Central Asia are, nevertheless, low (i.e. Zhabagin et al. 2017). Turning now to the internal phylogeny of G-M201, within this mutation one finds two main branches, G2-P287 (see Supplementary Figure 7.1) and G1-M285 (Supplementary Figure 7.2). ...
Preprint
Full-text available
In the last five years the quest to explain the correlation between genetic and linguistic diversity has employed a methodology called palaeogenomic modeling. Such models were published in prestigious science journals including Nature. They have also been reported in mainstream media such as the BBC. Furthermore, the metrics data reflect that they are cited frequently in scholarly journals. These models, however, are flagrantly inconsistent with the archaeological record. They employ the wrong genetic marker and not enough data. I strongly believe that the palaeogenomic modeling “fad” will soon dissipate because of this these deficiencies. My research stands ready to yield desperately needed models of language prehistory that are highly reliable. Researchers will have, for the first time, a robust methodology for exploring the correlation between genetic and linguistic diversity.
... contains supplementary material, which is available to authorized users. STR genotyping systems from East and South Kazakhstan [14,15]. Results of previous studies revealed association between Y chromosome variations and clans. ...
Article
Full-text available
To improve available databases of forensic interest, all Y-STR haplotypes from Kazakh population were presented in this study. The reference database accumulated almost 3650 samples from academic and citizen science. Additionally, 27 Y-STR from Yfiler Plus System were first analyzed in 300 males from Kazakh (Qazaq) populations residing in Kazakhstan. The data is available in the YHDR under accession numbers YA004316 and YA004322. A total of 270 unique haplotypes were observed. Discrimination capacity was 90%. Obtained Y-STR haplotypes exhibited a high intra-population diversity. Analysis of pairwise genetic distances showed lowest RST values from Uighur and Mongolian populations.