Table 1 - uploaded by Kristin L Young
Content may be subject to copyright.
Y-Chromosome STRs Used in the Present Analysis

Y-Chromosome STRs Used in the Present Analysis

Source publication
Article
Full-text available
This study examines the genetic variation in Basque Y chromosome lineages using data on 12 Y-short tandem repeat (STR) loci in a sample of 158 males from four Basque provinces of Spain (Alava, Vizcaya, Guipuzcoa, and Navarre). As reported in previous studies, the Basques are characterized by high frequencies of haplogroup R1b (83%). AMOVA analysis...

Context in source publication

Context 1
... extraction was performed using a standard phenol:chlo- roform protocol. Y-STR profiles were characterized using the Y-Plex ™ 12 kit (Reliagene Technologies, Inc., New Orleans, Louisiana) to type 11 Y-STR loci (detailed in Table 1) plus the sex-determining amelogenin locus, according to manufacturer's instructions. Amplified products were detected using an ABI 377 DNA sequencer. ...

Citations

... Наконец, у басков, тоже включаемых в дене-кавказскую макросемью, гаплогруппа R1b составляет более 80 % (Young et al. 2011). И, это тоже выше (хоть и не так резко), чем у соседних с ними популяций Пиренейского полуострова (Lopez-Parra et al. 2008: 45). ...
... While the accuracy of in silico haplogroup assignment methods was initially under question, a number of recent validation studies including sample groups with both Y-SNP and Y-STR data acclaimed accuracy of more than 95%. e reliability of analysis is further reinforced when using datasets of at least twelve Y-STR loci with more inflexible definite haplogroup assignment thresholds [21][22][23][24]. In the forensic perspective, there are adequate Y-STR data accessible that can be explored for population genetics [25][26][27]. ...
Article
Full-text available
Amelogenin is a common sex typing marker encountered in forensic case work. Phenotypically normal males have been reported in the literature who exhibit anomalous amelogenin allele. These males express only a single amelogenin peak representing AMEL-X and are called as AMEL-Y-null males. Gender misclassification of such individuals is an obvious consequence of this mutation, as a male sample would falsely appear to be a female sample. This study was aimed to attribute the AMEL-Y-null male DNA profiles encountered in forensic casework in the Pakistani population to appropriate phylogenetic clade based on shared ancestry. A total of 18 null AMEL-Y males were screened out of the sample pool of 5000 male individuals, reflecting mutational frequency of 0.36%. A common phylogenetic ancestor is suggested for 17 individuals, based on computational analysis of the Y-STR haplotypes, shown to be belonging to the J haplogroup while only one sample belonged to the R group. The samples in J groups showed homology with subclades J2b2a M241 and J2b2a PH1648, while R group individual showed 100% homology with R1a. Data are reported after haplotype network development of AMEL-Y-null Pakistani males using Network 10.0 for the study of evolutionary distances and emergence of nodes.
... Retrieving shortterm genomic information has mainly consisted in Y-STR profiling in accessing the maximum of STRs variants and polymorphism either by (i) designing Y-STR multiplexes including highly mutable markers to better discriminate closely related individuals [19,20] or (ii) by sequencing and extracting length-based Y-STR polymorphism STR loci from Next Generation Sequencing technologies as implemented in STRait Razor [21] to get rid of the excess of variants. To access short and long-term information while diminishing costs, some studies have chosen to generate high resolution Y-STR data and to use previously developed tools to predict haplogroup classes [22][23][24][25]. Among these methods, Neural Network-based models (Felix Immanuel website [55] http://www.y-str.org/)) ...
Article
We developed a new mutationally well-balanced 32 Y-STR multiplex (CombYplex) together with a machine learning (ML) program PredYMaLe to assess the impact of STR mutability on haplogourp prediction, while respecting forensic community criteria (high DC/HD). We designed CombYplex around two sub-panels M1 and M2 characterized by average and high-mutation STR panels. Using these two sub-panels, we tested how our program PredYmale reacts to mutability when considering basal branches and, moving down, terminal branches. We tested first the discrimination capacity of CombYplex on 996 human samples using various forensic and statistical parameters and showed that its resolution is sufficient to separate haplogroup classes. In parallel, PredYMaLe was designed and used to test whether a ML approach can predict haplogroup classes fromY-STR profiles. Applied to our kit, SVM and Random Forest classifiers perform very well (average 97%), better than Neural Network (average 91%) and Bayesian methods (<90%). We observe heterogeneity in haplogroup assignation accuracy among classes, with most haplogroups having high prediction scores (99-100%) and two (E1b1b and G) having lower scores (67%). The small sample sizes of these classes explain the high tendency to misclassify the Y-profiles of these haplogroups; results were measurably improved as soon as more training data were added. We provide evidence that our ML approach is a robust method to accurately predict haplogroups when it is combined with a sufficient number of markers, well-balanced mutation rate Y-STR panels, and large ML training sets. Further research on confounding factors (such as gene conversion) and ideal STR panels in regard to the branches analysed can be developed to help classifiers further optimize prediction scores.
... While the in silico Y-haplogroup assignments for the Turkish Cypriots are presented herein for the very first time, they still constitute a relatively approximate approach. However, established accuracy levels of higher than 95% from the recent validation studies suggest that they can still be used for preliminary anthropological studies (Nunez et al., 2012;Petrejcikova et al., 2014;Young et al., 2011). To further establish the accuracy rate for the specific haplogroup assignment algorithm to be used in the current study, representative calculations were carried out first using a previously published dataset (with both 11-loci Y-STR and 58 Y-SNP data) for a different, yet smaller Cypriot population which was revised following a Bonferroni correction for multiple comparisons to account for potential Type I errors (Hochberg, 1988). ...
... Further validation studies were obviously needed so that more precise estimates of the error-rates for in silico haplogroup assignment tools could be made. In one such attempt, a previously published 19-loci Y-STR dataset with associated Y-SNP haplogroup assignments for a Basque population sample (n=116) was used to assess the accuracy of the Whit Athey algorithm (Adams et al., 2008;Young et al., 2011). Armed with a prior validation of this in silico haplogroup assignment tool for their target population, these investigators then proceeded with the use of the same algorithm for a new Basque population sample (n=158), this time with an 11-loci Y-STR dataset without Y-SNP data (Young et al., 2011). ...
... In one such attempt, a previously published 19-loci Y-STR dataset with associated Y-SNP haplogroup assignments for a Basque population sample (n=116) was used to assess the accuracy of the Whit Athey algorithm (Adams et al., 2008;Young et al., 2011). Armed with a prior validation of this in silico haplogroup assignment tool for their target population, these investigators then proceeded with the use of the same algorithm for a new Basque population sample (n=158), this time with an 11-loci Y-STR dataset without Y-SNP data (Young et al., 2011). In sharp contrast with Muzzio et al. (2011), results from this new validation exercise revealed an unambigious Y-haplogroup assignment accuracy as high as 99.14% (115/116), whereby the only ambiguous assignment result was that for a sample with Y-SNP assignment of the J2 4 haplogroup, for which the given algorithm assignment was J2 with only 43.2% probability and L with 50.6% probability (Muzzio et al., 2011;Young et al., 2011). ...
Article
Full-text available
Background: Cyprus is an island in the Eastern Mediterranean Sea with a documented history of human settlements dating back over 10000 years before present. Aim: To investigate the paternal lineages of a representative population from Cyprus in the context of the larger Near Eastern/Southeastern European genetic landscape Subjects and methods: 380 samples from the second most populous ethnic group in Cyprus (Turkish Cypriots) were analyzed at 17 Y-chromosomal short tandem repeat (Y-STR) loci. Results: A haplotype diversity of 0.9991 was observed, along with a number of allelic variants, multi-allelic patterns and a most frequent haplotype that were not previously reported elsewhere. Pairwise genetic distance comparisons of the Turkish Cypriot Y-STR dataset and Y-chromosomal haplogroup distribution with those from Near East/Southeastern Europe both suggested a closer genetic connection with the Near Eastern populations. Median-joining network analyses of the most frequent haplogroups also revealed some evidence towards in situ radiation. Conclusion: Turkish Cypriot paternal lineages seem to bear an autochthonous character and closest genetic connection with the neighboring Near Eastern populations. These observations are further underscored by the fact that the haplogroups associated with the spread of Neolithic Agricultural Revolution from the Fertile Crescent (E1b1b/J1/J2/G2a) dominate (>70%) the Turkish Cypriot haplogroup distribution.
... Since Y- SNP analyses are time-consuming and labor intensive, novel approaches have also recently been investigated, such as through the use of in silico assignment algorithms based on already available Y-STR data for a given sample (Athey, 2006Athey, , 2013 Cullen, 2008; ?etkovi? Gentula and Nevski, 2015; Urasin, 2013). While there is still an ongoing discussion on the accuracy rates for such in silico haplogroup assignment methods, a number of recent validation studies based on sample pools with both reliable Y-SNP and Y-STR data suggested that accuracy levels over 95% can be attained, especially when datasets with at least 12 Y-STR loci and/or more stringent unambiguous haplogroup assignment thresholds were used (Gurkan et al., 2016; Nunez et al., 2012; Petrej?ikova et al., 2014; Young et al., 2011). Constituting a microcosm of the larger genetic landscape of the Balkans, Bosnia and Herzegovina (B&H) is a particularly suitable geographic location for studying the mechanisms responsible for the current distribution of the Paleolithic and Neolithic genetic signals observed throughout Europe (Mirabal et al., 2010). ...
... Since YSNP analyses are time-consuming and labor intensive, novel approaches have also recently been investigated, such as through the use of in silico assignment algorithms based on already available Y-STR data for a given sample ( Athey, 2006Athey, , 2013Cullen, 2008;Ćetković Gentula and Nevski, 2015;Urasin, 2013). While there is still an ongoing discussion on the accuracy rates for such in silico haplogroup assignment methods, a number of recent validation studies based on sample pools with both reliable Y-SNP and Y-STR data suggested that accuracy levels over 95% can be attained, especially when datasets with at least 12 Y-STR loci and/or more stringent unambiguous haplogroup assignment thresholds were used ( Gurkan et al., 2016;Nunez et al., 2012;Petrejčikova et al., 2014;Young et al., 2011). Constituting a microcosm of the larger genetic landscape of the Balkans, Bosnia and Herzegovina (B&H) is a particularly suitable geographic location for studying the mechanisms responsible for the current distribution of the Paleolithic and Neolithic genetic signals observed throughout Europe ( Mirabal et al., 2010). ...
Article
Y-chromosomal haplogroups are sets of ancestrally related paternal lineages, traditionally assigned by the use of Y-chromosomal single nucleotide polymorphism (Y-SNP) markers. An increasingly popular and a less labor-intensive alternative approach has been Y-chromosomal haplogroup assignment based on already available Y-STR data using a variety of different algorithms. In the present study, such in silico haplogroup assignments were made based on 23-loci Y-STR data for 100 unrelated male individuals from the Tuzla Canton, Bosnia and Herzegovina (B&H) using the following four different algorithms: Whit Athey's Haplogroup Predictor, Jim Cullen's World Haplogroup & Haplogroup-I Subclade Predictor, Vadim Urasin's YPredictor and the NevGen Y-DNA Haplogroup Predictor. Prior in-house assessment of these four different algorithms using a previously published dataset (n = 132) from B&H with both Y-STR (12-loci) and Y-SNP data suggested haplogroup misassignment rates between 0.76% and 3.02%. Subsequent analyses with the Tuzla Canton population sample revealed only a few differences in the individual haplogroup assignments when using different algorithms. Nevertheless, the resultant Y-chromosomal haplogroup distribution by each method was very similar, where the most prevalent haplogroups observed were I, R and E with their sublineages I2a, R1a and E1b1b, respectively, which is also in accordance with the previously published Y-SNP data for the B&H population. In conclusion, results presented herein not only constitute a concordance study on the four most popular haplogroup assignment algorithms, but they also give a deeper insight into the inter-population differentiation in B&H on the basis of Y haplogroups for the first time.
... R1b, J1, and other minor haplogroups (Table 2 and Supporting Information Table S2) are classified as of European descent. R1b is present at high frequencies in the Iberian Peninsula, and it ranges from 56% in Portugal to 68% in Spain (Young, Sun, Deka, & Crawford, 2011). J1 is most common in Arabs and Jews, and its frequency in the Portuguese population is 14% (Adams et al., 2008). ...
... Haplotypes H16 (Bernardo Furquim) and H62 (Luiz Francisco de Souza) were both classified in haplogroup R1b1a2a1a2 (of European origin) through Y-SNPs. As previously reported, R1b haplogroup is present in high frequencies in the Iberian Peninsula (Young et al., 2011). The Y-SNPs showed that both Furquim's and Souza's Y chromosomes are probably of the same Portuguese stock, since they were classified into the same haplogroup, in both analyses (Y-STRs and Y-SNPs). ...
Article
Full-text available
Objectives: Quilombo remnants are relics of communities founded by runaway or abandoned African slaves, but often with subsequent extensive and complex admixture patterns with European and Native Americans. We combine a genetic study of Y-chromosome markers with anthropological surveys in order to obtain a portrait of quilombo structure and history in the region that has the largest number of quilombo remnants in the state of São Paulo. Methods: Samples from 289 individuals from quilombo remnants were genotyped using a set of 17 microsatellites on the Y chromosome (AmpFlSTR-Yfiler). A subset of 82 samples was also genotyped using SNPs array (Axiom Human Origins-Affymetrix). We estimated haplotype and haplogroup frequencies, haplotype diversity and sharing, and pairwise genetic distances through FST and RST indexes. Results: We identified 95 Y chromosome haplotypes, classified into 15 haplogroups. About 63% are European, 32% are African, and 6% Native American. The most common were: R1b (European, 34.2%), E1b1a (African, 32.3%), J1 (European, 6.9%), and Q (Native American, 6.2%). Genetic differentiation among communities was low (FST = 0.0171; RST = 0.0161), and haplotype sharing was extensive. Genetic, genealogical and oral surveys allowed us to detect five main founder haplotypes, which explained a total of 27.7% of the Y chromosome lineages. Conclusions: Our results showed a high European patrilineal genetic contribution among the founders of quilombos, high amounts of gene flow, and a recent common origin of these populations. Common haplotypes and genealogical data indicate the origin of quilombos from a few male individuals. Our study reinforces the importance of a dual approach, involving the analysis of both anthropological and genetic data.
... While the in silico Yhaplogroup assignments for the Turkish Cypriots are presented herein for the very first time, they still constitute a relatively approximate approach. However, established accuracy levels of higher than 95% from the recent validation studies suggest that they can still be used for preliminary anthropological studies (Nunez et al., 2012;Petrejcikova et al., 2014;Young et al., 2011). To further establish the accuracy rate for the specific haplogroup assignment algorithm to be used in the current study, representative calculations were carried out first using a previously published dataset (with both 11-loci Y-STR and 58 Y-SNP data) for a different, yet smaller Cypriot population sample (n ¼ 162) (Zalloua et al., 2008a). ...
... Именно: R-M207 у бурушо -10% (Пакистан (средняя) -0,6%); R2 у бурушо -14,4% (Пакистан (средняя) -7,8%); R1а у бурушо -26% (Пакистан (средняя) -37%) (Firasat et al. 2007). Наконец, у басков, тоже включаемых в дене-кавказскую макросемью, гаплогруппа R1b составляет более 80% (Young et al. 2011). И, это тоже выше, чем частота R1b у соседних с ними популяций Пиренейского полуострова (Lopez-Parra et al. 2008: 45) и Франции (Bekada et al. 2013: Suppl.). ...
... Наконец, у басков, тоже включаемых в дене-кавказскую макросемью, гаплогруппа R1b составляет более 80 % (Young et al. 2011). И, это тоже выше (хоть и не так резко), чем у соседних с ними популяций Пиренейского полуострова (Lopez-Parra et al. 2008: 45). ...
... Thus, haplogroup R in Basques is near 90% (Young et al. 2011). This is 10% higher than their neighbors have (Lopez-Parra et al. 2008: 45;Becada et al. 2013: Suppl.). ...
Book
Full-text available
В монографии предпринята попытка еще раз подтвердить восточно- евразийскую гипотезу дене-кавказской прародины, используя данные по гаплогруппам Y-хромосомы. Рассматривается распределение гаплогрупп R и Q у дене-кавказских народов и их соседей, а также ряд других вопро- сов. Работа будет интересна широкому кругу читателей, в первую очередь лингвистам, генетикам, антропологам, историкам и археологам.
... The exception to the results of the present study is thus nicely justified in this scenario, suggesting that when a Finno-Ugric language was introduced in Hungary, the genetic buildup of the population changed only in part, thus retaining similarities with its geographic neighbors, an example of the process called elite dominance by Renfrew (1992). On the contrary, the same case cannot be easily made for Basques (Alonso et al., 2005;Rodr ıguez-Ezpeleta et al., 2010;Young et al., 2011;Martinez-Cruz et al., 2012) or Finns, for whom, to the best of our knowledge, no available evidence suggests a similar model of partial demographic replacement associated with language replacement (Nelis et al., 2009). Thus, the comparative linguistic/genomic analysis, attempted in the present study, seems able to single out and precisely assess these differences in the population histories of the three non-IE members of our sample. ...
Article
Full-text available
The notion that patterns of linguistic and biological variation may cast light on each other and on population histories dates back to Darwin's times; yet, turning this intuition into a proper research program has met with serious methodological difficulties, especially affecting language comparisons. This article takes advantage of two new tools of comparative linguistics: a refined list of Indo-European cognate words, and a novel method of language comparison estimating linguistic diversity from a universal inventory of grammatical polymorphisms, and hence enabling comparison even across different families. We corroborated the method and used it to compare patterns of linguistic and genomic variation in Europe. Two sets of linguistic distances, lexical and syntactic, were inferred from these data and compared with measures of geographic and genomic distance through a series of matrix correlation tests. Linguistic and genomic trees were also estimated and compared. A method (Treemix) was used to infer migration episodes after the main population splits. We observed significant correlations between genomic and linguistic diversity, the latter inferred from data on both Indo-European and non-Indo-European languages. Contrary to previous observations, on the European scale, language proved a better predictor of genomic differences than geography. Inferred episodes of genetic admixture following the main population splits found convincing correlates also in the linguistic realm. These results pave the ground for previously unfeasible cross-disciplinary analyses at the worldwide scale, encompassing populations of distant language families. Am J Phys Anthropol, 2015. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.