Article

A Bayesian approach to inferring population structure from dominant markers

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Molecular markers derived from PCR amplification of genomic DNA are an important part of the toolkit of evolutionary geneticists. RAPDs, AFLPs, and ISSR polymorphisms allow analysis of species for which prior DNA sequence information is lacking, but dominance makes it impossible to apply standard techniques to calculate F-statistics. We describe a Bayesian method that allows direct estimates of Fst from dominant markers. In contrast to existing alternatives, we do not assume prior knowledge of the degree of within-population inbreeding. In particular, we do not assume that genotypes within populations are in Hardy-Winberg proportions. Our estimate of Fst incorporates uncertainty about the magnitude of within-population inbreeding. Simulations show that samples from even a relatively small number of loci and populations produce reliable estimates of Fst. Moreover, some information about the degree of within population inbreeding (Fis) is available from data sets with a large number of loci and populations. We illustrate the method with reanalysis of RAPD data from 14 populations of a North American orchid, Platanthera leucophaea.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... AFLPsurv v1.0 software (Vekemans 2002) was used to estimate allele frequency at AFLP loci and expected heterozygote using the Bayesian approach. The f (analogous to F IS ) and h II (analogous to F ST ) statistics were calculated using Bayesian inference for different models using HICKORY v1.1 software (Holsinger et al. 2002). Also, population-specific inbreeding coefficients (F IS ) were estimated by an approximate Bayesian computation using the software ABC4 (Foll et al. 2008). ...
... At the population level, 6% of the total molecular variation was attributed to inter-population differentiation, and 96% to individual differentiation within populations (U ST ¼ 0.060, p < 0.001). From Bayesian analysis (Holsinger et al. 2002), the smallest deviance information criterion (DIC ¼ 6889.8) was obtained from full model among four different models and the obtained f (analogous to F IS ) was 0.865 and h II (analogous to F ST ) was 0.056, assuming the full model (Table 4). ...
... The value of inbreeding coefficient (f ¼ 0.865) estimated by the Bayesian procedure (Holsinger et al. 2002). It was higher than the value (F IS ¼ 0.618) estimated by an approximate Bayesian computation (ABC) (Foll et al. 2008). ...
Article
Full-text available
We applied eight primer-restriction enzyme combinations to investigate genetic diversity, genetic differentiation, and genetic structure of Carpinus laxiflora populations with AFLP markers. The average of effective alleles (Ae), the proportion of polymorphic loci (%P), Shannon’s diversity index (I), and the expected heterozygosity (He) were 1.4, 82.2%, 0.371, and 0.241, respectively. The expected heterozygosity (Hj) from Bayesian method was 0.270. The level of genetic diversity was high compared to those of Carpinus species and other species with a similar life history. The inbreeding coefficient (FIS) from approximated Bayesian method was 0.618, which was smaller than that for Acer pseudosieboldianum (FIS=0.712). Genetic differentiation was 0.060 from AMOVA (ΦST) and 0.056 from Bayesian method (θII). The level of genetic differentiation was very small compared to that of Carpinus species and other species with a similar life history. According to UPGMA and Bayesian clustering, 10 populations were divided into two genetic groups. Except Mt. Chilgap and Minjuji, most of the populations were detected as weak genetic structures according to the geographical distribution such as mountain ranges. We might consider that demographic disturbance, local specific vegetation change history, and forest succession interrupted the genetic structure of C. laxiflora in South Korea.
... F statistics under all three assumptions were obtained with AFLP-SURV 1.0 (Vekemans 2002) following the method by Lynch and Milligan (1994). We also included an additional estimate of F ST using a recently developed Bayesian approximation procedure (Holsinger et al. 2002) with the program Hickory 0.8 (Holsinger and Lewis 2003). Unlike the methods described above, this Bayesian procedure does not assume previous knowledge of the degree of within-population inbreeding and, under simulation, has shown to give more accurate and reliable estimates of population subdivision (Holsinger et al. 2002). ...
... We also included an additional estimate of F ST using a recently developed Bayesian approximation procedure (Holsinger et al. 2002) with the program Hickory 0.8 (Holsinger and Lewis 2003). Unlike the methods described above, this Bayesian procedure does not assume previous knowledge of the degree of within-population inbreeding and, under simulation, has shown to give more accurate and reliable estimates of population subdivision (Holsinger et al. 2002). Last, we used a phenetic approach (AMOVA) for testing genetic population structure and genetic variability within populations (Stewart and Excoffier 1996). ...
... It has been suggested that for angiosperms with primarily outcrossing mating systems and large data sets, different gene diversity estimates should converge on similar values, and thus, their respective F ST estimators should also converge (Krauss 2000). However, considerable differences among estimators may result when inbreeding is substantial in the system (Holsinger et al. 2002). Thus, similar F ST estimates could be used as an indication of weak within-population inbreeding. ...
Article
Heliconia bihai is a highly polymorphic species. Populations of the species are distributed throughout the Lesser Antilles and northern South America; the most obvious sign of polymorphism is manifested through differences in bract color. This variation has been attributed to pollinator-mediated selection, but other processes are also plausible. To test the potential contribution of drift in the evolution of phenotypic variation in this system, we examined the distribution of morphological and genetic variation within and between populations of H. bihai in the Caribbean islands of St. Vincent and St. Lucia. Morphological characterization was limited to flower and inflorescence characteristics, including bract color. AFLPs were used to investigate levels of genetic diversity within and between populations. Genetic similarity among individuals was equivalent to that expected for conspecific individuals. Levels of AFLP and population subdivision were high and were comparable to those of outcrossing species in one island, but they were low and comparable with selfing or clonal species in the other, indicating that genetic variation between islands may be under different evolutionary regimes. There was a significant geographical structure in morphological variation, but this was considerably less pronounced than that found at the genetic level except for bract color patterns, which were different for different islands. We found no correlation between genetic similarity and geographic distance or between genetic similarity and morphological similarity. Observed patterns of genetic and morphological variation in H. bihai suggest that genetic drift is likely to have had a minor role in their development. Instead, these patterns are consistent with phenomena such as selection, inbreeding, and habitat fragmentation that may be operating independently at different spatial scales.
... The number of polymorphic markers and private alleles were calculated for each population separately. We used the Bayesian method of Holsinger & al. (2002) to estimate θ B (an analogue of F st st ) and within-population expected heterozygosities (H s ). The software HICKORY-v1.0.4 (Holsinger & al., 2002) was applied with a non-informative prior on F is is (within-population inbreeding coefficient) using the f free model. ...
... We used the Bayesian method of Holsinger & al. (2002) to estimate θ B (an analogue of F st st ) and within-population expected heterozygosities (H s ). The software HICKORY-v1.0.4 (Holsinger & al., 2002) was applied with a non-informative prior on F is is (within-population inbreeding coefficient) using the f free model. This model was favored over the f = 0, θ = 0 and full models according to the Deviance Information Criterion (DIC 1,693.9, ...
... Geographic distances between populations were calculated using the 'Haversine' formula (Sinnott, 1994). Genetic distances among populations were calculated as pairwise θ B (F st st ) values applying the f free model of HICKORY-v1.0.4 (Holsinger & al., 2002). Mantel correlations were calculated among the genetic and geographical distance matrices and their significance was tested using 1,000 random permutations in ARLE-QUIN-v3.11 ...
Article
The high species diversity and endemism of the Maritime Alps suggest that this region may have held several refugia during the Pleistocene glaciations. Nevertheless, this assumption has rarely been examined. Here we investigate the genetic diversity of Saxifraga florulenta, a rare endemic restricted to siliceous substrates in the Maritime Alps. Overlaying the maximum extension of the ice sheet during the Pleistocene, the current distributions of S. florulenta and siliceous substrates suggest the existence of two putative refugial areas in the Maritime Alps. By using evidence from amplified fragment length polymorphisms we aim at elucidating whether genetic structure of the species corresponds to this two‐refugia hypothesis and how this genetic information can be used to ensure its long‐term conservation. Low levels of species‐wide and within‐population genetic diversity were detected, suggesting strong historical bottlenecks. Bayesian and principal coordinate analyses identified two population groups in agreement with the two refugia hypothesis. However, weak genetic divergence between these groups suggests that their separation happened more recently, and that S. florulenta survived the Pleistocene glaciations in one main refugium. The lack of a significant correlation among genetic and geographic distances implies that populations are not at migration‐drift equilibrium and current levels of gene flow among them do not appear to be sufficient to balance the effect of genetic drift. Hence, in future conservation strategies, special care should be taken to preserve both gene pools and prevent further fragmentation of populations.
... AFLPsurv v1.0 software (Vekemans 2002) was used to estimate allele frequencies at AFLP loci and expected heterozygotes using a Bayesian approach. The f (analogous to F IS ) and h II (analogous to F ST ) statistics were calculated using a Bayesian inference for different models using HICKORY v1.1 software (Holsinger et al. 2002). In addition, population-specific inbreeding coefficients (F IS ) were estimated by an approximate Bayesian computation using the software ABC4 (Foll et al. 2008). ...
... At the population level, 25% of the total molecular variation was attributed to inter-population differentiation, and 75% to individual differentiation within populations (U ST ¼ 0.245, p < 0.01). From Bayesian analysis (Holsinger et al. 2002), the smallest deviance information criterion (DIC ¼ 8905.1) was obtained from the full model among the 4 different models, and the obtained f (analogous to F IS ) was 0.963 and h II (analogous to F ST ) was 0.278, assuming the full model (Table 4). ...
Article
Full-text available
We applied seven pairs of primer-restriction enzyme combinations to investigate the genetic diversity, genetic differentiation, and genetic structure of Prunus padus populations with AFLP markers. The values obtained for average of effective alleles (A e), percentage of polymorphic loci (%P), Shannon’s diversity index (I), and expected heterozygosity (H e) were 1.38, 81.4, 0.357, and 0.223%, respectively. The expected heterozygosity (Hj) obtained by using a Bayesian method was 0.256. The level of genetic diversity obtained for P. padus was low compared to that of Prunus species and other species with a similar life history. The inbreeding coefficient (F IS) from the approximated Bayesian method was 0.767. This value was lower than that obtained for Ulmus davidiana, which undergoes both sexual and asexual reproduction. However, the value obtained was larger than that for other species that undergo sexual reproduction, such as, Carpinus laxiflora, Phellodendron amurense, and Acer pseudosieboldianum. The value of genetic differentiation was 0.245 from AMOVA (ΦST) and 0.278 from Bayesian method (θII). The obtained level of genetic differentiation was large compared to that of other Prunus species plants and other species with a similar life history. According to UPGMA and Bayesian clustering, 11 populations were divided into two genetic groups. However, some populations were detected as weak genetic structures according to the geographical distribution. It was occurred by forest succession, asexual propagation strategies to adapt local environmental change, and gene flow being gradually decreased due to population fragmentation by demographic disturbances.
... Analysis of molecular variance (AMOVA) was conducted in Arlequin V3.5 in the same way as for the COI dataset. In addition, an unbiased estimate of differentiation among populations, h (II) was obtained using the Bayesian method proposed by Holsinger et al. (2002) and implemented in the software Hickory v1.1. The data were run with the default parameters using the f-free model. ...
... The N-net provides good visualization of the data when it presents complex evolutionary steps or reticulate relationship among genotypes (Huson and Bryant 2006). The networks were constructed based on Nei's distance (GD) matrix between populations calculated with a Bayesian method using AFLP-Surv version 1.0 (Vekemans et al. 2002) with non-uniform distribution by assuming deviation from Hardy-Weinberg equilibrium; F is values were estimated by Hickory software using the full model (f = 0.033, Holsinger et al. 2002). Analyses were done with 1000 permutations and 1000 bootstrap values. ...
Article
Full-text available
Pontoscolex corethrurus is a well-known invasive earthworm in tropical zone which is believed to have originated from the Guayana Shield in South America and was described as parthenogenetic. A recent phylogenetic study revealed four cryptic species in the P. corethrurus complex (L1, L2, L3 and L4), among them L1 was particularly widespread and was proposed as P. corethrurus sensu stricto. Here, our aims were to investigate the genetic variation of P. corethrurus L1 in its presumed native and introduced ranges and to examine its reproductive strategy. An extensive dataset of 478 cytochrome oxidase I gene (COI) sequences, obtained in specimens sampled all around the world, revealed a weak COI haplotype diversity with one major haplotype (H1) present in 76% of the specimens. Analyses of the genetic variation of 12 L1 populations were done using both nuclear (226 AFLP profiles) and mitochondrial (269 COI sequences) genetic information. The high AFLP genotype diversity at the worldwide scale and the fact that no genotype was shared among populations, allowed to reject the ‘super-clone’ invasion hypothesis. Moreover, a similar level of mean genetic diversity indices were observed between the introduced and native ranges, a pattern explained by a history of multiple introductions of specimens from different parts of the world. At last, occurrence of identical AFLPs genotypes (i.e. clones) in several population confirmed asexual reproduction, but recombination was also revealed by gametic equilibrium analysis in some populations suggesting that P. corethrurus L1 may have a mixed reproductive strategy.
... One thousand random permutations were used to infer the significance of the variance components. In addition, an unbiased estimate of differentiation among populations, θ (II) was obtained using the Bayesian method proposed by Holsinger et al. (2002) and implemented in the software Hickory v1.1. The data were run with the default parameters using the f-free model. ...
... The N-net provides good visualization of the data when it presents complex evolutionary steps or reticulate 102 relationship among genotypes (Huson & Bryant, 2006). The networks were constructed based on Nei's distance (GD) matrix between populations calculated with a Bayesian method using AFLP-Surv version 1.0 (Vekemans et al., 2002) with non-uniform distribution by assuming deviation from Hardy-Weinberg equilibrium; Fis values were estimated by Hickory software using the full model (Holsinger et al., 2002). Analyses were done with 1000 permutations and 1000 bootstrap values. ...
Thesis
Pontoscolex corethrurus is the most widespread earthworm species in the tropical and sub-tropical zones, it is hence one of the most studied earthworm in soil science. Ecological aspects of P. corethrurus, which is known to be present in a wide range of habitats from poor soils of pasture to rich soils of primary forest, were intensively investigated but biological aspects are less addressed. In particular, information on the genetic variation within the morphospecies is scarce except for the finding of two genetically differentiated lineages in São Miguel Island of Azores archipelago in 2014. Moreover, the ploidy degree of the morphospecies is not yet known and its reproduction strategy is not well understood. One of the objectives of this thesis was to understand the mechanisms and characteristics which make P. corethrurus a successful invader. Our second objective was to look for cryptic lineages in the whole world and to describe the phylogenetic relationships between them. A third objective was to identify which lineage was invasive and to characterize its population genetic structure in the native and the introduced ranges. The last objective was to test if the different species of the complex have different ploidy degrees (polyploid complex). This could eventually explain the reproductive isolation among these species. A bibliographic synthesis of 265 studies covering all subjects of knowledge on P. corethrurus showed that the r strategy and plasticity of this earthworm are the key characteristics which make it a successful invader in different habitats. In order to investigate the cryptic diversity within P. corethrurus in a world-wide scale, I examined 792 specimens collected from 25 different countries and islands. These specimens were analyzed using two mitochondrial (COI and 16S rDNA) and two nuclear (internal transcribed spacers 2 and 28S rDNA) markers and a large-scale multilocus sequence data matrix obtained using the Anchored Hybrid Enrichment (AHE) method. In addition, a total of 11 morphological characters, both internal and external, were investigated in all genetically characterized lineages. Four cryptic species (L1, L2, L3 and L4) were found within the P. corethrurus species complex, and four potentially new species within the genus Pontoscolex. The cryptic species were observed in sympatry at several localities, and analyses based on AFLP markers showed no hybridization among L1 and L3. The possibility of reproductive isolation among species of the complex because of different ploidy degrees was investigated by cytogenetic experimentations. Due to different obstacles encountered at different steps of the experimentations, results were just obtained for L4 (2n=70). One of the species belonging to the complex, L1, was particularly widespread per comparison with the others. This species corresponded to topotype specimens (samples from Fritz Müller’s garden where P. corethrurus was first described in 1856). Thus, we focused on this invasive species in a population genetics and phylogeography study. Using COI gene and AFLP markers, we revealed low genetic diversity through the tropical zone, probably due to recent colonization events and asexual reproduction type. Meanwhile, due to weak linkage disequilibrium and relatively high genetic diversity in some populations, sexual reproduction was suggested for L1.To date, this is the first study investigating at a world-wide scale, cryptic species diversity, population genetics and phylogeography of a peregrine earthworm species throughout tropical zone. I produced the first comprehensive review of all ecological and biological aspects of P. corethrurus. Moreover, the taxonomic status of P. corethrurus was clarified as well as its reproduction strategy which is mixed (parthenogenetic and sexual). All these findings represent potentially useful information for future experimentations and researches on species of P. corethrurus complex
... The Bayesian approach [43] was employed to obtain the θ B genetic divergence index between pairs of populations, with the assistance of the HICKORY v1.1 program [40]. The values of total genetic heterozygosity (H T-B ) and mean heterozygosity within the population (H S-B ) was also obtained using the Bayesian method. ...
Article
Full-text available
Seasonally Dry Tropical Forests (SDTFs) located on limestone outcrops are vulnerable to degradation caused by timber logging and limestone extraction for cement production. Some of these forests represent the last remnants of native vegetation cover, functioning as isolated islands. Ceiba pubiflora (Malvaceae) is a tree frequently found on limestone outcrops in the central region of Brazil. This study aimed to evaluate the genetic diversity and identify suitable populations for the establishment of Management Units (MUs) for conservation. Inter-simple sequence repeat markers were employed to assess the genetic diversity in ten populations sampled from the Caatinga, Cerrado, and Atlantic Forest biomes. The species exhibited substantial genetic diversity (H T = 0.345; P LP = 97.89%). Populations SAH, JAN, and MON demonstrated elevated rates of polymorphic loci (> 84.2%) along with notable genetic diversity (He > 0.325). Additionally, these populations were the primary contributors to gene flow. The analysis of molecular variance (AMOVA) indicated that most genetic variation occurs within populations (91.5%) than between them. In the Bayesian analysis, the ten populations were clustered into five groups, revealing the presence of at least three barriers to gene flow in the landscape: 1) the Central Plateau or Paranã River valley; 2) near the Espinhaço mountains or the São Francisco River valley; and 3) around the Mantiqueira mountain range, Chapada dos Veadeiros plateau, and disturbed areas. A positive and statistically significant correlation was observed between genetic (θ B ) and geographic distances (r = 0.425, p = 0.008). Based on these findings, we propose the establishment of Management Units in Minas Gerais state, encompassing the (1) southern region (MIN population), (2) central region (SAH population), and (3) north region (MON population), as well as in Goiás state, covering the (4) Central Plateau region. These units can significantly contribute to preserving the genetic diversity of these trees and protecting their habitat against ongoing threats.
... Molecular sex determination is a highly accurate method that can be utilized for young birds and monomorphic birds. This approach directly targets the sex chromosomes, resulting in a precise determination of the bird's gender (Holsinger et al., 2002). ...
Article
Full-text available
Sex identification of Sun Conures (Aratinga solstitialis) is crucial for breeding and preservation, as well as increasing sun conure populations. These birds are sexually monomorphic. Therefore, Determination between male and female carried out by their morphology examination. The Polymerase Chain Reaction (PCR) method, utilizing molecular-based technology, was employed to determine the sex of Aratinga solstitialis in this study. The P2 and P8 primers were utilized in this method, which has been deemed suitable and accurate for sex identification through calamus samples. The research focused on two 28-month-old Aratinga solstitialis birds. Calamus samples were collected and subjected to PCR amplification using the extracted calamus. The resulting PCR products were then visualized using electrophoresis with a 1% agarose gel. In the electrophoresis photo, the presence of two bands indicated a female specimen, whereas a single band indicated a male specimen. The result of the gel electrophoresis research showed that both of the Aratinga solstitialis were male with one band of each bird on ranged from 300-400 base pairs. The result show that the Polymerase Chain Reaction method in terms for sex identification on monomorphic birds, especially Aratinga solstitialis birds is very effective to differentiate the sex of young birds and the adults.
... The components of genetic differentiation were estimated using the ɸ PT , with GenAlEx software. The genetic structure of our dataset was derived with the STRUCTURE software (Pritchard et al., 2000) and the data were tested for fit to four different population models proposed in the HICKORY package (Holsinger et al., 2002). More details on the calculation and analysis of these parameters can be found in the Supplementary Material. ...
Article
Density and genetic variability of both soil macro- and meso-fauna are disturbed by productive practices. This study aimed to analyze the genetic diversity and genetic structure of populations of the earthworm Aporrectodea caliginosa caliginosa (Briones, 1996) in two sites of the Argentine Pampa under different levels of disturbance: i) grassland, ii) livestock-raising plot, iii) agricultural-livestock raising plots and iv) agricultural plot (the most disturbed). The genetic diversity of the earthworm population was determined based on the allele number, polymorphism percentage and Similarity Index. Allele number and polymorphism percentage were lower in populations from one of the soils under agricultural-livestock practices, but differed significantly only from the values of the other agricultural-livestock raising plot and the livestock-raising plot. Population structure was low although significant. This study shows that allele number and polymorphism are very useful metrics to provide historical and functional information of soils. However, the genetic differences here recorded probably depend on multiple historical and recent causes. The stability of environmental conditions along with the degree of disturbance must be considered to understand their impact on population genetic structure.
... The components of genetic differentiation were estimated using the ɸ PT , with GenAlEx software. The genetic structure of our dataset was derived with the STRUCTURE software (Pritchard et al., 2000) and the data were tested for fit to four different population models proposed in the HICKORY package (Holsinger et al., 2002). More details on the calculation and analysis of these parameters can be found in the Supplementary Material. ...
... This package uses Hamiltonian Monte Carlo as implemented in Stan to provide Bayesian estimates of the relevant parameters (including the entire posterior distribution of allele frequencies, within-patch inbreeding [F IS ], and diversity within and among patches with the fixation index [F st ]). Hickory implements an improved version of a Bayesian model first described in Holsinger et al. (2002) and in Holsinger and Wallace (2004), which addresses shortcomings noted by Foll et al. (2008). We used custom R scripts to extract the posterior distribution of allele frequencies and used them in several estimates of diversity: 1. ...
Article
Full-text available
Premise: The distribution of genetic diversity on the landscape has critical ecological and evolutionary implications. This may be especially the case on a local scale for foundation plant species since they create and define ecological communities, contributing disproportionately to ecosystem function. Methods: We examined the distribution of genetic diversity and clones, which we defined first as unique multilocus genotypes (MLG), and then by grouping similar MLGs into multilocus lineages (MLL). We used 186 markers from inter-simple sequence repeats (ISSR) across 358 ramets from 13 patches of the foundation grass Leymus chinensis. We examined the relationship between genetic and clonal diversities, their variation with patch-size, and the effect of the number of markers used to evaluate genetic diversity and structure in this species. Results: Every ramet had a unique MLG. Almost all patches consisted of individuals belonging to a single MLL. We confirmed this with a clustering algorithm to group related genotypes. The predominance of a single lineage within each patch could be the result of the accumulation of somatic mutations, limited dispersal, some sexual reproduction with partners mainly restricted to the same patch, or a combination of all three. Conclusions: We found strong genetic structure among patches of L. chinensis. Consistent with previous work on the species, the clustering of similar genotypes within patches suggests that clonal reproduction combined with somatic mutation, limited dispersal, and some degree of sexual reproduction among neighbors causes individuals within a patch to be more closely related than among patches. This article is protected by copyright. All rights reserved.
... bFst accounts 100 for genotype uncertainty in the model using genotype likelihoods. For a more detailed 101 description see [21]. The likelihood function has been modified to use genotype 102 likelihoods provided by variant callers. ...
Preprint
Full-text available
Since its introduction in 2011 the variant call format (VCF) has been widely adopted for processing DNA and RNA variants in practically all population studies — as well as in somatic and germline mutation studies. VCF can present single nucleotide variants, multi-nucleotide variants, insertions and deletions, and simple structural variants called against a reference genome. Here we present over 125 useful and much used free and open source software tools and libraries, part of vcflib tools and bio-vcf . We also highlight cyvcf2 , hts-nim and slivar tools. Application is typically in the comparison, filtering, normalisation, smoothing, annotation, statistics, visualisation and exporting of variants. Our tools run daily and invisibly in pipelines and countless shell scripts. Our tools are part of a wider bioinformatics ecosystem and we consider it very important to make these tools available as free and open source software to all bioinformaticians so they can be deployed through software distributions, such as Debian, GNU Guix and Bioconda. vcflib , for example, was installed over 40,000 times and bio-vcf was installed over 15,000 times through Bioconda by December 2020. We shortly discuss the design of VCF, lessons learnt, and how we can address more complex variation that can not easily be represented by the VCF format. All source code is published under free and open source software licenses and can be downloaded and installed from https://github.com/vcflib . Author summary Most bioinformatics workflows deal with DNA/RNA variations that are typically represented in the variant call format (VCF) — a file format that describes mutations (SNP and MNP), insertions and deletions (INDEL) against a reference genome. Here we present a wide range of free and open source software tools that are used in biomedical sequencing workflows around the world today.
... The F ST values were estimated as in Weir and Cockerham (1984), and interpreted as in Hartl and Clark (1997).The program Hickory v. 1.1 (Holsinger and Lewis, 2003) was used to analyze RAPD data for the population structure between species. The program allows estimation of θ II , an analogue of F ST from dominant marker data such as RAPDs without prior information on inbreeding (F IS ) and without assuming Hardy-Weinberg equilibrium of genotypes (Holsinger et al., 2002). The RAPD data set was analyzed under four different models as implemented in the program. ...
... Holsinger (HOLSINGER 1999) and implemented in previous studies (HOLSINGER et al. 2002;BALDING 2003). Because no longer follows the uniform distribution, we used rejection sampling to ensure that ̅ ∑ is uniformly distributed across 100 bins across simulations to avoid artifacts caused by systematic differences in allele frequencies. ...
Article
Traditional Hardy-Weinberg equilibrium (HWE) tests (the χ2 test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in datasets comprised of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence datasets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently amongst the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth.
... This method is computationally efficient but may not account for the uncertainties in the estimated allele frequencies as well as ANGSD does. In the second method (bFst), GPAT implements a Bayesian framework as described by Holsinger, Lewis, & Dey (2002), with a modification in its original likelihood function such that genotype likelihoods can be used as input instead of called genotypes. This Bayesian approach has the advantage of being able to provide a confidence interval for FST, but it is computationally expensive. ...
Preprint
Low-coverage whole genome sequencing (lcWGS) has emerged as a powerful and cost-effective approach for population genomic studies in both model and non-model species. However, with read depths too low to confidently call individual genotypes, lcWGS requires specialized analysis tools that explicitly account for genotype uncertainty. A growing number of such tools have become available, but it can be difficult to get an overview of what types of analyses can be performed reliably with lcWGS data and how the distribution of sequencing effort between the number of samples analyzed and per-sample sequencing depths affects inference accuracy. In this introductory guide to lcWGS, we first illustrate that the per-sample cost for lcWGS is now comparable to RAD-seq and Pool-seq in many systems. We then provide an overview of software packages that explicitly account for genotype uncertainty in different types of population genomic inference. Next, we use both simulated and empirical data to assess the accuracy of allele frequency estimation, detection of population structure, and selection scans under different sequencing strategies. Our results show that spreading a given amount of sequencing effort across more samples with lower depth per sample consistently improves the accuracy of most types of inference compared to sequencing fewer samples each at higher depth. Finally, we assess the potential for using imputation to bolster inference from lcWGS data in non-model species, and discuss current limitations and future perspectives for lcWGS-based analysis. With this overview, we hope to make lcWGS more approachable and stimulate broader adoption.
... It is a simple, reliable, speed and relatively low cost, straight forward technique to apply and the number of loci that can be examined is unlimited. RAPD markers have generally been used for the detection of genetic variation within and among populations in several plant species without the need for detailed knowledge of DNA (Holsinger et al., 2002;Wei et al., 2008;Zarek, 2009). Recently RAPD analysis of PCR-amplified DNA regions have been used to detect DNA variations for taxonomic purposes, particularly at the intra-and interspecific levels, the genotype stability of species and others (Abkenar et al., 2004;Liesebach and Gotz, 2008;Ibrahim et al., 2010;Rahman and Al Munsur, 2009;Agbidinoukoun et al., 2017). ...
Article
Full-text available
Balanites aegyptiaca (L.) Del. (Heglig) is a multipurpose tree with considerable potential. The wide range of environments under which it is growing suggests high pattern of variation among and within locations. The scrutiny was undertaken to assess genetic diversity among and within four geographical genotypes of Heglig growing naturally in four different regions of Sudan based on their seed performance using random amplified polymorphic DNA (RAPD) markers. The DNA was extracted from dried leaf materials and subjected to PCR using ten primers. A total of 57 bands were scored and 42 were polymorphic with polymorphism level ranging from 50 to 87.5% and a mean of 71.7%. Cluster analysis using NTSYS-PC software, showed three main clusters. The dissimilarity values ranged between 77 and 93%. Effective gene flow among the three populations of Obied, Damazin and Gedaref and limited gene flow with Genaina was observed. Populations of B. aegyptiaca from different provenances were confirmed to have significant genetic diversity. Results indicate that RAPD could be efficiently used for studying genetic variation of B. aegyptiaca. The study recommends that local provenances of B. aegyptiaca should be properly conserved and immediate efforts to be made to widen the genetic base through research and collection from other regions. Key words: Balanites aegyptiaca, random amplified polymorphic DNA (RAPD), marker, genetic variation, DNA, Sudan
... Ces deux méthodes sont complémentaires car, alors que l"une donne des résultats de probabilité d"appartenance de chacune des accessions à une population, l"autre montre bien l"écart existant entre les populations et la diversité génétique intrinsèque à chacune des populations. D"autres méthodes d"analyses bayesiennes sont disponibles pour la communauté scientifique pour étudier la structure génétique à partir de marqueurs moléculaires (Holsinger, Lewis et al. 2002;Corander, Waldmann et al. 2003;Corander, Waldmann et al. 2004;Holsinger and E. 2004). Ces méthodes n"ont pas été testées ici mais peuvent présenter certains avantages, comme le fait de ne pas forcer de structure lorsque celles-ci n"existent pas (Corander, Waldmann et al. 2003), ce qui n"est pas le cas de STRUCTURE. ...
... Las diferentes clases de marcadores proveen distintos niveles de resolución y poder estadístico, así como ventajas y desventajas (Avise, 2004;Freeland, 2005;Pradeep et al., 2002). Por ejemplo, la técnica de los marcadores Inter-Secuencias Simples Repetitivas (issrs) permite obtener los niveles de variación en las regiones microsatélites dispersas en el genoma nuclear (Zietkiewicz et al., 1994;Gupta et al., 1999;Wolfe, 2000;Rakoczy-Trojanowska y Bolibok, 2004), son dominantes y su iniciador para pcr está compuesto de una secuencia microsatélite como (GACA)n anclada al extremo 3' o 5' por dos a cuatro nucleótidos arbitrarios usualmente degenerados (Zietkiewicz et al, 1994) y no es necesario conocer la secuencia para el diseño de los iniciadores, lo que obvia la construcción de una librería genómica, tiempo y dinero (Rakoczy-Trojanowska y Bolibok, 2004); es una manera eficaz de inferir los niveles de variación genética en y entre diferentes poblaciones en ausencia de conocimiento genético previo de una especie (Holsinger et al., 2002;Nybom, 2004;Pérez de la Torre, 2011). Puede mostrar una diferenciación genética a muy pequeña escala (Ormond et al., 2010), se han utilizado mucho por su rápida implementación y por su bajo costo para establecer diversidad genética (Zizumbo et al., 2005; Pérez de la Torre, 2011) y ha demostrado ser una herramienta eficaz para analizar la diversidad genética en una variedad de especies de insectos (Luque et al., 2002;Martay et al., 2014). ...
Chapter
Full-text available
El tiempo es la herramienta fundamental a partir de la cual se puede realizar una descripción metodológica para el análisis de series de datos meteorológicos. La observación de los parámetros meteorológicos cambia radicalmente con la escala temporal. Si hablamos de tiempo, lo hacemos para determinar los parámetros meteorológicos de un lugar determinado en un momento determinado. Si hablamos de clima, estamos hablando del tiempo promedio (o sea del promedio y variabilidad de los parámetros meteorológicos) medidos en un lapso temporal no menor a 30 años. Este trabajo en particular se centra en el análisis de series temporales de entre 3 años y 20 años de datos de intensidad y dirección del viento, para estudios preliminares de tiempo y clima. Se utilizaron datos provenientes del Parque Eólico “Tierra del Diablo”, en Bahía Blanca, y de la estación del Servicio Meteorológico Nacional (smn) en el aeropuerto Comandante Espora ubicado a unos 12 km del parque. Una de las cosas fundamentales a analizar para instalar y sostener un parque eólico es el viento (su intensidady dirección). El parque cuenta con tres torres meteorológicas que registran intensidad y dirección del viento a diferentes alturas y temperatura cerca del suelo, y el aeropuerto con una torre que mide intensidad y dirección del viento a 10 m de altura.
... A total of 167 bands were amplified for the fifteen primers and presence/ absence data recorded for each locus (Table 3). Statistics obtained for populations and regions included: percentage of polymorphic loci (PLP), expected heterozygosity (H e ) (Vekemans 2002), and qII statistic (Holsinger et al. 2002). Additionally, polymorphic information criteria (PIC) was estimated for loci as described by Tao et al. (2014). ...
... Only two software programs allow the estimation of the coefficient of inbreeding f (F IS ) from dominant marker data. The first method described (Holsinger, Lewis & Dey, 2002) sometimes shows biased estimates, mainly due to intrinsic characteristics of the dominant markers and the selection of only polymorphic markers (Foll, Beaumont & Gaggiotti, 2008), which is common practice in most studies using dominant markers (Nybom, 2004). developed the ABC4F software that corrects these biases and gives individual F ST and F IS values for each population as a way of considering the differences in population history (Balding, 2003), so this software was used to calculate these statistics. ...
Article
Anomochlooideae (Poaceae) represent the earliest-diverging extant lineage of grasses. One of the two genera is the monotypic Anomochloa, which is extremely rare and restricted to the Atlantic Forest of southern Bahia state in Brazil, where only two natural populations have been recorded to date. Knowledge of A. marantoidea is considered crucial to understanding evolutionary and diversification patterns in Poaceae. Despite this, knowledge of the biology and distribution of A. marantoidea remain incomplete, and thus the conservation of this poorly known species is problematic. We used niche modelling to estimate its current distribution and assess potential ranges in situ to explore new occurrences. In addition, genetic diversity and the factors that disrupt gene flow between populations of this species were estimated using molecular markers. Two new populations were documented; the modelled ecological niche indicates high climatic restriction, but also revealed suitable sites for the establishment of new populations. Genetic diversity is correlated to population size, and genetic structure analysis suggests recent fragmentation and low gene flow among the remaining populations, which exhibit high levels of inbreeding. These levels also indicate the capacity of A. marantoidea to respond favourably to selection and, thus, that a conservation plan could be designed to maintain the current genetic diversity.
... Only two software programs allow the estimation of the coefficient of inbreeding f (F IS ) from dominant marker data. The first method described (Holsinger, Lewis & Dey, 2002) sometimes shows biased estimates, mainly due to intrinsic characteristics of the dominant markers and the selection of only polymorphic markers (Foll, Beaumont & Gaggiotti, 2008), which is common practice in most studies using dominant markers (Nybom, 2004). developed the ABC4F software that corrects these biases and gives individual F ST and F IS values for each population as a way of considering the differences in population history (Balding, 2003), so this software was used to calculate these statistics. ...
Article
Anomochlooideae (Poaceae) represent the earliest-diverging extant lineage of grasses. One of the two genera is the monotypic Anomochloa, which is extremely rare and restricted to the Atlantic Forest of southern Bahia state in Brazil, where only two natural populations have been recorded to date. Knowledge of A. marantoidea is considered crucial to understanding evolutionary and diversification patterns in Poaceae. Despite this, knowledge of the biology and distribution of A. marantoidea remain incomplete, and thus the conservation of this poorly known species is problematic. We used niche modelling to estimate its current distribution and assess potential ranges in situ to explore new occurrences. In addition, genetic diversity and the factors that disrupt gene flow between populations of this species were estimated using molecular markers. Two new populations were documented; the modelled ecological niche indicates high climatic restriction, but also revealed suitable sites for the establishment of new populations. Genetic diversity is correlated to population size, and genetic structure analysis suggests recent fragmentation and low gene flow among the remaining populations, which exhibit high levels of inbreeding. These levels also indicate the capacity of A. marantoidea to respond favourably to selection and, thus, that a conservation plan could be designed to maintain the current genetic diversity.
... native) was performed using the software STRUCTURE 2.3.4 (Pritchard et al. 2010). A Bayesian analysis infers the number of genetically homogeneous clusters (K), is one of the most appropriate method for detecting patterns of genetic structure base on dominant markers because it does not assume previous knowledge of the inbreeding and Hardy-Weinberg equilibrium (HWE), and performs well even with relatively small numbers of loci and populations (Pritchard et al. 2000;Holsinger et al. 2002). Ten independent runs were performed for K = 1 to K = 12, with 1,000,000 Markov Chain Monte Carlo (MCMC) runs for each one, with a burnin of 100,000 iterations. ...
Article
Full-text available
Furcraea foetida (L.) Haw. (Asparagaceae) is an invasive plant in tropical and subtropical regions of the world. In Brazil, this species is mainly invasive along the Atlantic coast. The sexual reproduction is unknown in native and invaded regions, and individuals efficiently produce thousands of vegetative propagules by clonal reproduction. The actual distribution of F. foetida seems to be related to its historical use as a cultivated species and human-driven propagule pressure. Our hypotheses are that the multiple historical introduction events could originate high genetic diversity in invaded areas, and the posterior recurrent propagules migration could maintain the variability and produce strong cohesion among clonal populations along coastal environments in South America. To test these hypotheses, we used inter-simple sequence repeat molecular markers to investigate: (1) the level of genetic diversity of clonal populations; (2) the distribution of genetic diversity among native and invaded areas; and (3) the genetic structure of populations in invaded areas in the southeast and southern Brazil. Invasive populations showed similar levels of genetic diversity to the native populations, and in both areas was explained by the variation among individuals within populations instead of between populations and between regions. Based on the history of human use we believe that, in a larger time–space scale, the introduction from multiple sources is the main factor related to the high genetic diversity of this clonal species and a mixed genetic composition is reinforced by the efficient water dispersion of propagules among islands and coastal environments in South America.
... Molecular markers derived from PCR amplification of genomic DNA are an important tool kit of evolutionary geneticist [1]. RAPD marker analysis can show high levels of polymorphism even among closely related species. ...
Research
Molecular tools play an important role in conservation genetics. RAPD, the simplest among all PCR techniques, has been used in the present study to reveal the inter-generic relatedness among pheasants such as Indian Peafowl (Pavo cristatus), Silver pheasant (Lophura nycthemera), Domesticated Turkey (Meleagris gallopavo) and Domesticated fowl (Gallus gallus). Among 38 bands amplified by three primers, 34 (89.4%) were highly polymorphic. The bands produced were checked for reproducibility and further statistical analyses were made with Jaccard's and UPGMA method for genetic distance and developing dendrogram. The results clearly revealed the clade divergence of Pavo cristatus and Gallus gallus with that of Lophura nycthemera and Meleagris gallopavo.
... (Schlüter & Harris, 2006) to estimate Shannon's diversity index (H Sh ) and describe band characteristics per population for the AFLP results. AFLP profiles were also analysed using a Bayesian approach implemented in HICKORY (Holsinger, Lewis & Dey, 2002), that allowed evaluating potential slight deviations from Hardy-Weinberg equilibrium in dominant markers and calculated the inbreeding coefficient (f) and the level of genetic differentiation between populations (Ɵ I values, similar to Wright's F ST ). HICKORY analyses were performed using default parameters (burn-in = 5000, sample = 100 000, thin = 20). ...
Article
Full-text available
We explored the effects of Quaternary climate changes on the campos rupestre sky island ecosystem in central and eastern Brazil studying the phylogeography of Richterago discoidea (Asteraceae) to better understand the effect of historical biogeographic processes on species diversification in this region. DNA sequences of nuclear (ITS) and plastid (psbA-trnH, rpl32-trnL UAG , trnK UUU-rps16 and ycf3-trnS) markers and 83 AFLP loci were used to genotype up to 90 individuals from 19 populations of R. discoidea. We investigated intraspecific genetic structure, demographic history and spatiotemporal diversification. Also, ecological niche modelling was used to infer palaeodistribution. Three lineages (without strong geographical structure) were identified in the Bayesian genetic structure analysis of 25 haplotypes, whereas AFLPs revealed two lineages with considerable levels of admixture. The origin of diversification of R. discoidea is on the Diamantina Plateau (Espinhaço Meridional, Minas Gerais), from where lineages expanded to the central highlands of Brazil during glacial periods in the Mid-Pleistocene. The current disjunct distribution is a relict of an ancient wide distribution, resulting from retraction during interglacial periods, confining populations to mountain-top sky islands that acted as refugia (Espinhaço Range and Goiás highlands). Expansion within refugium sites during the LGM blurred deeper genetic structure, and pressure of selection in ecological outliers favoured high genetic diversity outside refugia.
... For overall genome-level comparisons, Nei's genetic diversity estimate (H) was calculated by HICKORY software (Holsinger et al. 2002) with default parameters. To assess patterns of differentiation between barley subspecies, STRUCTURE 2.3 (Pritchard et al. 2010) was used to estimate the genetic differentiation among barley subspecies via a Bayesian method previously described by Wang et al. (2014). ...
Article
Full-text available
As one of the world's earliest domesticated crops, barley is a model species for the study of evolution and domestication. Domestication is an evolutionary process whereby a population adapts, through selection; to new environments created by human cultivation. We describe the genome-scanning of molecular diversity to assess the evolution of barley in the Tibetan Plateau. We used 667 Diversity Arrays Technology (DArT) markers to genotype 185 barley landraces and wild barley accessions from the Tibetan Plateau. Genetic diversity in wild barley was greater than in landraces at both genome and chromosome levels, except for chromosome 3H. Landraces and wild barley accessions were clearly differentiated genetically, but a limited degree of introgression was still evident. Significant differences in diversity between barley subspecies at the chromosome level were observed for genes known to be related to physiological and phenotypical traits, disease resistance, abiotic stress tolerance, malting quality and agronomic traits. Selection on the genome of six-rowed naked barley has shown clear multiple targets related to both its specific end-use and the extreme environment in Tibet. Our data provide a platform to identify the genes and genetic mechanisms that underlie phenotypic changes, and provide lists of candidate domestication genes for modified breeding strategies.
... Estimates were calculated under three different models: a f ¼ 0 model where data are set to have no inbreeding, a h ¼ 0 model where differentiation is forced to zero, and a full model where f and h can vary. Five runs of each model were conducted to ensure consistency of results (burn-in 50,000, sample ¼ 250,000, thin ¼ 50) as suggested by Holsinger et al. (2002). We report the full model with f and h because it had the best fit estimated by the deviance information criterion, a metric similar to Akaike's information criterion that takes into account the fit of a model and the number of factors used. ...
Article
Vicia ocalensis Godfrey & Kral (Ocala vetch) is a declining endemic plant that is state-listed as endangered in Florida and as critically imperiled globally. This perennial vine is historically known from only three populations along the shorelines of separate spring systems within the Ocala National Forest. One of these populations recently rejuvenated after being absent for 15 years. All populations have experienced great annual fluctuations in size, and are increasingly threatened by recreational use of the springs, herbicide treatment, and other habitat disturbances. The aims of this study are to assess genetic diversity within and between the three historical populations of V. ocalensis and characterize habitat supporting the two extant populations of V. ocalensis to inform conservation strategies. We analyzed 743 amplified fragment length polymorphism (AFLP) markers. Our AFLP results show evidence of high inbreeding and weak genetic differentiation among populations. Genetic diversity within Population 1 and Population 3 was lower than Population 2. Complementary monitoring data over the last 10 yr shows population size declines in these three populations. The low genetic diversity and inbreeding is a concern for this species, especially coupled with habitat loss, degradation, and fragmentation. Our analysis of habitat suggests that germination substrate is a key component in determining suitable habitat for V. ocalensis, but understory and overstory plants may also influence habitat suitability. Habitat protection, restoration of similar habitats, population introductions, and ex situ preservation are all recommended for this species.
... The mean estimates of θ II were analyzed and compared under the different models, by using the Deviant Information Criterion (DIC). Models with the smaller DIC are preferred (Holsinger et al., 2002). ...
Article
The Argentine Pampas region is recognized by its high productivity and fertility, which make its soils suitable for agricultural use. However, the intensification of agricultural activities over the last forty years has led to an intense disturbance regime, characterized by an increase in the soil degradation rate. Soil degradation and its relationship with soil fauna communities are crucial issues in resource management. In this study, we investigated the effects of land-use change on the genetic variability of the terrestrial isopod Armadillidium vulgare, chosen as a biological model. The diversity and population genetic structure of this species were analyzed using inter-simple sequence repeat (ISSR)-PCR markers, in three land uses in two localities of Luján, Buenos Aires, Argentina. Genetic variability was high in natural grassland populations and lower in agricultural land uses. Both conventional FST analysis and Bayesian approach for dominant markers showed significant genetic differences between land uses within each locality. The loss of genetic variability and the population genetic structure can be used as indicators of system disturbance. Thus, in the soils studied, the degree of genetic variability of representative populations of the soil fauna can be a good indicator of the disturbance degree.
... The Bayesian statistical approach finds a wide application in evolutionary genetic studies, circumventing some of the difficulties posed by the dominant marker data for their analysis (Zhivotovsky 1999;Holsinger et al. 2002). On the other hand, there is a significant correlation of genetic differentiation measures like F-statistics (Wright 1951(Wright , 1965 with isolation by distance (Sork et al. 2013). ...
Article
Full-text available
The genetic differentiation of teak meta-population in India was investigated in relation to geographical and climatic variations employing dominant ISSR markers followed by Bayesian statistical analysis to understand adaptability of the species. The analysis based on 290 teak genotypes representing 29 locations of its natural distribution and 43 ISSR loci exhibited an insignificant structure and low 2.76% LD (≥ 0.1 R² values, p < 0.001) in teak meta-population. The genetic and geographical variables despite acting independently with each other resulted in three sub-population clusters in the meta-population. The geographical barrier played a significant role in direction/restriction of gene flow. The integration of spatial/climatic variables altered the clustering pattern of the teak meta-population with signature of the adaptation to the temperature and longitudinal gradients that was also verified by the similar adaptation pattern of meta-population towards predicted global climate modeling for year 2050. The findings can help tackle the sustainable management and conservation of the species and its survival quotient in threat of changing climatic conditions. © 2018 Springer Science+Business Media, LLC, part of Springer Nature
... The dataset was cleaned by excluding loci with high amounts of missing data. The binary matrix created was used in Hickory v-1.1 [42], which is a Bayesian method that calculates deviation of the Hardy-Weinberg equilibrium by the Markov chain Monte Carlo (MCMC), so it does not calculate using allele frequency (details in [41]). The Bayesian differentiation index, θ ST , was calculated with the f -free model: 250,000 runs and 50,000 burn-ins in the Hickory software. ...
Article
Full-text available
Vital for many marine and terrestrial species, and several other environmental services, such as carbon sink areas, the mangrove ecosystem is highly threatened due to the proximity of large urban centers and climate change. The forced fragmentation of this ecosystem affects the genetic diversity distribution among natural populations. Moreover, while restoration efforts have increased, few studies have analyzed how recently-planted areas impact the original mangrove genetic diversity. We analyzed the genetic diversity of two mangroves species (Laguncularia racemosa and Avicennia schaueriana) in three areas in Brazil, using inter-simple sequence repeat (ISSR) markers. Using the local approach, we identified the genetic diversity pool of a restored area compared to nearby areas, including the remnant plants inside the restored area, one well-conserved population at the shore of Guanabara Bay, and one impacted population in Araçá Bay. The results for L. racemosa showed that the introduced population has lost genetic diversity by drift, but remnant plants with high genetic diversity or incoming propagules could help improve overall genetic diversity. Avicennia schaueriana showed similar genetic diversity, indicating an efficient gene flow. The principal component analysis showing different connections between both species indicate differences in gene flow and dispersal efficiencies, highlighting the needed for further studies. Our results emphasize that genetic diversity knowledge and monitoring associated with restoration actions can help avoid bottlenecks and other pitfalls, especially for the mangrove ecosystem.
... Genetic variation within and among predefined groups and pair-wise F ST genetic distances were measured by Analysis of Molecular Variance (AMOVA) [40][41][42] using ARLEQUIN 2.0 [43]. A Bayesian partition method of genetic differentiation among population groups was applied using HICK-ORY [44] software to direct estimation of F ST without prior knowledge of inbreeding history [45]. ...
Article
Full-text available
Limited polymorphism and narrow genetic base, due to genetic bottleneck through historic domestication, highlight a need for comprehensive characterization and utilization of existing genetic diversity in cotton germplasm collections. In this study, 288 worldwide Gossypium barbadense L. cotton germplasm accessions were evaluated in two diverse environments (Uzbekistan and USA). These accessions were assessed for genetic diversity, population structure, linkage disequilibrium (LD), and LD-based association mapping (AM) of fiber quality traits using 108 genome-wide simple sequence repeat (SSR) markers. Analyses revealed structured population characteristics and a high level of intra-variability (67.2%) and moderate interpopulation differentiation (32.8%). Eight percent and 4.3% of markers revealed LD in the genome of the G. barbadense at critical values of r² ≥ 0.1 and r² ≥ 0.2, respectively. The LD decay was on average 24.8 cM at the threshold of r² ≥ 0.05. LD retained on average distance of 3.36 cM at the threshold of r² ≥ 0.1. Based on the phenotypic evaluations in the two diverse environments, 100 marker loci revealed a strong association with major fiber quality traits using mixed linear model (MLM) based association mapping approach. Fourteen marker loci were found to be consistent with previously identified quantitative trait loci (QTLs), and 86 were found to be new unreported marker loci. Our results provide insights into the breeding history and genetic relationship of G. barbadense germplasm and should be helpful for the improvement of cotton cultivars using molecular breeding and omics-based technologies.
... Nei's gene diversity (H j ) with its standard error was computed using AFLP-SURV version 1.0 [61] using the approach of Lynch and Milligan and the Bayesian method with non-uniform prior distribution to calculate allelic frequencies assuming Hardy-Weinberg equilibrium [62]. Nevertheless, taking into account a possible bias caused by this assumption, Bayesian gene diversity (H B ) and its standard error were also calculated using Hickory version 1.1 [63], which does not assume Hardy-Weinberg equilibrium within accessions. ...
Article
Full-text available
Indigofera pseudotinctoria Mats is an agronomically and economically important perennial legume shrub with a high forage yield, protein content and strong adaptability, which is subject to natural habitat fragmentation and serious human disturbance. Until now, our knowledge of the genetic relationships and intraspecific genetic diversity for its wild collections is still poor, especially at small spatial scales. Here amplified fragment length polymorphism (AFLP) technology was employed for analysis of genetic diversity, differentiation, and structure of 364 genotypes of I. pseudotinctoria from 15 natural locations in Wushan Montain, a highly structured mountain with typical karst landforms in Southwest China. We also tested whether eco-climate factors has affected genetic structure by correlating genetic diversity with habitat features. A total of 515 distinctly scoreable bands were generated, and 324 of them were polymorphic. The polymorphic information content (PIC) ranged from 0.694 to 0.890 with an average of 0.789 per primer pair. On species level, Nei’s gene diversity (Hj), the Bayesian genetic diversity index (HB) and the Shannon information index (I) were 0.2465, 0.2363 and 0.3772, respectively. The high differentiation among all sampling sites was detected (FST = 0.2217, GST = 0.1746, G’ST = 0.2060, θB = 0.1844), and instead, gene flow among accessions (Nm = 1.1819) was restricted. The population genetic structure resolved by the UPGMA tree, principal coordinate analysis, and Bayesian-based cluster analyses irrefutably grouped all accessions into two distinct clusters, i.e., lowland and highland groups. The population genetic structure resolved by the UPGMA tree, principal coordinate analysis, and Bayesian-based cluster analyses irrefutably grouped all accessions into two distinct clusters, i.e., lowland and highland groups. This structure pattern may indicate joint effects by the neutral evolution and natural selection. Restricted Nm was observed across all accessions, and genetic barriers were detected between adjacent accessions due to specifically geographical landform.
Article
Full-text available
The classification system for the genus Aconitum is highly complex. It is also the subject of ongoing debate. Aconitum pendulum Busch and Aconitum flavum Hand.-Mazz. are perennial herbs of the genus Aconitum. Dried roots of these two plants are used in traditional Chinese medicine. In this study, morphological observations and ISSR molecular markers were employed to discriminate between A. flavum and A. pendulum, with the objective of gaining insights into the interspecies classification of Aconitum. The pubescence on the inflorescence of A. flavum was found to be appressed, while that on the inflorescence of A. pendulum was spread. UPGMA (unweighted pair-group method with arithmetic average) cluster analysis, PCoA (principal coordinates analysis), and Bayesian structural analysis divided the 199 individuals (99 individuals from DWM population and 100 individuals from QHL population) into two main branches, which is consistent with the observations of the morphology of pubescence on the inflorescence. These analyses indicated that A. flavum and A. pendulum are distinct species. No diagnostic bands were found between the two species. Two primer combinations (UBC808 and UBC853) were ultimately selected for species identification of A. flavum and A. pendulum. This study revealed high levels of genetic diversity in both A. flavum (He = 0.254, I = 0.395, PPB = 95.85%) and A. pendulum (He = 0.291, I = 0.445, PPB = 94.58%). We may say, therefore, that ISSR molecular markers are useful for distinguishing A. flavum and A. pendulum, and they are also suitable for revealing genetic diversity and population structure.
Article
Full-text available
In Neotropical regions, plantations and remnant forest populations of native trees coexist in a highly fragmented matrix and may be affected by isolation and reduction in population size, leading to genetic structure, inbreeding, and genetic bottlenecks that reduce the population’s genetic diversity. Tabebuia rosea variability in the Mayan Forest was studied by genotyping 30 trees from three plantations and three remnant natural populations using simple sequence repeats (SSRs) and inter-simple sequence repeats (ISSRs). Ho-SSR estimates were lower than He; the mean inbreeding coefficient was 0.07 and did not differ among populations, but was eight times higher in plantations than in remnant populations. Using ISSR data, the individuals were assigned to k = 5 and k = 4 clusters under admixture without and with geographic information used as priors in Bayesian analysis assignments. Genetic differentiation estimated with the Bayesian estimator II (0.0275 ± 0.0052) was significantly different from 0, but FST was not (0.0985 ± 0.1826), while paired FST among populations ranged from 0.05 up to 0.16. Only one remnant population displayed evidence of a genetic bottleneck. T. rosea displays a genetic structure in which the isolated remnant forest populations show moderate inbreeding levels.
Preprint
Full-text available
The development of modern society depends on the connectivity of different regions, and in particular, the establishment of roads and highways. While roads may be fundamental to development, they tend to have negative impacts on the biodiversity of a region. The present study verifies the roadkill patterns of the crab-eating fox, Cerdocyon thous , in the context of the characteristics of the surrounding matrix. Specimens were collected from four highways in the Brazilian states of Pernambuco and Alagoas, between August 2019 and December 2020. The date, time, and location of each specimen was recorded, and the carcasses were removed from the highway to avoid replicating records. A total of 101 specimens were collected, with a 1:1 sex ratio. Roadkill hotspots were observed in environments to which C. thous is adapted, in particular open fields and the margins of areas with denser vegetation. Rainfall may influence the observed roadkill patterns, given that the hotspots coincided with areas that have annual rainfall of between 600 mm and 700 mm. The molecular analyses indicated that the sample collected encompasses individuals from two distinct genetic groups, which are both distributed throughout the study area, reflecting its reduced genetic diversity. This scenario is influenced by factors such as the pattern of anthropogenic impacts within the study area and the ecological characteristics of the species. The impacts of the roadkill hotspots can be mitigated by measures such as signposting during recruitment periods.
Article
Full-text available
Key Message Interspecific comparison of two Paspalum species has demonstrated that mating systems (selfing and outcrossing) contribute to variation (genetically and morphologically) within species through similar but mutually exclusive processes. Abstract Mating systems play a key role in the genetic dynamics of populations. Studies show that populations of selfing plants have less genetic diversity than outcrossing plants. Yet, many such studies have ignored morphological diversity. Here, we compared the morphological and molecular diversity patterns in populations of two phylogenetically-related sexual diploids that differ in their mating system: self-sterile Paspalum indecorum and self-fertile P. pumilum. We assessed the morphological variation using 16 morpho-phenological characters and the molecular diversity using three combinations of AFLPs. We compared the morphological and molecular diversity within and among populations in each mating system. Contrary to expectations, selfers showed higher morphological variation within populations, mainly in vegetative and phenological traits, compared to outcrossers. The high morphological variation within populations of selfers led to a low differentiation among populations. At molecular level, selfing populations showed lower levels of genotypic and genetic diversity than outcrossing populations. As expected, selfers showed higher population structure than outcrossers (PhiST = 0.301 and PhiST = 0.108, respectively). Increased homozygous combinations for the same trait/locus enhance morphological variation and reduce molecular variation within populations in selfing P. pumilum. Thus, selfing outcomes are opposite when comparing morphological and molecular variation in P. pumilum. Meanwhile, pollen flow in obligate outcrossing populations of P. indecorum increases within-population molecular variation, but tends to homogenize phenotypes within-population. Pollen flow in obligate outcrossers tends to merge geographically closer populations; but isolation by distance can lead to a weak differentiation among distant populations of P. indecorum.
Preprint
Full-text available
The development of modern society depends on the connectivity of different regions, and in particular, the establishment of roads and highways. While roads may be fundamental to development, they tend to have negative impacts on the biodiversity of a region. The present study verifies the roadkill patterns of the crab-eating fox, Cerdocyon thous , in the context of the characteristics of the surrounding matrix. Specimens were collected from four highways in the Brazilian states of Pernambuco and Alagoas, between August 2019 and December 2020. The date, time, and location of each specimen was recorded, and the carcasses were removed from the highway to avoid replicating records. A total of 101 specimens were collected, with a 1:1 sex ratio. Roadkill hotspots were observed in environments to which C. thous is adapted, in particular open fields and the margins of areas with denser vegetation. Rainfall may influence the observed roadkill patterns, given that the hotspots coincided with areas that have annual rainfall of between 600 mm and 700 mm. The molecular analyses indicated that the sample collected encompasses individuals from two distinct genetic groups, which are both distributed throughout the study area, reflecting its reduced genetic diversity. This scenario is influenced by factors such as the pattern of anthropogenic impacts within the study area and the ecological characteristics of the species. The impacts of the roadkill hotspots can be mitigated by measures such as signposting during recruitment periods.
Chapter
The capacity of adaptation is very important to mangrove plants not only to face harsh environmental conditions but also climate changes. Genetic and epigenetic studies aim to assess the capacity of evolution and adaptation of each species. The greater the genetic diversity, the higher the probability of populations and species to have adaptations for future scenarios. There are a few evolutionary theories that explain the current status of mangrove ecosystems; the most likely for Atlantic-East Pacific mangroves is a vicariance event as per the Central American Isthmus and long-distance dispersal through the Atlantic Ocean. These were corroborated by genetic studies here unveiled. We also present traditional methods of molecular markers and how to analyze them. The chapter covers many (if not all) published studies assessing genetic and epigenetic diversity of Brazilian mangrove plants. In the future, it is probable that genetics will be completely substituted by genomics with the advent of next-generation sequencing platforms. However, many questions about Brazilian mangrove plants’ molecular diversity remain unanswered to inform on the adaptive and evolutionary potential of populations and species to cope with the enormous threats facing this vulnerable ecosystem, including climatic changes.KeywordsDiversity indexMolecular markersVicarianceEvolutionary theories
Article
Full-text available
Since its introduction in 2011 the variant call format (VCF) has been widely adopted for processing DNA and RNA variants in practically all population studies—as well as in somatic and germline mutation studies. The VCF format can represent single nucleotide variants, multi-nucleotide variants, insertions and deletions, and simple structural variants called and anchored against a reference genome. Here we present a spectrum of over 125 useful, complimentary free and open source software tools and libraries, we wrote and made available through the multiple vcflib , bio-vcf , cyvcf2 , hts-nim and slivar projects. These tools are applied for comparison, filtering, normalisation, smoothing and annotation of VCF, as well as output of statistics, visualisation, and transformations of files variants. These tools run everyday in critical biomedical pipelines and countless shell scripts. Our tools are part of the wider bioinformatics ecosystem and we highlight best practices. We shortly discuss the design of VCF, lessons learnt, and how we can address more complex variation through pangenome graph formats, variation that can not easily be represented by the VCF format.
Article
Full-text available
Rubia cordifolia L. is a widely used medicinal plant belongs to the family Rubiaceae. The evaluation of the alizarin, purpurin and genetic fidelity of Rubia cordifolia L. is crucial for the identification of this economically and ecologically valuable climber. In this study, we quantify the phytochemicals (alizarin and purpurin) using RP-HPLC analysis and Genetic fidelity studies carried out using RAPD markers. Phytochemical studies showed gradual variation among the samples. The maximum amount of alizarin and purpurin was found in the plant samples of Kalrayan hills (GBR3- 32.4% and 33.4%) followed by Kolli Hills (GBR1- 27.0% and 16.5%), Pachamalai (GBR6- 23.8% and 20.7%), Shervaroy hills (GBR2 16.6% and 9.4%), Jawadhu hills (GBR5- 13.4% and 11.7%), Chitteri hills (GBR4- 12.9% and 0.59%) and Yelagiri hills (GBR7 8.7% and 0.87%) respectively. 15 RAPD primers were used to distinguish the genetic fidelity among Rubia cordifolia L. collected from Eastern Ghats of Tamil Nadu. 15 RAPD primers amplified 246 polymorphic bands. Each primer classified the species under investigation into clear, completely separated clusters, although maximum conformity was achieved with respect to species relationships when the RAPD method was employed. Overall, results indicated that RP-HPLC represents an effective tool for phytochemical studies and RAPD represents an efficient molecular marker system for the assessment of genetic fidelity and diversity.
Article
Full-text available
Psammochloa villosa is an ecologically important desert grass that occurs in the Inner Mongolian Plateau where it is frequently the dominant species and is involved in sand stabilization and wind breaking. We sought to generate a preliminary demographic framework for P. villosa to support the future studies of this species, its conservation, and sustainable utilization. To accomplish this, we characterized the genetic diversity and structure of 210 individuals from 43 natural populations of P. villosa using amplified fragment length polymorphism (AFLP) markers. We obtained 1,728 well-defined amplified bands from eight pairs of primers, of which 1,654 bands (95.7%) were polymorphic. Results obtained from the AFLPs suggested effective alleles among populations of 1.32, a Nei's standard genetic distance value of 0.206, a Shannon index of 0.332, a coefficient of gene differentiation (GST) of 0.469, and a gene flow parameter (Nm) of 0.576. All these values indicate that there is abundant genetic diversity in P. villosa, but limited gene flow. An analysis of molecular variance (AMOVA) showed that genetic variation mainly exists within populations (64.2%), and we found that the most genetically similar populations were often not geographically adjacent. Thus, this suggests that the mechanisms of gene flow are surprisingly complex in this species and may occur over long distances. In addition, we predicted the distribution dynamics of P. villosa based on the spatial distribution modeling and found that its range has contracted continuously since the last interglacial period. We speculate that dry, cold climates have been critical in determining the geographic distribution of P. villosa during the Quaternary period. Our study provides new insights into the population genetics and evolutionary history of P. villosa in the Inner Mongolian Plateau and provides a resource that can be used to design in situ conservation actions and prioritize sustainable utilization.
Article
Population genetic structure and diversity and phylogeographical dispersal routes were assessed for the Azorean endemic grass Deschampsia foliosa using AFLP markers. This species occurs on seven islands in the archipelago and a sampling of populations from the three main geographical groups of islands was used, covering its known distribution. Principal coordinates analyses (PCoAs), Bayesian analyses and phylogenetic networks revealed different degrees of admixture for the central group (C) populations and a clear differentiation for the western group (W) and São Miguel island (in the eastern group, E) populations. The best K values corresponded to nine and 11 genetic groups, which were also confirmed by analysis of molecular variance. A low but significant correlation between genetic data and geography was observed, with most relevant barriers to gene flow generally placed between sub-archipelagos. We suggest a west-to-east isolation by distance dispersal model across an island age continuum with Flores–Corvo (W) and Pico (C) at the extremes of the dispersal path. An alternative scenario, also supported by the genetic data, implies an initial colonization of São Jorge (C), dispersal within C and following bidirectional dispersal to the W and E. The phylogeographical framework detected might be related to island age and to highly destructive volcanic events, and it supports the occurrence of cryptic diversity within D. foliosa. Genetic diversity estimators were highest for Pico island populations (C), lowest for São Miguel (E) and Flores (W) populations, and more divergent for the Corvo population (W). Conservation measures should be taken to preserve the genetic structure found across sub-archipelagos and islands.
Preprint
Full-text available
Traditional Hardy-Weinberg equilibrium (HWE) tests (the χ ² test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in datasets comprised of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence datasets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently amongst the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth.
Article
Full-text available
Salvia officinalis is a perennial species, native and endemic to the Western Balkans and the Apennine Peninsula. Due to its medicinal and aromatic properties, it is used in pharmaceutical, cosmetic, and food industries. The main objectives of the study were to infer the genetic structure of S. officinalis populations in the northern and central parts of the eastern Adriatic coast, to detect the phylogeographical barriers among the putative microrefugia and to assess the genetic diversity among the resulting ancestral clusters. Twenty-five populations were assessed using amplified fragment length polymorphism markers. High polymorphism and high diversity within populations were typical for this outcrossing long-lived species. The Fitch–Margoliash tree based on Nei’s genetic distance matrix showed that most of the populations tended to group in accordance with the geographical position of their collecting sites. Spatial analysis of the genetic diversity revealed a typical pattern of isolation by distance. Very low overall among-population differentiation and detection of only three private alleles indicate that there has been high gene flow among populations. By using Bayesian Analysis of Population Structure on population level, two distinct ancestral clusters were obtained. It is likely that these two ancestral clusters were separated for a longer period by Pleistocene glaciation, although the subsequent fast recolonization resulted in diminished genetic differences. High rarity of northern and southern populations of the investigated area indicates that S. officinalis presumably survived in both northern and southern microrefugia and expanded from there resulting in secondary contact zones, characterized by lower rarity and equal genetic diversity.
Article
Full-text available
Sapindus emarginatus Vahl (Sapindaceae) also known as ‘Indian Soap nut’ is significantly important for saponin content in its fruits. However, its current population in India is heavily fragmented due to a lack of sustainable harvesting practices. Moreover, changing climatic regimes may further limit its distribution and possibly compromise the survival of the species in nature. The aim of the present study was to: predict the future distribution range of S. emarginatus; identify the bioclimatic variables limiting this distribution and to evaluate its adaptive fitness and genomic resilience towards these variables. To determine future species distribution range and identify limiting bioclimatic variables, we applied two different ecological niche models (ENMs; BioClim and MaxEnt) on real occurrence data (n = 88 locations). The adaptive fitness of the species was evaluated by quantifying the genetic variability with AFLP markers and marker-environmental associations, using AFLP-associated Bayesian statistics. We found 77% overlap between the baseline (2030) and predicted (2100) species distribution ranges, which were primarily determined by maximum temperature (TMAX) and mean annual precipitation (MAP). The TMAX and MAP contributed 43.1% and 27.1%, respectively to ENM model prediction. Furthermore, AFLP loci significantly associated with bioclimatic variables, and TMAX and MAP represent the lowest proportion (6.15%), confirming to the severe response of the species genome towards these variables. Nevertheless, the very low Linkage disequilibrium (LD) in these loci (4.54%) suggests that the current sensitivity to TMAX and MAP is subject to change during recombination. Moreover, a combination of high heterozygosity (0.40–0.43) and high within-population variability (91.63 ± 0.31%) confirmed high adaptive fitness to maintain reproductive success. Therefore, the current populations of S. emarginatus have substantial genomic resilience towards future climate change, albeit significant conservation efforts (including mass multiplication) are warranted to avoid future deleterious impacts of inbreeding depression on the fragmented populations.
Article
Full-text available
Logic regression has been recognized as a tool that can identify and model non-additive genetic interactions using Boolean logic groups. Logic regression, TASSEL-GLM and SAS-GLM were compared for analytical precision using a previously characterized model system to identify the best genetic model explaining epistatic interaction for vernalization-sensitivity in barley. A genetic model containing two molecular markers identified in vernalization response in barley was selected using logic regression while both TASSEL-GLM and SAS-GLM included spurious associations in their models. The results also suggest the logic regression can be used to identify dominant/recessive relationships between epistatic alleles through its use of conjugate operators.
Article
Full-text available
Genetic diversity is the basis for present day diversified living systems and future genetic improvement needs. Within the framework of breed conservation, genetic characterization is important in guarding breeds and is a prerequisite for managing genetic resources. The objective of this study was to use the RAPD technique to evaluate genetic diversity and relatedness within and among four horse line (white, Gray, Brown, Black). To our knowledge there is currently no information about RAPD genetic markers that detect genetic polymorphism in Erbil/Iraqi horse breeds. Information from this work provides basic genetic knowledge that is critical for conservation and breeding programs. Random amplification of polymorphic DNA (RAPD-PCR) was done by using 10 primers from GenScript USA company. A total of (6) Primers out of the (10) Primers gave results to find a complementary DNA Genomic sites, (OPQ-05, OPQ-06, OPQ-08, OPQ-09, OPQ-10, OPQ-12). These primers amplified on average 7 to 53 bands of sizes varying from 100bp to 1500bp. A total of 150 diagnostic bands were scored within RAPD profiles amplified by these 6 primers. Among 150 scorable bands 28 (18.67%) were recognized as polymorphic. UPGMA dendrogram based on Nei's genetic distance grouped the investigated horse line genotypes into two clusters. The first cluster includes(white, Brown, Black) whereas the second cluster include Gray which appeared to be most distant from the other lines. In conclusion, these results indicated the effectiveness of RAPD in detecting polymorphism between horse lines and their applicability in lines studies and establishing genetic relationships among the horse lines.
Article
Full-text available
Estuarine organisms grow in highly heterogeneous habitats, and their genetic differentiation is driven by selective and neutral processes as well as population colonization history. However, the relative importance of the processes that underlie genetic structure is still puzzling. Scirpus mariqueter is a perennial grass almost limited in the Changjiang River estuary and its adjacent Qiantang River estuary. Here, using amplified fragment length polymorphism (AFLP), a moderate‐high level of genetic differentiation among populations (range FST: 0.0310–0.3325) was showed despite large ongoing dispersal. FLOCK assigned all individuals to 13 clusters and revealed a complex genetic structure. Some genetic clusters were limited in peripheries compared with very mixing constitution in center populations, suggesting local adaptation was more likely to occur in peripheral populations. 21 candidate outliers under positive selection were detected, and further, the differentiation patterns correlated with geographic distance, salinity difference, and colonization history were analyzed with or without the outliers. Combined results of AMOVA and IBD based on different dataset, it was found that the effects of geographic distance and population colonization history on isolation seemed to be promoted by divergent selection. However, none‐liner IBE pattern indicates the effects of salinity were overwhelmed by spatial distance or other ecological processes in certain areas and also suggests that salinity was not the only selective factor driving population differentiation. These results together indicate that geographic distance, salinity difference, and colonization history co‐contributed in shaping the genetic structure of S. mariqueter and that their relative importance was correlated with spatial scale and environment gradient.
Article
Full-text available
A long‐standing debate in evolutionary biology concerns the relative importance of different evolutionary forces in explaining phenotypic diversification at large geographic scales. For example, natural selection is typically assumed to underlie divergence along environmental gradients. However, neutral evolutionary processes can produce similar patterns. We collected molecular genetic data from 14 European populations of Plantago lanceolata to test the contributions of natural selection versus neutral evolution to population divergence in temperature‐sensitive phenotypic plasticity of floral reflectance. In P. lanceolata, reflectance plasticity is positively correlated with latitude/altitude. We used population pairwise comparisons between neutral genetic differentiation (FST and Jost's D) and phenotypic differentiation (PST) to assess the contributions of geographic distance and environmental parameters of the reproductive season in driving population divergence. Data are consistent with selection having shaped large‐scale geographic patterns in thermal plasticity. The aggregate pattern of PST versus FST was consistent with divergent selection. FST explained thermal plasticity differences only when geographic distance was not included in the model. Differences in the extent of cool reproductive season temperatures, and not overall temperature variation, explained plasticity differences independent of distance. Results are consistent with the hypothesis that thermal plasticity is adaptive where growing seasons are shorter and cooler, that is, at high latitude/altitude.
Article
Full-text available
Milk thistle (Silybum marianum) is among the world’s popular medicinal plants. Start Codon Targeted (SCoT) marker system was utilized to investigate the genetic variability of 80 S. marianum genotypes from eight populations in Iran. SCoT marker produced 255 amplicons and 84.03% polymorphism was generated. The SCoT marker system’s polymorphism information content value was 0.43. The primers’ resolving power values were between 4.18 and 7.84. The percentage of polymorphic bands was between 33.3 and 100%. The Nei’s gene diversity (h) was 0.19–1.30 with an average 0.72. The Shannon’s index (I) ranged from 0.29 to 1.38 with an average value of 0.83. The average gene flow (0.37) demonstrated a high genetic variation among the studied populations. The variation of 42% was displayed by the molecular variance analysis among the populations while a recorded variation of 58% was made within the populations. Current investigation suggested that SCoT marker system could effectively evaluate milk thistle genotypes genetic diversity.
Article
Full-text available
As currently defined, DNA fingerprint profiles do not uniquely identify individuals. For criminal cases involving DNA evidence, forensic scientists evaluate the conditional prob-ability that an unknown, but distinct, individual matches the crime sample, given that the defendant matches. Estimates of the conditional probability of observing matching profiles are based on reference populations maintained by forensic testing laboratories. Each of these databases is heterogeneous, being composed of subpopulations of different heritages. This heterogeneity has an impact on the weight of the evidence. A hierarchical Bayes model is formulated that incorporates the key physical characteristics inherent in these data. With the help of Markov chain Monte Carlo sampling, levels of heterogeneity are estimated for three major ethnic groups in the database of Lifecodes Corporation.
Article
Full-text available
Formulae are given for estimators for the parameters F, θ, f (FIT, FST, FIS) of population structure. As with all such estimators, ratios are used so that their properties are not known exactly, but they have been found to perform satisfactorily in simulations. Unlike the estimators in general use, the formulae do not make assumptions concerning numbers of populations, sample sizes, or heterozygote frequencies. As such, they are suited to small data sets and will aid the comparisons of results of different investigators. A simple weighting procedure is suggested for combining information over alleles and loci, and sample variances may be estimated by a jackknife procedure.
Article
Full-text available
We present here a framework for the study of molecular variation within a single species. Information on DNA haplotype divergence is incorporated into an analysis of variance format, derived from a matrix of squared-distances among all pairs of haplotypes. This analysis of molecular variance (AMOVA) produces estimates of variance components and F-statistic analogs, designated here as phi-statistics, reflecting the correlation of haplotypic diversity at different levels of hierarchical subdivision. The method is flexible enough to accommodate several alternative input matrices, corresponding to different types of molecular data, as well as different types of evolutionary assumptions, without modifying the basic structure of the analysis. The significance of the variance components and phi-statistics is tested using a permutational approach, eliminating the normality assumption that is conventional for analysis of variance but inappropriate for molecular data. Application of AMOVA to human mitochondrial DNA haplotype data shows that population subdivisions are better resolved when some measure of molecular differences among haplotypes is introduced into the analysis. At the intraspecific level, however, the additional information provided by knowing the exact phylogenetic relations among haplotypes or by a nonlinear translation of restriction-site change into nucleotide diversity does not significantly modify the inferred population genetic structure. Monte Carlo studies show that site sampling does not fundamentally affect the significance of the molecular variance components. The AMOVA treatment is easily extended in several different directions and it constitutes a coherent and flexible framework for the statistical analysis of molecular data.
Article
Full-text available
A novel DNA fingerprinting technique called AFLP is described. The AFLP technique is based on the selective PCR amplification of restriction fragments from a total digest of genomic DNA. The technique involves three steps: (i) restriction of the DNA and ligation of oligonucleotide adapters, (ii) selective amplification of sets of restriction fragments, and (iii) gel analysis of the amplified fragments. PCR amplification of restriction fragments is achieved by using the adapter and restriction site sequence as target sites for primer annealing. The selective amplification is achieved by the use of primers that extend into the restriction fragments, amplifying only those fragments in which the primer extensions match the nucleotides flanking the restriction sites. Using this method, sets of restriction fragments may be visualized by PCR without knowledge of nucleotide sequence. The method allows the specific co-amplification of high numbers of restriction fragments. The number of fragments that can be analyzed simultaneously, however, is dependent on the resolution of the detection system. Typically 50—100 restriction fragments are amplified and detected on denaturing polyacrylamide gels. The AFLP technique provides a novel and very powerful DNA fingerprinting technique for DNAs of any origin or complexity.
Article
Full-text available
Multilocus DNA markers [random amplified polymorphic DNA (RAPDs), amplified fragment length polymorphism (AFLPs)] are important for population studies because they reveal many polymorphic loci distributed over the genome. The markers are dominant, that is two phenotypes are distinguished at each locus, with a band and with no band. The latter one represents null-homozygotes with unamplified, recessive null-alleles. The frequency of a null-allele can be estimated by taking the square root of the fraction of individuals with no band. Lynch and Milligan (1994) have suggested a modified procedure that reduces bias introduced by the square-root transform. However, the procedure recommends to ignore those samples in which fewer than four null-homozygotes are observed. This may lead to significant bias in estimates of genetic diversity. In this study, I introduce a Bayesian approach to estimation of null-allele frequencies for dominant DNA markers. It follows from computer simulations and data on two conifer species that the Bayesian method gives nearly unbiased estimates of heterozygosity, genetic distances and F-statistics. The influence of a prior distribution and departure from Hardy-Weinberg proportions on the estimates is also considered.
Article
We generalize the method proposed by Gelman and Rubin (1992a) for monitoring the convergence of iterative simulations by comparing between and within variances of multiple chains, in order to obtain a family of tests for convergence. We review methods of inference from simulations in older to develop convergence-monitoring summaries that are relevant for the purposes for which the simulations are used. We recommend applying a battery of tests for mixing based on the comparison of inferences from individual sequences and from the mixture of sequences. Finally, we discuss multivariate analogues, for assessing convergence of several parameters simultaneously.
Article
Bayesian approaches have been widely applied to partitioning diversity within and among levels in many different multi-level modeling contexts. In spite of the structural similarities between these Bayesian models and hierarchical approaches to partitioning diversity in population genetics, population geneticists have not explored the use of hierarchical Bayesian models to provide estimates of Wright's F-statistics. In this paper I describe and illustrate the application of a simple multilocus, two-allele model sufficient for partitioning diversity within and among populations. Extenions of the model incorporate both fixed-effect and random-effect models of population sampling at multiple hierarchical levels with multiple alleles per locus. The Bayesian approach developed here is closely related to previously developed methods for likelihood analysis of the same problem. I illustrate the utility of the Bayesian approach with a reanalysis of previously published allozyme data from Argania spinosa.
Article
A method for estimating and comparing population genetic variation using random amplified polymorphic DNA (RAPD) profiling is presented. An analysis of molecular variance (AMOVA) is extended to accomodate phenotypic molecular data in diploid populations in Hardy-Weinberg equilibrium or with an assumed degree of selfing. We present a two step strategy: 1) Estimate RAPD site frequencies without preliminary assumptions on the unknown population structure, then perform significance testing for population substructuring. 2) If population structure is evident from the first step, use this data to calculate better estimates for RAPD site frequencies and sub-population variance components. A nonparametric test for the homogeneity of molecular variance (HOMOVA) is also presented. This test was designed to statistically test for differences in intrapopulational molecular variances (heteroscedasticity among populations). These theoretical developments are applied to a RAPD data set in Vaccinium macrocarpon (American cranberry) using small sample sizes, where a gradient of molecular diversity is found between central and marginal populations. The AMOVA and HOMOVA methods provide flexible population analysis tools when using data from RAPD or other DNA methods that provide many polymorphic markers with or without direct allelic data.
Article
The relevance of using dominant random amplified polymorphic DNA (RAPD) fingerprints for estimating population differentiation was investigated when typically small population sample sizes were used. Haploid sexual tissues were first used to determine genotypes at RAPD loci for 75 eastern white pines (Pinus strobus L.) representing five populations. Dominant RAPD fingerprints were then inferred from genotypic data for each individual at each locus, and gene diversity estimates from both sources of data were compared. Genotypic information at RAPD loci indicated little or no differentiation among populations, similar to allozyme loci. However, estimates of population differentiation derived from dominant RAPD fingerprints according to various common methods of analysis were generally inflated, especially when all fragments were considered. Simulations showed that an increase in loci sampling and population sample sizes did not significantly alleviate the biases observed.
Article
It has been long recognized that highly polymorphic genetic markers can lead to underestimation of divergence between populations when migration is low. Microsatellite loci, which are characterized by extremely high mutation rates, are particularly likely to be affected. Here, we report genetic differentiation estimates in a contact zone between two chromosome races of the common shrew (Sorex araneus), based on 10 autosomal microsatellites, a newly developed Y-chromosome microsatellite, and mitochondrial DNA. These results are compared to previous data on proteins and karyotypes. Estimates of genetic differentiation based on F- and R-statistics are much lower for autosomal microsatellites than for all other genetic markers. We show by simulations that this discrepancy stems mainly from the high mutation rate of microsatellite markers for F-statististics and from deviations from a single-step mutation model for R-statistics. The sex-linked genetic markers show that all gene exchange between races is mediated by females. The absence of male-mediated gene flow most likely results from male hybrid sterility.
Article
The role that inbreeding and coancestry play in the distribution of neutral genes is incorporated into the variance of a linear function to provide a simple cumulative expression of the variance among mean gene frequencies of groups of individuals. The total variance of gene frequencies is subdivided into components corresponding to genes within individuals, among individuals within groups, and among groups. Various intraclass correlations, some of which may be negative, of gene frequencies are formulated. The various types of parameters are considered for and extended to include further structuring of the population, separate sexes, systems of mating, and effective numbers. Estimators and tests of hypotheses for the parameters are developed.
Article
The Gibbs sampler, the algorithm of Metropolis and similar iterative simulation methods are potentially very helpful for summarizing multivariate distributions. Used naively, however, iterative simulation can give misleading answers. Our methods are simple and generally applicable to the output of any iterative simulation; they are designed for researchers primarily interested in the science underlying the data and models they are analyzing, rather than for researchers interested in the probability theory underlying the iterative simulations themselves. Our recommended strategy is to use several independent sequences, with starting points sampled from an overdispersed distribution. At each step of the iterative simulation, we obtain, for each univariate estimand of interest, a distributional estimate and an estimate of how much sharper the distributional estimate might become if the simulations were continued indefinitely. Because our focus is on applied inference for Bayesian posterior distributions in real problems, which often tend toward normality after transformations and marginalization, we derive our results as normal-theory approximations to exact Bayesian inference, conditional on the observed simulations. The methods are illustrated on a random-effects mixture model applied to experimental measurements of reaction times of normal and schizophrenic patients.
Article
"Wright's views about population genetics and evolution are so fundamental and so comprehensive that every serious student must examine these books firsthand. . . . Publication of this treatise is a major event in evolutionary biology."-Daniel L. Hartl, BioScience
Article
"Wright's views about population genetics and evolution are so fundamental and so comprehensive that every serious student must examine these books firsthand. . . . Publication of this treatise is a major event in evolutionary biology."-Daniel L. Hartl, BioScience
Article
A mathematical model for the evolutionary change of restriction sites in mitochondrial DNA is developed. Formulas based on this model are presented for estimating the number of nucleotide substitutions between two populations or species. To express the degree of polymorphism in a population at the nucleotide level, a measure called "nucleotide diversity" is proposed.
Article
Molecular genetic maps are commonly constructed by analyzing the segregation of restriction fragment length polymorphisms (RFLPs) among the progeny of a sexual cross. Here we describe a new DNA polymorphism assay based on the amplification of random DNA segments with single primers of arbitrary nucleotide sequence. These polymorphisms, simply detected as DNA segments which amplify from one parent but not the other, are inherited in a Mendellan fashion and can be used to construct genetic maps in a variety of species. We suggest that these polymorphisms be called RAPD markers, after Random Amplified Polymorphic DNA.
Article
S pointed out in the first paper of this series (HUBBY and LEWONTIN 1966), A no one knows at the present time the kinds and frequencies of variant alleles present in natural populations of any organism, with the exception of certain special classes of genes. For human populations we know a good deal about certain polymorphisms for blood cell antigens, serum proteins, and metabolic disorders of various kinds but we can hardly regard these, a priori, as typical of the genome as a whole. Clearly we need a method that will randomly sample the genome and detect a major proportion of the individual allelic substitutions that are segre- gating in a population. In our previous paper, we discussed a method for accom- plishing this end by means of a study of electrophoretic variants at a large number of loci and we showed that the variation picked up by this method behaves in a simple Mendelian fashion so that phenotypes can be equated to homozygous and heterozygous genotypes at single loci. It is the purpose of this second paper to show the results of an application of the method to a series of samples chosen from natural populations of Drosophila pseudoobscura. In particular, we will show that there is a considerable amount of genic variation segregating in all of the populations studied and that the real variation in these populations must be greater than we are able to demonstrate. This study does not make clear what balance of forces is responsible for the genetic variation observed, but it does make clear the kind and amount of varia- tion at the genic level that we need to explain. An exactly similar method has recently been applied by HARRIS (1966) for the enzymes of human blood. In a preliminary report on ten randomly chosen enzymes, HARRIS describes two as definitely polymorphic genetically and a third as phenotypically polymorphic but with insufficient genetic data so far. Clearly these methods are applicable to any organism of macroscopic dimensions.
Article
Random amplified polymorphic DNA (RAPD) fragments were prepared from samples of Calonectris diomedea (Cory's shearwater, Aves) and Haemonchus contortus (Nematoda) DNA by polymerase chain reaction (PCR) using decamers containing two restriction enzyme sites as primers. Six of 19 studied RAPD fragments probably originated from traces of commensal microorganisms. Many rearranged fragments, absent in the original genomic DNA, were synthesized and amplified during the processing of all the DNA samples, indicating that interactions occur within and between strands during the annealing step of PCR. The model of interactions between molecular species during DNA amplification with a single arbitrary oligonucleotide primer was modified to include nested primer annealing and interactions within and between strands. The presence of these artefacts in the final RAPD have a major effect on the interpretation of polymorphism studies.
The significance of population size on reproductive success and genetic variability in the Eastern prairie white fringed orchid, Platanthera leucophaea
  • L E Wallace
Wallace LE (2000) The significance of population size on reproductive success and genetic variability in the Eastern prairie white fringed orchid, Platanthera leucophaea. American Journal of Botany, 87, 165.
Analysis of population genetic structure with RAPD markers
  • M Lynch
  • B G Milligan
Lynch M, Milligan BG (1994) Analysis of population genetic structure with RAPD markers. Molecular Ecology, 3, 91-99.
Using RAPD's to assess suitable units of conservation in a threatened orchid, Platanthera leucophaea
  • L E Wallace
Wallace LE (submitted) Using RAPD's to assess suitable units of conservation in a threatened orchid, Platanthera leucophaea. Plant Species Biology.