ArticlePDF Available

Arlequin (version 3.0): An integrated software package for population genetics data analysis

Evolutionary Bioinformatics

November 2017
1(1):1-47

DOI:10.1177/117693430500100003

License
CC BY-NC 3.0

Authors:

Laurent Excoffier

Universität Bern

Guillaume Laval

Institut Pasteur International Network

Stefan Schneider

University of Geneva

Arlequin ver 3.0 is a software package integrating several basic and advanced methods for population genetics data analysis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of departure from linkage equilibrium, departure from selective neutrality and demographic equilibrium, estimation or parameters from past population expansions, and thorough analyses of population subdivision under the AMOVA framework. Arlequin 3 introduces a completely new graphical interface written in C++, a more robust semantic analysis of input files, and two new methods: a Bayesian estimation of gametic phase from multi-locus genotypes, and an estimation of the parameters of an instantaneous spatial expansion from DNA sequence polymorphism. Arlequin can handle several data types like DNA sequences, microsatellite data, or standard multilocus genotypes. A Windows version of the software is freely available on http://cmpg.unibe.ch/software/arlequin3.

Access to this full-text is provided by SAGE Publications Inc.

Learn more

Content available from Evolutionary Bioinformatics

This content is subject to copyright.

APPLICATION NOTE

Arlequin (version 3.0): An integrated software package for

population genetics data analysis

Laurent Excoffier, Guillaume Laval, Stefan Schneider

Computational and Molecular Population Genetics Lab, , Zoological Institute, University of Berne, Baltzerstrasse 6, 3012

Berne, Switzerland

Introduction

Most genetic studies on non-model organisms require a description of the pattern of diversity within and be-

tween populations, based on a variety of markers often including mitochondrial DNA (mtDNA) sequences and

microsatellites. The genetic data are processed to extract information on the mating system, the extent of popu-

lation subdivision, the past demography of the population, or on departure from selective neutrality at some

loci. A series of computer packages have been developed in the last 10 years to assist researchers in performing

basic population genetics analyses like Arlequin2 (Schneider et al. 2000), DNASP (Rozas et al. 2003), FSTAT

(Goudet 1995), GENEPOP (Raymond and Rousset 1995b), or GENETIX (Belkhir et al. 2004). These programs

have been widely used in the molecular ecology and conservation genetics community (Labate 2000; Luikart

and England 1999; Schnabel et al. 1998). Among these, Arlequin is a very versatile (though not universal) pro-

gram, and complements the other programs listed above. It can handle several data types like RFLPs, DNA se-

quences, microsatellite data, allele frequencies, or standard multi-locus genotypes, while allowing the user to

carry out the same types of analyses irrespective of the data types.

We present here the version 3 of Arlequin with additional methods extending its capacities for the handling

of unphased multi-locus genotypes and for the estimation of parameters of a spatial expansion. Note that these

new developments are mainly implementations of new methodologies developed in our lab. We believe these

methods will be useful to the research community, but we do not claim that alternative methods implemented

by other groups in other programs are inadequate. A new graphical interface has been developed to provide a

better integration of the different analyses into a common framework, and an easier exploration of the data by

performing a wide variety of analyses with different settings. The tight coupling of Arlequin with the simula-

tion programs SIMCOAL2 (Laval and Excoffier 2004) and SPLATCHE (Currat et al. 2004) should also make

it useful to describe patterns of genetic diversity under complex evolutionary scenarios.

Methods implemented in Arlequin

Arlequin provides methods to analyse patterns of genetic diversity within and between population samples.

Intra-population methods

• Computation of different standard genetic indices, like the number of segregating sites, the number of dif-

Correspondence: Laurent Excoffier, Tel: +41 31 631 30 31, Fax: +41 31 631 48 88 Email: laurent.excoffier@zoo.unibe.ch

Evolutionary Bioinformatics Online 2005:1 47-50 47

Abstract: Arlequin ver 3.0 is a software package integrating several basic and advanced methods for population genetics data analy-

sis, like the computation of standard genetic diversity indices, the estimation of allele and haplotype frequencies, tests of departure

from linkage equilibrium, departure from selective neutrality and demographic equilibrium, estimation or parameters from past popu-

lation expansions, and thorough analyses of population subdivision under the AMOVA framework. Arlequin 3 introduces a com-

pletely new graphical interface written in C++, a more robust semantic analysis of input files, and two new methods: a Bayesian esti-

mation of gametic phase from multi-locus genotypes, and an estimation of the parameters of an instantaneous spatial expansion from

DNA sequence polymorphism. Arlequin can handle several data types like DNA sequences, microsatellite data, or standard multi-

locus genotypes. A Windows version of the software is freely available on http://cmpg.unibe.ch/software/arlequin3.

Keywords: Computer package, population genetics, genetic data analysis, AMOVA, EM algorithm, gametic phase estimation, spatial

expansion.

Excoffier et al

Evolutionary Bioinformatics Online 2005:1

ferent alleles, the heterozygosity, the base

composition of DNA sequences, gene diver-

sity, or the population effective size Ne scaled

by the mutation rate μ as θ=4Neu.

• Maximum-likelihood estimation of allele and

haplotype frequencies via the EM algorithm

(Excoffier and Slatkin 1995).

• Estimation of the gametic phase from multi-

locus genotypes via the Excoffier-Laval-

Balding (ELB) algorithm (Excoffier et al.

2003).

• Estimation of the parameters of a demographic

(Rogers and Harpending 1992; Schneider and

Excoffier 1999) or a spatial (Excoffier 2004;

Ray et al. 2003) expansion, from the mismatch

distribution computed on DNA sequences.

• Calculation of several measures of linkage dis-

equilibrium (LD) like D, D', or r2 (Hedrick

1987), and test of non-random association of

alleles at different loci when the gametic phase

is known (Weir 1996) or unknown (Slatkin

and Excoffier 1996).

• Exact test of departure from Hardy-Weinberg

equilibrium (Guo and Thompson 1992).

• Computation of Tajima’s D (Tajima 1989) and

Fu's FS (Fu 1997) statistics, and test of their

significance by coalescent simulations

(Hudson 1990; Nordborg 2003) under the infi-

nite-site model.

• Tests of selective neutrality under the infinite-

alleles model, like the Ewens-Watterson test

(Slatkin 1996; Watterson 1978), and Chak-

raborty’s amalgamation test (Chakraborty

1990).

Inter-population methods

• Search for shared haplotypes between popula-

tions

• Analysis of population subdivision under the

AMOVA framework (Excoffier 2003; Excof-

fier et al. 1992), with three hierarchical levels:

genes within individuals, individuals within

demes, demes within groups of demes. Com-

putation of F-statistics like the local inbreed-

ing coefficient FIS or the index of population

differentiation FST.

• Computation of genetic distances between

populations related to the pairwise FST index

(Gaggiotti and Excoffier 2000; Reynolds et al.

1983; Slatkin 1995).

• Exact test of population differentiation

(Goudet et al. 1996; Raymond and Rousset

1995a).

• A simple assignment test of individual geno-

types to populations according to their likeli-

hood (Paetkau et al. 1997).

• Computation of correlations or partial correla-

tions between a set of 2 or 3 distance matrices

(Mantel test: Smouse et al. 1986)

New features in Arlequin 3

• Version 3 of Arlequin integrates the core com-

putational routines and the interface in a single

program written in C++ for the Windows envi-

ronment. The interface has been entirely redes-

igned to provide better usability.

• Incorporation of two new methods to estimate

gametic phase and haplotype frequencies:

◊ The ELB algorithm (Excoffier et al.

2003) is a pseudo-Bayesian approach

aiming at reconstructing the gametic

phase of multi-locus genotypes, and the

estimation of the haplotype frequencies

are a by-product of this process. Phase

updates are made on the basis of a win-

dow of neighbouring loci, and the win-

dow size varies according to the local

level of linkage disequilibrium.

◊ The EM zipper algorithm, which is an

extension of the EM algorithm for esti-

mating haplotype frequencies (Excoffier

and Slatkin 1995), aims at estimating the

haplotype frequencies in unphased

multi-locus genotypes. The estimation of

the gametic phases are a by-product of

this process. It proceeds by adding loci

one at a time and progressively extend-

ing the length of the reconstructed haplo-

Arlequin 3.0

Evolutionary Bioinformatics Online 2005:1 49

types. With this method, Arlequin does

not need to build all possible genotypes

for each individual like in the conven-

tional EM algorithm, but it only consid-

ers the genotypes whose sub-haplotypes

have non-null estimated frequencies. It

can thus handle a much larger number of

polymorphic sites than the strict EM al-

gorithm. It also gives final haplotype fre-

quencies that often have a higher likeli-

hood than those estimated under the

strict EM algorithm, due to the difficulty

in exploring the space of all possible

genotypes when the number of polymor-

phic loci in the sample is large. Note that

this version of the EM algorithm is

equivalent to that implemented in the

SNPHAP program by David Clayton

fully described on http://www-

gene.cimr.cam.ac.uk/clayton/software/sn

phap.txt, and whose efficiency for infer-

ring gametic phase has been favorably

evaluated (Adkins 2004).

• Estimation of the parameters of a spatial ex-

pansion (age of the expansion and deme size

scaled by the mutation rate, as well as the

number of migrants exchanged between

neighbouring demes) from the patterns of

polymorphism in a sample of DNA sequences.

The estimation is based on a simple model of

instantaneous and infinite range expansion,

where some time ago, a single deme instanta-

neously colonized an infinite number of demes

subsequently interconnected by migration (as

under an infinite-island model) (Excoffier

2004). The parameters are obtained by a least-

square approach maximizing the fit between

the observed and expected distribution of pair-

wise differences (the mismatch distribution)

computed on DNA sequences. Confidence in-

tervals of the estimates are obtained under a

parametric bootstrap approach involving the

simulation of an instantaneous expansion un-

der a coalescent framework.

• Estimation of confidence intervals for F-

statistics estimated under the AMOVA frame-

work by bootstrapping over loci for multi-

locus data. A minimum of 8 loci are necessary

for the computation of these confidence inter-

vals.

• A completely rewritten and more robust input

file parsing procedure, giving more precise

information on the location of potential syntax

and format errors in input files.

• Use of the ELB algorithm described above to

generate samples of phased multi-locus geno-

types, which allows one to analyse unphased

multi-locus genotype data as if the phase was

known. The phased data sets are output in Ar-

lequin projects that can be analysed in a batch

mode to obtain the distribution of statistics

taking phase uncertainty into account.

• New output files fully compatible with modern

web browsers.

Availability

A Windows executable version Arlequin ver 3 can

be freely downloaded on

http://cmpg.unibe.ch/software/arlequin3, together

with an up-to-date user manual in Adobe Acrobat

PDF format incorporating more technical details on

the methods used in Arlequin 3, as well as several

example files.

Acknowledgements

This work was partially made possible thanks to a

Swiss NSF grant No 31-56755.99 to LE.

Excoffier et al

Evolutionary Bioinformatics Online 2005:1

References

Adkins RM. 2004. Comparison of the accuracy of methods of computational

haplotype inference using a large empirical dataset. BMC Genet. 5:

22.

Belkhir K, Borsa P, Chikhi L et al. 2004. GENETIX 4.05, logiciel sous Win-

dows pour la génétique des populations. Laboratoire Génome, Popula-

tions, Interactions, CNRS UMR 5000, Université de Montpellier II,

Montpellier.

Chakraborty R. 1990. Mitochondrial DNA polymorphism reveals hidden het-

erogeneity within some Asian populations. Am J Hum Genet. 47: 87-

94.

Currat M, Ray N and Excoffier L. 2004. SPLATCHE: a program to simulate

genetic diversity taking into account environmental heterogeneity.

Mol Ecol. 4: 139-142.

Excoffier L. 2003. Analysis of Population Subdivision. In Balding D Bishop

M, and Cannings C, eds. Handbook of Statistical Genetics, 2nd Edi-

tion. New York: John Wiley & Sons, Ltd. p 713-750.

Excoffier L. 2004. Patterns of DNA sequence diversity and genetic structure

after a range expansion: lessons from the infinite-island model. Mol

Ecol. 13: 853-864.

Excoffier L, Laval G and Balding D. 2003. Gametic phase estimation over

large genomic regions using an adaptive window approach. Mol Ecol.

1: 7-19.

Excoffier L and Slatkin M. 1995. Maximum-likelihood estimation of molecular

haplotype frequencies in a diploid population. Mol Biol Evol. 12: 921-

927.

Excoffier L, Smouse P and Quattro J. 1992. Analysis of molecular variance

inferred from metric distances among DNA haplotypes: Application to

human mitochondrial DNA restriction data. Genetics. 131: 479-491.

Fu Y-X. 1997. Statistical tests of neutrality of mutations against population

growth, hitchhiking and backgroud selection. Genetics. 147: 915-925.

Gaggiotti O and Excoffier L. 2000. A simple method of removing the effect of

a bottleneck and unequal population sizes on pairwise genetic dis-

tances. Proceedings of the Royal Society London B. 267: 81-87.

Goudet J. 1995. Fstat version 1.2: a computer program to calculate F-statistics.

J Heredity. 86: 485-486.

Goudet J, Raymond M, de Meeüs T et al. 1996. Testing differentiation in dip-

loid populations. Genetics. 144: 1933-1940.

Guo S and Thompson E. 1992. Performing the exact test of Hardy-Weinberg

proportion for multiple alleles. Biometrics. 48: 361-372.

Hedrick P. 1987. Gametic disequilibrium measures: proceed with caution.

Genetics. 117: 331-3412.

Hudson RR. 1990. Gene genealogies and the coalescent process. In Futuyma

DJ and Antonovics JD, eds. Oxford Surveys in Evolutionary Biology.

New York: Oxford University Press. p 1-44.

Labate JA. 2000. Software for Population Genetic Analyses of Molecular

Marker Data. Crop Sci. 40: 1521-1528.

Laval G and Excoffier L. 2004. SIMCOAL 2.0: a program to simulate genomic

diversity over large recombining regions in a subdivided population

with a complex history. Bioinformatics. 20: 2485-2487.

Luikart G and England PR. 1999. Statistical analysis of microsatellite DNA

data. Trends Ecol Evol. 14: 253-256.

Nordborg M. 2003. Coalescent Theory. In Balding D Bishop M, and Cannings

C, eds. Handbook of Statistical Genetics, 2nd edition. New York: John

Wiley & Sons Ltd. p 602-635.

Paetkau D, Waits LP, Clarkson PL et al. 1997. An empirical evaluation of

genetic distance statistics using microsatellite data from bear (Ursidae)

populations. Genetics. 147: 1943-1957.

Ray N, Currat M and Excoffier L. 2003. Intra-Deme Molecular Diversity in

Spatially Expanding Populations. Mol. Biol. Evol. 20: 76-86.

Raymond M and Rousset F. 1995a. An exact test for population differentiation.

Evolution. 49: 1280-1283.

Raymond M and Rousset F. 1995b. GENEPOP Version 1.2: Population genet-

ics software for exat tests and ecumenicism. J Heredity. 248-249.

Reynolds J, Weir BS and Cockerham CC. 1983. Estimation for the coancestry

coefficient: basis for a short-term genetic distance. Genetics. 105:

767-779.

Rogers AR and Harpending H. 1992. Population growth makes waves in the

distribution of pairwise genetic differences. Mol Biol Evol. 9: 552-

569.

Rozas J, Sanchez-DelBarrio JC, Messeguer X et al. 2003. DnaSP, DNA poly-

morphism analyses by the coalescent and other methods. Bioinformat-

ics. 19: 2496-2497.

Schnabel A, Beerli P, Estoup A et al. 1998. A guide to software packages for

data analysis in molecular ecology. In Carvalho G, eds. Advances in

Molecular Ecology. Amsterdam: IOS Press. pp 291-303.

Schneider S and Excoffier L. 1999. Estimation of demographic parameters

from the distribution of pairwise differences when the mutation rates

vary among sites: Application to human mitochondrial DNA. Genet-

ics. 152: 1079-1089.

Schneider S, Roessli D and Excoffier L. 2000. Arlequin: a software for popula-

tion genetics data analysis. User manual ver 2.000. Genetics and Bi-

ometry Lab, Dept. of Anthropology, University of Geneva, Geneva.

Slatkin M. 1995. A measure of population subdivision based on microsatellite

allele frequencies. Genetics. 139: 457-462.

Slatkin M. 1996. A correction to the exact test based on the Ewens sampling

distribution. Genet Res. 68: 259-260.

Slatkin M and Excoffier L. 1996. Testing for linkage disequilibrium in geno-

typic data using the EM algorithm. Heredity. 76: 377-383.

Smouse PE, Long JC and Sokal RR. 1986. Multiple regression and correlation

extensions of the Mantel Test of matrix correspondence. Syst Zool. 35:

627-632.

Tajima F. 1989. Statistical method for testing the neutral mutation hypothesis

by DNA polymorphism. Genetics. 123: 585-595.

Watterson G. 1978. The homozygosity test of neutrality. Genetics. 88: 405-

417.

Weir BS. 1996. Genetic Data Analysis II: Methods for Discrete Population

Genetic Data. Sinauer Assoc., Inc.: Sunderland, MA, USA.

Content uploaded by Stefan Schneider

Content may be subject to copyright.

Multilocus genetic analysis of Trypanosoma cruzi supports non-domestic intrusion into domestic transmission in an endemic region of Colombia

Article

Full-text available

Jun 2024

Complex patterns of genetic structure in the sea cucumber Holothuria (Metriatyla) scabra from the Philippines: implications for aquaculture and fishery management

Article

Full-text available

Jun 2024

The sandfish Holothuria (Metriatyla) scabra, is a high-value tropical sea cucumber harvested from wild stocks for over four centuries in multi-species fisheries across its Indo-Pacific distribution, for the global bêche-de-mer (BDM) trade. Within Southeast Asia, the Philippines is an important centre of the BDM trade, however overharvesting and largely open fishery management have resulted in declining catch volumes. Sandfish mariculture has been developed to supplement BDM supply and assist restocking efforts; however, it is heavily reliant on wild populations for broodstock supply. Consequently, to inform fishery, mariculture, germplasm and translocation management policies for both wild and captive resources, a high-resolution genomic audit of 16 wild sandfish populations was conducted, employing a proven genotyping-by-sequencing approach for this species (DArTseq). Genomic data (8,266 selectively-neutral and 117 putatively-adaptive SNPs) were used to assess fine-scale genetic structure, diversity, relatedness, population connectivity and local adaptation at both broad (biogeographic region) and local (within-biogeographic region) scales. An independent hydrodynamic particle dispersal model was also used to assess population connectivity. The overall pattern of population differentiation at the country level for H. scabra in the Philippines is complex, with nine genetic stocks and respective management units delineated across 5 biogeographic regions: (1) Celebes Sea, (2) North and (3) South Philippine Seas, (4) South China and Internal Seas and (5) Sulu Sea. Genetic connectivity is highest within proximate marine biogeographic regions (mean F st=0.016), with greater separation evident between geographically distant sites (F st range=0.041–0.045). Signatures of local adaptation were detected among six biogeographic regions, with genetic bottlenecks at 5 sites, particularly within historically heavily-exploited locations in the western and central Philippines. Genetic structure is influenced by geographic distance, larval dispersal capacity, species-specific larval development and settlement attributes, variable ocean current-mediated gene flow, source and sink location geography and habitat heterogeneity across the archipelago. Data reported here will inform accurate and sustainable fishery regulation, conservation of genetic diversity, direct broodstock sourcing for mariculture and guide restocking interventions across the Philippines.

Genetic diversity and population structure of Piper nigrum (black pepper) accessions based on next-generation SNP markers

Article

Full-text available

Jun 2024
PLOS ONE

Despite the economic importance of Piper nigrum (black pepper), a highly valued crop worldwide, development and utilization of genomic resources have remained limited, with diversity assessments often relying on only a few samples or DNA markers. Here we employed restriction-site associated DNA sequencing to analyze 175 P. nigrum accessions from eight main black pepper growing regions in Sri Lanka. The sequencing effort resulted in 1,976 million raw reads, averaging 11.3 million reads per accession, revealing 150,356 high-quality single nucleotide polymorphisms (SNPs) distributed across 26 chromosomes. Population structure analysis revealed two subpopulations (K = 2): a dominant group consisting of 152 accessions sourced from both home gardens and large-scale cultivations, and a smaller group comprising 23 accessions exclusively from native collections in home gardens. This clustering was further supported by principal component analysis, with the first two principal components explaining 35.2 and 12.1% of the total variation. Genetic diversity analysis indicated substantial gene flow (Nm = 342.21) and a low fixation index (FST = 0.00073) between the two subpopulations, with no clear genetic differentiation among accessions from different agro-climatic regions. These findings demonstrate that most current black pepper genotypes grown in Sri Lanka share a common genetic background, emphasizing the necessity to broaden the genetic base to enhance resilience to biotic and abiotic stresses. This study represents the first attempt at analyzing black pepper genetic diversity using high-resolution SNP markers, laying the foundation for future genome-wide association studies for SNP-based gene discovery and breeding.

Genomic and morphological data reveal a critically endangered new species from the Atlantic Forest, Paepalanthus salimenae (Eriocaulaceae)

Article

Jun 2024

During our investigation of the genetic and morphological variation within the distribution range of Paepalanthus calvus, we observed that specimens from Juiz de Fora municipality, Minas Gerais (Brazil) were morphologically divergent. Through genome-wide analyses using MIG-seq and detailed morphological comparisons, we were able to establish consistent differences between P. calvus and the Juiz de Fora specimens, which supported the recognition of a new species, Paepalanthus salimenae sp. nov. This new species is primarily distinguished by the narrow leaves; trichomes on the leaves, spathes, and scapes; densely pilose involucral bracts; and the tufted apex of some floral organs. Paepalanthus salimenae exhibits low genetic diversity and strong population structure. A preliminary conservation risk assessment suggests this species as Critically Endangered, highlighting the urgent need for conservation efforts to protect its habitat and intraspecific diversity. We provide photographs, line drawings and additional commentaries on the distribution, habitat, and morphological affinities of P. salimenae to congeneric taxa.

Population structure and diversification of Gymnospermium kiangnanense , a plant species with extremely small populations endemic to eastern China

Article

Full-text available

Jun 2024

Background Gymnospermium kiangnanense is the only species distributed in the subtropical region within the spring ephemeral genus Gymnospermium . Extensive human exploitation and habitat destruction have resulted in a rapid shrink of G. kiangnanense populations. This study utilizes microsatellite markers to analyze the genetic diversity and structure and to deduce historical population events of extant populations of G. kiangnanense . Methods A total of 143 individuals from eight extant populations of G. kiangnanense , including two populations from Anhui Province and six populations from Zhejiang Province, were analyzed with using 21 pairs of microsatellite markers. Genetic diversity indices were calculated using Cervus, GENEPOP, GenALEX. Population structure was assessed using genetic distance (UPGMA), principal coordinate analysis (PCoA), Bayesian clustering method (STRUCTURE), and molecular variation analysis of variance (AMOVA). Population history events were inferred using DIYABC. Results The studied populations of G. kiangnanense exhibited a low level of genetic diversity ( He = 0.179, I = 0.286), but a high degree of genetic differentiation ( F ST = 0.521). The mean value of gene flow ( N m ) among populations was 1.082, indicating prevalent gene exchange via pollen dispersal. Phylogeographic analyses suggested that the populations of G. kiangnanense were divided into two lineages, Zhejiang (ZJ) and Anhui (AH). These two lineages were separated by the Huangshan-Tianmu Mountain Range. AMOVA analysis revealed that 36.59% of total genetic variation occurred between the two groups. The ZJ lineage was further divided into the Hangzhou (ZJH) and Zhuji (ZJZ) lineages, separated by the Longmen Mountain and Fuchun River. DIYABC analyses suggested that the ZJ and AH lineages were separated at 5.592 ka, likely due to the impact of Holocene climate change and human activities. Subsequently, the ZJZ lineage diverged from the ZJH lineage around 2.112 ka. Given the limited distribution of G. kiangnanense and the significant genetic differentiation among its lineages, both in-situ and ex-situ conservation strategies should be implemented to protect the germplasm resources of G. kiangnanense .

Genomic and morphological data reveal a critically endangered new species from the Atlantic Forest, Paepalanthus salimenae (Eriocaulaceae)

Article

Jun 2024

Marcelo Trovó

Conservation genetics of the endangered California Freshwater Shrimp (Syncaris pacifica): watershed and stream networks define gene pool boundaries

Article

Full-text available

Jun 2024
CONSERV GENET

Understanding genetic structure and diversity among remnant populations of rare species can inform conservation and recovery actions. We used a population genetic framework to spatially delineate gene pools and estimate gene flow and effective population sizes for the endangered California Freshwater Shrimp Syncaris pacifica. Tissues of 101 individuals were collected from 11 sites in 5 watersheds, using non-lethal tissue sampling. Single Nucleotide Polymorphism markers were developed de novo using ddRAD-seq methods, resulting in 433 unlinked loci scored with high confidence and low missing data. We found evidence for strong genetic structure across the species range. Two hierarchical levels of significant differentiation were observed: (i) five clusters (regional gene pools, FST = 0.38–0.75) isolated by low gene flow were associated with watershed limits and (ii) modest local structure among tributaries within a watershed that are not connected through direct downstream flow (local gene pools, FST = 0.06–0.10). Sampling sites connected with direct upstream-to-downstream water flow were not differentiated. Our analyses suggest that regional watersheds are isolated from one another, with very limited (possibly no) gene flow over recent generations. This isolation is paired with small effective population sizes across regional gene pools (Ne = 62.4–147.1). Genetic diversity was variable across sites and watersheds (He = 0.09–0.22). Those with the highest diversity may have been refugia and are now potential sources of genetic diversity for other populations. These findings highlight which portions of the species range may be most vulnerable to future habitat fragmentation and provide management consideration for maintaining local effective population sizes and genetic connectivity.

Population genetic structure and demographic history of short mackerel, Rastrelliger brachysoma, in the Gulf of Thailand

Article

Jun 2024

The effect of terrain on the fine‐scale genetic diversity of sub‐Antarctic Collembola: A landscape genetics approach

Article

Full-text available

Jun 2024

Biodiversity patterns are shaped by the interplay between geodiversity and organismal characteristics. Superimposing genetic structure onto landscape heterogeneity (i.e., landscape genetics) can help to disentangle their interactions and better understand population dynamics. Previous studies on the sub‐Antarctic Prince Edward Islands (located midway between Antarctica and Africa) have highlighted the importance of landscape and climatic barriers in shaping spatial genetic patterns and have drawn attention to the value of these islands as natural laboratories for studying fundamental concepts in biology. Here, we assessed the fine‐scale spatial genetic structure of the springtail, Cryptopygus antarcticus travei, which is endemic to Marion Island, in tandem with high‐resolution geological data. Using a species‐specific suite of microsatellite markers, a fine‐scale sampling design incorporating landscape complexity and generalised linear models (GLMs), we examined genetic patterns overlaid onto high‐resolution digital surface models and surface geology data across two 1‐km sampling transects. The GLMs revealed that genetic patterns across the landscape closely track landscape resistance data in concert with landscape discontinuities and barriers to gene flow identified at a scale of a few metres. These results show that the island's geodiversity plays an important role in shaping biodiversity patterns and intraspecific genetic diversity. This study illustrates that fine‐scale genetic patterns in soil arthropods are markedly more structured than anticipated, given that previous studies have reported high levels of genetic diversity and evidence of genetic structing linked to landscape changes for springtail species and considering the homogeneity of the vegetation complexes characteristic of the island at the scale of tens to hundreds of metres. By incorporating fine‐scale and high‐resolution landscape features into our study, we were able to explain much of the observed spatial genetic patterns. Our study highlights geodiversity as a driver of spatial complexity. More widely, it holds important implications for the conservation and management of the sub‐Antarctic islands.

Characterisation of the Cinnamomum parthenoxylon (Jack) Meisn (Lauraceae) transcriptome using Illumina paired-end sequencing and EST-SSR markers development for population genetics

Article

Full-text available

Jun 2024

Cinnamomum parthenoxylon is an endemic and endangered species with significant economic and ecological value in Vietnam. A better understanding of the genetic architecture of the species will be useful when planning management and conservation. We aimed to characterize the transcriptome of C. parthenoxylon, develop novel molecular markers, and assess the genetic variability of the species. First, transcriptome sequencing of five trees (C. parthenoxylon) based on root, leaf, and stem tissues was performed for functional annotation analysis and development of novel molecular markers. The transcriptomes of C. parthenoxylon were analyzed via an Illumina HiSeqTM 4000 sequencing system. A total of 27,363,199 bases were generated for C. parthenoxylon. De novo assembly indicated that a total of 160,435 unigenes were generated (average length = 548.954 bp). The 51,691 unigenes were compared against different databases, i.e. COG, GO, KEGG, KOG, Pfam, Swiss-Prot, and NR for functional annotation. Furthermore, a total of 12,849 EST-SSRs were identified. Of the 134 primer pairs, 54 were randomly selected for testing, with 15 successfully amplified across nine populations of C. parthenoxylon. We uncovered medium levels of genetic diversity (PIC = 0.52, Na = 3.29, Ne = 2.18, P = 94.07%, Ho = 0.56 and He = 0.47) within the studied populations. The molecular variance was 10% among populations and low genetic differentiation (Fst = 0.06) indicated low gene flow (Nm = 2.16). A reduction in the population size of C. parthenoxylon was detected using BOTTLENECK (VP population). The structure analysis suggested two optimal genetic clusters related to gene flow among the populations. Analysis of molecular variance (AMOVA) revealed higher genetic variation within populations (90%) than among populations (10%). The UPGMA approach and DAPC divided the nine populations into three main clusters. Our findings revealed a significant fraction of the transcriptome sequences and these newlydeveloped novel EST-SSR markers are a very efficient tool for germplasm evaluation, genetic diversity and molecular marker-assisted selection in C. parthenoxylon. This study provides comprehensive genetic resources for the breeding and conservation of different varieties of C. parthenoxylon.

An exact test for population differentiation

Article

Full-text available

Dec 1995
EVOLUTION

GENEPOP (Version 1.2): Population Genetics Software for Exact Tests and Ecumenicism

Article

Full-text available

May 1995
J HERED

GENEPOP (Version 1.22): Population Genetics Software for Exact Tests and Ecumenicism

Article

Full-text available

Jan 1995
HEREDITY

A measure of population subdivision based on microsatellite allele frequencies.

Article

Jan 1995
GENETICS

Montgomery Slatkin

A new measure of the extent of population subdivision as inferred from allele frequencies at microsatellite loci is proposed and tested with computer simulations. This measure, called R(ST), is analogous to Wright's F(ST) in representing the proportion of variation between populations. It differs in taking explicit account of the mutation process at microsatellite loci, for which a generalized stepwise mutation model appears appropriate. Simulations of subdivided populations were carried out to test the performance of R(ST) and F(ST). It was found that, under the generalized stepwise mutation model, R(ST) provides relatively unbiased estimates of migration rates and times of population divergence while F(ST) tends to show too much population similarity, particularly when migration rates are low or divergence times are long [corrected].

Genetic Data Analysis: Methods for Discrete Population Genetic Data.

Article

Jun 1991
Syst Zool

Population growth makes waves in the distribution of pairwise genetic differences.

Article

May 1992

Episodes of population growth and decline leave characteristic signatures in the distribution of nucleotide (or restriction) site differences between pairs of individuals. These signatures appear in histograms showing the relative frequencies of pairs of individuals who differ by i sites, where i = 0, 1, .... In this distribution an episode of growth generates a wave that travels to the right, traversing 1 unit of the horizontal axis in each 1/2u generations, where u is the mutation rate. The smaller the initial population, the steeper will be the leading face of the wave. The larger the increase in population size, the smaller will be the distribution's vertical intercept. The implications of continued exponential growth are indistinguishable from those of a sudden burst of population growth Bottlenecks in population size also generate waves similar to those produced by a sudden expansion, but with elevated uppertail probabilities. Reductions in population size initially generate L-shaped distributions with high probability of identity, but these converge rapidly to a new equilibrium. In equilibrium populations the theoretical curves are free of waves. However, computer simulations of such populations generate empirical distributions with many peaks and little resemblance to the theory. On the other hand, agreement is better in the transient (nonequilibrium) case, where simulated empirical distributions typically exhibit waves very similar to those predicted by theory. Thus, waves in empirical distributions may be rich in information about the history of population dynamics.

'Coalescent Theory, '

Article