ArticlePDF Available

Computer note. BOTTLENECK: a computer program for detecting recent reductions in the effective size using allele frequency data

July 1999
Heredity 90(4)

July 1999
90(4)

DOI:10.1093/jhered/90.4.502

Authors:

Sylvain Piry

French National Institute for Agriculture, Food, and Environment (INRAE)

Gordon Luikart

University of Montana

Content uploaded by Sylvain Piry

Content may be subject to copyright.

499

Computer Notes

Simulation of Effects of

Dominance on Estimates of

Population Genetic Diversity

and Differentiation

K. V. Krutovskii, S. Y. Erofeeva,

J. E. Aagaard, and S. H. Strauss

The advent of PCR-based molecular mark-

ers has led to a rapid expansion in studies

describing the levels and distribution of

genetic variation among populations at

the DNA level. Randomly ampliﬁed poly-

morphic DNA (RAPD; Williams et al. 1990)

and ampliﬁed fragment length polymor-

phism (AFLP; Vos et al. 1995) markers are

now commonly used in population genetic

studies (e.g., Aagaard et al. 1998; Isabel et

al. 1995; Liu and Furnier 1993; Mosseler et

al. 1992; Peakall et al. 1995; Szmidt et al.

1996; Travis et al. 1996; Wu et al., in press).

However, these PCR-based markers have

limitations compared to allozymes, which

had been the prevalent means for popu-

lation studies prior to the use of PCR. At

the majority of RAPD and AFLP loci the

dominant allele masks the presence of the

null allele in heterozygotes when assaying

diploid tissues (e.g., about 97%–98%; Kru-

tovskii et al. 1998), thus sampling variance

for dominant allele frequencies is typically

greater than that for codominant alleles

(Lynch and Milligan 1994). The frequen-

cies of null and dominant alleles are

inferred from the frequency of null allele

homozygotes; the precision of their esti-

mation thus depends on mating system as-

sumptions and is strongly affected by the

sample size. Empirical studies have also

suggested that dominant markers can bias

estimates of genetic diversity and differ-

entiation among populations (e.g., Isabel

et al. 1995; Szmidt et al. 1996).

Although RAPD markers have proved to

be useful for population studies, and their

gross patterns of diversity usually agree

with that of allozymes, the levels of genet-

ic variation, differentiation, and ﬁne-scale

genetic structures often differ (e.g., Barufﬁ

et al. 1995; Dawson et al. 1996; Heun et al.

1994; Lanne´r-Herrera et al. 1996; Latta and

Mitton 1997; le Corre et al. 1997; Liu and

Furnier 1993; Peakall et al. 1995; Puterka

et al. 1993). To help assess whether these

differences are biological or a simple con-

sequence of the dominance and biallelism

of RAPD and AFLP markers, we developed

a dominance simulation program, DOM-

SIM, that transforms codominant popula-

tion data into a biallelic dominant dataset.

The program then estimates population

genetic statistics with which dominant

and codominant markers can be directly

compared. We use data from a widespread

North American conifer, Douglas-ﬁr [Pseu-

dotsuga menziesii (Mirb.) Franco], and

three California closed-cone pine species

to illustrate the program’s function. The

test simulation suggests that dominant

biallelic markers, such as RAPDs, can

strongly underestimate population diver-

sity but can still reasonably estimate pop-

ulation differentiation (G

), if sample

sizes are larger than about 30 individuals.

Program Functions

The program DOMSIM uses multiallelic da-

tasets with a maximum number of six al-

leles per locus for which population allele

frequencies are deﬁned. Assuming Hardy–

Weinberg equilibrium and no linkage

among loci, the program generates Nbasic

populations (N

max

20) of up to 1,000 in-

dividuals each with multilocus genotypes

that maintain the speciﬁed allele frequen-

cies within populations. A total of Ssub-

populations (S

max

400) of nindividuals

10–200) are then drawn with replace-

ment for each of the Npopulations. The

sampling is done in two different ways: by

sampling subpopulations of size nwith

replacement directly from the initially gen-

erated basic population, and by resam-

pling subpopulations of size nwith

replacement within the ﬁrst sampled sub-

population of nindividuals ( bootstrap re-

sampling). Population genetic parameters

, and G

) are calculated for each

cycle of resampling in three ways. First,

for a codominant dataset, calculations are

made considering all alleles and geno-

types present in the subpopulations. Sec-

ond, the same subpopulations and data

are used to simulate a dominant biallelic

dataset by randomly selecting one allele

as dominant, with the rest treated as re-

cessive to it. The synthetic null allele fre-

quency is then calculated from the null

homozygote frequencies assuming Hardy–

Weinberg equilibrium. Average parame-

ters and their variance are calculated for

each set of Ssubpopulations. Gene diver-

sity is evaluated using H

and H

, either

unmodiﬁed (Nei 1973) or modiﬁed ( Nei

and Chesser 1983) for the sample size. Ge-

netic differentiation is evaluated via

(Weir and Cockerham 1984) and G

pa-

rameters that are either unmodiﬁed ( Nei

1973), modiﬁed for the sample size (Nei

and Chesser 1983), or modiﬁed for both

the sample size and population number

(Nei 1986). Finally, null allele frequencies

are corrected for dominance using Lynch

and Milligan’s (1994) equation 2a, and

their asymptotically unbiased estimate of

recommended for dominant markers is

also calculated following equation 14a.

Installing and Running the Program

The program DOMSIM is written in FOR-

TRAN-77 (simulation routines) and in Lab-

Windows CVI (interface routines). The

source code ﬁle, domsimd.f, was compiled

using Microsoft FORTRAN Power Station

Compiler version 1.0. DOMSIM runs on

IBM PCs and compatibles under MS Win-

dows 95 and NT for 32-bit operating envi-

ronments. To install the program run the

compressed self-extracting ﬁle dom-

simpr.exe which can be downloaded from

the web site http://www.fsl.orst.edu/tgerc/

protocol.htm. It will automatically decom-

500

The Journal of Heredity 1999:90(4)

Figure 1. Levels of diversity and differentiation for codominant, multiallelic allozymes versus biallelic, dominant

markers, as simulated from an allozyme dataset from Douglas-ﬁr studied with varying sample sizes. Standard

deviations (error bars) were calculated from the variance among 400 bootstrap subsamples and represent the

variance due to resampling of individuals at each level of sampling from the master population of 1,000individuals.

The arrow shows the population sample size between 30 and 40 needed to eliminate the tendency for overesti-

mation of population differentiation caused by dominance and biallelism.

press ﬁve ﬁles domsimd.f, domsim.001,

domsim.002, read.me, and setup.exe. Next,

run the setup ﬁle and follow the instruc-

tions on your screen during installation.

Run the program by either clicking the

icon or executing the program ﬁle dom-

sim.exe. A read.me ﬁle contains additional

instructions for installing and running the

program.

Input and Output Files

The input format is an ASCII ﬁle similar to

GeneStat input ﬁles (Lewis 1994), but

does not require population, locus, and al-

lele names, and there should be no empty

lines. An example (sample.dat) and brief

help, which explicitly explains an input ﬁle

structure, are provided with the program.

The output ﬁle has all the parameters cal-

culated for each resampled and bootstrap

set, their average values, and standard de-

viations.

Examples of Simulation Based on

Allozyme Data in Douglas-ﬁr and

California Closed-Cone Pines

In order to facilitate comparisons between

dominant and codominant markers, and

to help understand the effects of RAPD

dominance and biallelism on our studies

of genetic diversity and differentiation in

Douglas-ﬁr (Aagaard et al. 1998) and the

California closed-cone pines (Wu et al., in

press), we simulated dominance and bial-

lelism in these two allozyme datasets (Li

and Adams 1989; Wu et al., in press). The

ﬁrst allozyme dataset included six popu-

lations of three races of Douglas-ﬁr—

coastal, north interior, and south interi-

or—with two populations per race. The

second one included four, ﬁve, and three

populations of Pinus attenuata, P. muricata,

and P. radiata, respectively. These popu-

lations are described in detail elsewhere

(Aagaard et al. 1998; Wu et al., in press).

From allozyme allele frequencies within

populations we generated simulated pop-

ulations of 1,000 individuals each, and a

total of 400 subpopulations of nindividu-

als were drawn with replacement from

each of the populations. The program also

performed 400 bootstrap resamplings us-

ing a subpopulation of size n. Population

genetic parameters (H

, and

) were then calculated for each set of

400 subpopulations in the three ways de-

scribed above. We varied the number of

individuals (n) within the subsamples

from 10 to 200 to bracket the range of sam-

ple sizes that might reasonably be em-

ployed in population studies, and the sam-

ple size of 30–50 trees per population that

was used in our RAPD studies (Aagaard et

al. 1998; Wu et al., in press). The results of

the simulations are summarized in Figures

1 and 2. The simulations showed that di-

versity measurements (H

and H

) were

likely to be underestimated by dominant

biallelic markers approximately twofold

regardless of sample size.

When 30 or more diploid individuals per

population were sampled, there was little

effect on differentiation estimates (G

Computer Notes

501

Figure 2. Genetic diversity (H

) and differentiation (G

; Nei 1986) values averaged over populations of each

California closed-cone pine species for codominant multiallelic allozyme and dominant biallelic markers simulated

in the samples of different sizes. Standard deviations (error bars) were calculated from the variance among 400

bootstrap subsamples simulated for each population of each species. Observed RAPD values are also shown as a

star.

and F

) in Douglas-ﬁr. However, though

still very similar to the estimates for co-

dominant markers, the estimates for the

simulated dominant markers began to di-

verge slightly but signiﬁcantly downward

at large population sizes in Douglas-ﬁr. In

the California closed-cone pines, the esti-

mates for the simulated dominant markers

converge toward the estimates for codom-

inant multiallelic markers at large popula-

tion sizes, but were always signiﬁcantly

higher (Wu et al., in press). Our simula-

tions were in close agreement with our

empirical studies of Douglas-ﬁr where, de-

spite dominance and biallelism of RAPD

markers, we have found that RAPDs and

allozymes exhibit similar levels of differ-

entiation at the population and race levels

with adequate sample sizes (Aagaard et al.

1998). However, the California closed-cone

pine allozyme data showed that the larger

sample sizes than we employed in our

RAPD study (Wu et al., in press) are desir-

able for a fair comparison of RAPD and

allozyme data. Finally, despite the expec-

tation of much reduced diversity for dom-

inant biallelic markers predicted by the

simulations, our RAPD data gave higher

estimates of diversity than did allozymes

in both Douglas-ﬁr (Aagaard et al. 1998)

and the California closed-cone pines (Wu

et al., in press). This suggests that RAPD

markers may have much higher intrinsic

genetic diversity than do allozyme mark-

ers. Our results demonstrate the impor-

tance of simulations to help compare and

interpret the results of population studies

with dominant markers.

From the Departments of Forest Science (Aagaard, Kru-

tovskii, and Strauss) and the College of Oceanic and

Atmospheric Sciences (Erofeeva), Oregon StateUniver-

sity, Corvallis, OR 97331-7501. We thank Tom Adams for

providing allozyme data and Vladislav Erofeev for help

with computer software. This work was supported in

part by NSF grants DEB 9300083 and BSR 895702 to

S.H.S. The dominance simulation program (DOMSIM)

is available for public use and can be downloaded as

a self-extracting ﬁle domsimpr.exe from the TGERC web

site: http://www.fsl.orst.edu/tgerc/protocol.htm. Ad-

dress correspondence to Dr. K. V. Krutovskii at the ad-

dress above or e-mail: krutovskiik@fsl.orst.edu.

1999 The American Genetic Association

References

Aagaard JE, Krutovskii KV, and Strauss SH, 1998. RAPDs

and allozymes exhibit similar levels of diversity and

differentiation among populations and races of Doug-

las-ﬁr. Heredity 81:69–78.

Barufﬁ L, Damiani G, Guglielmino CR, Bandi C, Malacri-

da AR, and Gasperi G, 1995. Polymorphism within and

between populations of Ceratitis capitata: comparison

between RAPD and multilocus enzyme electrophoresis

data. Heredity 74:425–437.

Dawson IK, Simons AJ, Waugh R, and Powell W, 1996.

Diversity and genetic differentiation among subpopu-

lations Gliricidia sepium revealed by PCR-based assays.

Heredity 74:10–18.

Heun M, Murphy JP, and Phillips TD, 1994. A compari-

son of RAPD and isozyme analyses for determiningthe

genetic relationships among Avena sterilis L. acces-

sions. Theor Appl Genet 87:689–696.

Isabel N, Beaulieu J, and Bousquet J, 1995. Complete

congruence between gene diversity estimates derived

from genotypic data at enzyme and random ampliﬁed

polymorphic DNA loci in black spruce. Proc Natl Acad

Sci USA 92:6369–6373.

Krutovskii KV, Vollmer SS, Sorensen FC, Adams WT,

Knapp SJ, and Strauss SH, 1998. RAPD genome maps of

Douglas-ﬁr. J Hered 89:197–205.

Lanne´r-Herrera C, Gustafsson M, Fa¨lt AS, and Bryngels-

son T, 1996. Diversity of wild Brassica oleraceae as es-

timated by isozyme and RAPD analysis. Genet Re-

sources Crop Evol 43:13–23.

Latta RG and Mitton JB, 1997. A comparison of popu-

lation differentiation across four classes of gene mark-

er in limber pine (Pinus ﬂexilis James). Genetics 146:

1153–1163.

le Corre V, Dumolin-Lape`gue S, and Kremer A, 1997.

Genetic variation at allozyme and RAPD loci in sessile

oak Quercus petraea (Matt.) Liebl.: the role of history

and geography. Mol Ecol 6:519–529.

Lewis PO, 1994. GeneStat-PC 3.3. Raleigh, North Caro-

lina: Department of Statistics, North Carolina StateUni-

versity.

Li P and Adams WT, 1989. Range-wide patterns of allo-

zyme variation in Douglas-ﬁr (Pseudotsuga menziesii).

Can J For Res 19:149–161.

Liu Z and Furnier GR, 1993. Comparison of allozyme,

RFLP, and RAPD markers for revealing genetic variation

within and between trembling aspen and bigtooth as-

pen. Theor Appl Genet 87:97–105.

Lynch M and Milligan BG, 1994. Analysis of population

genetic structure with RAPD markers. Mol Ecol 3:91–

99.

Mosseler A, Egger KN, and Hughes GA, 1992. Low levels

of genetic diversity in red pine conﬁrmed by random

ampliﬁed polymorphic DNA markers. Can J For Res 22:

1332–1337.

Nei M, 1973. Analysis of gene diversity in subdivided

populations. Proc Natl Acad Sci USA 70:3321–3323.

Nei M, 1986. Deﬁnition and estimation of ﬁxation indi-

ces. Evolution 40:643–645.

502

The Journal of Heredity 1999:90(4)

Nei M and Chesser RK, 1983. Estimation of ﬁxation in-

dices and gene diversities. Ann Hum Genet 47:253–259.

Peakall R, Smouse PE, and Huff DR, 1995. Evolutionary

implications of allozyme and RAPD variation in diploid

populations of dioecious buffalograss Buchloe¨ dactylo-

ides. Mol Ecol 4:135–147.

Puterka GJ, Black IV WC, Steiner WM, and Burton RL,

1993 Genetic variation and phylogenetic relationships

among worldwide collections of the Russian wheat

aphid, Diuraphis noxia (Mordvilko), inferred from allo-

zyme and RAPD-PCR markers. Heredity 70:604–618.

Szmidt AE, Wang X, and Lu M, 1996. Empirical assess-

ment of allozyme and RAPD variation in Pinus sylvestris

(L.) using haploid tissue analysis. Heredity 76:412–420.

Travis SE, Maschinski J, and Keim P, 1996. An analysis

of genetic variation in Astragalus cremnophylax var.

cremnophylax, a critically endangered plant, using

AFLP markers. Mol Ecol 5:735–745.

Vos P, Hogers R, Bleeker M, Reijans M, van de Lee T,

Hornes M, Frijters A, Pot J, Peleman J, Kuiper M, and

Zabeau M, 1995. AFLP: a new technique for DNA ﬁn-

gerprinting. Nucleic Acids Res 23:4407–4414.

Weir BS and Cockerham CC, 1984. Estimating F-statis-

tics for the analysis of population structure. Evolution

38:1358–1370.

Williams JG, Kubelik AR, Livak KJ, Rafalski JA, and Tin-

gey SV, 1990. DNA polymorphisms ampliﬁed by arbi-

trary primers are useful as genetic markers. Nucleic

Acids Res 18:6531–6535.

Wu J, Krutovskii KV, and Strauss SH, in press. Nuclear

DNA diversity, population differentiation and phyloge-

netic relationships in the California closed-cone pines

based on RAPD and allozyme markers. Genome.

Received April 12, 1998

Accepted February 18, 1999

Corresponding Editor: Robert Angus

BOTTLENECK: A Computer

Program for Detecting

Recent Reductions in the

Effective Population Size

Using Allele Frequency Data

S. Piry, G. Luikart, and J-M.

Cornuet

BOTTLENECK (current version 1.2) is a

population genetics computer program

that conducts four tests for identifying

populations that have recently experi-

enced a severe reduction in effective pop-

ulation size (N

). ‘‘Recently’’ is deﬁned as

within approximately the past 2N

–4N

generations, depending on several factors

such as the severity of the bottleneck and

the mutation rate of the loci being studied

(Cornuet and Luikart 1996). The program

runs on Windows 95

. It requires allele

frequency data obtained from one sample

of individuals (e.g., 20–30 diploid individ-

uals) and at least four polymorphic loci.

Signiﬁcant deviations from population

mutation-drift equilibrium (e.g., bottle-

necks) are important to detect because

equilibrium is an assumption required for

numerous analyses of population genetics

data (e.g., see Nei 1987, p. 251). Bottle-

necks are important to detect in conser-

vation biology because they can increase

the risk of population extinction. Founder-

ﬂush events (i.e., short but severe bottle-

necks) are important to detect because

they may play a role in some modes of

speciation [for reviews see Harrison

(1991) and Howard (1993)].

Principle

Populations that have experienced a re-

cent reduction of their effective popula-

tion size exhibit a correlative reduction of

the allele number and heterozygosity at

polymorphic loci. But the allele number is

reduced faster than the heterozygosity

). Thus the H

becomes larger than the

heterozygosity (H

) expected at mutation-

drift equilibrium because H

is calculated

from the allele number (and the sample

size; see Description below and Cornuet

and Luikart 1996). Note that H

is calcu-

lated from allele frequencies (e.g., 1

, where p

is the frequency of the ith

allele). Here both the measured heterozy-

gosity (H

) and the expected equilibrium

heterozygosity (H

) refer to heterozygos-

ity in the sense of Nei’s (1987) gene diver-

sity. Heterozygosity never refers to the

proportion of heterozygotes observed

). Thus we are not testing for an excess

of heterozyogotes (H

), but rather an

excess of heterozygosity (H

Strictly speaking, heterozygosity excess

has been demonstrated only for loci evolv-

ing under the inﬁnite allele model ( IAM;

Kimura and Crow 1964) by Maruyama and

Fuerst (1985). If the locus evolves under

the strict one-step stepwise mutation

model (SMM; Ohta and Kimura 1973),

there can be situations where this hetero-

zygosity excess is not observed (Cornuet

and Luikart 1996). However, few loci fol-

low the strict SMM, and as soon as loci

depart slightly from the SMM toward the

IAM they will exhibit a heterozygosity ex-

cess as a consequence of a genetic bottle-

neck. When testing for bottlenecks, the

BOTTLENECK program uses both the

SMM and IAM independently, because

they represent two extreme models of mu-

tation along a continuum of possible mod-

els (Chakraborty and Jin 1992). All loci will

follow a mutation model somewhere in-be-

tween the two extreme models.

For selectively neutral loci in a popula-

tion near mutation-drift equilibrium (i.e., a

population in which N

has remained fairly

constant in the past), there is approxi-

mately an equal probability that a locus

will show a slight heterozygosity excess or

a heterozygosity deﬁcit. However, in re-

cently bottlenecked populations, a major-

ity of loci will exhibit an excess of hetero-

zygosity (Luikart and Cornuet 1998). To

determine if a population exhibits a signif-

icant number of loci with heterozygosity

excess, we proposed three statistical

tests: sign test, a standardized differences

test (Cornuet and Luikart 1996; Luikart

and Cornuet 1998), and a Wilcoxon’s

signed rank test (Luikart et al., submitted;

Luikart 1997). We also proposed a graph-

ical descriptor of the shape of the allele

frequency distribution (‘‘mode-shift’’ indi-

cator) which can differentiate between

bottlenecked and stable populations (Lui-

kart et al. 1998).

Interpretation of output from the sign

and standardized differences tests is thor-

oughly discussed in Cornuet and Luikart

(1996) and Luikart and Cornuet (1998). In-

terpretation of output from the graphical

descriptor is discussed in Luikart et al.

(1998). Guidelines for interpreting the out-

put from the Wilcoxon’s test are less easy

to ﬁnd ( Luikart 1997: chapter 4; Luikart et

al., submitted), although this test is anal-

ogous to the sign test. The Wilcoxon’s test

is generally the most useful of all the tests

because it is the most powerful (along

with the standardized differences test),

and robust (like the sign test) when used

with few (

20) polymorphic loci. When

testing for bottlenecks, the null hypothe-

sis of the Wilcoxon’s test is no signiﬁcant

heterozygosity excess (on average across

loci). Thus the alternate hypothesis is sig-

niﬁcant heterozygosity excess (and thus

evidence of a recent bottleneck). This is a

one-tailed test that requires at least four

polymorphic loci to have any possibility

of obtaining a signiﬁcant (P

.05) test re-

sult.

Description

The BOTTLENECK program computes for

each population sample and for each lo-

cus the distribution of the heterozygosity

) expected from the observed number

of alleles (k), given the sample size (n) un-

der the assumption of mutation-drift equi-

librium. This distribution is obtained

through simulating the coalescent process

of ngenes under each of two possible mu-

tation models, the IAM and the SMM. This

distribution enables the computation of the

average expected equilibrium heterozygos-

ity (H

) for each locus which is compared

to the Hardy–Weinberg heterozygosity (H

Computer Notes

503

i.e., gene diversity) in order to establish

whether there is a heterozygosity excess

or deﬁcit at each locus. In addition, the

standard deviation (SD) of the mutation-

drift equilibrium distribution of the het-

erozygosity is used to compute the stan-

dardized difference for each locus [(H

)/SD]. The distribution obtained

through simulation also enables the com-

putation of a P-value for the measured het-

erozygosity (H

). The P-value is the prob-

ability of obtaining the measured H

in a

sample (n) from an equilibrium population

that has the observed number of alleles

(k).

The way in which the coalescent pro-

cess is simulated is unconventional due to

conditioning by the observed number of

alleles. The phylogeny of the ngenes is

simulated as usual (Hudson 1990). Under

the IAM, a single mutation is allocated at

a time and the resulting number of alleles

is computed. The process is repeated until

the simulation reaches the number of al-

leles (k) observed in the population sam-

ple. Under the SMM, a Bayesian approach

is used as explained in Cornuet and Lui-

kart (1996). Brieﬂy, the likelihood distri-

bution of the parameter

(

) given

the number of alleles (k) and the sample

size (n) is evaluated as the proportion of

iterations (in the simulation process) pro-

ducing exactly kalleles for a varying set

’s. As a second step, drawing random

values of

according to the likelihood dis-

tribution, the coalescent process is simu-

lated as usual. Only heterozygosities

found in iterations producing exactly kal-

leles are considered. Once all loci in a pop-

ulation sample have been processed the

three statistical tests are performed for

each mutation model, as explained in Cor-

nuet and Luikart (1996), and the allele fre-

quency distribution is graphed to deter-

mine whether a bottleneck-induced mode

shift has recently occurred. Note that a

mode shift is a transient distortion in the

distribution of allele frequencies such that

the frequency of alleles at low frequency

(frequency

0.10) becomes lower than

the frequency of alleles in an intermediate

allele frequency class (see Luikart et al.

1998).

Input File Format

Five input data ﬁle formats are accepted

and automatically recognized by BOTTLE-

NECK. All are text ﬁles. One is the GENE-

POP computer program format (Raymond

and Rousset 1995). The second is the GE-

NETIX computer program format ( Belkhir

et al. 1996). The other three formats con-

cern single population data and are de-

scribed in the help ﬁle of the program.

General Comments

BOTTLENECK is written in the Delphi 4

(Inprise Co.) computer language. The per-

formance of BOTTLENECK has been thor-

oughly evaluated using simulated datasets

(Cornuet and Luikart 1996; Luikart et al.

1998) and allozyme and microsatellite da-

tasets (Luikart and Cornuet 1998). To

achieve reasonably high statistical power

(

0.80), we recommend typing at least 10

polymorphic loci (microsatellites or allo-

zymes) and sampling at least 30 individu-

als. The standardized differences test is

recommended when using approximately

20 or more polymorphic loci (Cornuet and

Luikart 1996). For fewer than 20 loci, the

Wilcoxon’s test is the most appropriate

and powerful. The IAM is recommended

for allozyme data and the SMM is gener-

ally more appropriate when testing micro-

satellite loci (i.e., dinucleotide repeat loci)

(Luikart and Cornuet 1998). For most mi-

crosatellites, the TPM (two-phase model)

is apparently even more appropriate than

the SMM (Di Rienzo et al. 1994; Luikart G,

unpublished data). The TPM was recently

added as an option in BOTTLENECK.

When using microsatellites we recom-

mend the TPM with 95% single-step mu-

tations and 5% multiple-step mutations

(and a variance among multiple steps of

approximately 12). When using the quali-

tative test for mode-shift distortion, we

recommend using at least 30 individuals

and 10–20 polymorphic loci to avoid un-

reasonably high type 1 error rates (i.e., to

avoid concluding that a stable population

has been recently bottlenecked).

BOTTLENECK runs on any computer with

Windows 95

. However, we recommend

a computer at least as fast as a pentium

PC. A fast pentium is especially recom-

mended for analyzing datasets containing

many individuals (

30) and loci with

many alleles (e.g.,

3). Analyzing data un-

der the SMM is far slower than analyses

assuming only the IAM. On a Pentium 166

it takes about 15 minutes to analyze a da-

taset of 44 individuals and 7 loci (with 2–

8 alleles) when using both mutation mod-

els and 1000 simulation iterations. The

number of iterations inﬂuences the preci-

sion of the H

estimates. A minimum of

1000 iterations is recommended. The pro-

gram and example input and help ﬁles can

be obtained from the World Wide Web at

http://www.ensam.inra.fr/URLB.

From the Laboratoire de Mode´lisation et de Biologie

Evolutive, INRA-URLB, 488 rue de la Croix-Lavit, F-34090

Montpellier, France (Piry and Cornuet), and the Division

of Biological Sciences, University of Montana, Missoula,

Montana (Luikart). G. Luikart is now at the Laboratoire

de Biologie des Populations d’Altitude, Universite´ Joseph

Fourier, Grenoble, France. This work wasfunded by the

Institut National de la Recherche Agronomique,the Ful-

bright Foundation (to G.L.), and the Graduate School

of the University on Montana (to G.L.). I. Till-Bottraud

provided helpful comments. Address correspondence

to J-M. Cornuet at the address above or e-mail:

Cornuet@ensam.inra.fr.

1999 The American Genetic Association

References

Belkhir K, Borsa P, Goudet J, Chikhi L, and Bonhomme

F, 1996. GENETIX, logiciel sous Windows

pour la ge´-

ne´tique des populations. Version 3.0. Montpellier,

France: Universite´ Montpellier II.

Chakraborty R and Jin L, 1992. Heterozygote deﬁciency,

population substructure and their implications in DNA

ﬁngerprinting. Hum Genet 88:267–272.

Cornuet J-M and Luikart G, 1996. Description and pow-

er analysis of two tests for detecting recent population

bottlenecks from allele frequency data. Genetics 144:

2001–2014.

Di Rienzo A, Peterson AC, Garza JC, Valdes AM, Slatkin

M, and Freimer NB, 1994. Mutational processes of sim-

ple sequence repeat loci in human populations. Proc

Natl Acad Sci USA 91:3166–3170.

Harrison RG, 1991. Molecular changes at speciation.

Annu Rev Ecol Syst 22:281–308.

Howard DJ, 1993. Small populations, inbreeding, and

speciation. In: The natural history of inbreeding and

outbreeding (Thornhill NW, ed). Chicago: University of

Chicago Press; 118–142.

Hudson RR, 1990. Gene genealogies and the coalescent

process. In: Oxford survey in evolutionary biology,vol.

7 (Futyma D and Antonovics J, eds). Oxford: Oxford

University Press; 1–42.

Kimura M and Crow JF, 1964. The number of allelesthat

can be maintained in a ﬁnite population. Genetics 49:

725–738.

Luikart G, 1997. Usefulness of molecular markers for

detecting population bottlenecks and monitoring ge-

netic change (PhD dissertation). Missoula, Montana:

University of Montana.

Luikart G and Cornuet J-M, 1998. Empirical evaluation

of a test for identifying recently bottlenecked popula-

tions from allele frequency data. Conserv Biol 12:228–

237.

Luikart G, Allendorf FW, Sherwin B, and Cornuet J-M,

1998. Distortion of allele frequency distributions pro-

vides a test for recent population bottlenecks. J Hered

12:238–247.

Maruyama T and Fuerst PA, 1985. Population bottle-

necks and non-equilibrium models in population ge-

netics. II. Number of alleles in a small population that

was formed by a recent bottleneck. Genetics 111:675–

689.

Nei M, 1987. Molecular evolutionary genetics. New

York: Columbia University Press.

Ohta T and Kimura M, 1973. A model of mutation ap-

propriate to estimate the number of electrophoretically

detectable alleles in a ﬁnite population. Genet Res

Cambr 22:201–204.

Raymond M and Rousset F, 1995. GENEPOP (version

1.2): population genetics software for exact tests and

ecumenicism. J Hered 86:248–249.

Received December 15, 1997

Accepted February 26, 1999

Corresponding Editor: Robert Angus

Sex-mediated gene flow of grayfoot chacma baboons (Papio ursinus griseipes) in a highly seasonal habitat in Gorongosa National Park

Preprint

Full-text available

Jun 2024

Investigating primates’ behavioral variation at the inter-population level is important for the understanding of the evolutionary processes leading to species-specific patterns. The study of behavioral diversity among populations also contributes to improving’ primate conservation efforts. Dispersal patterns tend to be similar among close phylogenetic lineages but may vary in response to individual-based responses. Here, we investigate dispersal patterns of chacma baboons (Papio ursinus griseipes) living in Gorongosa National Park (GNP) and the Catapu Forest Reserve (CFR) in central Mozambique. The park consists of a mosaic landscape, located in a seasonally variable area. GNP was the epicenter of a major war, which severely reduced most apex predators resulting in limited mammalian predation on baboons and a steep increase in number of groups and/or group’s fission. We used a genetic dataset of 121 non-invasive DNA samples analyzed for uni- and bi-parentally inherited markers aiming to characterize the spatial distribution of genetic variation and investigate the extent and direction of sex-mediated gene flow at different time scales. We found high levels of genetic diversity as estimated using autosomal microsatellite loci data and no evidence for a significant contraction of the population size in the last generations. A very distinct mitochondrial DNA haplotype was sampled in CFR. We found evidence for historical and instantaneous male-biased dispersal and female philopatry, estimated among localities and at short distances in GNP, respectively. Our study highlights the strong conservation of sex-biased dispersal patterns and philopatry in chacma baboons and suggests that dispersal behaviors in chacma baboons are resilient to environmental changes and seasonality.

Genetic diversity of the critically endangered Blue‐crowned laughingthrush (Garrulax courtoisi)

Article

Full-text available

Jun 2024

To evaluate the genetic quality and provide available management strategies for Blue‐crowned laughingthrush (BCLT), fifteen polymorphic microsatellite loci were developed and applied. The genetic diversity of wild individuals was indicated to be higher than the two captive populations. The average number of alleles (5.50 ± 0.317), the number of effective alleles (3.417 ± 0.222), observed heterozygosity (0.828 ± 0.04), and genetic differentiation index (0.028 ± 0.007) of 64 wild individuals showed high genetic diversity despite drastic bottleneck and low genetic differentiation. The number of effective migrants (22.737 ± 8.318) indicated the intriguing wintering grounds may be surrounded by the breeding sites where the syncheimadia occurred in Wuyuan. Efficient conservation, winter flocking, and cooperative breeding may facilitate gene exchange and inclusive fitness. We recommend that monitoring concentrated distribution areas for BCLT should be strengthened, and geographical barriers, interference types, and the inner mechanism of distribution patterns should be further explored.

Genetic Diversity and Population Structure of Rhodeus uyekii in the Republic of Korea Revealed by Microsatellite Markers from Whole Genome Assembly

Article

Full-text available

Jun 2024
INT J MOL SCI

This study is the first report to characterize the Rhodus uyekii genome and study the development of microsatellite markers and their markers applied to the genetic structure of the wild population. Genome assembly was based on PacBio HiFi and Illumina HiSeq paired-end sequencing, resulting in a draft genome assembly of R. uyekii. The draft genome was assembled into 2652 contigs. The integrity assessment of the assemblies indicates that the quality of the draft assemblies is high, with 3259 complete BUSCOs (97.2%) in the database of Verbrata. A total of 31,166 predicted protein-coding genes were annotated in the protein database. The phylogenetic tree showed that R. uyekii is a close but distinct relative of Onychostoma macrolepis. Among the 10 fish genomes, there were significant gene family expansions (8–2387) and contractions (16–2886). The average number of alleles amplified by the 21 polymorphic markers ranged from 6 to 23, and the average PIC value was 0.753, which will be useful for evolutionary and genetic analysis. Using population genetic analysis, we analyzed genetic diversity and the genetic structures of 120 individuals from 6 populations. The average number of alleles per population ranged from 7.6 to 9.9, observed heterozygosity ranged from 0.496 to 0.642, and expected heterozygosity ranged from 0.587 to 0.783. Discriminant analysis of principal components According to the analysis method, the population was divided into three populations (BS vs. DC vs. GG, GC, MS, DC). In conclusion, our study provides a useful resource for comparative genomics, phylogeny, and future population studies of R. uyekii.

Population genetic insights into the conservation of common walnut (Juglans regia) in Central Asia

Article

Jun 2024

Ocean circulation contributes to genetic connectivity of limpet populations at deep‐sea hydrothermal vents in a back‐arc basin

Article

Full-text available

Jun 2024
EVOL APPL

For endemic benthos inhabiting hydrothermal vent fields, larval recruitment is critical for population maintenance and colonization via migration among separated sites. The vent‐endemic limpet, Lepetodrilus nux, is abundant at deep‐sea hydrothermal vents in the Okinawa Trough, a back‐arc basin in the northwestern Pacific; nonetheless, it is endangered due to deep‐sea mining. This species is associated with many other vent species and is an important successor in these vent ecosystems. However, limpet genetic diversity and connectivity among local populations have not yet been examined. We conducted a population genetics study of L. nux at five hydrothermal vent fields (maximum geographic distance, ~545 km; depths ~700 m to ~1650 m) using 14 polymorphic microsatellite loci previously developed. Genetic diversity has been maintained among these populations. Meanwhile, fine population genetic structure was detected between distant populations, even within this back‐arc basin, reflecting geographic distances between vent fields. There was a significant, positive correlation between genetic differentiation and geographic distance, but no correlation with depth. Contrary to dispersal patterns predicted by an ocean circulation model, genetic migration is not necessarily unidirectional, based on relative migration rates. While ocean circulation contributes to dispersal of L. nux among vent fields in the Okinawa Trough, genetic connectivity may be maintained by complex, bidirectional dispersal processes over multiple generations.

Genetic tracing of the illegal trade of the white-bellied pangolin (Phataginus tricuspis) in western Central Africa

Article

Full-text available

Jun 2024

The white-bellied pangolin is subject to intense trafficking, feeding both local and international trade networks. In order to assess its population genetics and trace its domestic trade, we genotyped 562 pangolins from local to large bushmeat markets in western central Africa. We show that the two lineages described from the study region (WCA and Gab) were overlapping in ranges, with limited introgression in southern Cameroon. There was a lack of genetic differentiation across WCA and a significant signature of isolation-by-distance possibly due to unsuspected dispersal capacities involving a Wahlund effect. We detected a c. 74.1–82.5% decline in the effective population size of WCA during the Middle Holocene. Private allele frequency tracing approach indicated up to 600 km sourcing distance by large urban markets from Cameroon, including Equatorial Guinea. The 20 species-specific microsatellite loci provided individual-level genotyping resolution and should be considered as valuable resources for future forensic applications. Because admixture was detected between lineages, we recommend a multi-locus approach for tracing the pangolin trade. The Yaoundé market was the main hub of the trade in the region, and thus should receive specific monitoring to mitigate pangolins’ domestic trafficking. Our study also highlighted the weak implementation of CITES regulations at European borders.

Spatial population genetic structure of Caquetaia kraussii (Steindachner, 1878) evidenced by species-specific microsatellite loci in the middle and low basin of the Cauca River, Colombia

Article

Full-text available

Jun 2024
PLOS ONE

The adaptative responses and divergent evolution shown in the environments habited by the Cichlidae family allow to understand different biological properties, including fish genetic diversity and structure studies. In a zone that has been historically submitted to different anthropogenic pressures, this study assessed the genetic diversity and population structure of cichlid Caquetaia kraussii, a sedentary species with parental care that has a significant ecological role for its contribution to redistribution and maintenance of sedimentologic processes in its distribution area. This study developed de novo 16 highly polymorphic species-specific microsatellite loci that allowed the estimation of the genetic diversity and differentiation in 319 individuals from natural populations in the area influenced by the Ituango hydroelectric project in the Colombian Cauca River. Caquetaia kraussii exhibits high genetic diversity levels (Ho: 0.562–0.885; He: 0.583–0.884) in relation to the average neotropical cichlids and a three group-spatial structure: two natural groups upstream and downstream the Nechí River mouth, and one group of individuals with high relatedness degree, possibly independently formed by founder effect in the dam zone. The three genetic groups show recent bottlenecks, but only the two natural groups have effective population size that suggest their long-term permanence. The information generated is relevant not only for management programs and species conservation purposes, but also for broadening the available knowledge on the factors influencing neotropical cichlids population genetics.

Unveiling the genetic structure of pig population in a Himalayan state Uttarakhand through microsatellite and mitochondrial DNA analyses

Article

Full-text available

Jun 2024
TROP ANIM HEALTH PRO

This study traced the maternal lineage of the domestic swine populations using mitochondrial DNA control region markers and genetic diversity using microsatellite markers in Uttarakhand, an Indian state situated at the foothills of the world’s youngest (geo-dynamically sensitive) mountain system, “the Himalayas”. Analysis of 68 maternally unrelated individuals revealed 20 haplotypes. The maternal signature of the Pacific, Southeast Asian, European, and ubiquitously distributed Chinese haplotypes was present in Uttarakhand’s domestic pig population. The D3 haplotype reported in wild pigs from North India was also identified in 47 domestic samples. A unique gene pool, UKD (Uttarakhand Domestic), as another lineage specific to this region has been proposed. Genotypes were analyzed, using 13 sets of microsatellite markers. The observed (Ho) and expected (He) heterozygosities were 0.83 ± 0.02 and 0.84 ± 0.01, respectively. The average polymorphic information content value of 0.83 ± 0.01 indicated the high informativeness of the marker. The overall mean FIS value for all the microsatellite markers was low (F = 0.04, P < 0.01). Seven loci deviated from Hardy-Weinberg equilibrium (HWE) at a significant level (p < 0.05). Two clusters were identified, indicating overlapping populations. These results suggested that though belonging to different maternal lineages, the traditional management practices in Uttarakhand have allowed for genetic mixing and the sharing of genetic material among pig populations. It could contribute to increased genetic diversity but might also result in the loss of distinct genetic characteristics or breed purity of the local breeds if not carefully managed.

Genetic structuring and conservation of sockeye salmon on the Asian coast of the North Pacific: identification of regional stock complexes

Article

Full-text available

Jun 2024
HYDROBIOLOGIA

Anastasia Khrustaleva

In order to describe large-scale spatial structure of sockeye salmon on the Asian part of the range, the variability of 45 SNP loci was analyzed in 22 samples from the Northwest coast of the Pacific Ocean. Three large regional population complexes were identified: Southwest Kamchatka, Kamchatka River basin, and the Northeast (comprising stocks from Koryak Highlands). Populations within the identified complexes are connected by gene migration and have a common origin, close geographic proximity, comparable climatic, landscape, and environmental conditions in the freshwater and early marine periods of sockeye salmon life. Populations confined to watersheds of the North coast of the Sea of Okhotsk (Palana and Okhota rivers), along with island populations, displayed distinctions from the isolated population complexes. It is hypothesized that the marked divergence observed in island populations is primarily caused by genetic drift occurring during long periods of isolation. The pronounced divergence of Palana River population may be the result of both genetic drift and natural selection, driven by the challenging smoltification and specific conditions of freshwater period in this watershed. At the same time in the Okhota River population, demographic factors such as genetic drift and bottlenecks played a key role.

Characterisation of the Cinnamomum parthenoxylon (Jack) Meisn (Lauraceae) transcriptome using Illumina paired-end sequencing and EST-SSR markers development for population genetics

Article

Full-text available

Jun 2024

Cinnamomum parthenoxylon is an endemic and endangered species with significant economic and ecological value in Vietnam. A better understanding of the genetic architecture of the species will be useful when planning management and conservation. We aimed to characterize the transcriptome of C. parthenoxylon, develop novel molecular markers, and assess the genetic variability of the species. First, transcriptome sequencing of five trees (C. parthenoxylon) based on root, leaf, and stem tissues was performed for functional annotation analysis and development of novel molecular markers. The transcriptomes of C. parthenoxylon were analyzed via an Illumina HiSeqTM 4000 sequencing system. A total of 27,363,199 bases were generated for C. parthenoxylon. De novo assembly indicated that a total of 160,435 unigenes were generated (average length = 548.954 bp). The 51,691 unigenes were compared against different databases, i.e. COG, GO, KEGG, KOG, Pfam, Swiss-Prot, and NR for functional annotation. Furthermore, a total of 12,849 EST-SSRs were identified. Of the 134 primer pairs, 54 were randomly selected for testing, with 15 successfully amplified across nine populations of C. parthenoxylon. We uncovered medium levels of genetic diversity (PIC = 0.52, Na = 3.29, Ne = 2.18, P = 94.07%, Ho = 0.56 and He = 0.47) within the studied populations. The molecular variance was 10% among populations and low genetic differentiation (Fst = 0.06) indicated low gene flow (Nm = 2.16). A reduction in the population size of C. parthenoxylon was detected using BOTTLENECK (VP population). The structure analysis suggested two optimal genetic clusters related to gene flow among the populations. Analysis of molecular variance (AMOVA) revealed higher genetic variation within populations (90%) than among populations (10%). The UPGMA approach and DAPC divided the nine populations into three main clusters. Our findings revealed a significant fraction of the transcriptome sequences and these newlydeveloped novel EST-SSR markers are a very efficient tool for germplasm evaluation, genetic diversity and molecular marker-assisted selection in C. parthenoxylon. This study provides comprehensive genetic resources for the breeding and conservation of different varieties of C. parthenoxylon.

Empirical Evaluation of a Test for Identifying Recently Bottlenecked Populations from Allele Frequency Data

Article

Full-text available

Feb 1998

Detecting bottlenecks or population declines

Distortion of allele frequency distributions provides a test for recent population bottlenecks

Article

Full-text available

May 1998
J HERED

We use population genetics theory and computer simulations to demonstrate that population bottlenecks cause a characteristic mode-shift distortion in the distribution of allele frequencies at selectively neutral loci. Bottlenecks cause alleles at low frequency (< 0.1) to become less abundant than alleles in one or more intermediate allele frequency class (e.g., 0.1-0.2). This distortion is transient and likely to be detectable for only a few dozen generations. Consequently only recent bottlenecks are likely to be detected by tests for distortions in distributions of allele frequencies. We illustrate and evaluate a qualitative graphical method for detecting a bottleneck-induced distortion of allele frequency distributions. The simple novel method requires no information on historical population sizes or levels of genetic variation; it requires only samples of 5 to 20 polymorphic loci and approximately 30 individuals. The graphical method often differentiates between empirical datasets from bottlenecked and nonbottlenecked natural populations. Computer simulations show that the graphical method is likely (P > .80) to detect an allele frequency distortion after a bottleneck of < or = 20 breeding individuals when 8 to 10 polymorphic microsatellite loci are analyzed.

GENEPOP (Version 1.22): Population Genetics Software for Exact Tests and Ecumenicism

Article

Full-text available

Jan 1995
HEREDITY

GENEPOP (version 1.2): population genetic software for exact tests and ecumenicism

Article

Full-text available

May 1995

Note that an updated reference for Genepop is Rousset (2008) genepop’007: a complete re-implementation of the genepop software for Windows and Linux (DOI: 10.1111/j.1471-8286.2007.01931.x)

Description and Power Analysis of Two Tests for Detecting Recent Population Bottlenecks From Allele Frequency Data

Article

Dec 1996
GENETICS

When a population experiences a reduction of its effective size, it generally develops a heterozygosity excess at selectively neutral loci, i.e., the heterozygosity computed from a sample of genes is larger than the heterozygosity expected from the number of alleles found in the sample if the population were at mutation drift equilibrium. The heterozygosity excess persists only a certain number of generations until a new equilibrium is established. Two statistical tests for detecting a heterozygosity excess are described. They require measurements of the number of alleles and heterozygosity at each of several loci from a population sample. The first test determines if the proportion of loci with heterozygosity excess is significantly larger than expected at equilibrium. The second test establishes if the average of standardized differences between observed and expected heterozygosities is significantly different from zero. Type I and II errors have been evaluated by computer simulations, varying sample size, number of loci, bottleneck size, time elapsed since the beginning of the bottleneck and level of variability of loci. These analyses show that the most useful markers for bottleneck detection are those evolving under the infinite allele model (IAM) and they provide guidelines for selecting sample sizes of individuals and loci. The usefulness of these tests for conservation biology is discussed.

Mutational processes of simple sequence repeats in human populations

Article

Jan 1994

Molecular Changes At Speciation

Article

Nov 2003
Annu Rev Ecol Systemat

Richard G Harrison

Empirical Evaluation of a Test for Identifying Recently Bottlenecked Populations from Allele Frequency Data

Article

Jul 2008
CONSERV BIOL

Gene Genealogies and the Coalescent Process

Book

Jan 1990

R. R. Hudson

In Molecular Evolutionary Genetics

Book

Dec 1987

Masatoshi Nei

Spectacular progress has been made recently in the study of evolution at the molecular level, primarily due to new biochemical techniques such as gene cloning and DNA sequencing. In this book, the author summarizes new developments and seeks to unify studies of evolutionary histories of organisms and the mechanisms of evolution into a single science - molecular evolutionary genetics.

Computer note. BOTTLENECK: a computer program for detecting recent reductions in the effective size using allele frequency data

Recommended publications

INRAE (France) is recruiting 55 research scientists

INRAE is hiring 10 research scientists - Call for research projects (CRCN)

INRAE is hiring 45 Scientists through open competitions and offering permanent positions.

Detection of Characteristic Co-Occurrence Words from News Articles on the Web

Invariant G2V algorithm for computing SAGBI-Gröbner bases

A Non-Dimensional Consideration in Combustor Axial Stress Computations

Log‐Linear Models for I × J Tables

A Non-Dimensional Consideration in Combustor Axial Stress Computations