Content uploaded by Sylvain Piry
Author content
All content in this area was uploaded by Sylvain Piry on Jan 24, 2014
Content may be subject to copyright.
499
Computer Notes
Simulation of Effects of
Dominance on Estimates of
Population Genetic Diversity
and Differentiation
K. V. Krutovskii, S. Y. Erofeeva,
J. E. Aagaard, and S. H. Strauss
The advent of PCR-based molecular mark-
ers has led to a rapid expansion in studies
describing the levels and distribution of
genetic variation among populations at
the DNA level. Randomly amplified poly-
morphic DNA (RAPD; Williams et al. 1990)
and amplified fragment length polymor-
phism (AFLP; Vos et al. 1995) markers are
now commonly used in population genetic
studies (e.g., Aagaard et al. 1998; Isabel et
al. 1995; Liu and Furnier 1993; Mosseler et
al. 1992; Peakall et al. 1995; Szmidt et al.
1996; Travis et al. 1996; Wu et al., in press).
However, these PCR-based markers have
limitations compared to allozymes, which
had been the prevalent means for popu-
lation studies prior to the use of PCR. At
the majority of RAPD and AFLP loci the
dominant allele masks the presence of the
null allele in heterozygotes when assaying
diploid tissues (e.g., about 97%–98%; Kru-
tovskii et al. 1998), thus sampling variance
for dominant allele frequencies is typically
greater than that for codominant alleles
(Lynch and Milligan 1994). The frequen-
cies of null and dominant alleles are
inferred from the frequency of null allele
homozygotes; the precision of their esti-
mation thus depends on mating system as-
sumptions and is strongly affected by the
sample size. Empirical studies have also
suggested that dominant markers can bias
estimates of genetic diversity and differ-
entiation among populations (e.g., Isabel
et al. 1995; Szmidt et al. 1996).
Although RAPD markers have proved to
be useful for population studies, and their
gross patterns of diversity usually agree
with that of allozymes, the levels of genet-
ic variation, differentiation, and fine-scale
genetic structures often differ (e.g., Baruffi
et al. 1995; Dawson et al. 1996; Heun et al.
1994; Lanne´r-Herrera et al. 1996; Latta and
Mitton 1997; le Corre et al. 1997; Liu and
Furnier 1993; Peakall et al. 1995; Puterka
et al. 1993). To help assess whether these
differences are biological or a simple con-
sequence of the dominance and biallelism
of RAPD and AFLP markers, we developed
a dominance simulation program, DOM-
SIM, that transforms codominant popula-
tion data into a biallelic dominant dataset.
The program then estimates population
genetic statistics with which dominant
and codominant markers can be directly
compared. We use data from a widespread
North American conifer, Douglas-fir [Pseu-
dotsuga menziesii (Mirb.) Franco], and
three California closed-cone pine species
to illustrate the program’s function. The
test simulation suggests that dominant
biallelic markers, such as RAPDs, can
strongly underestimate population diver-
sity but can still reasonably estimate pop-
ulation differentiation (G
ST
), if sample
sizes are larger than about 30 individuals.
Program Functions
The program DOMSIM uses multiallelic da-
tasets with a maximum number of six al-
leles per locus for which population allele
frequencies are defined. Assuming Hardy–
Weinberg equilibrium and no linkage
among loci, the program generates Nbasic
populations (N
max
5
20) of up to 1,000 in-
dividuals each with multilocus genotypes
that maintain the specified allele frequen-
cies within populations. A total of Ssub-
populations (S
max
5
400) of nindividuals
(n
5
10–200) are then drawn with replace-
ment for each of the Npopulations. The
sampling is done in two different ways: by
sampling subpopulations of size nwith
replacement directly from the initially gen-
erated basic population, and by resam-
pling subpopulations of size nwith
replacement within the first sampled sub-
population of nindividuals ( bootstrap re-
sampling). Population genetic parameters
(H
S
,H
T
, and G
ST
) are calculated for each
cycle of resampling in three ways. First,
for a codominant dataset, calculations are
made considering all alleles and geno-
types present in the subpopulations. Sec-
ond, the same subpopulations and data
are used to simulate a dominant biallelic
dataset by randomly selecting one allele
as dominant, with the rest treated as re-
cessive to it. The synthetic null allele fre-
quency is then calculated from the null
homozygote frequencies assuming Hardy–
Weinberg equilibrium. Average parame-
ters and their variance are calculated for
each set of Ssubpopulations. Gene diver-
sity is evaluated using H
S
and H
T
, either
unmodified (Nei 1973) or modified ( Nei
and Chesser 1983) for the sample size. Ge-
netic differentiation is evaluated via
u
w
(Weir and Cockerham 1984) and G
ST
pa-
rameters that are either unmodified ( Nei
1973), modified for the sample size (Nei
and Chesser 1983), or modified for both
the sample size and population number
(Nei 1986). Finally, null allele frequencies
are corrected for dominance using Lynch
and Milligan’s (1994) equation 2a, and
their asymptotically unbiased estimate of
F
ST
recommended for dominant markers is
also calculated following equation 14a.
Installing and Running the Program
The program DOMSIM is written in FOR-
TRAN-77 (simulation routines) and in Lab-
Windows CVI (interface routines). The
source code file, domsimd.f, was compiled
using Microsoft FORTRAN Power Station
Compiler version 1.0. DOMSIM runs on
IBM PCs and compatibles under MS Win-
dows 95 and NT for 32-bit operating envi-
ronments. To install the program run the
compressed self-extracting file dom-
simpr.exe which can be downloaded from
the web site http://www.fsl.orst.edu/tgerc/
protocol.htm. It will automatically decom-
500
The Journal of Heredity 1999:90(4)
Figure 1. Levels of diversity and differentiation for codominant, multiallelic allozymes versus biallelic, dominant
markers, as simulated from an allozyme dataset from Douglas-fir studied with varying sample sizes. Standard
deviations (error bars) were calculated from the variance among 400 bootstrap subsamples and represent the
variance due to resampling of individuals at each level of sampling from the master population of 1,000individuals.
The arrow shows the population sample size between 30 and 40 needed to eliminate the tendency for overesti-
mation of population differentiation caused by dominance and biallelism.
press five files domsimd.f, domsim.001,
domsim.002, read.me, and setup.exe. Next,
run the setup file and follow the instruc-
tions on your screen during installation.
Run the program by either clicking the
icon or executing the program file dom-
sim.exe. A read.me file contains additional
instructions for installing and running the
program.
Input and Output Files
The input format is an ASCII file similar to
GeneStat input files (Lewis 1994), but
does not require population, locus, and al-
lele names, and there should be no empty
lines. An example (sample.dat) and brief
help, which explicitly explains an input file
structure, are provided with the program.
The output file has all the parameters cal-
culated for each resampled and bootstrap
set, their average values, and standard de-
viations.
Examples of Simulation Based on
Allozyme Data in Douglas-fir and
California Closed-Cone Pines
In order to facilitate comparisons between
dominant and codominant markers, and
to help understand the effects of RAPD
dominance and biallelism on our studies
of genetic diversity and differentiation in
Douglas-fir (Aagaard et al. 1998) and the
California closed-cone pines (Wu et al., in
press), we simulated dominance and bial-
lelism in these two allozyme datasets (Li
and Adams 1989; Wu et al., in press). The
first allozyme dataset included six popu-
lations of three races of Douglas-fir—
coastal, north interior, and south interi-
or—with two populations per race. The
second one included four, five, and three
populations of Pinus attenuata, P. muricata,
and P. radiata, respectively. These popu-
lations are described in detail elsewhere
(Aagaard et al. 1998; Wu et al., in press).
From allozyme allele frequencies within
populations we generated simulated pop-
ulations of 1,000 individuals each, and a
total of 400 subpopulations of nindividu-
als were drawn with replacement from
each of the populations. The program also
performed 400 bootstrap resamplings us-
ing a subpopulation of size n. Population
genetic parameters (H
S
,H
T
,G
ST
,
u
w
, and
F
ST
) were then calculated for each set of
400 subpopulations in the three ways de-
scribed above. We varied the number of
individuals (n) within the subsamples
from 10 to 200 to bracket the range of sam-
ple sizes that might reasonably be em-
ployed in population studies, and the sam-
ple size of 30–50 trees per population that
was used in our RAPD studies (Aagaard et
al. 1998; Wu et al., in press). The results of
the simulations are summarized in Figures
1 and 2. The simulations showed that di-
versity measurements (H
S
and H
T
) were
likely to be underestimated by dominant
biallelic markers approximately twofold
regardless of sample size.
When 30 or more diploid individuals per
population were sampled, there was little
effect on differentiation estimates (G
ST
,
u
w
,
Computer Notes
501
Figure 2. Genetic diversity (H
S
) and differentiation (G
ST
; Nei 1986) values averaged over populations of each
California closed-cone pine species for codominant multiallelic allozyme and dominant biallelic markers simulated
in the samples of different sizes. Standard deviations (error bars) were calculated from the variance among 400
bootstrap subsamples simulated for each population of each species. Observed RAPD values are also shown as a
star.
and F
ST
) in Douglas-fir. However, though
still very similar to the estimates for co-
dominant markers, the estimates for the
simulated dominant markers began to di-
verge slightly but significantly downward
at large population sizes in Douglas-fir. In
the California closed-cone pines, the esti-
mates for the simulated dominant markers
converge toward the estimates for codom-
inant multiallelic markers at large popula-
tion sizes, but were always significantly
higher (Wu et al., in press). Our simula-
tions were in close agreement with our
empirical studies of Douglas-fir where, de-
spite dominance and biallelism of RAPD
markers, we have found that RAPDs and
allozymes exhibit similar levels of differ-
entiation at the population and race levels
with adequate sample sizes (Aagaard et al.
1998). However, the California closed-cone
pine allozyme data showed that the larger
sample sizes than we employed in our
RAPD study (Wu et al., in press) are desir-
able for a fair comparison of RAPD and
allozyme data. Finally, despite the expec-
tation of much reduced diversity for dom-
inant biallelic markers predicted by the
simulations, our RAPD data gave higher
estimates of diversity than did allozymes
in both Douglas-fir (Aagaard et al. 1998)
and the California closed-cone pines (Wu
et al., in press). This suggests that RAPD
markers may have much higher intrinsic
genetic diversity than do allozyme mark-
ers. Our results demonstrate the impor-
tance of simulations to help compare and
interpret the results of population studies
with dominant markers.
From the Departments of Forest Science (Aagaard, Kru-
tovskii, and Strauss) and the College of Oceanic and
Atmospheric Sciences (Erofeeva), Oregon StateUniver-
sity, Corvallis, OR 97331-7501. We thank Tom Adams for
providing allozyme data and Vladislav Erofeev for help
with computer software. This work was supported in
part by NSF grants DEB 9300083 and BSR 895702 to
S.H.S. The dominance simulation program (DOMSIM)
is available for public use and can be downloaded as
a self-extracting file domsimpr.exe from the TGERC web
site: http://www.fsl.orst.edu/tgerc/protocol.htm. Ad-
dress correspondence to Dr. K. V. Krutovskii at the ad-
dress above or e-mail: krutovskiik@fsl.orst.edu.
q
1999 The American Genetic Association
References
Aagaard JE, Krutovskii KV, and Strauss SH, 1998. RAPDs
and allozymes exhibit similar levels of diversity and
differentiation among populations and races of Doug-
las-fir. Heredity 81:69–78.
Baruffi L, Damiani G, Guglielmino CR, Bandi C, Malacri-
da AR, and Gasperi G, 1995. Polymorphism within and
between populations of Ceratitis capitata: comparison
between RAPD and multilocus enzyme electrophoresis
data. Heredity 74:425–437.
Dawson IK, Simons AJ, Waugh R, and Powell W, 1996.
Diversity and genetic differentiation among subpopu-
lations Gliricidia sepium revealed by PCR-based assays.
Heredity 74:10–18.
Heun M, Murphy JP, and Phillips TD, 1994. A compari-
son of RAPD and isozyme analyses for determiningthe
genetic relationships among Avena sterilis L. acces-
sions. Theor Appl Genet 87:689–696.
Isabel N, Beaulieu J, and Bousquet J, 1995. Complete
congruence between gene diversity estimates derived
from genotypic data at enzyme and random amplified
polymorphic DNA loci in black spruce. Proc Natl Acad
Sci USA 92:6369–6373.
Krutovskii KV, Vollmer SS, Sorensen FC, Adams WT,
Knapp SJ, and Strauss SH, 1998. RAPD genome maps of
Douglas-fir. J Hered 89:197–205.
Lanne´r-Herrera C, Gustafsson M, Fa¨lt AS, and Bryngels-
son T, 1996. Diversity of wild Brassica oleraceae as es-
timated by isozyme and RAPD analysis. Genet Re-
sources Crop Evol 43:13–23.
Latta RG and Mitton JB, 1997. A comparison of popu-
lation differentiation across four classes of gene mark-
er in limber pine (Pinus flexilis James). Genetics 146:
1153–1163.
le Corre V, Dumolin-Lape`gue S, and Kremer A, 1997.
Genetic variation at allozyme and RAPD loci in sessile
oak Quercus petraea (Matt.) Liebl.: the role of history
and geography. Mol Ecol 6:519–529.
Lewis PO, 1994. GeneStat-PC 3.3. Raleigh, North Caro-
lina: Department of Statistics, North Carolina StateUni-
versity.
Li P and Adams WT, 1989. Range-wide patterns of allo-
zyme variation in Douglas-fir (Pseudotsuga menziesii).
Can J For Res 19:149–161.
Liu Z and Furnier GR, 1993. Comparison of allozyme,
RFLP, and RAPD markers for revealing genetic variation
within and between trembling aspen and bigtooth as-
pen. Theor Appl Genet 87:97–105.
Lynch M and Milligan BG, 1994. Analysis of population
genetic structure with RAPD markers. Mol Ecol 3:91–
99.
Mosseler A, Egger KN, and Hughes GA, 1992. Low levels
of genetic diversity in red pine confirmed by random
amplified polymorphic DNA markers. Can J For Res 22:
1332–1337.
Nei M, 1973. Analysis of gene diversity in subdivided
populations. Proc Natl Acad Sci USA 70:3321–3323.
Nei M, 1986. Definition and estimation of fixation indi-
ces. Evolution 40:643–645.
502
The Journal of Heredity 1999:90(4)
Nei M and Chesser RK, 1983. Estimation of fixation in-
dices and gene diversities. Ann Hum Genet 47:253–259.
Peakall R, Smouse PE, and Huff DR, 1995. Evolutionary
implications of allozyme and RAPD variation in diploid
populations of dioecious buffalograss Buchloe¨ dactylo-
ides. Mol Ecol 4:135–147.
Puterka GJ, Black IV WC, Steiner WM, and Burton RL,
1993 Genetic variation and phylogenetic relationships
among worldwide collections of the Russian wheat
aphid, Diuraphis noxia (Mordvilko), inferred from allo-
zyme and RAPD-PCR markers. Heredity 70:604–618.
Szmidt AE, Wang X, and Lu M, 1996. Empirical assess-
ment of allozyme and RAPD variation in Pinus sylvestris
(L.) using haploid tissue analysis. Heredity 76:412–420.
Travis SE, Maschinski J, and Keim P, 1996. An analysis
of genetic variation in Astragalus cremnophylax var.
cremnophylax, a critically endangered plant, using
AFLP markers. Mol Ecol 5:735–745.
Vos P, Hogers R, Bleeker M, Reijans M, van de Lee T,
Hornes M, Frijters A, Pot J, Peleman J, Kuiper M, and
Zabeau M, 1995. AFLP: a new technique for DNA fin-
gerprinting. Nucleic Acids Res 23:4407–4414.
Weir BS and Cockerham CC, 1984. Estimating F-statis-
tics for the analysis of population structure. Evolution
38:1358–1370.
Williams JG, Kubelik AR, Livak KJ, Rafalski JA, and Tin-
gey SV, 1990. DNA polymorphisms amplified by arbi-
trary primers are useful as genetic markers. Nucleic
Acids Res 18:6531–6535.
Wu J, Krutovskii KV, and Strauss SH, in press. Nuclear
DNA diversity, population differentiation and phyloge-
netic relationships in the California closed-cone pines
based on RAPD and allozyme markers. Genome.
Received April 12, 1998
Accepted February 18, 1999
Corresponding Editor: Robert Angus
BOTTLENECK: A Computer
Program for Detecting
Recent Reductions in the
Effective Population Size
Using Allele Frequency Data
S. Piry, G. Luikart, and J-M.
Cornuet
BOTTLENECK (current version 1.2) is a
population genetics computer program
that conducts four tests for identifying
populations that have recently experi-
enced a severe reduction in effective pop-
ulation size (N
e
). ‘‘Recently’’ is defined as
within approximately the past 2N
e
–4N
e
generations, depending on several factors
such as the severity of the bottleneck and
the mutation rate of the loci being studied
(Cornuet and Luikart 1996). The program
runs on Windows 95
y
. It requires allele
frequency data obtained from one sample
of individuals (e.g., 20–30 diploid individ-
uals) and at least four polymorphic loci.
Significant deviations from population
mutation-drift equilibrium (e.g., bottle-
necks) are important to detect because
equilibrium is an assumption required for
numerous analyses of population genetics
data (e.g., see Nei 1987, p. 251). Bottle-
necks are important to detect in conser-
vation biology because they can increase
the risk of population extinction. Founder-
flush events (i.e., short but severe bottle-
necks) are important to detect because
they may play a role in some modes of
speciation [for reviews see Harrison
(1991) and Howard (1993)].
Principle
Populations that have experienced a re-
cent reduction of their effective popula-
tion size exhibit a correlative reduction of
the allele number and heterozygosity at
polymorphic loci. But the allele number is
reduced faster than the heterozygosity
(H
e
). Thus the H
e
becomes larger than the
heterozygosity (H
eq
) expected at mutation-
drift equilibrium because H
eq
is calculated
from the allele number (and the sample
size; see Description below and Cornuet
and Luikart 1996). Note that H
e
is calcu-
lated from allele frequencies (e.g., 1
2
S
p
i2
, where p
i
is the frequency of the ith
allele). Here both the measured heterozy-
gosity (H
e
) and the expected equilibrium
heterozygosity (H
eq
) refer to heterozygos-
ity in the sense of Nei’s (1987) gene diver-
sity. Heterozygosity never refers to the
proportion of heterozygotes observed
(H
o
). Thus we are not testing for an excess
of heterozyogotes (H
o
.
H
e
), but rather an
excess of heterozygosity (H
e
.
H
eq
).
Strictly speaking, heterozygosity excess
has been demonstrated only for loci evolv-
ing under the infinite allele model ( IAM;
Kimura and Crow 1964) by Maruyama and
Fuerst (1985). If the locus evolves under
the strict one-step stepwise mutation
model (SMM; Ohta and Kimura 1973),
there can be situations where this hetero-
zygosity excess is not observed (Cornuet
and Luikart 1996). However, few loci fol-
low the strict SMM, and as soon as loci
depart slightly from the SMM toward the
IAM they will exhibit a heterozygosity ex-
cess as a consequence of a genetic bottle-
neck. When testing for bottlenecks, the
BOTTLENECK program uses both the
SMM and IAM independently, because
they represent two extreme models of mu-
tation along a continuum of possible mod-
els (Chakraborty and Jin 1992). All loci will
follow a mutation model somewhere in-be-
tween the two extreme models.
For selectively neutral loci in a popula-
tion near mutation-drift equilibrium (i.e., a
population in which N
e
has remained fairly
constant in the past), there is approxi-
mately an equal probability that a locus
will show a slight heterozygosity excess or
a heterozygosity deficit. However, in re-
cently bottlenecked populations, a major-
ity of loci will exhibit an excess of hetero-
zygosity (Luikart and Cornuet 1998). To
determine if a population exhibits a signif-
icant number of loci with heterozygosity
excess, we proposed three statistical
tests: sign test, a standardized differences
test (Cornuet and Luikart 1996; Luikart
and Cornuet 1998), and a Wilcoxon’s
signed rank test (Luikart et al., submitted;
Luikart 1997). We also proposed a graph-
ical descriptor of the shape of the allele
frequency distribution (‘‘mode-shift’’ indi-
cator) which can differentiate between
bottlenecked and stable populations (Lui-
kart et al. 1998).
Interpretation of output from the sign
and standardized differences tests is thor-
oughly discussed in Cornuet and Luikart
(1996) and Luikart and Cornuet (1998). In-
terpretation of output from the graphical
descriptor is discussed in Luikart et al.
(1998). Guidelines for interpreting the out-
put from the Wilcoxon’s test are less easy
to find ( Luikart 1997: chapter 4; Luikart et
al., submitted), although this test is anal-
ogous to the sign test. The Wilcoxon’s test
is generally the most useful of all the tests
because it is the most powerful (along
with the standardized differences test),
and robust (like the sign test) when used
with few (
,
20) polymorphic loci. When
testing for bottlenecks, the null hypothe-
sis of the Wilcoxon’s test is no significant
heterozygosity excess (on average across
loci). Thus the alternate hypothesis is sig-
nificant heterozygosity excess (and thus
evidence of a recent bottleneck). This is a
one-tailed test that requires at least four
polymorphic loci to have any possibility
of obtaining a significant (P
,
.05) test re-
sult.
Description
The BOTTLENECK program computes for
each population sample and for each lo-
cus the distribution of the heterozygosity
(H
eq
) expected from the observed number
of alleles (k), given the sample size (n) un-
der the assumption of mutation-drift equi-
librium. This distribution is obtained
through simulating the coalescent process
of ngenes under each of two possible mu-
tation models, the IAM and the SMM. This
distribution enables the computation of the
average expected equilibrium heterozygos-
ity (H
eq
) for each locus which is compared
to the Hardy–Weinberg heterozygosity (H
e
,
Computer Notes
503
i.e., gene diversity) in order to establish
whether there is a heterozygosity excess
or deficit at each locus. In addition, the
standard deviation (SD) of the mutation-
drift equilibrium distribution of the het-
erozygosity is used to compute the stan-
dardized difference for each locus [(H
e
2
H
eq
)/SD]. The distribution obtained
through simulation also enables the com-
putation of a P-value for the measured het-
erozygosity (H
e
). The P-value is the prob-
ability of obtaining the measured H
e
in a
sample (n) from an equilibrium population
that has the observed number of alleles
(k).
The way in which the coalescent pro-
cess is simulated is unconventional due to
conditioning by the observed number of
alleles. The phylogeny of the ngenes is
simulated as usual (Hudson 1990). Under
the IAM, a single mutation is allocated at
a time and the resulting number of alleles
is computed. The process is repeated until
the simulation reaches the number of al-
leles (k) observed in the population sam-
ple. Under the SMM, a Bayesian approach
is used as explained in Cornuet and Lui-
kart (1996). Briefly, the likelihood distri-
bution of the parameter
u
(
5
4N
e
m
) given
the number of alleles (k) and the sample
size (n) is evaluated as the proportion of
iterations (in the simulation process) pro-
ducing exactly kalleles for a varying set
of
u
’s. As a second step, drawing random
values of
u
according to the likelihood dis-
tribution, the coalescent process is simu-
lated as usual. Only heterozygosities
found in iterations producing exactly kal-
leles are considered. Once all loci in a pop-
ulation sample have been processed the
three statistical tests are performed for
each mutation model, as explained in Cor-
nuet and Luikart (1996), and the allele fre-
quency distribution is graphed to deter-
mine whether a bottleneck-induced mode
shift has recently occurred. Note that a
mode shift is a transient distortion in the
distribution of allele frequencies such that
the frequency of alleles at low frequency
(frequency
,
0.10) becomes lower than
the frequency of alleles in an intermediate
allele frequency class (see Luikart et al.
1998).
Input File Format
Five input data file formats are accepted
and automatically recognized by BOTTLE-
NECK. All are text files. One is the GENE-
POP computer program format (Raymond
and Rousset 1995). The second is the GE-
NETIX computer program format ( Belkhir
et al. 1996). The other three formats con-
cern single population data and are de-
scribed in the help file of the program.
General Comments
BOTTLENECK is written in the Delphi 4
y
(Inprise Co.) computer language. The per-
formance of BOTTLENECK has been thor-
oughly evaluated using simulated datasets
(Cornuet and Luikart 1996; Luikart et al.
1998) and allozyme and microsatellite da-
tasets (Luikart and Cornuet 1998). To
achieve reasonably high statistical power
(
.
0.80), we recommend typing at least 10
polymorphic loci (microsatellites or allo-
zymes) and sampling at least 30 individu-
als. The standardized differences test is
recommended when using approximately
20 or more polymorphic loci (Cornuet and
Luikart 1996). For fewer than 20 loci, the
Wilcoxon’s test is the most appropriate
and powerful. The IAM is recommended
for allozyme data and the SMM is gener-
ally more appropriate when testing micro-
satellite loci (i.e., dinucleotide repeat loci)
(Luikart and Cornuet 1998). For most mi-
crosatellites, the TPM (two-phase model)
is apparently even more appropriate than
the SMM (Di Rienzo et al. 1994; Luikart G,
unpublished data). The TPM was recently
added as an option in BOTTLENECK.
When using microsatellites we recom-
mend the TPM with 95% single-step mu-
tations and 5% multiple-step mutations
(and a variance among multiple steps of
approximately 12). When using the quali-
tative test for mode-shift distortion, we
recommend using at least 30 individuals
and 10–20 polymorphic loci to avoid un-
reasonably high type 1 error rates (i.e., to
avoid concluding that a stable population
has been recently bottlenecked).
BOTTLENECK runs on any computer with
Windows 95
y
. However, we recommend
a computer at least as fast as a pentium
PC. A fast pentium is especially recom-
mended for analyzing datasets containing
many individuals (
..
30) and loci with
many alleles (e.g.,
.
3). Analyzing data un-
der the SMM is far slower than analyses
assuming only the IAM. On a Pentium 166
it takes about 15 minutes to analyze a da-
taset of 44 individuals and 7 loci (with 2–
8 alleles) when using both mutation mod-
els and 1000 simulation iterations. The
number of iterations influences the preci-
sion of the H
eq
estimates. A minimum of
1000 iterations is recommended. The pro-
gram and example input and help files can
be obtained from the World Wide Web at
http://www.ensam.inra.fr/URLB.
From the Laboratoire de Mode´lisation et de Biologie
Evolutive, INRA-URLB, 488 rue de la Croix-Lavit, F-34090
Montpellier, France (Piry and Cornuet), and the Division
of Biological Sciences, University of Montana, Missoula,
Montana (Luikart). G. Luikart is now at the Laboratoire
de Biologie des Populations d’Altitude, Universite´ Joseph
Fourier, Grenoble, France. This work wasfunded by the
Institut National de la Recherche Agronomique,the Ful-
bright Foundation (to G.L.), and the Graduate School
of the University on Montana (to G.L.). I. Till-Bottraud
provided helpful comments. Address correspondence
to J-M. Cornuet at the address above or e-mail:
Cornuet@ensam.inra.fr.
q
1999 The American Genetic Association
References
Belkhir K, Borsa P, Goudet J, Chikhi L, and Bonhomme
F, 1996. GENETIX, logiciel sous Windows
y
pour la ge´-
ne´tique des populations. Version 3.0. Montpellier,
France: Universite´ Montpellier II.
Chakraborty R and Jin L, 1992. Heterozygote deficiency,
population substructure and their implications in DNA
fingerprinting. Hum Genet 88:267–272.
Cornuet J-M and Luikart G, 1996. Description and pow-
er analysis of two tests for detecting recent population
bottlenecks from allele frequency data. Genetics 144:
2001–2014.
Di Rienzo A, Peterson AC, Garza JC, Valdes AM, Slatkin
M, and Freimer NB, 1994. Mutational processes of sim-
ple sequence repeat loci in human populations. Proc
Natl Acad Sci USA 91:3166–3170.
Harrison RG, 1991. Molecular changes at speciation.
Annu Rev Ecol Syst 22:281–308.
Howard DJ, 1993. Small populations, inbreeding, and
speciation. In: The natural history of inbreeding and
outbreeding (Thornhill NW, ed). Chicago: University of
Chicago Press; 118–142.
Hudson RR, 1990. Gene genealogies and the coalescent
process. In: Oxford survey in evolutionary biology,vol.
7 (Futyma D and Antonovics J, eds). Oxford: Oxford
University Press; 1–42.
Kimura M and Crow JF, 1964. The number of allelesthat
can be maintained in a finite population. Genetics 49:
725–738.
Luikart G, 1997. Usefulness of molecular markers for
detecting population bottlenecks and monitoring ge-
netic change (PhD dissertation). Missoula, Montana:
University of Montana.
Luikart G and Cornuet J-M, 1998. Empirical evaluation
of a test for identifying recently bottlenecked popula-
tions from allele frequency data. Conserv Biol 12:228–
237.
Luikart G, Allendorf FW, Sherwin B, and Cornuet J-M,
1998. Distortion of allele frequency distributions pro-
vides a test for recent population bottlenecks. J Hered
12:238–247.
Maruyama T and Fuerst PA, 1985. Population bottle-
necks and non-equilibrium models in population ge-
netics. II. Number of alleles in a small population that
was formed by a recent bottleneck. Genetics 111:675–
689.
Nei M, 1987. Molecular evolutionary genetics. New
York: Columbia University Press.
Ohta T and Kimura M, 1973. A model of mutation ap-
propriate to estimate the number of electrophoretically
detectable alleles in a finite population. Genet Res
Cambr 22:201–204.
Raymond M and Rousset F, 1995. GENEPOP (version
1.2): population genetics software for exact tests and
ecumenicism. J Hered 86:248–249.
Received December 15, 1997
Accepted February 26, 1999
Corresponding Editor: Robert Angus