ArticlePDF Available

Marker-based estimate of between and within population kinships for the conservation of genetic diversity

Authors:

Abstract and Figures

In this article coefficients of kinship between and within populations are proposed as a tool to assess genetic diversity for conservation of genetic variation. However, pedigree-based kinships are often not available, especially between populations. A method of estimation of kinship from genetic marker data was applied to simulated data from random breeding populations in order to study the suitability of this method for livestock conservation plans. Average coefficients of kinship between populations can be estimated with low Mean Square Error of Prediction, although a bias will occur from alleles that are alike in state in the founder population. The bias is similar for all populations, so the ranking of populations will not be affected. Possible ways of diminishing this bias are discussed. The estimation of kinships between individuals is imprecise unless the number of marker loci is large (> 200). However, it allows distinction between highly related animals (full sibs, half sibs and equivalent relations) and animals that are nor directly related if about 30-50 polymorphic marker genes are used. The marker-based estimates of kinship coefficients yielded higher correlations than genetic distance measures with pedigree-based kinships and thus to tills measure of generic diversity, although correlations were high overall. The relation between coefficients of kinship and generic distances are discussed. Kinship-based diversity measures conserve the founder population allele frequencies, whereas generic distances will conserve populations in which allele frequencies are che most different. Marker-based kinship estimates can be used for the selection Of breeds and individuals as contributors to a genetic a conservation programme.
Content may be subject to copyright.
Institute for Animal Science and Health, AB Lelystad, the Netherlands
Marker-based estimates of between and within population
kinships for the conservation of genetic diversity
By H. E
DING
and T. H. E. M
EUWISSEN
Summary
In this article coef®cients of kinship between and within populations are proposed as a tool to assess
genetic diversity for conservation of genetic variation. However, pedigree-based kinships are often not
available, especially between populations. A method of estimation of kinship from genetic marker data
was applied to simulated data from random breeding populations in order to study the suitability of
this method for livestock conservation plans. Average coef®cients of kinship between populations can
be estimated with low Mean Square Error of Prediction, although a bias will occur from alleles that are
alike in state in the founder population. The bias is similar for all populations, so the ranking of
populations will not be affected. Possible ways of diminishing this bias are discussed. The estimation of
kinships between individuals is imprecise unless the number of marker loci is large (> 200). However,
it allows distinction between highly related animals (full sibs, half sibs and equivalent relations) and
animals that are not directly related if about 30±50 polymorphic marker genes are used. The marker-
based estimates of kinship coef®cients yielded higher correlations than genetic distance measures with
pedigree-based kinships and thus to this measure of genetic diversity, although correlations were high
overall. The relation between coef®cients of kinship and genetic distances are discussed. Kinship-based
diversity measures conserve the founder population allele frequencies, whereas genetic distances will
conserve populations in which allele frequencies are the most different. Marker-based kinship
estimates can be used for the selection of breeds and individuals as contributors to a genetic
conservation programme.
Zusammenfassung
Markergestu
Ètzte Scha
Ètzungen der Verwandtschaft zwischen und innerhalb Populationen zur Erhaltung
genetischer Diversita
Èt
In dieser Vero
Èffentlichung werden Verwandtschaftskoef®zienten zwischen und innerhalb Populatio-
nen als Werkzeug zur Bewertung genetischer Diversita
Ètfu
Èr die Konservierung genetischer Variation
vorgeschlagen. Pedigreeinformationen zu Verwandtschaftsverha
Èltnissen sind ha
Èu®g nicht verfu
Ègbar,
insbesondere nicht zwischen Populationen. In diesem Artikel wird eine Scha
Ètzmethode fu
Èr den
Verwandtschaftsgrad mittels genetischer Marker an simulierten Daten zufallsgepaarter Populationen
angewandt, um die Eignung dieser Methode fu
Èr Tiererhaltungsprogramme zu u
Èberpru
Èfen. Durch-
schnittliche Verwandtschaftskoef®zienten zwischen Populationen ko
Ènnen mit geringen durch-
schnittlichen Standardfehlern gescha
Ètzt werden, obwohl bei Allelen in Populationen, die der
Gru
Ènderpopulation a
Èhnlich sind, Verzerrungen auftreten. Diese Verzerrung ist fu
Èr alle Populationen
a
Èhnlich, so dass sich die Werte fu
Èr die Populationen nicht verschieben. Es werden mo
Ègliche Wege zur
Verringerung der Verzerrung diskutiert. Die Scha
Ètzung der Verwandtschaft zwischen Einzeltieren ist
ungenau, wenn keine hohe Markerzahl (> 200) verwendet wird. Trotzdem erlaubt es eine
Unterscheidung eng verwandter Tiere (Vollgeschwister, Halbgeschwister und vergleichbarer Ver-
wandtschaftsverha
Èltnisse) und nicht direkt verwandter Tiere, wenn 30±50 polymorphe Marker
verwendet werden. Die markergestu
Ètzte Scha
Ètzung von Verwandtschaftskoef®zienten ergibt ho
Èhere
Korrelationen mit Pedigreeinformationen als die damit ermittelten Distanzmaûe. Die Beziehung
zwischen Verwandtschaftskoef®zienten und genetischen Distanzen werden diskutiert. Ver-
wandtschaftsbasierende Diversita
Ètsmaûe erhalten die Allelfrequenzen der Ausgangspopulation,
wa
Èhrend bei Verwendung genetischer Distanzen Populationen mit extremen Allelfrequenzen
konserviert werden. Die markergestu
Ètzte Scha
Ètzung von Verwandtschaft kann fu
Èr die Selektion von
Rassen und Einzeltieren fu
Èr genetische Konservierungsprogramme herangezogen werden.
J. Anim. Breed. Genet. 118 (2001), 141±159
Ó2001 Blackwell Wissenschafts-Verlag, Berlin
ISSN 0931±2668
Ms. received: 31.10.2000
U.S. Copyright Clearance Center Code Statement: 0931±2668/2001/1803±0141 $15.00/0 www.blackwell.de/synergy
Introduction
The importance of conservation of genetic diversity in livestock has received widespread
attention in recent years. Food security (H
AMMOND
1994) and sustainable livestock
production (D
E
W
IT
et al. 1995) are the main reasons. A major problem with regard to
conservation efforts is the assessment of genetic diversity within and between populations.
Many studies have described the genetic diversity of several populations within species
based on genetic distances (M
OAZAMI
-G
OUDARZI
et al. 1997; T
HAON D'ARNOLDI
et al.
1998; E
DING
and L
AVAL
1999; R
UANE
1999). On the other hand there are measures that are
based on some form of genetic similarity index (L
YNCH
1988). These similarity indices can
be adjusted to estimate relatedness between individuals within a population (L
I
et al. 1993;
L
YNCH
and R
ITLAND
1999).
As a third option, minimizing the mean kinship between animals within a population
selected for conservation purposes has been suggested as a general approach to
conservation of genetic diversity (H
AIG
et al. 1990; F
RANKHAM
1994; J
OHNSTON
and
L
ACY
1995; Z
HENG
et al. 1997; T
ORO
et al. 1998). The coef®cient of kinship is de®ned as
the probability that two alleles randomly sampled from the same locus in two individuals
are identical by descent (IBD, M
ALECOT
1948). Therefore, if the mean kinship in a set of
individuals is minimized, duplicates of alleles descending from the same ancestor will also
be minimized. Furthermore, this parameter is, on average, valid for the entire genome and
is not limited to the loci under study.
Kinships are calculated from pedigree records using for instance path analysis
(F
ALCONER
and M
AC
K
AY
1996). The need for pedigree records means that in situations
where they do not exist (poor administration or between breed analysis), pedigree-based
kinships can not be used as a measure of genetic diversity. In plant breeding a method was
developed to estimate kinship between individuals and populations using marker gene data
(B
ERNARDO
1993). This method consists of a similarity index Sbetween individuals and
correcting for alleles being alike in state (AIS). The similarity index Swas calculated as the
proportion of shared restriction fragment length polymorphism (RFLP) marker alleles
between lines of maize that were assumed to be inbred. Probabilities of alleles AIS were
estimated as the proportion of shared alleles with distantly related maize strains and
assumed to be different for different pairs of strains.
In this article a similarity index will be used for microsatellite markers. An extension of
the method by B
ERNARDO
(1993) to include non-inbred populations will be presented. The
estimation of the probability of alleles AIS will be discussed and alternatives presented. The
main focus of this article will be to question to what extent missing pedigree data can be
substituted by kinship estimates based on marker information in conservation decision
making. First, the behaviour of kinship (actual pedigree-based and estimated from a
similarity index) between and within (sub) populations over time will be studied. Next, the
degree to which kinships can be predicted by a similarity index using marker gene
information will be investigated by simulation. As a secondary aim the relationship
between coef®cients of kinship and marker-based estimates of genetic diversity, speci®cally
genetic distances and similarity indices, will be investigated. It will be argued that the
similarity index used in this article has the most consistent relation with both actual kinship
coef®cients and genetic diversity.
Methods
Similarity index
The similarity index that is used is based on the concept of identity by descent (IBD,
J
ACQUARD
1983; L
YNCH
1988). The scoring rules can be written mathematically as:
142 H. Eding and T. H. E. Meuwissen
Sxy;l1=4I11 I12 I21 I221
where I
ij
is an indicator variable which is 1 when allele ion locus lin the ®rst individual and
allele jon the same locus in the second individual are identical, otherwise it is 0. Note that
S
xy,l
can have four possible values: 1,  and  and 0. When three indicators have value 1 the
fourth will necessarily be 1 also, eliminating the possibility of a value of 3=4. Under
the assumption of founder alleles, S
xy
averaged over multiple loci is an estimator of the
coef®cient of kinship f
xy
(i.e. probability of IBD). Using J
ACQUARD
(1974) identity
coef®cients, Appendix A shows S
xy
is an unbiased estimator of kinship when founder
alleles are unique.
When founder alleles are not unique, the pairwise similarity between two individuals
is determined not only by the probability that two randomly sampled alleles are IBD,
but also by the probability that they are alike in state (AIS). Let f
ij
be the probability
two alleles are IBD and sthe probability that two alleles are AIS. Then the expected
value of the similarity score for a locus l between two individuals iand jbecomes
(L
YNCH
1988):
ESijfij 1ÿfijs;2
i.e. Sis upwardly biased by s. It is assumed that there is a founder population from which
all populations descend. All population are therefore related at least through this founder
population. It is further assumed that all relations in the founder population are zero i.e.
f
ff
0. It follows that the probability of two alleles being AIS, but not IBD is:
sS
ff
Pq
k
2
, where S
ff
is the similarity in the founder population and q
k
is the frequency
of the kth allele in the founder population. Note that sis de®ned by the founder population
only, as in this population all relations are assumed to be zero. If it is assumed that this
founder population is the ancestor to all populations in a study, this implies the probability
sis equal for all populations (ignoring mutations).
The expectation of the similarity between two populations is expected to remain
constant after population ®ssion (when no gene ¯ow is assumed). The smallest between
population similarity is therefore equal to the within population similarity of the founding
population just prior to ®rst ®ssion (see Discussion for further information). Thus scan
be set equal to the smallest between population similarity. This de®nes the generation just
prior to ®ssion as the founder population, in which all animals are unrelated. Hence, if the
breeds are more distantly related, i.e. the ®rst ®ssion occurs earlier, the founder generation
occurred earlier in time as well, and within population kinships are increased. It also
follows that the kinship estimates depend on the set of breeds that is considered.
However, it is their relative values that are important when prioritizing breeds for
conservation.
Rearrangement of equation 2 gives:
^
fij Sij ÿs
1ÿsLYNCH 19883
where scan be of assumed value or be estimated per locus from founder population data.
The estimate of f
ij
between two individuals iand jcan be obtained through
averaging over Lanalysed loci. If however, the probability sdiffers per locus, the inverse
of the variance of the estimate of f
ij
can be used as weights (see Appendix B for
derivation):
143Kinships and conservation of genetic diversity
^
fij PL
l1
^
fij;l
1ÿsl
slfij;l1ÿ2slÿf2
ij;l1ÿsl

PL
l1
1ÿsl
slfij;l1ÿ2slÿf2
ij;l1ÿsl
 4
Average similarities between and within populations
On the level of populations the average pairwise similarity between population xand yfor
a locus with Kalleles can be expressed in terms of allele frequencies as:
Sxy X
k
pxk pyk 5
where p
xk
is the frequency of the kth allele in population x. This expression has been used
many times in the ®eld of conservation genetics. Applied within a population (xy)
it expresses homozygosity under Hardy±Weinberg equilibrium. Its complement,
heterozygosity has been used as a measure of genetic diversity (T
ORO
et al. 1998).
Moreover, the coef®cient of inbreeding has been proposed as a measure of genetic
diversity (notably F
ST
) and is de®ned as the excess of homozygosity relative to Hardy±
Weinberg equilibrium genotype frequencies. The reciprocal of expression (5) was used
by Kimura (C
ROW
and K
IMURA
1970) to estimate the effective number of alleles and in
Nei's standard distance D, expression (5) appears in the numerator of the coef®cient of
identity.
Simulation
The behaviour of similarity index Sand the estimates of f
ij
were tested by simulation. A
base population was simulated, which developed into ®ve separate populations according
to the phylogeny given in Fig. 1. Divergence was obtained by doubling the number of
offspring in the generation in which ®ssion occurred to avoid bottleneck effects. The
Fig. 1. General structure of the phylogenetic tree used in the simulation for the case of ®ve
populations
144 H. Eding and T. H. E. Meuwissen
population of each line consisted of 50 individuals with equal numbers of males and
females. Each round of mating produced again 25 males and 25 females. Parents of each
offspring were sampled at random from the preceding generation. Generations were
discrete. For each individual a genome was simulated consisting of 200 autosomal, unlinked
selectively neutral loci. Every generation the information on all alleles of every individual
was recorded. Simultaneously a pedigree ®le was written containing all pedigree
information. For reasons of simplicity, linkage was ignored in this study, as were
selection, mutation and migration, such that the relationship between the similarity and the
actual kinship was not affected by these effects.
The size of each population was limited to a maximum 50 breeding individuals, to save
on computer time. The length and structure of the history was variable. The results will be
presented as a function of t/N
e
, since genetic drift depends on t/N
e
rather than only N
e
or
time t(C
ROW
and K
IMURA
1970).
The simulation was run for founder alleles (all founder animals have a unique set
of alleles per locus) and for founder populations with a limited number of alleles per
locus (2, 5, 10 and 20, respectively), with approximately equal allele frequencies in
the founder population. Before the ®rst population ®ssion, the founder population was
allowed to breed for a number of generations to generate a realistic distribution of
frequencies.
Over generations a number of statistics were calculated: average pairwise fbetween and
within populations calculated from the full pedigree (f
ij
, this statistic was taken to be the
`true' value of genetic similarity and was used to test the other statistics against), marker
estimated kinships (MEK) from average pairwise similarities (S
ij
) and average population
similarities from allele frequencies (S
xy
), Nei's standard distance D(N
EI
1972), Reynold's
distance D
R
(R
EYNOLDS
1983) and F
ST
based on marker gene information (N
AGYLAKI
1998).
Results
Actual average kinships between populations
Figure 2 shows scatter plots of the development of the average actual kinship between
and within populations for a single replicate. Figure 2a shows fcalculated from the
recorded pedigree and Fig. 2b MEK from the 200 loci, where the number of alleles per
locus was 2 (`worst case'). Correction for alleles AIS, was done by setting sto 0.5, the
expected probability of AIS. Data on all 200 loci was used to eliminate random drift
effects. This was done to verify MEK does behave according to actual kinships. The
population has a phylogeny as given in Fig. 1. In the ®gure a main line (´), can be
distinguished which increases with time. This line corresponds to the within population
average actual kinship. At intervals of 0.2N
e
generations a horizontal line separates from
the main line. These lines (h,n,e,s) show the average actual kinship between one
population and the cluster of populations that are the descendants of this population,
and their value is equal to the average population kinship within the population just
prior to ®ssion. The lowermost of these lines in the ®gure (at f
ij
0.098; h) corresponds
to the kinship between population 1 (the oldest population) and the cluster of
populations (2, 3, 4, 5). The next line (at f
ij
0.189; n) depicts the kinship between
population 2 and the cluster (3, 4, 5), the third line (e) corresponds to the kinship
between 3 and (4, 5) and the last line (s) is the average actual kinship between
populations 4 and 5. Note that after splitting the average kinship between populations
remains constant in both 2a and 2b, even though genetic distances between populations
would increase over time (see Discussion). Although some sampling deviations occur,
Fig. 2b generally depicts the same trend as Fig. 2a.
145Kinships and conservation of genetic diversity
Fig. 2. Scatterplot of the actual coef®cient of kinship f(calculated from pedigree) versus t/N
e
(a) and
estimated fusing markers with two alleles per locus in the founder population (b) versus t/N
e
for a
single replicate. Five populations were simulated. The populations have a phylogeny as given in Fig. 1.
(´) corresponds to the within population average actual kinship. (h) corresponds to the kinship
between population 1 (the oldest population) and the cluster of populations (2, 3, 4, 5). (n) depicts the
kinship between population 2 and the cluster (3, 4, 5), (e) corresponds to the kinship between 3 and
(4, 5) and (s) is the average actual kinship between populations 4 and 5
146 H. Eding and T. H. E. Meuwissen
Estimation of average kinships
In Table 1 the regression factor and the mean square error of prediction (MSEP), calculated
as 
Pij
^
fij ÿfij2=n
q, of average population fare given for a relatively short (t/N
e
0.4)
and a relatively long (t/N
e
1) period of time. The case with M200 refers to the full
genetic model with which the simulation was done and is included for reference. In the
upper half of the table founder alleles were assumed.
The lower half of Table 1 gives the regression factors and MSEP of
^
fwith increasing
numbers of alleles per locus at time t/N
e
0.4 and 1, respectively. Regression coef®cients
between fand
^
fwere close to 1, indicating the estimator was approximately unbiased. The
MSEP approached that of founder alleles. The estimation of
^
ffor non-founder alleles was
by expression (5) and assumed known s.
Within populations estimates of kinship
The regression of the pairwise MEKs on the actual kinships was 1 and had relatively small
MSEP. The right-hand portion of Table 2 shows that the regression factors, b
0
and b
1,
are
close to 0 and 1, respectively, which indicates an approximately unbiased estimation of f
ij
.
For the left-hand portion of Table 2 two situations were compared: one with a relatively
short history (t/N
e
0.4) and another with relatively long history (t/N
e
1). Numbers of
loci used were varied as was the number of alleles per locus in the founder population.
The general trend is a decreasing MSEP with increasing numbers of loci and increasing
number of alleles per locus in the founder population. There is not a clear distinction in the
importance between number of loci used and the number of alleles per locus. If the number
of alleles per locus is low, extra alleles are more informative than extra loci.
MSEP was overall rather large. Especially when looking at scenarios that presently are
used in the studies of genetic diversity with 10±15 loci, it can be seen that it is virtually
impossible to distinguish even full sibs from half sibs. To be able to accurately distinguish
between non-inbred full sibs and half sibs (p < 0.05) the results suggest that at moderate
numbers of alleles per locus (5±10) at least 30±50 unlinked markers have to be used, which
con®rms observations in similar studies of marker-based relationship estimates (L
YNCH
and R
ITLAND
1999).
Table 1. Regression coef®cients b, of the regression of the population averages of fÃ
ij
on f
ij
and the
square root of the mean square error of prediction (MSEP)
a
. Values of band the MSEP were
calculated over 20 replicates
t/N
e
= 0.4 t/N
e
= 1.0
bMSEP bMSEP
No. of markers founder alleles
10 0.972 0.058 1.020 0.079
20 0.986 0.034 1.002 0.068
30 0.998 0.025 1.000 0.058
50 0.999 0.021 0.998 0.041
200 1.010 0.007 1.008 0.012
No. of alleles 200 markers
2 0.852 0.020 0.940 0.028
5 0.970 0.009 0.992 0.018
10 1.000 0.009 1.003 0.015
20 0.998 0.008 1.001 0.013
a
MSEP 
Pij
^
fij ÿfij2=n
q, where n= 20 replicates
147Kinships and conservation of genetic diversity
Estimates of kinship and genetic distances
In Table 3 the proportion of variance explained by regression of genetic distances and
similarity parameters on kinship, R
2
, at time t/N
e
1 are given for cases with different
numbers of alleles in the founder population. All measures have an apparently strong
relationship with kinship. Only F
ST
shows a very weak relation with kinship when the
number of alleles is 2. This might be due to the combination of relatively large variance on
the estimator and low estimates of F
ST
due to the number of alleles per locus. Although
these strong relationships can be explained by the fact that all populations evolved similarly
(constant and equal N
e
) it illustrates that genetic distance measures have a tendency to be
highly related (H
EDRICK
1974; T
AKEZAKI
and N
EI
1996).
The R
2
of both measures of Swith kinship is consistently higher than those of genetic
distances. Note that the correlation of Nei's distance with kinship is reduced when founder
alleles are used. This is due to the non-linearity with t/N
e
of Nei's distance.
Looking over time the relationships between kinship and genetic distance become more
complicated. In Fig. 3a, b scatter plots are given of Sand Nei's standard distance,
Table 2. Mean square errors of prediction (MSEP) of estimated kinship fÃper pair of animals
within a population. The probability of alleles being alike in state, but not identical by descent s
had a value based on the distribution of alleles in the founder population. t/N
e
is the time since
establishment of the founder population. The regression estimates were taken from data over the
entire history. Regression factors are from the regression fÃ=b
0
+b
1
f+error
No. markers used Regression
No. alleles t/N
e
5 1015203050200 b
0
b
1
MSEP
a
2 0.4 0.260 0.179 0.147 0.130 0.108 0.086 0.050 ±0.007 1.084 0.042
1.0 0.289 0.207 0.166 0.153 0.123 0.097 0.047
5 0.4 0.154 0.109 0.089 0.077 0.065 0.054 0.037 0.002 0.980 0.026
1.0 0.177 0.122 0.101 0.088 0.073 0.056 0.032
10 0.4 0.123 0.089 0.076 0.067 0.058 0.048 0.035 0.002 0.999 0.023
1.0 0.145 0.104 0.087 0.074 0.059 0.048 0.028
20 0.4 0.107 0.077 0.067 0.059 0.051 0.043 0.034 0.005 0.992 0.021
1.0 0.129 0.091 0.076 0.067 0.054 0.042 0.025
Founder 0.4 0.094 0.069 0.059 0.053 0.047 0.040 0.033 0.002 0.992 0.019
1.0 0.115 0.082 0.068 0.060 0.049 0.039 0.023
a
Number of alleles per locus in the founder population. Alleles were assigned randomly with probability (No.1/
alleles), except in the case of founder alleles, where each individual received a unique pair of alleles
Table 3. Proportion of variance explained by the regression of average pairwise similarity S
xy
,
population similarity S
ij
, Nei's standard distance D, Reynolds distance D
R
or F
ST
(from allele
frequencies) at t/N
e
= 1 on actual average kinship (calculated from pedigree), R
2
. Estimates of the
parameters were based on full genetic information (i.e. 200 markers)
Parameter
No. alleles/locus S
xy
S
ij
D
a
D
R
a
F
ST
2 0.944 0.959 0.881 0.870 0.041
5 0.979 0.983 0.917 0.954 0.831
10 0.984 0.987 0.905 0.965 0.899
20 0.984 0.987 0.905 0.965 0.915
Founder 0.990 0.992 0.863 0.971 0.967
a
Genetic distances were calculated between populations only
148 H. Eding and T. H. E. Meuwissen
respectively, versus the true kinship. Swas calculated in two alternative ways: averaging all
pairwise similarities, S
xy
, and estimation from allele frequencies, S
ij
. Results were very
similar so they are not presented separately. Both S
ij
and S
xy
were calculated from founder
alleles, so S
^
f. The points in the scatter plots represent kinships and the statistics
Fig. 3. Scatter plots of between population diversity estimators versus the true kinship. Five
populations were simulated according to Fig. 1. All information (all individuals and all 200 loci) was
included. For all measures founder alleles were assumed. (a)
^
fbased on S, (b) Nei's standard genetic
distance
149Kinships and conservation of genetic diversity
mentioned above between populations at 10 intervals in time between t/N
e
0 and
t/N
e
1 for 20 replicates. The four groups of data points in Fig. 3a, b (from left to right)
correspond to the kinship/distance of population 1 and the cluster of populations
(2, 3, 4, 5), populations 2 and (3, 4, 5), 3 and (4, 5) and the kinship distance between
populations 4 and 5. In Fig. 3b, each group of data points starts on the x-axis (distance
0), as this is the moment where population ®ssion took place (D0). Over the next time
interval, the distances increase. The kinship between populations remains the same
however, resulting in a cloud of points directly above the previous ones. Looking at
Fig. 3b, it is clear a distance measure can be associated with any number of combinations of
kinship coef®cients, making the interpretation of genetic distances in terms of genetic
diversity ambiguous. Figure 3b shows this relationship for Nei's standard distance, but was
similar for Reynold's distance and F
ST
.
The average kinship f
xy
between two populations xand yis an estimate of the time, or
rather t/N
e
between establishment of the founder populations and the time of divergence of
the two populations. It is approximately equal to inbreeding in the parent population at
time of divergence. After population ®ssion f
xy
will remain constant, whereas xand ywill
drift further apart, resulting in increasing distance estimates between population xand y,
which explains the differences between kinship and distance measures in Fig. 3.
Discussion
Kinship/similarity as measure for genetic diversity
In this article it is argued that average kinship is a good measure of genetic diversity.
Moreover, as can be seen from expression (5) most of the distance and diversity measures
involve terms that estimate kinship. Kinship or similarity indices can be used to assess
genetic diversity within and between populations. For conservation purposes kinship as a
measure of diversity has some properties with intuitive appeal:
(1) Within populations, kinships can generally only increase whereas diversity can only
decrease over time (ignoring mutation).
(2) After population ®ssion, kinship between populations becomes constant very quickly
causing between population diversity to remain constant. The fact that kinships estimated
from allele frequencies remain constant can be seen from the following.
The similarity score for a locus between two populations A and B can be expressed as:
SAB X
I
i1
pA;ipB;i
SX
I
i1
pAB;iDpA;i pAB;iDpB;i
X
I
i1
p2
AB;iDpA;ipAB;iDpB;ipAB;iDpA;iDpB;i
and ESABX
I
il
p2
AB;i
where p
x,
iis the frequency of allele iin population x,p
AB
is the frequency of allele iin
the parent population of A and B and Dp
X
is the change in frequency in
population xsince population ®ssion. As the expectation of Dp
X
is equal to zero and
there is no covariance between Dp
A,i
and Dp
B,i
, the expectation of the similarity score
between populations A and B is constant and equal to the similarity score within the
150 H. Eding and T. H. E. Meuwissen
parent population, just prior to ®ssion. Since the probabilities of alleles AIS, s, are not
expected to change either, the between population kinship is also expected to remain
constant after population ®ssion.
(3) The de®nition of the coef®cient of kinship as the probability that two randomly
sampled alleles drawn from two individuals are identical by descent f, which implies that
(1 ± f) is the probability they are not identical by descent and can therefore be interpreted
as an upper limit for genetic diversity.
(4) The coef®cient of kinship is also involved in the variance of quantitative traits. In
Appendix C it is shown how the minimization of kinship will lead to conservation of
variance of quantitative traits.
Between populations the marker-based estimates of f(including between a population
with itself) show relatively low MSEP (Table 1), and are useful as genetic diversity
measures. Between individuals the estimates of fsuffer from relatively high MSEP
(Table 2). Using a reasonable number of marker alleles (30±50) which are relatively
polymorphic (5±10 alleles per locus) it is possible to distinguish animals with low kinship
from pairs of animals with a high degree of kinship. Estimating between individual kinships
based on marker estimation, even with a low number of marker loci, is useful however. Use
of these estimates to calculate between population kinships introduces less assumptions
about the population structure and implicitly accounts for structures within a population
(herds, for instance).
Estimates of relations between individuals have been developed by many authors
(T
HOMPSON
1975; L
YNCH
1988; L
I
et al. 1993; L
YNCH
and R
ITLAND
1999). Each of these
estimates has its merits but is not entirely suitable for the purposes that are described in
this article. Either they are not linear with Malecot's coef®cient of kinship (L
YNCH
1988)
or can realistically only be applied within a population. L
YNCH
and R
ITLAND
(1999) state
that there are problems with the sampling error of the similarity index used in this
article. However, the case cited in Lynch and Ritland corrects for alleles alike in state by
replacing sin Equation 3 by J
0
, the expected homozygosity under Hardy±Weinberg
equilibrium. Although this is a good approximation for estimations of ®rst- and second-
order relationships, it should be clear that this is not the desired method when assessing
genetic diversity. Using the expected homozygosity of a population spanning multiple
generations de®nes the founder population somewhere between the oldest and the
youngest generation in the population. When J
0
is used within populations a problem
occurs in that populations cannot be compared for their genetic diversity content.
Furthermore, inbreeding is not accounted for, although this is an important part of
genetic diversity within a population. In practice, the use of J
0
as the probability of AIS
leads to negative estimates of the kinship coef®cient in cases where the common
ancestor(s) is (are) a member of the oldest generations and is not a matter of sampling
error alone.
All of the above authors and many others have concluded that it requires a large amount
of genetic marker data to obtain reliable estimates of between individual coef®cients of
kinship. If pedigree information exists, other than from genetic marker data (i.e. herd
books), it seems advisable that once populations have been identi®ed for conservation, the
existing pedigree information is incorporated to facilitate selection of individual contribu-
tors to a conservation plan or gene bank. This might be carried out by using W
RIGHT
's
(1968) F-statistics
1ÿFIT1ÿFIS 1ÿFST
where F
IT
is de®ned as the total kinship between two individuals within a population. F
IS
is
the kinship between two individuals relative to the present population and can be extracted
from the (limited) pedigree information. Then for F
ST
the average kinship within the
population under study estimated from genetic marker data (i.e. MEK) is substituted. This
151Kinships and conservation of genetic diversity
method removes a large part of the error of the estimates of kinships between individuals
based on marker data only. If pedigree information does not exist the MEKs can still be
used to avoid selection of full sibs or half sibs as contributors.
The strength of the presented method is that the same method is being applied on the
level of breeds, populations, herds down to individuals which, as shown above can
relatively easily incorporate existing pedigree information. Both MEKs and pedigree
information are transferred to kinship coef®cients and are therefore easily combined. The
result is a comprehensive approach to assessing the genetic diversity that is maintained
in a gene bank and thus can be used to prioritize breeds or populations for genetic
conservation.
In this study a genome was simulated consisting of a maximum of 200 autosomal,
unlinked loci. In nature, linkage does occur of course and will have an in¯uence on the
accuracy with which fis estimated. Accounting for linkage however, is complicated and
lies beyond the scope of this article.
W
EITZMAN
(1992) developed criteria which have to be ful®lled by proper measures of
diversity (T
HAON D'ARNOLDI
et al. 1998). These criteria are:
(1) The `twin property', which means that the inclusion of a population identical to a
population already in a set of conserved populations must not increase the diversity in the
set. In the case of kinship inclusion of such a population would increase the average
kinship, i.e. diversity would be decreased.
(2) The total amount of diversity in a set of populations cannot increase when a population
is removed from the set. It can be shown that the average kinship can decrease, i.e. diversity
can increase, when a population is removed from the set. However, this can only happen
when the between population kinships are (almost) as large the within population kinships.
The latter is not likely to occur in practice.
(3) Continuity in distance: if distances are slightly modi®ed, the change in diversity is slight
too. Average kinship is a continuous function, so any small change leads to a small
difference in average kinship.
(4) Monotonicity in distance: if distances increase, diversity should increase also; if the
kinship between two population decreases, diversity will increase.
Thus the average kinship as a measure of diversity has some problems with the
comparison of sets of unequal sizes, i.e. Weitzman's criteria 1 and 2. These problems do not
seem to be very important in practical situations, where the number of populations in the
genebank will often be limited and thus constant. The authors are in the process of
modifying the average kinship criterion to a weighted average kinship, which should ful®l
all of Weitzman's criteria.
Kinship and genetic distances
Being proportional to time since divergence, genetic distances create the impression of
increasing diversity between two populations, even when there is no change in the actual
genetic diversity in terms of allelic diversity or coef®cient of kinships. The average kinship
within a population can be written as:
fxfxy Dfx
That is, the within population kinship is the sum of the between population kinship (i.e. the
kinship within the population just prior to ®ssion, f
xy
) and the increase in within
population kinship since ®ssion (Df
x
).
In terms of coef®cients of kinship, a generic distance between populations xand ycan be
written as:
152 H. Eding and T. H. E. Meuwissen
dx;yfxfyÿ2fxy DfxDfy
This formula implies that the distance between two populations is determined by the
increase in within population kinship after population ®ssion. Although f
xy
stays constant
over time, f
x
and f
y
increase over time and this results in an increase of the distance between
xand yfor the same value of f
xy
. However, an increase in within population kinship
indicates an increase in homozygosity or inbreeding, causing loss of alleles and genetic
variance.
Considering a set of populations where all within population kinships are equal the
genetic distance between populations is now only determined by the between population
kinships. In such a case a larger distance indicates more genetic diversity, because the
between population kinship is smaller. Hence, a larger genetic distance is only related to a
larger diversity if the within population kinships are equal. If within population kinships
vary, a larger distance can even lead to lower diversity, as the following example illustrates.
Suppose there is a phylogenetic tree as given in Fig. 4. In this ®gure the similarity scores
are given within and between breeds. Nei's genetic distances between (A,B) (A,C) and
(B,C) are given in the table in Figure 4. Since S
xy
P
i
(p
x,i
p
y,i
) Nei's distance can be
calculated as D)ln(I) and IS
xy
/Ö(S
xx
S
yy
). A table of kinship coef®cients is also given
in Fig. 4. The kinships were calculated using formula (3) and assuming s0.30, that is the
oldest ®ssion in this set of breeds.
If two populations were chosen for conservation based on these distances, the choice
would be the pair (A,B) as they have the largest distance between them and seem the
furthest apart. However, both the within and between population kinship is smaller (and
consequently the conserved diversity larger), when the pair (A,C) or (B,C) is chosen for
Fig. 4. Hypothetical phylogenetic tree of three breeds. The numbers in the ®gure are the similarities
between (under nodes) and within breeds. The table in the ®gure gives Nei's genetic distances between
the breeds D±log(I), with IS
xy
/Ö(S
xx
S
yy
) and kinship coef®cients estimates between and within
populations. From the table can be seen that even though the pair (A,B) has less diversity (higher
between and within population coef®cients of kinship), the distance between A and B is larger then the
distances between them and C
153Kinships and conservation of genetic diversity
conservation instead of (A,B). The robust method of Weitzman results in population C
being the link element in the diversity tree, which implies that the loss of population C is
less consequential for the diversity than any other element. Clearly, the loss of population
C in the present example would yield the highest loss of diversity. Genetic distances are
useful to picture genetic diversity, for example, in the form of phylogenetic trees. However,
genetic distances increase with increasing levels of inbreeding of the populations, and thus
diversity decreases. Genetic distances will therefore conserve populations with the most
different allele frequencies, while minimizing kinships attempts to conserve the founder
population allele frequencies.
Generally, measuring genetic diversity with genetic distances is a special case of
measuring genetic diversity with genetic similarity methods such as MEK, in which within
population diversity is assumed to be equal for all populations.
Correction for alleles being alike in state
Estimation of kinships with genetic marker data is easiest under the assumption of founder
alleles somewhere in the history of the population. T
ORO
et al.(1998) have used this
assumption in their study of the use of marker information in a live conservation of a single
breed. If the assumption of founder alleles is relaxed the estimate of kinship needs to be
corrected for the probability two alleles are alike in state, s. When kinship or numbers of
alleles per locus are relatively small, the in¯uence of the distribution of alleles in the
founder population is considerable (Table 2). There is an advantage in using estimates of s
in that it makes weighing over loci possible which reduces the variance of the estimator
(Equation 4). Note that since a single founding population is assumed, swill be of equal
value for all populations and individuals and the ranking of pairs of individuals or
populations is not affected by the assumed value of s.
In a set of populations it can be assumed that sis the value of the between population
similarity of the populations descending from the oldest ®ssion (i.e. sequals the smallest
between population kinship). In the population structure used in this study this would
mean taking the average value of the between population similarity of population 1 and the
cluster (2, 3, 4, 5) (see Fig. 1). This de®nes the generations with parents of 1 and 2 as the
founder population. This method requires the fewest assumptions about the character of
the founder population: information on the founder population can be inferred from the
between population similarity of the two oldest populations or clusters. This seems to be
the best approach to the question of founder population de®nition. It should be noted that
the de®nition of a founder population is arti®cial. It is a convenient entity to specify more
precisely what the relationships are and to minimize the prediction error of kinships
estimates using equation 4. For conservation purposes the estimate of sneed not be
accurate, because the MEK will still be proportional to the true f. This will leave the
outcome of a selection procedure of animals for a genebank unaffected, which has been
veri®ed in an example (results not shown).
In this study mutation was not accounted for. Mutation will bias information about
kinships between and within populations and individuals. However, studies of the effect of
mutation on genetic distances generally indicate that these effects will not disturb estimates
very much, unless the number of generations and the population size are very large
(S
LATKIN
1995; N
AUTA
and W
EISSING
1996). In studies of breed formation, both the
population size and the time since divergence are expected to be relatively small on an
evolutionary scale and therefore the in¯uence of mutations is not expected to be of great
importance.
Generally, when using marker information, it is recommended to use markers that are as
polymorphic as possible (B
RETTING
and W
IDERLECHNER
1995). The panel of microsatellite
markers proposed by FAO for the study of genetic diversity in European cattle (as part of
the MoDAD project) was chosen on the basis that the markers had to have at least four
154 H. Eding and T. H. E. Meuwissen
different alleles per locus (FAO 1998). Selection of highly polymorphic markers is equal to
selection of markers with small s. Since the method presented in this article includes a
correction for s, this selection of highly polymorphic markers is not expected to bias the
kinship estimates. Marker loci used should however, display more than two alleles per locus.
Writing the estimate of the coef®cient of kinship in Jacquards notation for a locus with only
two alleles in the founder population shows that this situation is no longer yielding an
estimate of Malecot's kinship coef®cient. This explains the poorer performance of the
diversity measures in this article for situations in which only two alleles per locus were used.
Conclusion
Kinship coef®cients appear to be of central importance in the de®nition and measurement
of genetic diversity. As the results show, it is possible to obtain estimates of between
population kinship with acceptably low MSEP. These estimates may be biased by the
unknown s(the probability two alleles are alike in state, but not identical by descent).
However, since it is expected that this bias is equal for all populations (sbeing a function of
the homozygosity in the founder population; see before) it will not affect the selection of
populations for genetic conservation. The MEKs will allow us to identify those
populations and individuals that have the least kinship and will therefore help to make
optimal use of limited resources for genetic conservation. However, the MSEP of the
between individual estimates are such that it is advisable to use existing pedigree
information for the selection of individuals of a population that is to be conserved.
Acknowledgements
The authors would like to thank P
IM
B
RASCAMP
,A
B
G
ROEN
and K
OR
O
LDENBROEK
for their useful
comments on the manuscript.
References
B
ERNARDO
, R., 1993: Estimation of coef®cient of coancestry using molecular markers in maize. Theor.
Appl. Gen. 88: 1055±1062.
B
RETTING
, P. K.; W
IDERLECHNER
, M. P., 1995: Genetic markers and horticultural germplasm
management. Hort. Science 30: 1349±1356.
C
ROW
, J. F.; K
IMURA
, M., 1970: An Introduction to Population Genetics Theory. Harper & Row, New
York, USA.
E
DING
, J. H.; L
AVAL
, G., 1999: Measuring the genetic uniqueness in livestock. In: O
LDENBROEK
,J.K.
(ed.), Genebanks and the Conservation of Farm Animals Genetic Resources. ID-DLO, Lelystad,
the Netherlands.
FAO, 1998.Primary Guidelines for Development of National Farm Animal Genetic Resources
Management Plans. FAO, Rome, Italy.
F
ALCONER
, D. S.; M
ACKAY
, T. F. C., 1996: Introduction to Quantitative Genetics. Longman House,
Harlow, UK.
F
RANKHAM
, R., 1994: Conservation of genetic diversity for animal improvement. In: S
MITH
, C. et al.
(eds), Proc. 5th World Congress on Genetics Applied to Livestock Production, Vol. 21.
University of Guelph, Guelph, Canada. pp. 385±392.
H
AIG
, S. M.; B
ALLOU
, J. D.; D
ERRICKSON
, S. R., 1990: Management options for preserving genetic
diversity: reintroduction of Guam rails to the wild. Conservat. Biol. 4: 290±300.
H
AMMOND
, K., 1994: Conservation of domestic animal diversity: global overview. In: S
MITH
, C. et al.
(eds), Proc. 5th World Congress on Genetics Applied to Livestock Production, Vol. 21.
University of Guelph, Guelph, Canada. pp. 423±430.
H
EDRICK
, P. W., 1974: Genetic similarity and distance. Comments Comparisons, Evolution 29:
362±366.
J
ACQUARD
, A., 1974: The Genetic Structure of Populations. Springer-Verlag, New York, USA.
J
ACQUARD
, A., 1983: Heritability: one word, three concepts, Biometrics 39: 465±477.
J
OHNSTON
, L. A.; L
ACY
, R. C., 1995: Genome resource banking for species conservation: selection of
sperm donors. Cryobiology 32: 68±77.
155Kinships and conservation of genetic diversity
L
I
, C. C.; W
EEKS
, D. E.; C
HAKRAVARTI
, A., 1993: Similarity of DNA ®ngerprints due to chance and
relatedness, Hum. Hered. 43: 45±52.
L
YNCH
, M., 1988: Estimation of relatedness by DNA ®ngerprinting, Mol. Biol. Evol. 5: 584±599.
L
YNCH
, M.; W
ALSH
, B., 1998: Genetics and Analysis of Quantitative Traits. Sinauer, Sunderland, MA,
USA.
L
YNCH
, M.; R
ITLAND
, K., 1999: Estimation of pairwise relatedness with molecular markers. Genetics
152: 1753±1766.
M
ALECOT
, G., 1948: Les MatheÂmatiques de L'heÂreÂdite . Masson. Paris.
M
OAZAMI
-G
OUDARZI
, K.; L
ALOE
È
, D.; F
URET
, J. P.; G
ROSCLAUDE
, F., 1997: Analysis of genetic
relationships between 10 cattle breeds with 17 microsatellites. Anim. Genet. 28: 338±345.
N
AGYLAKI
, T., 1998: Fixation indices in subdivided populations. Genetics 148: 1325±1332.
N
AUTA
, M. J.; W
EISSING
, F. J., 1996: Constraints on allele size at microsatellite loci: Implications for
genetic differentation. Genetics 143: 1021±1032.
N
EI
, M., 1972: Genetic distance between populations. Am.. Nat. 106: 283±292.
R
EYNOLDS
, J., 1983: Estimation of the coancestry coef®cient basis for a short-term genetic distance.
Genetics 105: 767±779.
R
UANE
, J., 1999: A critical review of the value of genetic distance studies in conservation of animal
genetic resources. J. Anim. Breed. Genet. 116: 317±323.
S
LATKIN
, M., 1995: A measure of population subdivision based on microsatellite allele frequencies.
Genetics 139: 457±462.
T
AKEZAKI
, N.; N
EI
, M., 1996: Genetic distances and reconstruction of phylogenetic trees from
microsatellite DNA. Genetics 144: 389±399.
T
HAON D
'A
RNOLDI
, C.; F
OULLEY
, J.-L.; O
LLIVIER
, L., 1998: An overview of the Weitzman approach to
diversity. Gen. Sel. Evol. 30: 149±161.
T
HOMPSON
, E. A., 1975: The estimation of pairwise relationships. Ann. Hum. Genet. 39: 173±188.
T
ORO
, M.; S
ILIO
, L.; R
ODRIGANEZ
, J.; R
ODRIGUEZ
, C., 1998: The use of molecular markers in
conservation programmes of live animals. Gen. Sel. Evol. 30: 585±600.
W
EITZMAN
, M. L., 1992: On diversity. Quart. J. Econ. 107: 363±405.
DE
W
IT
, J.; O
LDENBROEK
, J. K.; V
AN
K
EULEN
, H.; Z
WART
, D., 1995: Criteria for sustainable livestock
production: a proposal for implementation. Agric. Ecosys. Environ. 53: 219±229.
W
RIGHT
, S., 1968: Evolution and the Genetics of Populations, Vol. II, University of Chicago Press,
London.
Z
HENG
, Y. Q.; L
INDGREN
, D.; R
OSVALL
, O.; W
ESTIN
, J., 1997: Combining genetic gain and diversity by
considering average coancestry in clonal selection of Norway spruce. Theor. Appl. Genet. 95:
1312±1319.
Appendix A
The 15 states of identity de®ned by Jacquard are given in Fig. 5 condensed in nine
condensed coef®cients of identity (Taken from L
YNCH
and W
ALSH
1998). Note that these
states of identity presuppose the existence of more than two alleles for a locus.
Ignoring alleles alike in state (AIS) Malecot's coef®cient of kinship can be written in
these condensed identity coef®cients as (L
YNCH
and W
ALSH
1998):
fxy D11
2D3D5D71
4D8
The similarity index S
xy
is de®ned as given in Table 4 with the corresponding condensed
identity coef®cients. Assuming founder alleles and summing over all four possible values
then:
Sxy D11
2D3D5D71
4D8fxy
i.e. assuming founder alleles S
xy
is an unbiased estimator of f
xy.
Moreover, S
xy
will be linear
with f
xy
as long as the number of alleles per locus is larger than two. When only two alleles
per locus are assumed D
8
is unde®ned and S
xy
is no longer strictly linear with f
xy
. Note that
this situation is different from the situation where D
8
equals 0, i.e. more than two alleles
were present in the founder population. In the latter case S
xy
is still linear with f
xy
.
L
YNCH
and R
ITLAND
(1999) de®ne a coef®cient of relatedness, which should estimate
twice the kinship coef®cient of Malecot:
156 H. Eding and T. H. E. Meuwissen
rxy /xy
2Dxy
where /
xy
is the probability that one allele in xis IBD with one allele in y, and D
xy
is the
probability that both alleles in xare IBD with alleles in y. Lynch and Ritland do not
account for inbreeding. This removes the probability of individuals being homozygous for
alleles IBD. Rewriting f
xy
and r
xy
under these terms gives:
Fig. 5. The nine condensed coef®cients f identity for a locus in two individuals. Alleles that are
identical by descent are connected by lines (Taken from L
YNCH
and W
ALSH
1998)
157Kinships and conservation of genetic diversity
fxy 1
2D71
4D8and rxy D71
2D8
As can be seen from the above: The estimator of Lynch and Ritland agrees with Malecots
coef®cient of kinship if inbreeding is non-existent. However, if individuals are allowed to
be homozygous for alleles IBD, i.e. inbreeding does occur the estimator presented by
Lynch and Ritland can be expressed as:
rxy D1D3D71
2D5D8
which no longer agrees with Malecots coef®cient of kinship.
Appendix B
As stated in the main text, the relation between Sand the kinship f
ij
between iand jcan be
written as:
ESlpij;l
fij 1ÿfij sl
sl1ÿslfij
B1
where S
ij,l
is the similarity between two individuals for locus land s
l
is the probability of
alleles of locus lbeing alike in state.
This result leads to the variance of
^
fin that
var
^
fij 1
1ÿsl2varSij;lB2
As Sis the probability that two random alleles drawn from two individuals are alike, the
distribution of Sis binomial. The variance of Sbetween two individuals iand jfor a locus l
is given as:
varSij;lpij;l1ÿpij;lB3
Substituting (B1) in (B3) yields:
varSij;lfij1ÿslslÿf2
ij 1ÿsl22fijsls2
l
hi
fij1ÿsl1ÿ2slsl1ÿslÿf2
ij 1ÿsl2B4
Table 4. The four possible values of the similarity index and their corresponding condensed
coef®cients of identity
Similarity Value Identity coef®cient
AA )AA 1 D
1
AA )AB 1/2 D
3
+D
5
AB )AB 1/2 D
7
AB )BC 1/4 D
8
Total D
1
+(D
3
+D
5
+D
7
)+D
8
158 H. Eding and T. H. E. Meuwissen
Substitution of (B5) in (B2) gives:
var
^
fijfij 1ÿsl1ÿ2slsl1ÿslÿf2
ij 1ÿsl2
1ÿsl2
slfij1ÿ2slÿf2
ij 1ÿsl
1ÿsl
B5
Appendix C
Suppose an animal ihas a breeding value u
i
for an (unspeci®ed) trait. The total variance of
breeding value u
i
equals the variance of the mean plus the variance of deviations within the
population:
varuivar
uvaruiÿ
u)varuiÿ
uvaruiÿvar
u
The total amount of genetic diversity in a population is described by var(u
i
±u) and it is
this quantity that needs to be maximized. The total variance of the breeding value, var(u
i
),
is ®xed and unknown and thus cannot be maximized. Therefore a conservation plan can
only affect var(u). This last factor can be interpreted as the variance of the average breeding
value of all possible genebanks assembled from the population under study.
In matrix notation var(u) equals var(c¢u/c¢c), where uis an n´1 vector containing the
breeding values of the animals in the population and cdenotes a vector of ones and zeros
indicating which individuals in the total population are selected for conservation.
Now
varc0u=ngbc0varuc=n2
gb c0r2
uAc=n2
gb
where Ais the relationship matrix and n
gb
c¢c is the number of individuals in the
genebank. Elements a
ij
of Aare the additive genetic relationships between individuals i
and jand Malecot's coef®cient of kinship is f
ij
0.5(a
ij
). It can be seen that var(u)is
proportional to A/n
2
, hence it follows that maximization of genetic diversity in any
quantitative trait implies minimization of average kinship.
Author's address: H. E
DING
(corresponding author, E-mail: j.h.eding@id. dLo nl); T. H. E. M
EUWISSEN
,
Institute for Animal Science and Health, Box 65, 8200 AB Lelystad, the Netherlands
159Kinships and conservation of genetic diversity
... The concept of core sets was first proposed in the field of plant breeding and was defined as the minimum set of lines or types of a plant species that would still represent the genetic diversity of that species (Eding & Meuwissen, 2001). The aim of the core set model is to eliminate the genetic overlap between each of these lines. ...
... This was first defined by Malècot as the probability that two alleles, taken at random from two individuals, are identical by descent (Falconer & MacKay, 1996;Frankham et al., 2002;Malècot, 1948). The coefficient of kinship describes genetic diversity in terms of alleles (Caballero & Toro, 2002) and also in terms of quantitative genetic variation, without requiring a detailed knowledge of the genetic processes involved (Eding & Meuwissen, 2001). Minimizing the genetic overlap is equivalent to minimizing kinship in a set of breeds by adjusting contributions of each population or individual to the core set (Eding et al., 2002). ...
Article
Full-text available
The consequences of poor breed management and inbreeding can range from gradual declines in individual productivity to more serious fertility and mortality concerns. However, many small and closed groups, as well as larger unmanaged populations, are plagued by genetic regression, often due to a dearth in breeding support tools which are accessible and easy to use in supporting decision-making. To address this, we have developed a population management tool (BCAS, Breed Conservation and Management System) based on individual relatedness assessed using pedigree-based kinship, which offers breeding recommendations for such populations. Moreover, we demonstrate the success of this tool in 16 years of employment in a closed equine population native to the UK, most notably, the rate of inbreeding reducing from more than 3% per generation, to less than 0.5%, or that attributed to genetic drift, as assessed over the last 16 years of implementation. Furthermore, with adherence to this program, the long-term impact of poor management has been reversed and the genetic resource within the breed has grown from an effective population size of 20 in 1994 to more than 140 in 2020. The development and availability of our BCAS for breed management and selection establish a new paradigm for the successful maintenance of genetic resources in animal populations.
... conservation priorities. Several theoretical approaches to conservation have been published, such as the Weitzman (1992) approach based on genetic distances or minimizing marker-estimated kinships (Eding and Meuwissen, 2001;Caballero and Toro, 2002). However, the usefulness of the currently available algorithms is still a matter of debate (European Cattle Genetic Diversity Consortium, 2006;Toro et al., 2009). ...
Book
Full-text available
The Global Plan of Action for Animal Genetic Resources, adopted in 2007, is the first internationally agreed framework for the management of biodiversity in the livestock sector. It calls for the development of technical guidelines to support countries in their implementation efforts. Guidelines on the Preparation of national strategies and action plans for animal genetic resources were published by FAO in 2008 and are being complemented by a series of guideline publications addressing specific technical subjects. These guidelines on Molecular characterization of animal genetic resources address Strategic Priority Area 1 of the Global Plan of Action – “Characterization, Inventory and Monitoring of Trends and Associated Risks” and particularly complement the guidelines on Phenotypic characterization of animal genetic resources and Surveying and monitoring of animal genetic resources published in the same series. They have been endorsed by the Commission on Genetic Resources for Food and Agriculture. A short overview of progress in molecular characterization of animal genetic resources over the last two decades and prospects for the future is followed by a section that provides practical advice for researchers who wish to undertake a characterization study. Emphasis is given to the importance of obtaining high-quality and representative biological samples, yielding standardized data that may be integrated into analyses on an international scale. Appendices provide a glossary of technical terms; examples of questionnaires; an example of a simple material transfer agreement; a summary of software that can be used to analyse molecular data; and the standard International Society for Animal Genetics–FAO Advisory Group panels of microsatellite markers for nine common livestock species.
... our attempt to investigate genomic inbreeding coefficients is limited and the topic cannot be exhausted to those 10 estimators applied in our study. Other genomic inbreeding estimators exist (Nejati-Javaremi et al., 1997;Eding and Meuwissen, 2001), and recently Nani and VanRaden (2021) investigated the adjustment for the X-chromosome to better scale genomic to pedigree inbreeding coefficients and to account for differences between females and males. Moreover, measures of autozygosity and ROH classes provide additional knowledge on the form of homozygosity (i.e., distinguish IBS from IBD) and may be considered complementary to other genomic inbreeding estimators for organizing matings and controlling inbreeding. ...
Article
Full-text available
The objective of this study was to estimate inbreeding coefficients in Holstein dairy cattle using imputed SNPs data. A data set of 95,540 Italian Holstein dairy cows from the routine genomic evaluations of the Italian National Association of Holstein, Brown, and Jersey Breeders were analyzed, with 84,445 imputed SNP. Ten widely used genomic inbreeding estimators were tested, including 4 PLINK v1.9 estimators (F, FHAT1, FHAT2, FHAT3), 3 genomic relationship matrix (GRM)-based methods [VanRaden's first method with observed allele frequencies (FGRM) or with fixed frequencies at 0.5 (FGRM05), VanRaden's third method, allelic frequency free and pedigree regressed (FGRM2)], runs of homozygosity (ROH)-based estimators in a complete (FROH) and simplified version (FROH2), and proportion of homozygous SNP (FPH). Pairwise comparisons among them were made, including the comparison with traditional pedigree-based inbreeding coefficients (FPED). Our results showed variability among the genomic inbreeding estimators. Coefficients of FGRM and FHAT3 were >1, meaning that more variability has been lost than the variability that existed in the base population. Regarding the remaining ones, FGRM05, FROH, FROH2, and FPH provided coefficients within the [0,1] space and are considered comparable to FPED. Not comparable to FPED, yet with an interpretable value, can be considered the coefficients of F, FHAT2, and FGRM2. Estimators based on ROH had the highest correlation with pedigree-based coefficients (0.59–0.66), among all estimators tested. In this study, Spearman correlations were shown to possibly provide a clearer estimation of the strength of the relationship between estimators. We hypothesize that imputation might cause extreme genomic inbreeding values that deserves further investigation.
... While breed conservation is seen as the protection of rare breeds in developed countries (e.g. Windig et al., 2004Windig et al., , 2007, conservation in the context of developing countries can be appropriately defined as the rational use and protection of existing local genotypes from genetic introgression (Eding and Meuwissen, 2001). ...
Article
Full-text available
Recent improvements in genetic analysis and genotyping methods have resulted in a rapid expansion of the power of molecular markers to address genetic characterization. Microsatellites have emerged as the most popular and versatile marker type for biotechnological applications. Nevertheless, lack of clear vision of where to bring an impact (since the crossbreds were distributed all over the country) and lack of recording at all levels, especially at smallholder farms, ability to make informed decisions on using molecular approaches and creates the risk that some will use microsatellites without understanding the steps needed to evaluate the quality of a genetic data set. The goals of this paper are to provide an overview of the role of microsatellite marker, to encourage the use and consistent reporting of through microsatellite marker to ensure high-quality data and to suggest directions for future improvement of the breed conserving those at risk, with the aim of minimizing the loss of diversity among breeds.
... The genetic diversity has been defined in terms of molecular coancestry distances (Eding and Meuwissen, 2001). Global diversity (GD) and breed diversity are computed averaging the corresponding values for all the within-or between-breed pairs of individuals. ...
Article
Full-text available
The preservation of genetic variability of autochthonous poultry breeds is crucial in global biodiversity. A recent report revealed small breed size and potential risk of extinction of all native Italian poultry breeds; therefore, a correct assessment of their genetic diversity is necessary for a suitable management of their preservation. In this work, we provided an overview of the contribution to poultry biodiversity of some Italian autochthonous breeds reared in conservation centers devoted to local biodiversity preservation. The level of genetic diversity, molecular kinship, inbreeding, contribution to overall genetic diversity, and rate of extinction of each breed were analyzed with a set of 14 microsatellite loci in 17 autochthonous chicken breeds. To evaluate genetic variability, total number (Na), and effective number (Ne) of alleles, observed (Ho) and expected (He) heterozygosity, and F (Wright’s inbreeding coefficient) index were surveyed. The contribution of each analyzed breed to genetic diversity of the whole dataset was assessed using MolKin3.0; global genetic diversity and allelic richness contributions were evaluated. All the investigated loci were polymorphic; 209 alleles were identified (94 of which private alleles). The average number of alleles per locus was 3.62, and the effective number of alleles was 2.27. The Ne resulted lower in all breeds due to the presence of low-frequency alleles that can be easily lost by genetic drift, thus reducing the genetic variability of the breeds, and increasing their risk of extinction. The global molecular kinship was 27%, the average breed molecular kinship was 53%, and the mean inbreeding rate 43%, with a self-coancestry of 78%. Wright’s statistical analysis showed a 41% excess of homozygous due to breed genetic differences (34%) and to inbreeding within the breed (9%). Genetic variability analysis showed that 11 breeds were in endangered status. The contribution to Italian poultry genetic diversity, estimated as global genetic diversity, and ranged from 30.2 to 98.5%. In conclusion, the investigated breeds maintain a unique genetic pattern and play an important role in global Italian poultry biodiversity, providing a remarkable contribution to genetic variability.
... Additionally, different inbreeding coefficients (F IS , F ROH ) or parameters for the assessment of coancestry (e.g., kinship coefficients) were computed. This is consistent with the findings of Eding and Meuwissen [8] and Toro et al. [9], which emphasize that the assessment of marker-estimated kinships can provide information on genetic diversity within and between breeds, especially in the context of conservation. It is also evident from the present results that the frequency of assessed diversity parameters depends on the genotyping technique. ...
Article
Full-text available
Globally, many local farm animal breeds are threatened with extinction. However, these breeds contribute to the high amount of genetic diversity required to combat unforeseen future challenges of livestock production systems. To assess genetic diversity, various genotyping techniques have been developed. Based on the respective genomic information, different parameters, e.g., heterozygosity, allele frequencies and inbreeding coefficient, can be measured in order to reveal genetic diversity between and within breeds. The aim of the present work was to shed light on the use of genotyping techniques in the field of local farm animal breeds. Therefore, a total of 133 studies across the world that examined genetic diversity in local cattle, sheep, goat, chicken and pig breeds were reviewed. The results show that diversity of cattle was most often investigated with microsatellite use as the main technique. Furthermore, a large variety of diversity parameters that were calculated with different programs were identified. For 15% of the included studies, the used genotypes are publicly available, and, in 6%, phenotypes were recorded. In conclusion, the present results provide a comprehensive overview of the application of genotyping techniques in the field of local breeds. This can provide helpful insights to advance the conservation of breeds.
... The identity by state reflected in the molecular relationship r Mij and the identity by descent (IBD) reflected in the pedigree relationships A ij have a well-known relationship that is periodically revisited (Li andHorvitz 1953 , 1953 ;Eding and Meuwissen 2001 ;Powell, Visscher, and Goddard 2010 ;Toro, García-Cortés, and Legarra 2011). A formal derivation can be found in (Cockerham 1969) (see also (Toro, García-Cortés, and Legarra 2011) . ...
Article
The procedures outlined by the Food and Agriculture Organisation of the United Nations (FAO) guidelines for managing small populations at risk are reviewed. These cover identification of breeds at risk, prioritising and deciding upon actions, managing in vivo populations at risk, and managing gene banks of cryoconserved material.
Article
The usefulness of molecular genetic markers as a tool for the conservation, characterisation and differentiation of domestic animal populations are shown in the following text, that summarises diverse applications to Iberian pigs.
Book
The overall goal of this study was to characterize the genetic diversity of the Vietnamese local chicken breeds and to identify population priorities for conservation. The specific aims were 1) to assess and explain the population genetic structure of the Vietnamese breeds, 2) to characterize the Vietnamese breeds in relation to the Chinese breeds and wild chickens, 3) to estimate conservation potentials for conservation priorities of the Vietnamese breeds, and 4) to define an optimal allocation of limited conservation funds to them
Article
A new measure of the extent of population subdivision as inferred from allele frequencies at microsatellite loci is proposed and tested with computer simulations. This measure, called R(ST), is analogous to Wright's F(ST) in representing the proportion of variation between populations. It differs in taking explicit account of the mutation process at microsatellite loci, for which a generalized stepwise mutation model appears appropriate. Simulations of subdivided populations were carried out to test the performance of R(ST) and F(ST). It was found that, under the generalized stepwise mutation model, R(ST) provides relatively unbiased estimates of migration rates and times of population divergence while F(ST) tends to show too much population similarity, particularly when migration rates are low or divergence times are long [corrected].
Article
The coefficient of coancestry (fAB) between individuals A and B is the classical measure of genetic relationship. fAB is determined from pedigree records and is the probability that random alleles at the same locus in A and B are copies of the same ancestral allele or identical by descent (ibd). Recently, the proportion of molecular marker variants shared between A and B (SAB) has been used to measure genetic relationship. But SAB is an upwardly-biased estimator of fAB, especially between distantly-related lines. fAB, SAB, and adjusted (to remove bias) estimates of molecular marker similarity (fABM) were compared. RFLP banding patterns at 46 probe-restriction enzyme combinations were obtained for 23 maize inbred lines derived from the Iowa Stiff Stalk Synthetic (BSSS) maize (Zea mays L.) population, and for 4 non-BSSS lines. fABMwas estimated as {Mathematical expression}, where δA (or δB) was the average proportion of RFLP variants shared between inbred A (or inbred B) and the non-BSSS lines. The average fAB among 253 pairwise combinations of BSSS lines was 0.212, whereas the average SAB was 0.397. The average fABMwas 0.162, indicating that the upward bias in SAB was effectively removed. SAB and fAB were significantly different (α = 0.05) in 76.3% of the comparisons, whereas 24.9% of the fABMvalues differed significantly from fAB. The latter result suggests that selection and/or drift were present during inbred line development and that fAB may not be an accurate measure of the true proportion of ibd alleles between two lines. Cluster analyses based on SABMand fABMgrouped lines according to pedigree, although several exceptions were noted. The presence of shared molecular marker variants between unrelated lines must be considered when setting SAB-based minimum distances for varietal protection. Under simplified conditions, more than 250 molecular marker loci are necessary to obtain sufficiently precise estimates of coefficient of coancestry using molecular markers.
Article
Introduction There has been a veritable explosion of projects in recent years aiming to calculate genetic distances between domesticated breeds of animals and the number of such projects is still increasing. The extent of this can be appreciated by a glance at the recent proceedings of the 26th International Conference on Animal Genetics (ISAG 1998). All domesticated species are being targeted, using breeds from both developed and developing countries and projects are now almost exclusively based on microsatellite marker loci. Although the goal in a few cases is to provide insights into the history of animal domestication (see, for example, M ac H ugh et al. 1997; L au et al. 1998), the most common justification for genetic distancing projects is their importance for helping the decision‐makers to identify genetically unique breeds so that they may be prioritized for breed conservation purposes (e.g. H all and B radley 1995; M oazami ‐G uodarzi et al. 1997; C rawford and L ittlejohn 1998). On a world‐wide basis there are roughly 3000 breeds and breed varieties of the seven major mammalian species – cattle, pig, sheep, goat, horse, donkey and buffalo (FAO 1995). Of those with population data, 23% are either endangered or critical (FAO 1995). In addition, in the current century it is estimated that at least 600 breeds have been lost (H all and R uane 1993). On the poultry side, the picture is even worse as over half the breeds of the five major species (chicken, domestic duck, muscovy duck, goose and turkey) are thought to be endangered or critical (FAO 1995). At the same time, there is often a lack of even the most rudimentary information on many of these breeds. Basic phenotypic data, including approximate figures for population sizes, are currently available on only 50% of the world’s animal genetic resources (AGR) (H ammond 1998). There is therefore an urgent need to act now to prevent the rapid erosion of AGR. This is especially true for breeds in developing countries, where many will be lost without ever having been adequately characterized or studied (K& ouml ; hler ‐R ollefson 1997). However, resources (both in terms of available manpower and finances) are limited in this area and appropriate use of these resources is therefore of vital importance. Given the large amount of current activity in the area of genetic distancing of domestic breeds, the aim of this article is to critically examine the value that genetic distance projects have for breed conservation.
Article
Population management programs recognize the importance of managing genetic diversity in species that are candidates for eventual reintroduction to natural habitats. The planned 1989 release of captive‐born Guam rails (Rallus owstoni), extinct in the wild since 1986, to the Northern Mariana island of Rota provides an opportunity to evaluate various management options for selecting breeders to produce young rails for release. Six options were compared to determine which one best replicated genetic diversity in the original captive founder population. Heterozygosity, allelic dimity, founder contribution, and founder genome equivalents were used as indicators of genetic diversity. Option 1: Randomly choose adults for breeding. Option 2: Choose the most fecund captive breeders. Option 3: Use allozyme data to choose parents that will produce the most genetically diverse chicks. Option 4: Choose pairs to equalize founder contribution in the population. Option 5: Choose pairs to maximize allelic diversity. Option 6: Choose pairs to maximize founder genome equivalents. Genetic management options based on pedigree analysis (#4, 5, 6) produced the most genetically diverse release populations for Rota. Managing founder genome equivalents produced a balance between equalizing founder contribution and maximizing allelic diversity, and provided the most genetically diverse population. Randomly selecting breeding pain, choosing the best captive breeding stock, or managing by allozyme data resulted in substantially reduced genetic diversity. Results illustrate that some of the most common approaches to population management or population reintroduction may produce significant loss of genetic diversity, whereas certain genetic management options may actually increase genetic diversity over current population levels.
Article
 Genetic relationship within a population can be measured by average coancestry. This can also be expressed as an effective number which represents the relative genetic diversity of the population. The goal of breeding can be formulated to maximise genetic value minus average coancestry times a constant (the “penalty constant”). An iterative search algorithm can then be used to find the best selections for meeting this goal. Two such algorithms, one for a fixed number of selections and the other for a variable optimum number, were applied to select a mixture of field-tested Norway spruce clones with known parents. The results were compared with those from the conventional method of restricting parental contributions to the selected population as a means to control diversity. Coancestry-adjusted selection always yielded more gain than restricted selection at a given effective population size (except under circumstances where the methods were equivalent). Expressed another way, at any given level of gain, coancestry-adjusted selection maintained a larger effective population size than did restricted selection. The relative superiority of coancestry-adjusted selection declined when the effective population size approached the lowest value, that at which no penalty or restriction was applied. The method was extended by the second search algorithm to optimise the selected number of clones. The optimal number of clones can be rather large when diversity is heavily valued, but the reduction in genetic gain becomes large.
Article
After discussing some general problems in measuring sustainability, an identification of measurable criteria for the major agroecological problems is proposed, derived from explicit issues of unsustainability. The proposed criteria are briefly discussed. Factors which might influence the effect of inclusion of livestock in an agricultural system on each criterion are also discussed.It is argued that identification of livestock-specific criteria is impossible because of the large heterogeneity of livestock production systems and the non-linear relation between livestock-specific criteria and agroecological criteria. Therefore, a system-specific analysis is needed to assess the overall effect of livestock inclusion in an agricultural system on each of the proposed general criteria for sustainability. These are: demand and supply of consumable livestock products; potential human population supporting capacity; land area utilized for agriculture; degree of equity in food distribution; variability of production; net annual soil losses; nutrient balances and losses; water availability and utilization; soil organic matter; fossil energy and drug utilization.Such a system-specific analysis will also allow formulation of measurable criteria for other objectives, and an assessment of trade-offs between the criteria. Recognition of such trade-offs, together with the reduced acceptability of external effects (both in time and space), might appear to be the most important notion of the sustainability concept.
Article
A measure of genetic distance (D) based on the identity of genes between populations is formulated. It is defined as D = -logeI, where I is the normalized identity of genes between two populations. This genetic distance measures the accumulated allele differences per locus. If the rate of gene substitution per year is constant, it is linearly related to the divergence time between populations under sexual isolation. It is also linearly related to geographical distance or area in some migration models. Since D is a measure of the accumulated number of codon differences per locus, it can also be estimated from data on amino acid sequences in proteins even for a distantly related species. Thus, if enough data are available, genetic distance between any pair of organisms can be measured in terms of D. This measure is applicable to any kind of organism without regard to ploidy or mating scheme.