ArticlePDF Available

Estimation of Pairwise Relatedness With Molecular Markers

September 1999
Genetics 152(4):1753-66

September 1999
152(4):1753-66

DOI:10.1093/genetics/152.4.1753

Source
PubMed

Authors:

Michaela Lynch

Dundalk Institute of Technology

Kermit Ritland

University of British Columbia

Applications of quantitative genetics and conservation genetics often require measures of pairwise relationships between individuals, which, in the absence of known pedigree structure, can be estimated only by use of molecular markers. Here we introduce methods for the joint estimation of the two-gene and four-gene coefficients of relationship from data on codominant molecular markers in randomly mating populations. In a comparison with other published estimators of pairwise relatedness, we find these new "regression" estimators to be computationally simpler and to yield similar or lower sampling variances, particularly when many loci are used or when loci are hypervariable. Two examples are given in which the new estimators are applied to natural populations, one that reveals isolation-by-distance in an annual plant and the other that suggests a genetic basis for a coat color polymorphism in bears.

Single-locus sampling variances for r and as a function of number of alleles at loci with uniform and triangular allele-frequency distributions. Results are given for nonrelatives (NR), half sibs (HS), full sibs (FS), and parents and offspring (PO). The plotted values were obtained from Monte Carlo simulations of 10 loci (all with the same allelefrequency profile) for 32,000 pairs of individuals. Sampling variances of multilocus estimates of r and are obtained by dividing the plotted values by the number of loci, keeping in mind that somewhat higher values are expected if 10 loci are observed.

…

Single-locus sampling variances for estimates of

…

Single-locus sampling variances for estimates of r for multiallelic loci, derived with the regression method (R), the correlation method (C), the similarity-index method (S), and the Queller-Goodnight method (Q) for uniform and triangular allelefrequency distrubutions. The results for the regression method apply to analyses based on 10 loci and were obtained by Monte Carlo simulations; additional loci yield slightly lower values. The results for the correlation and the similarity-index methods are exact solutions based on expected genotype combinations.

…

Distributions of estimates of pairwise relatedness among bears not sharing the same coat color and

…

Estimates of pairwise relatedness in the

…

Figures - uploaded by Kermit Ritland

Content may be subject to copyright.

Content uploaded by Kermit Ritland

Content may be subject to copyright.

Estimation of Pairwise Relatedness With Molecular Markers

Michael Lynch* and Kermit Ritland

†

*Department of Biology, University of Oregon, Eugene, Oregon 97403 and

†

Department of Forest Sciences,

University of British Columbia, Vancouver, British Columbia V6T1Z4, Canada

Manuscript received June 26, 1998

Accepted for publication April 19, 1999

ABSTRACT

Applications of quantitative genetics and conservation genetics often require measures of pairwise

relationships between individuals, which, in the absence of known pedigree structure, can be estimated

only by use of molecular markers. Here we introduce methods for the joint estimation of the two-gene

and four-gene coefﬁcients of relationship from data on codominant molecular markers in randomly

mating populations. In a comparison with other published estimators of pairwise relatedness, we ﬁnd

these new “regression” estimators to be computationally simpler and to yield similar or lower sampling

variances, particularly when many loci are used or when loci are hypervariable. Two examples are given

in which the new estimators are applied to natural populations, one that reveals isolation-by-distance in

an annual plant and the other that suggests a genetic basis for a coat color polymorphism in bears.

OEFFICIENTS of relationship between pairs of of relatedness can only be achieved through inferences

with molecular markers (Avise 1995).individuals play a central role in many areas of

geneticsandbehavioralecology.Forexample,inquanti- A third ﬁeld of inquiry within which pairwise relat-

edness plays a signiﬁcant role is the evolution of socialtative genetics, thephenotypic resemblanceof relatives,

which forms the basis for the empirical estimation of behavior. Studies inthis area are largely focused around

Hamilton’s (1964) theoryof kinselection,which statescomponents of genetic variance, is a direct function of

the probability that individuals have one or two genes that the evolutionary advantage of an altruistic act de-

pends on whether the cost to the donor exceeds theidentical by descent at a locus. Given such probabilities,

causalcomponents of variance(such as theadditive and beneﬁt to the recipient multiplied by the relatedness

between the two individuals. Because most such studiesdominance geneticvariance) canbe estimatedfromthe

phenotypic covariance (Falconer and Mackay 1996; involveﬁeld populations whereparentage is not directly

observed, indirect inferences about relatedness mustLynch and Walsh 1998). In studies of laboratory or

domesticated populations, where investigators can be again be made with molecular markers.

In all of the above-mentionedapplications ofmolecu-certain of the degrees of relationship among observed

individuals, the application of conventional quantita- lar markers, it is an implicit assumption that such mark-

ers provide reasonable, if not excellent, estimates oftive-genetic methodology is straightforward. Major un- relatednesscoefﬁcients.Yet,therearefewexistingmeth-

certainties about the relationships among individuals ods for the estimation of pairwise relatedness for which

from natural populations are the primary impediment the statistical properties are well understood or well

to extending quantitative-genetic analysis to ﬁeld stud- behaved. Several estimators have been developed for

ies, but Ritland (1989, 1996a) has suggested how this pairwise relatedness using the rather specialized data

problemmightbeovercome by regressing pairwise mea- provided by DNA-ﬁngerprint proﬁles (Lynch 1988; Li

sures of phenotypic similarity on pairwise estimates of et al. 1993; Geyer and Thompson 1995). Following

relatedness obtained with molecular markers. up on earlier work of Pamilo and Crozier (1982),

Pairwise measures of relatedness also play a role in the Queller and Goodnight (1989) developed marker-

ﬁeld of conservation genetics. For example, in captive based estimators for within-group relatedness, but these

breeding programs, substantial effort is being made to are of somewhat limited applicability in the estimation

ensure that matings are minimized between close rela- of pairwise relationship because of their poor behavior

tives to reduce the loss of genetic variation by random with diallelic loci. An efﬁcient method-of-moments esti-

geneticdrift.Ifthe potential parents are derived directly mator, recently developed by Ritland (1996b), pro-

fromwild-caughtstock or are descendantsof individuals vides a basis for the joint estimation of identity-by-

of unknown relationship, a relative ranking of degrees descent at both the genic and genotypic levels. Ritland’s

approach, which is based on a model involving joint

probabilities of the two genotypes of a pair, can be quite

Corresponding author: Michael Lynch, Department of Biology, Uni-

complex computationally and is ill-behaved with some

versity of Oregon, Eugene, OR 97403.

E-mail: mlynch@oregon.uoregon.edu

gene frequencies. Maximum-likelihood methods have

Genetics 152: 1753–1766 (August 1999)

1754 M. Lynch and K. Ritland

been developed by Thompson (1975, 1976, 1986) to edness and genetic components of variance (Cocker-

ham 1971; Jacquard 1974). Higher-order terms musttest for speciﬁc types of relationship.

In this article, we introduce a simple method for ob- also beadded tothe previousexpression whenepistatic

sources of genetic variance are present, but providedtaining unbiased estimates of pairwise relationship coef-

ﬁcients. Its simplicity arises from the use of a regression the population is randomly mating, no relationship co-

efﬁcients are required beyond r

and D

(Kempthorneapproach for inferring relationship—one individual of

a pair serves as a “reference,” and the probabilities of 1954; Lynch and Walsh 1998).

In the following analyses, we focus on the estimationthe locus-speciﬁcgenotypesintheother“proband”indi-

vidual are conditioned on those of the reference. Aside of r

and D

, as these are the relationship coefﬁcients

that are of primary practical utility. Our computer simu-from its ease of application and unbiased nature, this

method has two very useful features—it generates joint lations showed that estimates of φ

have much higher

sampling variance than those of r

and D

, enough soestimates of both the two- and four-gene coefﬁcients

of relatedness, and it yields simple expressions for the that the accurate measurement of φ

is beyond reach

unless very large numbers of informative loci can besampling variance of these coefﬁcients. This latter fea-

ture provides a convenient means for optimizing the assayed. This large sampling variance does not carry

over greatly to estimates of the composite measure r

,use of information derived from different loci. Follow-

ing our derivation of the regression method, we com- because there is also a very large negative sampling

covariance between the two component coefﬁcients, φ

pare its performance against that of other methods and

then provide two examples of its application to studies and D

Genotypic probabilities: There are two fundamentalof natural populations. ways to set up a model for the genotypic probabilities

in a pair of individuals. The ﬁrst approach, adopted by

JOINT ESTIMATION OF TWO-GENE AND

Ritland (1996b), speciﬁes the joint probability of both

FOUR-GENE COEFFICIENTS

genotypes. The second approach, adopted here, speci-

ﬁes the conditional genotypic probability of a proband

Throughout, we focus on the traditional deﬁnition individual y, given the genotype of the reference indi-

of relatedness for individual pairs of diploid individuals, vidual x. We refer to these two approaches as “correla-

52Q

, where the coefﬁcient of coancestry, Q

,isthe tion” and “regression” methods in the sense that they

probabilitythat,foranyautosomallocus,arandomgene are symmetrical vs. asymmetrical measures. Both ap-

taken from individual xis identical by descent with a proaches allow the joint estimation of r

,φ

, and D

random gene taken from individual y. For monozygotic but as we will see, correlation and regression estimators

twins(and clonemates), r

51;for parent-offspring and differ substantially in terms of complexity and statisti-

full-sibrelationships,r

50.5;andfor second- and third- cal properties. It is important to note that our use of

order relationships, r

is equal, respectively, to 0.25 and the terms correlation and regression refers to the un-

0.125. derlying statistical model and not to the estimators

The relatedness coefﬁcient for two individuals (xand themselves. The estimators developed here and in Rit-

y) is a linear function of two “higher-order” coefﬁcients, land (1996b) are more properly termed “method-of-

moments” estimators.

5φ

21D

. (1) Consider a single locus with nalleles, and let xbe the

Ifweconsiderallfourgenes possessed by two individuals reference individual (with alleles aand b) and ybe the

at a locus, φ

is the probability that a single gene in xproband individual (with alleles cand d). The condi-

is identical by descent with one in y, and D

is the tional probabilities for the n(n11)/2 possible geno-

probability that each of the two genes in xis identical types in ycan be expressed as a function of φ

, and

by descent with one in y. For parents and offspring, the known allele frequencies,

51 and D

50; for full sibs, φ

50.5 and D

5P(y5cd|x5ab)5P

(cd)·(12φ

)

0.25; and for half sibs, φ

50.25 and D

50. For many

applications, such a subdivision of r

is unnecessary, 1P

(cd|ab)·φ

(cd|ab)·D

but in quantitative genetics, a knowledge of the higher- (2)

order coefﬁcient D

is desirable because the expected

genetic covariance between individuals is deﬁned to be where P

(cd) is the Hardy-Weinberg probability of geno-

typecd,andP

(cd|ab)andP

(cd|ab)denotetheprobabili-

,ties of genotype cd in ygiven genotype ab in x, the ﬁrst

being conditional on the two individuals having onewheres

ands

aretheadditiveanddominancecompo-

nents of genetic variance for a quantitative trait. This gene identical by descent and the second being condi-

tional on two genes being identical by descent.expressionassumesarandom-matingpopulation,which

we also assume throughout. Inbreeding introduces the Regression estimators: Equation 2 provides the foun-

dation for the regression-based estimators that we nowneed for additional higher-order coefﬁcients of relat-

1755Estimation of Relatedness

explore. To illustrate the general approach, we ﬁrst proband individual alleles cand d. If the reference indi-

vidual is homozygous, S

51, while if it isheterozygous,derive estimators conditioned on the observation of a

homozygote reference genotype. In this straightforward S

50.Likewise,if allele afrom the referenceindividual

is the same as allele cfrom the proband, S

51, whilecase, twoprobabilitiesareinformativeaboutx’srelation-

ship with individual y:P(ii|ii) and P(i·|ii), the condi- S

50 if it is different. In total, there are six S’s corre-

sponding tothe sixwaysof choosingtwo objectswithouttional probabilities that the two individuals have two vs.

one pair of genes identical in state at the locus, with a replacement from apool offour objects. Letting p

and

bethe frequencies of alleles aand bin the population,dot denoting any allele other than i. The probability of

no genes identical in state, P(··|ii), provides no addi- the fully general expressions for the two coefﬁcients of

primary interest aretional information, as it simply equals [1 2P(ii|ii)2

P(i·|ii)]. Letting p

be the frequency of the ith allele,

)1p

)24p

(1 1S

)(p

)24p

(5a)

from Equation 2,

P(ii|ii)5p

(1 2p

)φ

1(1 2p

(3a)

52p

)2p

)1(S

)

(1 1S

)(1 2p

)12p

P(i·|ii)52p

(1 2p

)1(1 2p

)(1 22p

)φ

(5b)

22p

(1 2p

. (3b) In actual practice, there is no particular reason to use

one member of a pair of individuals as the referenceAssuming that we know the allele frequency p

in ad-

vance, these two equations can be rearranged to yield as opposed to the other member. Thus, the reciprocal

estimates r

and r

, etc., can be arithmetically averagedestimators for the two unknown relationship coefﬁ-

cients, to further reﬁne the pairwise relationship estimates for

the pair of individuals xand y. In all of the following

analyses, we rely on such reciprocal estimates, as the

5(1 1p

ˆ(i·|ii)12p

ˆ(ii|ii)22p

(1 2p

)

(4a) arithmeticaverageof the two reciprocal estimates gener-

allyhas a lower statistical variancethan a single estimate.

ˆ(i·|ii)1(1 22p

ˆ(ii|ii)

(1 2p

)

, (4b) In principle, the root of the product of the two recipro-

cal estimates could be used, but this leads to undeﬁned

and from Equation 1, estimates in the event that one is negative.

Multilocusestimates:Estimatesofrelatednessareusu-

ˆ(i·|ii)12P

ˆ(ii|ii)22p

2(1 2p

). (4c) allybasedondatafrommultipleloci.Undertheassump-

tionthat the marker loci are unlinked, the locus-speciﬁc

estimates are independent. However, any averaging of

Throughout, we use a ∧to distinguish an estimator the locus-speciﬁc estimates to obtain overall estimates

from its parametric value. For any pair of observed indi- of r

and D

should account for the dramatic among-

viduals, the two probabilities necessary for the solution locus differences of sampling variance that can arise

of these equations, P

ˆ(i·|ii) and P

ˆ(ii|ii), are estimated from both differences in reference genotypes (e.g.,com-

as 0/1 variables, with 1’s being given to observed two- mon homozygote vs. rare heterozygote) and in levels

genotype combinations and 0’s being given to unob- of variation (loci with more alleles being more informa-

served combinations. (Both probabilities are 0 if the tive).

proband has no alleles in common with the reference.) Let w

r,x

(,) and w

(,) denote the weights to be used

Thus, for example, when individual ycontains 2, 1, and for the ,th locus in the overall estimates of r

and D

0ialleles, the estimate r

is 1, (1 22p

)/[2(1 2p

)], and let W

r,x

and W

be the sums of the weights over

and 2p

/(1 2p

), respectively. all Lloci. The composite estimates of the relationship

The appendix provides a parallel set of results for coefﬁcients for xand yare then

heterozygotes at diallelic and multiallelic loci. Diallelic

heterozygous reference individuals introduce no new r

r,x

(,)r

(,) (6a)

problems, but with multiallelic loci, there are six classes

of conditional probabilities for heterozygous reference D

(,)D

(,) . (6b)

individuals. In the latter case then, the number of ob-

served 0/1 variables exceeds the number of unknowns

(φand D). To deal with this situation, we provide a With statistically independent marker loci, the locus-

speciﬁc weights that minimize the sampling variance ofweighted least-squares approximation.

A general estimator, which covers all three cases, is the overall estimates φ

and D

are simply the inverses

ofthe sampling variancesof the locus-speciﬁc estimates.best described by introducing “indicator variables” for

the sharing of pairs of alleles (as opposed to more com- As noted in the appendix, we cannot be very certain of

the numerical values of the weights because they areplex patterns of sharing as used earlier). As before, let

the reference individual have alleles aand band the functions of the parameters that we are trying to esti-

1756 M. Lynch and K. Ritland

mate, but approximations can be obtained by simply in which 10 informative loci have been sampled. At that

assuming that xand yare unrelated. The locus-speciﬁc point, the lower asymptotic value of the single-locus

weights are then given by the inverses of the sampling sampling variance is closely approximated in most situa-

variances of estimates of the relatedness coefﬁcients for tions, and 10 loci is a good approximation of the sam-

nonrelatives conditional on the genotype in x. General pling scheme employed in many empirical studies, with

expressions for the weights are given by diallelic locicorresponding to isozymes and multiallelic

loci corresponding to microsatellites.

r,x

(,)51

Var[r

(,)] 5(1 1S

)(p

)24p

(7a) For diallelic loci, the asymptotic sampling variance

per locus for r

ˆis equal to 1 in the case of nonrelatives

and somewhat lower for related individuals (even

(,)51

Var[D

(,)] 5(1 1S

)(1 2p

)12p

thoughnonoptimalweights are employed withrelatives;

(7b) Figure 1). With allele frequencies approaching 0.5, the

with S

equal to 1 when xis homozygous and equal to optimal weights of all reference genotypes approach

0 when xis heterozygous. equality regardless of the degree of relationship, be-

Properties of the regression estimators: Extensive cause all alleles are then equally informative. Thus, the

computer simulations demonstrated that the regression asymptotic sampling variances near allele frequencies

estimators given above are essentially unbiased, regard- of 0.5 are the best that one could expect to achieve

less of the numbers of loci or the values of φand D.even if the correct weights were used. Because even with

Thus, the primary issues of interest are the magnitudes close relatives, the sampling variance is never less than

of the sampling variances of the estimators and their about 0.4 per locus, these results imply that with a large

sensitivity to the degree of actual relationship and to number of loci, the expected standard error of r

ˆis

the allele-frequency distribution. generally on the order 1/

√

Lwhen diallelic loci are as-

We obtained estimates of the sampling variances of sayed, somewhat greater if loci with extreme allele fre-

the regression estimators by Monte Carlo simulation, quencies are included, and slightly less with close rela-

assuming gene frequencies were known without error tives.

and assuming a random mating population with un- As in the case of r

ˆ, the single-locus sampling variance

linked marker loci. Reference genotypes were drawn of D

ˆdepends on the number of loci sampled, but the

randomly according to their Hardy-Weinberg frequen- sensitivity to this is reduced at moderate allele frequen-

cies, and the genotypes of the paired individuals were cies (Figure 1). For all degrees of relationship, the as-

then obtained from the conditional genotype distribu- ymptotic single-locus sampling variance for D

ˆdeclines

tions given the reference genotype and the particular as allele frequencies become more equitable (Figure

relationship. For multiallelic loci, two types of allele- 1).It can exceed 10 whenallele frequencies are extreme

frequencydistributions were considered: uniformdistri- and is never much ,1 with any type of relationship.

butions,in which thefrequencies of each of the nalleles Thus, as in the case of r

ˆ, with diallelic loci, the best

per locus were equal to 1/n, and “triangular” distribu- that one can ever expect to achieve with the regression

tions, in which the frequencies of alleles followed the estimator is a multilocus standard error of D

ˆequal to

proportions 1, 2, ...,n. In all of the following ﬁgures, 1/

√

we report the single-locus sampling variances of the In principle, an increase in the number of alleles per

relationshipcoefﬁcients.Foranalysesinvolvingmultiple locus should reduce the sampling variance of related-

loci with identical allele frequencies, the sampling vari- ness estimates, because alleles that are identical in state

anceofmultilocus estimates can be obtained bydividing will be more reliable as indicators of identity by descent.

the plotted values by the number of loci (L). For nonrelated individuals, the asymptotic single-locus

A special property of the regression estimator is that sampling variance of r

ˆis very close to 1/(n21), regard-

the expected single-locus sampling variance declines less of the form of the allele-frequency distribution (Fig-

with increasing numbers of unlinked loci, down to an ure 2). With parents and offspring, the sampling vari-

asymptotic value (Figure 1). This dependence on num- ance is up to 50% less than this, while with other types

ber of loci arises with the regression estimator because of relatives it is somewhat higher when alleles with low

the estimation variances (the weights) differ among al- frequency are common. Again, with an even allele-fre-

ternative reference genotypes at the same locus (for quency distribution, all reference genotypes are equally

example, a reference genotype having rarer alleles gives informative regardless of the degree of relationship, so

estimates with lower variance). By contrast, the correla- the results for this case can be viewed as the minimum

tion estimator of Ritland (1996b) is not conditioned sampling variance that one can expect to achieve with

upon observed genotype, and its variance only depends the regression estimator—except in the case of parents

on the distribution of gene frequencies in the popula- and offspring, a standard error of r

ˆless than about

tion. Although Figure 1 details the inﬂuence of the 1/

√

L(n21) is not achievable. Relative to the situation

number of loci on the variance of the regression estima-

tor, for the remaining analyses we focus on the situation with r

ˆ, the rate of reduction in the asymptotic sampling

1757Estimation of Relatedness

Figure 1.—Single-locus sam-

pling variances for estimates of

pairwise rand Dfor the range

of possible gene frequencies at

diallelicloci. For each gene fre-

quency (in increments of 0.01)

and degree of relationship,

random pairs of multilocus ge-

notypes were obtained by

Monte Carlo simulation for

32,000 individuals. For each

pair of individuals, the two re-

ciprocal weighted estimates

were obtained and then aver-

agedto obtain the pairwise esti-

mates.Solid lines, large dashes,

medium dashes, and short

dashes denote estimates based

on 1, 5, 10, and 25 loci, respec-

tively.

variance of D

ˆwith increasing nis more rapid (Figure of D

. However, for situations in which one can be

2).For nonrelatives, the asymptotic single-locusvariance reasonably certain that the dominance genetic variance

closely approximates 2/[n(n21)] regardless of the for a trait is negligible, or when one can be certain that

form of the allele-frequency distribution. collateral relatives (e.g., pairs of individuals, such as full

sibs and double ﬁrst cousins, that share paternal and

maternal genes) are absent, D

can be ignored. In addi-

COMPARISON WITH OTHER ESTIMATORS

tion, in many applications in conservation genetics and

behavioral ecology, the composite estimate r

may pro-

Asnoted above, for applicationsin quantitative genet- vide all the information that is needed. Four additional

ics, there is a need for separate estimates of r

and D

estimators of r

, all of which are unbiased, have been

because the additive genetic covariance between indi- previously described.

viduals is a function of the composite measure r

whereas the dominance genetic covariance is a function A simple estimator based on the sharing of alleles,

1758 M. Lynch and K. Ritland

Figure 2.—Single-locus sam-

pling variances for rand Das

a function of number of alleles

at loci with uniform and trian-

gular allele-frequency distribu-

tions.Results are given fornon-

relatives (NR), half sibs (HS),

full sibs (FS), and parents and

offspring (PO). The plotted

values were obtained from

Monte Carlo simulations of 10

loci (all with the same allele-

frequency proﬁle) for 32,000

pairs of individuals. Sampling

variances of multilocus esti-

mates of rand Dare obtained

by dividing the plotted values

by the number of loci, keeping

in mind that somewhat higher

values are expected if ,10 loci

are observed.

proposedby Lynch (1988) foranalyses employing DNA above, Equation 8 does not return estimates of r

.1.

ﬁngerprint proﬁles, can be generalized to any set of However, like the weighted regression estimator, Equa-

codominant markers. The following expression in- tion 8 does generate negative estimates whenever the

cludes the slight modiﬁcation suggested by Li et al. observed S

is ,S

because of sampling error. In the

(1993). Deﬁne the similarity index, S

,tobetheaverage following, Equation 8 is referred to as the similarity-

fraction of genes at a locus in a reference individual index estimator.

(here either xor y) for which there is another gene in Like Equation 8, Ritland’s (1996b) method-of-

the proband that is identical in state. Thus, S

51 when moments estimator for r

considers the joint distri-

(x5ii,y5ii)or(x5ij,y5ij), S

50.75 when (x5bution of both genotypes in a symmetrical way. The

ii,y5ij) or vice versa, S

50.5 when (x5ij,y5ik), differing information provided by alternative alleles is

andS

50when (x5ij,y5kl). Asingle-locus estimator incorporated by considering the incidence of each of

for r

is then the npossible alleles at the locus. The observed data are

summarized as an array of nsimilarities, where the ith

12S

, (8) element(S

)is equal to 0.0 (at most, one ofthe individu-

als contains allele i), 0.25 (both individuals contain a

where S

(2 2p

) is the expected value of Sat single iallele), 0.5 (one individual contains two and the

the locus for unrelated individuals in a random-mating other individual one ialleles), or 1.0 (both individuals

population.Thissimple estimator derives fromthe prin- are ii homozygotes). Estimates of r

derived for each

ciple that if two individuals are related to degree r

, the allele are combined into a single estimate for the locus

expected fraction of genes that they have identical in by using weights that assume zero relationship (as with

state is the sum of the fractions shared because of iden- the weighted regression estimators derived above),

tity-by-descent and because of identity-in-state (but not

identity-by-descent), E(S

)5r

1(1 2r

. Note that r

n21

. (9)

unlike the weighted regression estimator described

1759Estimation of Relatedness

[Note that the r

in this article is twice that deﬁned in

the Ritland (1996b) article.]

A simpler estimator, also based upon the joint distri-

bution of genotypes, was described by Ritland (1996b)

and earlier workers (Li and Horvitz 1953; Weir 1996,

Equation 2.28), primarily in relation to estimating in-

breeding coefﬁcients. Deﬁning an alternative similarity

index such that S9

51 when (x5ii,y5ii), S9

50.5

when (x5ij,y5ij)or(x5ii,y5ij), S9

50.25 when

(x5ij,y5ik), and S9

50 when (x5ij,y5kl), then

52(S9

)

12J

, (10)

where J

is the expected homozygosity at the

locus.Equation 10 isequivalent to anunweighted corre-

lation estimator. Because our analyses showed it to be

uniformly worse in terms of sampling variance than all

of the estimators presented here, we do not consider it

any further.

Finally, we note Queller and Goodnight’s (1989)

estimator of r

. Although their index is primarily de-

signed for estimating the average degree of relatedness

withingroupsofindividuals, it can be expressed in terms

of the same parameters that we employ with our Equa-

tions5aand5btoobtainapairwiseestimator for individ-

uals xand y,

50.5(S

)2p

11S

. (11)

This equation has limited utility with diallelic loci—if

individual xis a heterozygote, then S

50 and Equation

11 is undeﬁned because p

51. Therefore, in the

following analyses, we consider Equation 11 only in the

context of multiallelic loci.

In comparing the performance of these alternative

methods for estimating r

to that of the regression esti-

mator, we evaluated their single-locus sampling vari-

ances analytically by considering the joint probabilities

of all genotypes of pairs of individuals, conditional on

the degree of relationship and the allele-frequency dis-

tribution. With these alternative methods, the weights

depend only on the allele-frequency distribution in the

population, not on the genotypes of the reference and

proband individuals. Thus, with multiple marker loci

all with the same allele frequencies, the multilocus sam-

Figure 3.—Single-locus sampling variances for estimates of

plingvariancesaresimplythe single-locus values divided

rderived with the regression method (R), the correlation

by the number of loci. When loci have different allele-

method (C), and the similarity-index method (S) for diallelic

loci. The results for the regression method apply to analyses

frequencydistributions, as is usuallythe case inpractice,

based on 10 loci and were obtained by Monte Carlo simula-

weighted multilocus estimates can be obtained by

tions; additional loci yield slightly lower values. The results

weighting the locus-speciﬁc estimates by the inverses of

for the correlation and similarity-index methods are exact

their sampling variance.

solutions based on expected genotype combinations.

For diallelic loci, the correlation estimator yields a

sampling variance per locus equal to one in the case of

nonrelatives regardless of the allele frequency (Figure pling variance. On the other hand, for close relatives,

3). As noted above, the regression estimator asymptoti- compared to the correlation estimator, the regression

callyapproaches this same level ofefﬁciencyfornonrela- and similarity-index methods yield more accurate esti-

mates of rover the full range of allele frequencies attives, but the similarity-index method has higher sam-

1760 M. Lynch and K. Ritland

Figure 4.—Single-locus sam-

pling variances for estimates of r

for multiallelic loci, derived with

the regression method (R), the

correlation method (C), the simi-

larity-index method (S), and the

Queller-Goodnight method (Q)

for uniform and triangular allele-

frequency distrubutions. The re-

sults for the regression method

apply to analyses based on 10 loci

andwere obtainedbyMonte Carlo

simulations; additional loci yield

slightly lower values. The results

forthe correlationandthe similar-

ity-index methods are exact solu-

tions based on expected genotype

combinations.

diallelic loci, with the latter actually outperforming the with any estimator of distant relationships. For related

individuals,the regression and similarity-indexmethodsformer in the case of parent-offspring pairs.

A multiallelic perspective yields further insight into yield very similar sampling variances of rprovided there

are at least three alleles per locus, while the correlationthe relative efﬁciencies of the four techniques. With a

uniform distribution of three or more alleles per locus, and Queller-Goodnight estimators are again less efﬁ-

cient. For the two superior methods, the single-locusthe single-locus sampling variance for r

ˆis essentially 1/

(n21) with nonrelatives regardless of the method sampling variance of estimates of r

ˆasymptotically ap-

proaches 0.14 with increasing allele number with full(Figure 4). Thus, because an even allele-frequency dis-

tribution provides the greatest power of inference, this sibs, and very slowly approaches 0 with parents and

offspring.seems to be the best that one can expect to achieve

1761Estimation of Relatedness

TABLE 1

Sampling variance properties of D

Number of alleles

Relationship Method 2 4 6 12

Uniform frequencies

Nonrelatives R 0.999 0.168 0.067 0.015

C 1.000 0.166 0.067 0.017

Half sibs R 1.011 0.269 0.142 0.056

C 1.004 0.272 0.144 0.056

Full sibs R 0.949 0.423 0.324 0.248

C 0.948 0.440 0.336 0.256

Parent-offspring R 0.989 0.368 0.219 0.096

C 1.008 0.376 0.220 0.096

Triangular frequencies

Nonrelatives R 1.070 0.182 0.074 0.016

C 1.000 0.166 0.067 0.017

Half sibs R 1.276 0.329 0.179 0.074

C 1.240 0.360 0.240 0.080

Full sibs R 1.362 0.605 0.486 0.396

C 1.480 1.000 0.960 0.880

Parent-offspring R 1.471 0.479 0.294 0.136

C 1.520 0.640 0.640 0.280

Values are given for the single-locus sampling variances. R and C denote the regression and correlation

estimators, respectively. The regression estimates are based on Monte Carlo simulations of 10 loci per pair of

individuals.

With a triangular allele-frequency distribution, the ships, deﬁned as family (parent-offspring, full sibs),

regression and correlation methods again yield essen- close (half sibs, uncle, etc.), remote (cousin, etc.), and

tially identical results with nonrelatives, while the simi- unrelated. This approach to inferring genealogical “re-

larity-index and Queller-Goodnight methods have lationship” is fundamentally different from our ap-

somewhat higher sampling variances. However, with re- proach to estimating “relatedness,” which is a nondis-

lated individuals, the similarity-index method is again crete numerical parameter deﬁned in terms of

the superior of the four methods, and the correlation probabilities of identity-by-descent. Nevertheless, we

and Queller-Goodnight estimators generally yield the haveconsideredthe possibility of using likelihood meth-

highest sampling variance. By use of either the regres- ods to estimate “relatedness” under our regression

sion or similarity-index methods, up to a 50% reduction framework. Using notation developed earlier, the likeli-

in the standard error of r

ˆan be achieved. hood of data from one locus is the probability

The only other marker-based method for the estima-

P(y5cd|x5ab)5p

(2 2S

)(2 2S

)

tion of Dis the correlation-based estimator of Ritland

(1996b),whichis quite complex algebraically.Results in

· [(1 22φ

12(φ

)

Table1 show that the muchsimplerregressionestimator

·((S

1(S

)/4

presented above yields essentially the same asymptotic

sampling variances as the correlation method when the

)/2]

(12)

allele-frequencydistribution is uniform.With triangular

allele-frequency distributions, the results are also very andthe multilocus likelihood is the product of Equation

12 over loci. This expression can be used for estimatingsimilar fornonrelatives, but with relatedindividuals, the

regression estimator yields more precise estimates,with relatedness by solving for the values of r

and D

that

maximize Equation 12, given the data.the reduction in sampling variance approaching 50%

with close relatives. Using computer simulations, we examined the behav-

ior of such maximum-likelihood estimation of related-Thompson (1975, 1986) has extensively investigated

the use of maximum likelihood for inferring pairwise ness by a standardnumericalmethod (Newton-Raphson

iteration). Convergence to a maximum was conﬁrmedrelationship. The likelihood method allows one to take

an entirely different approach for genealogical infer- both by noting that the likelihood increased over itera-

tions and converged and by comparing the iterativeence. For example, Thompson discusses the power of

likelihood to distinguish among major types of relation- solutions to likelihood functions of the same data mapped

1762 M. Lynch and K. Ritland

by brute force. The results, and those discussed by Rit-

land(1996b), suggest that the potential for using maxi-

mum likelihood for estimating relatedness is limited.

The problem is fundamentally due to the fact that the

ideal properties of likelihood are asymptotic or apply

to “large” sample sizes. The number of loci usually avail-

able for pairwise estimation is inherently small—too

small for likelihood to avoid substantial problems with

bias (usually negative) and extremely large sampling

variance. For example, for the case of zero true relat-

edness, the average estimate of r

is on the order of

21.0 or less when 40 or fewer loci are sampled, and the

sampling variance is two to three orders of magnitude

beyond that shown for the alternative estimators in Fig-

Figure 5.—Estimates of pairwise relatedness in the com-

ures 3 and 4. Interestingly, we found that there is an

mon monkeyﬂower plotted as a function of distance. The

approximate sample size (number of loci) above which

estimated slope of the linear regression is 20.037/m (0.005)

the maximum-likelihood estimators become “stable” or

andtheestimatedinterceptis 0.21 (0.01). The standard errors

(in parentheses) were obtained by bootstrapping over individ-

show approximately the predicted asymptotic variance.

uals, with comparisons between identical individuals being

However, this sample size is large. For the maximum-

excluded.

likelihood estimator of r

, at low true relatedness, stabil-

ity occurs at z70 diallelic loci (p50.5). The maximum-

likelihood estimator of D

exhibits similar behavior, tion, there is a negative regression of relatedness on

although it begins to stabilize when z30 loci have been distance (Figure 5) as expected under isolation-by-dis-

sampled. Thus, while the maximum-likelihood ap- tance. Relatedness decreased z50% over the span from

proach may provide a useful means for comparing alter- 0 to 4 m, with the average value for adjacent plants

native degrees of relationship by likelihood-ratio tests, being 0.21, nearly the level of relatedness expected be-

its applicability for estimating pairwise relatedness coef- tween half sibs (0.25).

ﬁcients appears to be limited unless one has the luxury A second application of relatedness estimates derives

of a very large number of polymorphic markers. fromwork (D. Marshall and K. Ritland, unpublished

results) with a white-phase (termed Kermodism) of the

EXAMPLE APPLICATIONS

black bear, which is found in low to moderate (10%)

frequency along the north coast of British Columbia

As examples of how estimators of pairwise relatedness and adjacent islands. The genetic basis of the coat color

can be used in population studies and how they behave polymorphism is unknown. During late summer 1997,

with actual data, we consider two applications. First, as nearly 900 bear hair samples were collected from ﬁve

partof a study of isolation-by-distance and ﬁeld heritabil- islands and the adjacent mainland of northern coastal

ities in the common monkeyﬂower (Mimulus guttatus), Bristish Columbia. DNA was extracted from hairs with

300 plants were randomly selected along an 84-m tran- roots and assayed for 8 highly polymorphic microsatel-

sect through a meadow adjacent to Indian Valley Reser- lite loci using the primers developed by Paetkau et al.

voir in Clear Lake County, California (this was the (1995). The number of alleles per locus ranged from

“meadow” transect of Ritland and Ritland 1996). Ex- 7 to 17, with a mean of 10.4, and locus-speciﬁc heterozy-

tracts were obtained from corollas and assayed for 10 gosities ranged from 0.72 to 0.85, with a mean of 0.79.

polymorphic isozyme loci. Eight loci were diallelic, 1 After factoring out the multiple samples for individual

was triallelic, and the other had four alleles. Using the bears, a total of 89 distinct genotypes were found in the

regressionestimator, relatedness wasestimated for pairs regions where Kermodism was of signiﬁcant frequency

of plants separated by up to 4 m (with gene frequencies (17 on Gribbel Island, 13 on Hawksbury Island, 38 on

estimated from the entire sample). The estimates of Princess Royal Island, and 21 at Terrace [mainland

pairwiserelatednessfromthis dataset show considerable BC]). Bear hair color was also recorded in these sam-

scatter, with some being .11 and many ,0 (Figure 5). ples. Estimates of pairwise relatedness werefoundwithin

Such behavior is in accordance with the results pre- each of these four regions, using the pooled samples

sented above, which highlight the large sampling vari- to estimate gene frequencies. All pairs of individuals

ance expected for estimates based upon relatively few were then classiﬁed into two groups: pairs sharing coat

marker loci. Because of this large variance, signiﬁcant color (both white or both black, of which there were

inferences can be made only from groups of pairwise 614 pairings) and pairs not sharing coat colors (one

relatedness estimates or from correlations of these esti- black, one white, involving 156 pairings). A comparison

mates with other quantities such as similarity for a quan-

titative trait (Ritland 1996a). In this particular applica- of the frequency distribution of r

ˆfor these two groups

1763Estimation of Relatedness

The high sampling variance of estimates of relat-

edness arises in part because of variance in identity-by-

descent among loci and in part because of variance

in identity-in-state for alleles that are not identical by

descent.Thesesourcesof sampling error are fundamen-

tal consequences of Mendelian segregation, and no

amount of statistical ﬁnesse can eliminate them. In the

actual estimation of relatedness, however, further sam-

pling error is introduced by error in inference. With

the regression and correlation estimators, for example,

large standard errors result because the estimates of

relationship coefﬁcients derived from single loci com-

monlyfalloutsideofthetruedomainof(0, 1). Although

estimators can be designed to ensure that all estimates

lie in the range of true possibilities (e.g.,Thompson

1976), all such estimators necessarily return biased esti-

Figure 6.—Distributions of estimates of pairwise relat-

edness among bears not sharing the same coat color and

mates, and the magnitude of the bias depends on the

among bears sharing the same coat color.

actual degree of relationship. Thus, while negative sin-

gle-locusestimatesof relationship coefﬁcients mayseem

to be an undesirable feature, it is precisely this feature

(Figure 6) shows an excess of relatedness among bears that ensures that the estimators proposed above will be

sharing coat colors (r

¯50.057 compared to 0.039 for unbiased.

unlike colors), suggesting a genetic basis for the varia- Our results suggest that the relative advantages of the

tion in this character. However, bootstrap resampling alternative estimators of relatedness depend on several

indicated that this difference of means is not signiﬁcant factors. These include the number of loci, the allele-

(the excess being present in only 88 highly variable frequency distribution, the degree of actual relation-

microsatellite loci, the statistical error of relatedness is ship, and the coefﬁcient estimated (r vs. D). In general,

considerably less than that experienced with isozyme molecular-marker approaches that yield many alleles

markers in the previous study). Further inferences and loci tend to favor use of the regression estimators

about the mode of inheritance of Kermodism are given proposed in this article over the correlation estimators

in Ritland (1999). presented by Ritland (1996b). With small numbers of

diallelic loci with extreme allele frequencies, the corre-

lation method is more efﬁcient than the regression

DISCUSSION

method,butthe regression estimators are more efﬁcient

Estimation of relatedness with molecular markers is in almost all other cases. In addition, the simplicity of

a statistically demanding enterprise. On the positive the regression estimators lends to easier programming

side, all of the estimators described above (except maxi- and more stability of estimates under uneven allele fre-

mum-likelihood) are essentially unbiased in the sense quency distributions. The simplicity of the regression-

that they return estimates that are on average identical based approach is underscored by our ability to obtain

to their expected values. Errors in estimates of popula- an analytical solution for D

ˆwith this method. By con-

tion allele frequencies, which were not incorporated trast, the correlation approach of Ritland (1996b) re-

into our simulations, can introduce bias, but the effects quires,foralocus with nalleles, the inversion of amatrix

of error in gene-frequency estimation will generally be of size n(n15)/2, which is 12 312 at the minimum

trivial (of order 1/Nwhen Nindividuals are censused with multiallelic loci and beyond analytical solution.

for gene frequency) compared to the additional sam- Moreover, unlike the correlation estimator for D, the

pling errors that arise in the estimation of relatedness, regression estimator for this coefﬁcient is well behaved

provided the number of individuals sampled exceeds over the full range of allele frequencies.

100 or so (Ritland 1996a,b). Moreover, this source of As noted above, some simple statements can be made

bias can be simply removed by omitting the pair of concerning the minimum sampling variance that one

interest from the estimate of allele frequency (Queller can expect to achieve in the estimation of relationship

andGoodnight 1989), although pathological behavior coefﬁcients. For pairs of unrelated or distantly related

will occur in the rare event that marker alleles are individuals assayed at Lloci, each containing nalleles,

unique to particular individuals, as this would lead to the standard errors of the estimates of φ(details leading

population gene-frequency estimates of zero. In addi- up to this result are not shown), D, and rwill be no less

tion, the sampling variance of the relationship coefﬁ- than 2

√

(n14)/[Ln(n21)],

√

2/[Ln(n21)], and

cientsowing to uncertain allelefrequencies can, inprin-

ciple, be obtained by resampling procedures.

√

1/[L(n21)], respectively. For diallelic loci, a com-

1764 M. Lynch and K. Ritland

monsituation with allozymes, these limits take on values tive-genetictechnique can beapplied to natural popula-

tions. Ritland’s (1996a) method provides a means of

of 3.5/

√

L,1/

√

L, and 1/

√

L. With large numbers of al- estimating the additive and dominance components of

leles, as can be achieved with microsatellite loci, the genetic variance for quantitative traits (and covariance

limits asymptotically approach

√

4/Ln,

√

2/Ln

, and between traits) in the ﬁeld by regressing measures of

√

1/Ln. Fortunately, the two coefﬁcients with the lowest phenotypic similarity on the relatedness coefﬁcients r

sampling error, rand D, are the ones that have the and D

ˆ. Aside from the physical labor involved, one of

greatest practical utility. thegreatestdifﬁcultieswith this technique is the needto

One of the limitations of both the regression and eliminate the sampling variance from the total observed

correlation methods for estimating relatedness is the variance of relatedness to estimate the actual variance

use of weights that assume zero relationship. The best in relatedness. The problem is by no means trivial as

weightsare a function of theactual relationship, but this can be seen in Ritland and Ritland’s (1996) ﬁrst

isan unknown. Nevertheless, the use of approximate but application of the technique with the monkeyﬂower

incorrect weights yields more precise estimates than the (Mimulus). With eight assayed loci, the estimates of r

useofunweightedestimators,becausedifferencesin the derived by the correlation method ranged from 23to

informationcontentofalleleswithdifferentfrequencies 15, with approximately a third of all observed values

areatleastpartiallytakenintoaccount.Onemightthink being negative. The actual variance of relatedness was

that estimates obtained with the null weights could be estimated to be on the order of only 0.04. Thus, almost

improved upon by subsequently reﬁning the weights, all of the observed variance in r

ˆwas due to sampling

using the previous estimates of relatedness in the calcu- error. Such results clearly highlight the practical need

lation of the weights. These revised weights could then for molecular and statistical methodologiesforminimiz-

give a second round of weighted estimates, and the ing the sampling variance of relatedness.

whole process could be repeated again until a suitable

We thank John Kelley for helpful comments. This work was sup-

degree of convergence to ﬁnal estimates is achieved.

ported by National Institutes of Health grant GM-36827 and National

However, simulations by us and by Ritland (1996b)

Science Foundation grant DEB-9629775 to M.L., and by a National

indicated that, even with large numbers of loci, this

Sciences and Engineering Research Council/Industry Research Chair

iterative approach has little promise. Bias is introduced,

in population genetics held by K.R.

and with the weights being as noisy as they are, the

weights themselves are often wildly unrealistic.

Generally speaking, our results show that attempts

LITERATURE CITED

to estimate relatedness with molecular markers can be

Avise, J. C., 1995 Molecular Markers, Natural History and Evolution.

Chapman and Hall, New York.

greatly improved upon by working with multiallelic loci,

Cockerham, C. C., 1971 Higher order probability functions of iden-

with the most dramatic gains in efﬁciency occurring

tity of alleles by descent. Genetics 69: 235–246.

with loci with relatively even distributions of allele fre-

Falconer, D. S., and T. F. C. Mackay, 1996 Introduction to Quantita-

tive Genetics, Ed. 4. Longman, Harlow, United Kingdom.

quencies. Because the sampling variance of r

ˆis in-

Geyer, C. J., and E. A. Thompson, 1995 A new approach to the

versely proportional to Ln, it is clear that roughly the

joint estimation of relationship from DNA ﬁngerprint data, pp.

same amount of efﬁciency is gained by working with

245–260 in Population Management for Survival and Recovery, edited

by J. D. Ballou, M. Gilpin and T. J. Foose. Columbia University

loci with twice the number of alleles as by doubling the

Press, New York.

number of loci. For D, the sampling variance is inversely

Hamilton, W. D., 1964 The genetical evolution of social behaviour:

proportional to Ln

, so a much greater gain can be

I and II. J. Theor. Biol. 7: 1–52.

Jacquard, A., 1974 The Genetic Structure of Populations. Springer,

achieved by increasing numbers of alleles as opposed

Berlin.

tonumbers of loci. Thus,an early investment ina search

Kempthorne, O., 1954 The correlation between relatives in a ran-

for informative loci (those with a large number of al-

dom mating population. Proc. R. Soc. Lond. Ser. B 143: 103–113.

Li, C. C., and D. G. Horvitz, 1953 Some methods of estimating

leles, with roughly equal frequencies) can be quite ad-

the inbreeding coefﬁcient. Am. J. Hum. Genet. 5: 107–117.

vantageous in the long term. These recommendations

Li, C. C., D. E. Weeks and A. Chakravarti, 1993 Similarity of DNA

assume that at least 10 or so loci are sampled, because

ﬁngerprints due to chance and relatedness. Hum. Hered. 43:

45–52.

with fewer loci, the tradeoff involving rfavors more loci

Lynch, M., 1988 Estimation of relatedness by DNA ﬁngerprinting.

over more alleles per locus.

Mol. Biol. Evol. 5: 584–599.

The results presented above indicate that even with

Lynch, M., and B. Milligan, 1994 Analysis of population genetic

structure with RAPD markers. Mol. Ecol. 3: 91–99.

fairly large numbers of loci, standard errors of relation-

Lynch,M., and J. B.Walsh, 1998 Genetics andAnalysis of Quantitative

ship coefﬁcients will rarely be ,0.1/

√

Land often will

Traits. Sinauer Associates, Sunderland, MA.

Paetkau, D., W. Calvert, I. Stirling and C. Strobeck, 1995 Mi-

be somewhat .1/

√

L, so in general one cannot expect

crosatellite analysis of population structure in Canadian polar

to use markers to make precise statements about differ-

bears. Mol. Ecol. 4: 347–354.

ences in relatedness between particular pairs of individ-

Pamilo, P., and R. H. Crozier, 1982 Measuring genetic relatedness

in natural populations: methodology. Theor. Popul. Biol. 21:

uals. However, with enough effort applied to the right

171–193.

kinds of loci, it may be possible to reduce the sampling

Queller, D. C., and K. F. Goodnight, 1989 Estimating relatedness

using molecular markers. Evolution 43: 258–275.

variance to the extent that Ritland’s (1996a) quantita-

1765Estimation of Relatedness

Ritland, K., 1989 Marker genes and the inference of quantitative

typeij.TheconditionalprobabilitiesincludeP(ii|ij)and

geneticparameters in theﬁeld, pp. 183–201in Population Genetics,

P(jj|ij) as given in Equations A1a and A1b plus four

Plant Breeding and Gene Conservation, edited by A. H. D. Brown,

M. T. Clegg, A. L. Kahler and B. S. Weir. Sinauer Associates,

Sunderland, MA.

Ritland, K., 1996a A marker-based method for inferences about P(ij|ij)52p

1[0.5(p

)22p

]

quantitative inheritance in natural populations. Evolution 50:

1062–1073. ·φ

2(1 22p

(A3a)

Ritland, K., 1996b Estimators for pairwise relatedness and inbreed-

ing coefﬁcients. Genet. Res. 67: 175–186. P(i·|ij)52p

(1 2p

)1(1 2p

)(0.5 22p

)

Ritland, K., 1999 Detecting inheritance with inferred relatedness

in nature, in Adaptive Genetic Variation in the Wild, edited by T. ·φ

22p

(1 2p

(A3b)

Mousseau. Oxford University Press, Oxford (in press).

Ritland, K., and C. Ritland, 1996 Inferences about quantitative P(j·|ij )52p

(1 2p

)1(1 2p

)(0.5 22p

)

inheritance based on natural population structure in the yellow

monkeyﬂower, Mimulus guttatus. Evolution 50: 1074–1082. ·φ

22p

(1 2p

(A3c)

Thompson, E. A., 1975 The estimation of pairwise relationships.

Ann. Hum. Genet. 39: 173–188. P(··|ij)5(1 2p

)

(1 2φ

(A3d)

Thompson,E. A., 1976 Arestriction on the spaceof genetic relation-

ships. Ann. Hum. Genet. 40: 201–204.

Thus,withmultiallelicloci,heterozygousreferenceindi-

Thompson,E. A.,1986 Pedigree Analysis inHuman Genetics. TheJohns

Hopkins University Press, Baltimore.

viduals generate the obvious difﬁculty of there being

Weir,B. S., 1996 Genetic Data Analysis II. Sinauer Associates, Sunder-

more equations than unknowns.

land, MA.

Linear regression provides a data-ﬁtting procedure

Communicating editor: A. H. D. Brown

for obtaining estimators for φ

, and r

in this case.

The six probabilities can be assembled into an array,

APPENDIX

Provided there are only two alleles at the locus in the

population, the approach provided in the text for a P5

P(ii|ij)

P(jj|ij)

P(ij|ij)

P(i·|ij)

P(j·|ij)

P(··|ij)

homozygous reference genotype can also be applied to

the case in which the reference genotype is a heterozy-

gote for alleles iand j. The conditional probabilities

of observing proband genotypes, given a heterozygous For any pair of individuals, the observed data vector

reference genotype, are (P

ˆ) will always contain a single one for the observed

two-genotypecombination with all other elements being

P(ii|ij)5p

(0.5 2p

)φ

(A1a) equal to zero. The linear model then becomes

P(jj|ij)5p

(0.5 2p

)φ

. (A1b)

ˆ5a1M

1e, (A4)

The third probability, P(ij|ij), is omitted, as only two of

the three probabilities are needed for a sufﬁcient statis-

tic because the three probabilities sum to unity. where the matrix M

has two columns that contain the

Equating these probabilities to their estimates and coefﬁcients for φ

and D

, respectively, ais a column

rearranging, estimators for the coefﬁcients of relation- vector containing the remaining constants (functions only

ship are obtained as of gene frequencies), and eis a vector of residuals with

expectationzero.TheelementsofM

andaareobtained

directly from Equations A1a and A1b and A3a–A3d.φ

52[q

ˆ(ii|ij)2p

ˆ(jj|ij)]

pq(q2p)(A2a) If the elements of the observation vector P

ˆwere inde-

pendent and identical in distribution, ordinary least-

512P

ˆ(ii|ij)

p2P

ˆ(jj|ij)

q, (A2b) squares analysis could be used to obtain estimates of

the relationship coefﬁcients with minimum sampling

wherein, to emphasize that these equations apply only variance. However, because all of the elements of the

to diallelic loci, we have dropped the subscript i, letting observation vector are constrained to sum to 1, such

p5p

and q512p. From Equation 1, conditions are obviously violated. Although the failure

to fully account for the structure of the data in the P

511P

ˆ(ii|ij)2P

ˆ(jj|ij)

(q2p). (A2c) vector does not cause the estimates of the coefﬁcients

of relationship to be biased, it does elevate the sampling

variance. Unfortunately, the variance-covariance struc-When gene frequencies are exactly equal, a reference

heterozygote at a diallelic locus yields undeﬁned esti- ture necessary to generate the optimal weights for a

more powerful generalized least-squares framework de-mates for φ

and r

If there are more than two alleles in the population, pends onthe unknownparameters φ

andD

.Toobtain

approximate weights, we rely on Ritland’s (1996b) ar-there are six possible proband genotype categories con-

ditionedonobserving the heterozygous referencegeno- gument that, in the absence of prior information on

1766 M. Lynch and K. Ritland

the relationship of xand y, it is reasonable to start with and from Equation 1

the assumption that φ

50.

Using the optimal weights given by Equation 4b of r

ˆ(i|ij)1p

ˆ(j|ij)1(p

ˆ(ij|ij)24p

24p

Ritland(1996b), we were ableto obtain analytical solu-

tions for the weighted least-squares estimators of φ

and whereP

ˆ(i|ij)5P

ˆ(i·|ij)12P

ˆ(ii|ij)and P

ˆ(j|ij)5P

ˆ(j·|ij)1

using an equation solver program. These are 2P

ˆ(jj|ij). When there are only two alleles, Equations

54p

(1 2p

)[1 2P

ˆ(ij|ij)] 22(1 22p

)[p

ˆ(i|ij)1p

ˆ(j|ij)]

(1 2p

12p

)(4p

)

A5a–A5c reduce to the diallelic-locus estimates (A2a–

(A5a) A2c).

5(1 2p

ˆ(ij|ij)2p

ˆ(i|ij)2p

ˆ(j|ij)12p

12p

(A5b)

Advancing Selective Breeding in Leopard Coral Grouper (P. leopardus) through Development of a High-Throughput Image-Based Growth Trait

Article

Full-text available

Jun 2024

City divided: Unveiling family ties and genetic structuring of coyotes in Seattle

Article

Jun 2024

Linear barriers pose significant challenges for wildlife gene flow, impacting species persistence, adaptation, and evolution. While numerous studies have examined the effects of linear barriers (e.g., fences and roadways) on partitioning urban and non‐urban areas, understanding their influence on gene flow within cities remains limited. Here, we investigated the impact of linear barriers on coyote ( Canis latrans ) population structure in Seattle, Washington, where major barriers (i.e., interstate highways and bodies of water) divide the city into distinct quadrants. Just under 1000 scats were collected to obtain genetic data between January 2021 and December 2022, allowing us to identify 73 individual coyotes. Notably, private allele analysis underscored limited interbreeding among quadrants. When comparing one quadrant to each other, there were up to 16 private alleles within a single quadrant, representing nearly 22% of the population allelic diversity. Our analysis revealed weak isolation by distance, and despite being a highly mobile species, genetic structuring was apparent between quadrants even with extremely short geographic distance between individual coyotes, implying that Interstate 5 and the Ship Canal act as major barriers. This study uses coyotes as a model species for understanding urban gene flow and its consequences in cities, a crucial component for bolstering conservation of rarer species and developing wildlife friendly cities.

Spatial population genetic structure of Caquetaia kraussii (Steindachner, 1878) evidenced by species-specific microsatellite loci in the middle and low basin of the Cauca River, Colombia

Article

Full-text available

Jun 2024
PLOS ONE

The adaptative responses and divergent evolution shown in the environments habited by the Cichlidae family allow to understand different biological properties, including fish genetic diversity and structure studies. In a zone that has been historically submitted to different anthropogenic pressures, this study assessed the genetic diversity and population structure of cichlid Caquetaia kraussii, a sedentary species with parental care that has a significant ecological role for its contribution to redistribution and maintenance of sedimentologic processes in its distribution area. This study developed de novo 16 highly polymorphic species-specific microsatellite loci that allowed the estimation of the genetic diversity and differentiation in 319 individuals from natural populations in the area influenced by the Ituango hydroelectric project in the Colombian Cauca River. Caquetaia kraussii exhibits high genetic diversity levels (Ho: 0.562–0.885; He: 0.583–0.884) in relation to the average neotropical cichlids and a three group-spatial structure: two natural groups upstream and downstream the Nechí River mouth, and one group of individuals with high relatedness degree, possibly independently formed by founder effect in the dam zone. The three genetic groups show recent bottlenecks, but only the two natural groups have effective population size that suggest their long-term permanence. The information generated is relevant not only for management programs and species conservation purposes, but also for broadening the available knowledge on the factors influencing neotropical cichlids population genetics.

Analisis Sidik Jari DNA Kalus In Vitro Kelapa Sawit Menggunakan Marka Simple Sequence Repeats (SSR) DNA Fingerprinting Analysis of Oil Palm In Vitro Calli Using Simple Sequence Repeats (SSR) Markers

Article

Full-text available

Jun 2022

Propagating elite oil palm (Elaeis guineensis Jacq.) planting material through in vitro culture techniques requires more time and advanced techniques. Early detection of culture stability would facilitate the process of culture selection and maintenance. This research aimed to analyze the DNA fingerprinting of explants and their calli. Calli consisted of embryogenic and non-embryonic calli, which had been subcultured three times. DNA of explants and calli isolated with DNeasy® Plant mini kit (Qiagen) and Genomic DNA Mini Kit (Plant) (Geneaid). DNA was amplified by SSR-PCR using 16 SSR markers and can be bulked into two groups to save analysis costs. The result showed that 16 markers produced identical electropherograms between the explant and calli. The relatedness coefficient indicated that both compared explant and calli were genetically identical (r = 1). The markers used were informative with an average PIC number = 0.48 and can be used for DNA fingerprinting analysis of oil palm in vitro culture. ABSTRAK Perbanyakan bahan tanaman elit kelapa sawit (Elaeis guineensis Jacq.) melalui teknik kultur in vitro merupakan kegiatan yang memakan waktu yang lama dan biaya yang cukup tinggi. Deteksi sejak dini kemurnian kultur yang dihasilkan akan memudahkan proses seleksi dan pemeliharaan kultur kelapa sawit elit. Penelitian ini bertujuan untuk menganalisis sidik jari DNA eksplan dan kalus yang dihasilkannya. Kalus yang digunakan merupakan kalus embriogenik dan non embrionik yang telah disubkultur sebanyak 3 kali. Sebanyak 16 marka SSR digunakan dalam analisis sidik jari DNA ini dan dapat digabungkan (bulking) menjadi 2 kelompok untuk menghemat biaya analisis. Hasil analisis menunjukkan bahwa ke 16 marka menghasilkan elektroferogram yang menunjukkan true to type 100% antara eksplan dan kalusnya berdasarkan lokus yang digunakan dan koefisien uji keterkaitan menunjukkan bahwa keduanya identik secara genetik (r = 1). Marka yang digunakan cukup informatif dengan nilai PIC = 0,48 dan dapat digunakan untuk analisis sidik jari DNA kultur in vitro kelapa sawit. Kata kunci : eksplan; kalus; kelapa sawit; marka SSR; sidik jari DNA

Questioning inbreeding: Could outbreeding affect productivity in the North African catfish in Thailand?

Article

Full-text available

May 2024
PLOS ONE

The North African catfish ( Clarias gariepinus ) is a significant species in aquaculture, which is crucial for ensuring food and nutrition security. Their high adaptability to diverse environments has led to an increase in the number of farms that are available for their production. However, long-term closed breeding adversely affects their reproductive performance, leading to a decrease in production efficiency. This is possibly caused by inbreeding depression. To investigate the root cause of this issue, the genetic diversity of captive North African catfish populations was assessed in this study. Microsatellite genotyping and mitochondrial DNA D-loop sequencing were applied to 136 catfish specimens, collected from three populations captured for breeding in Thailand. Interestingly, extremely low inbreeding coefficients were obtained within each population, and distinct genetic diversity was observed among the three populations, indicating that their genetic origins are markedly different. This suggests that outbreeding depression by genetic admixture among currently captured populations of different origins may account for the low productivity of the North African catfish in Thailand. Genetic improvement of the North African catfish populations is required by introducing new populations whose origins are clearly known. This strategy should be systematically integrated into breeding programs to establish an ideal founder stock for selective breeding.

Genetic Variation in the Pallas’s Cat (Otocolobus manul) in Zoo-Managed and Wild Populations

Article

Full-text available

Apr 2024
Diversity

The Pallas’s cat (Otocolobus manul) is one of the most understudied taxa in the Felidae family. The species is currently assessed as being of “Least Concern” in the IUCN Red List, but this assessment is based on incomplete data. Additional ecological and genetic information is necessary for the long-term in situ and ex situ conservation of this species. We identified 29 microsatellite loci with sufficient diversity to enable studies into the individual identification, population structure, and phylogeography of Pallas’s cats. These microsatellites were genotyped on six wild Pallas’s cats from the Tibet Autonomous Region and Mongolia and ten cats from a United States zoo-managed population that originated in Russia and Mongolia. Additionally, we examined diversity in a 91 bp segment of the mitochondrial 12S ribosomal RNA (MT-RNR1) locus and a hypoxia-related gene, endothelial PAS domain protein 1 (EPAS1). Based on the microsatellite and MT-RNR1 loci, we established that the Pallas’s cat displays moderate genetic diversity. Intriguingly, we found that the Pallas’s cats had one unique nonsynonymous substitution in EPAS1 not present in snow leopards (Panthera uncia) or domestic cats (Felis catus). The analysis of the zoo-managed population indicated reduced genetic diversity compared to wild individuals. The genetic information from this study is a valuable resource for future research into and the conservation of the Pallas’s cat.

The challenge of incorporating ex situ strategies for jaguar conservation

Article

Full-text available

Apr 2024

The loss of biodiversity is an ongoing process and existing efforts to halt it are based on different conservation strategies. The ‘One Plan approach’ introduced by The International Union for Conservation of Nature proposes to consider all populations of a species under a unified management plan. In this work we follow this premise in order to unify in situ and ex situ management of one of the most critically endangered mammals in Argentina, the jaguar (Panthera onca). We assessed pedigrees of captive animals, finding that 44.93% of the reported relatedness was erroneous according to molecular data. Captive individuals formed a distinct genetic cluster. The three remaining locations for jaguars in Argentina constitute two genetic groups, the Atlantic Forest and the Chaco–Yungas clusters. Genetic variability is low compared with other populations of the species in the Americas and it is not significantly different between wild and captive populations in Argentina. These findings demonstrate that genetic studies aiming to include captive individuals into conservation management are very valuable, and should incorporate several parameters such as mean individual relatedness, individual inbreeding, rare and private alleles, and mitochondrial haplotypes. Finally, we discuss two ongoing ex situ management actions and postulate the need for genetic monitoring of the breeding and release of animals.

A Glimpse into the Genetic Heritage of the Olive Tree in Malta

Article

Full-text available

Mar 2024

The genetic diversity of the ancient autochthonous olive trees on the Maltese islands and the relationship with the wild forms growing in marginal areas of the island (57 samples), as well as with the most widespread cultivars in the Mediterranean region (150 references), were investigated by genetic analysis with 10 SSR markers. The analysis revealed a high genetic diversity of Maltese germplasm, totaling 84 alleles and a Shannon information index (I) of 1.08. All samples from the upper and the lower part of the crown of the Bidni trees belonged to the same genotype, suggesting that there was no secondary top-grafting of the branches. The Bidni trees showed close relationships with the local wild germplasm, suggesting that the oleaster population played a role in the selection of the Bidni variety. Genetic similarities were also found between Maltese cultivars and several Italian varieties including accessions putatively resistant to the bacterium Xylella fastidiosa, which has recently emerged in the Apulia region (Italy) and has caused severe epidemics on olive trees over the last decade.

Dispersal of Common Shrews (Sorex Araneus L.): The Dream and “An Accident”

Article

Mar 2023
Ekologiya

Nikolay Alexander Shchipanov

Understanding the processes that affect the dispersal distance is essential from perspective of ecology and evolution. It is essential to understand processes that affect dispersal distances. Dispersal distances can may depend on environmental and demographic factors and on the motivation of an individual. Effective dispersal results in the distribution of related genotypes in space. The distribution of pairwise distances between related common shrews (sibs and half-sibs) is characterized by a nonrandom increase in the number of relatives at distances up to 200 m. Aggregations of relatives are formed in a part of individuals dispersed in a random direction to the nearest available home rang (“stright-line search”). The distribution of all distances between relatives (up to 1200 m) is satisfactorily approximated by the straight-line search model and is not consistent with the “spiral search” model as it is; however, the best match can be achieved by combining these two search types. The latter model variant (“mixed search”) assumes that the population includes animals with different personal traits: “superficial” and “thorough” explorers. Thorough explorers search for a vacant territory employing the spiral search strategy and correspond to “dreamers” in the model describing the movement and habitat selection strategy (MHSS). If vacant territories are in deficit and the environment is favorable, dreamers move over long distances and become randomly distributed in space: a random dispersion of related genotypes was recorded at distances from 200 to 1200 m. Therefore, searches for a dream territory in combination with a shortage of vacant territories (an accident) result in a random dispersal of related genotypes within a radius of at least 1200 m. The combination of temporal aggregations of relatives and the dispersal of related genotypes over a vast area explain well the previously discovered combination of an excess of homozygous alleles and a high allelic diversity.

Conservation genetics of Roosevelt elk: Population isolation and reduced diversity

Article

Apr 2024
CAN J ZOOL

Species reintroductions have the potential to cause genetic bottleneck events resulting in increased genetic drift, increased inbreeding, and reduced genetic diversity creating negative fitness consequences for populations. Roosevelt elk (Cervus canadensis roosevelti Erxleben 1777) are ‘at risk’ in British Columbia (BC), Canada. Once widespread along the west coast, Roosevelt elk were likely extirpated from the mainland by 1900 and experienced a substantial population bottleneck on Vancouver Island at that time, and again in the 1950s. Reintroduced to the mainland from Vancouver Island in the 1980s, this re-established population became the source for subsequent mainland translocations. To understand the effects of reintroduction strategy on genetic diversity, we analyzed genetic variation in 355 Roosevelt elk from Vancouver Island and mainland BC. Using mitochondrial DNA and 10 microsatellite loci, molecular analyses showed overall reduced genetic diversity relative to other extant elk populations, genetic isolation of the southern Vancouver Island population, and increased genetic drift among reintroduced herds. Four reintroduced populations were found to have increased levels of inbreeding. Results of this study contribute to our knowledge of reintroduction biology and can be used to guide continued conservation and management of at-risk species.

Detecting Inheritance with Inferred Relatedness in Nature

Chapter

Jan 2000

Kermit Ritland

Two of the great mysteries of biology yet to be explored concern the distribution and abundance of genetic variation in natural populations and the genetic architecture of complex traits. These are tied together by their relationship to natural selection and evolutionary history, and some of the keys to disclosing these secrets lie in the study of wild organisms in their natural environments. This book, featuring a superb selection of papers from leading authors, summarizes the state of current understanding about the extent of genetic variation within wild populations and the ways to monitor such variation. It proposes the idea that a fundamental objective of evolutionary ecology is necessary to predict organism, population, community, and ecosystem response to environmental change. In fact, the overall theme of the papers centers around the expression of genetic variation and how it is shaped by the action of natural selection in the natural environment. Patterns of adaptation in the past and the genetic basis of traits likely to be under selection in a dynamically changing environment is discussed along with a wide variety of techniques to test for genetic variation and its consequences, ranging from classical demography to the use of molecular markers. This book is perfect for professionals and graduate students in genetics, biology, ecology, conservation biology, and evolution.

Article

Mar 1989
EVOLUTION

A new method is described for estimating genetic relatedness from genetic markers such as protein polymorphisms. It is based on Grafen's (1985) relatedness coefficient and is most easily interpreted in terms of identity by descent rather than as a genetic regression. It has several advantages over methods currently in use: it eliminates a downward bias for small sample sizes; it improves estimation of relatedness for subsets of population samples; and it allows estimation of relatedness for a single group or for a single pair of individuals. Individual estimates of relatedness tend to be highly variable but, in aggregate, can still be very useful as data for nonparametric tests. Such tests allow testing for differences in relatedness between two samples or for correlating individual relatedness values with another variable.

INFERENCES ABOUT QUANTITATIVE INHERITANCE BASED ON NATURAL POPULATION STRUCTURE IN THE YELLOW MONKEYFLOWER, MIMULUS GUTTATUS

Article

Jun 1996
EVOLUTION

We used a nonmanipulative, marker-based method to study quantitative genetic inheritance in two habitats of a common monkeyflower population. The method involved regressing quantitative trait similarity on marker-estimated relatedness between individuals sampled in the field. We sampled 300 adult plants from each of two transects, one along a stream habitat and another through a meadow habitat. For each plant we measured 10 quantitative characters and assayed 10 polymorphic isozyme loci. In the meadow habitat, relatedness of plants within 1 m was moderate (r = 0.125, corresponding to half-sibs) as was actual variance of relatedness (Vr = 0.044). Significant heritabilities of 50-70% were found for corolla width and the fitness characters of flower number and plant weight. Genetic correlations were strongly positive, but sharing of environmental effects within 1 m was weak. In the stream habitat, levels of relatedness were lower and similar heritabilities were indicated. To detect dominance variance and the correlation of phenotypes due to shared inbreeding, we also estimated higher-order coefficients of relationship and inbreeding, but these did not significantly differ from zero. Laboratory-based estimates of heritability in the field were lower than the marker-based estimates, indicating that natural heritabilities and genetic correlations may be stronger than indicated by controlled studies.

A MARKER-BASED METHOD FOR INFERENCES ABOUT QUANTITATIVE INHERITANCE IN NATURAL POPULATIONS

Article

Jun 1996
EVOLUTION

Kermit Ritland

A marker-based method for studying quantitative genetic characters in natural populations is presented and evaluated. The method involves regressing quantitative trait similarity on marker-estimated relatedness between individuals. A procedure is first given for estimating the narrow sense heritability and additive genetic correlations among traits, incorporating shared environments. Estimation of the actual variance of relatedness is required for heritability, but not for genetic correlations. The approach is then extended to include isolation by distance of environments, dominance, and shared levels of inbreeding. Investigations of statistical properties show that good estimates do not require great marker polymorphism, but rather require significant variation of actual relatedness; optimal allocation generally favors sampling many individuals at the expense of assaying fewer marker loci; when relatedness declines with physical distance, it is optimal to restrict comparisons to within a certain distance; the power to estimate shared environments and inbreeding effects is reasonable, but estimates of dominance variance may be difficult under certain patterns of relationship; and any linkage of markers to quantitative trait loci does not cause significant problems. This marker-based method makes possible studies with long-lived organisms or with organisms difficult to culture, and opens the possibility that quantitative trait expression in natural environments can be analyzed in an unmanipulative way.

Pedigree Analysis in Human Genetics

Article

Dec 1986

Measuring genetic relatedness in natural populations: Methodology

Article

Apr 1982

The estimation of relatedness within social groups, such as the colonies of a population of social insects, is an important field for evaluating hypotheses concerning the evolution and maintenance of social behaviour. The methodology of this estimation from genetic data in the absence of pedigree information has been poorly understood; we develop this methodology for b, the regression coefficient of relatedness, and discuss its applications. Both b and G (the pedigree coefficient of relatedness) are potentially asymmetric coefficients, whereas φ, r, and FST are necessarily symmetric. We develop an estimator for b suitable for small samples, and also one for standard deviation, and examine the properties of both using sampling simulations. The b estimator returns values slightly below E(b), and the standard deviation estimator yields conservative confidence intervals. A comparative study of b and FST shows that, given the same set of data, b is estimated with greater reliability than is FST. As is the case for FST, b can be used to examine population structure at various levels, and b possesses the advantage of an estimator for its standard error, which can also be used to test for heterogeneity among the loci surveyed. The actual numbers of identical genes held in common by interacting individuals, and not simply their proportions, need to be considered in using coefficients of relatedness in inclusive fitness calculations. This necessity is handled by the weighted coefficients of relatedness, G′ and b′, which have been referred to in the literature as r (as have most relatedness measures).

Estimators for pairwise relatedness and individual inbreeding coefficients

Article

Apr 1996
GENET RES

Kermit Ritland

Method-of-moments estimators (MMEs) for the two-gene coefficients of relationship and inbreeding, and for thxe four-gene Cotterman coefficients, are described. These estimators, which use co-dominant genetic markers, are most appropriate for estimating pairwise relatedness or individual inbreeding coefficients, as opposed to their mean values in a group. This is because, compared to the maximum likelihood estimate (MLE), they show reduced small-sample bias and lack distributional assumptions. The ‘efficient’ MME is an optimally weighted average of estimates given by each allele at each locus. Generally, weights must be computed numerically, but if true coefficients are assumed zero, simplifiedestimators are obtained whose relative efficiencies are quite high. Population gene frequency is assumed to be assayed ina larger, ‘reference population’ sample, and the biases introduced by small reference samples and/or genetic drift of the reference population are discussed. Individual-level estimates of relatedness or inbreeding, while displaying high variance, are useful in several applications as a covariate in population studies.

Inferences About Quantitative Inheritance Based on Natural Population Structure in the Yellow Monkeyflower, Mimulus guttatus

Article

Jun 1996

We used a nonmanipulative, marker-based method to study quantitative genetic inheritance in two habitats of a common monkeyflower population. The method involved regressing quantitative trait similarity on marker-estimated relatedness between individuals sampled in the field. We sampled 300 adult plants from each of two transects, one along a stream habitat and another through a meadow habitat. For each plant we measured 10 quantitative characters and assayed 10 polymorphic isozyme loci. In the meadow habitat, relatedness of plants within 1 m was moderate (r = 0.125, corresponding to half-sibs) as was actual variance of relatedness (V-r = 0.044). Significant heritabilities of 50-70% were found for corolla width and the fitness characters of flower number and plant weight. Genetic correlations were strongly positive, but sharing of environmental effects within 1 m was weak. In the stream habitat, levels of relatedness were lower and similar heritabilities were indicated. To detect dominance variance and the correlation of phenotypes due to shared inbreeding, we also estimated higher-order coefficients of relationship and inbreeding, but these did not significantly differ from zero. Laboratory-based estimated of heritability in the field were lower than the marker-based estimated, indicating that natural heritabilities and genetic correlations may be stronger than indicated by controlled studies.

The Genetic Structure of Populations

Book

Jan 1974

Albert Jacquard

Thompson EA.. A restriction on the space of genetic relationships. Ann Hum Genet 40: 201-204

Article

Dec 1976

Elizabeth A Thompson

It was pointed out by Trustrum (1961) that even for non-inbred pairs of relatives it is possible for all four cross-parental kinship coefficients to be non-zero, and hence that the expression often assumed for the correlation between such relatives is not completely general. Van Aarde (1975) has recently made the same comment. We derive a restriction on the space of attainable Cotterman coefficients for a relationship between two arbitrary non-inbred relatives. This restriction implies that the form of the expression for the correlation is in fact general, although the components cannot always be interpreted as parental kinships.

Estimation of Pairwise Relatedness With Molecular Markers

Abstract and Figures

Recommended publications

Estimating pairwise relatedness from dominant genetic markers: RELATEDNESS ESTIMATION

Multilocus estimation of pairwise relatedness with dominant markers

Simultaneous Maximum Likelihood Estimation of Linkage and Linkage Phases in Outcrossing Species

Widespread, ecologically relevant genetic markers developed from association mapping of climate-rela...