Derivation of Inbreeding equations
F Inbreeding
coefficient = expectation that two alleles are identical by descent (autozygous):
exact genetic copies of a DNA
sequence in common ancestor.
Alleles may be identical by allelic state, and not identical
by descent.
Theory and equations should be applied only
to the latter.
If it is assumed
[see note below] that observed allelic variants arise
only once,
identity by allelic state is the same as
identity by descent
F is also the proportion
of population that is inbred at any locus:
the fraction of individuals with two alleles identical by
descent.
Then, homozygosity
at any locus indicates identity by descent
What is the effect of inbreeding on
genotype proportions?
In the absence of
inbreeding, expected f(AA) = p2
f(AB) = 2pq
f(BB) = q2
In the presence of
inbreeding,
f(AA) = (1 - F)(p2)
+ (F)(p)(1) = p2
- Fp2 + Fp = p2 + Fp(1 - p)
= p2 + Fpq
fraction (1 - F) of population not inbred:
expected frequency of AA
homozygotes among these = p2
fraction (F) of population inbred:
fraction p of these individuals have
A allele
If inbred, other allele must also be A,
with probability = 1
f(AB) = (1 - F)(2pq)
+ (F)(0) = 2pq -
2Fpq = 2pq (1 - F)
fraction (1- F) of
population not inbred:
the expected frequency of AB
heterozygotes among these is 2pq
fraction (F) of population inbred:
among these, no heterozygotes,
since alleles not identical.
f(BB) = (1 - F)(q2)
+ (F)(q)(1) = q2
- Fq2 + Fq = q2 + Fpq
Follow same logic for f(AA) above, applied to B
allele
Historical
note: Classical versus Balanced views on genetic variation
in natural populations
The so-called Classical School argued
that only a small fraction of 1% of loci were polymorphic (with
more than one allele), and that such alternative alleles as
existed were typically rare. Such alleles were most obvious in
genetic diseases caused by homozygosity for a deleterious
recessive allele (aa), where the a allele arose by
new mutation. Under such circumstances, the assumption of the
equivalence of allelic state and identify by descent in the above
calculations is justified. The Hardy-Weinberg expectation of
homozygosity for a rare recessive allele is extremely low: recall
that if f(a) = 0.001 then f(aa) = 10-6.
However, if deleterious alleles arrive in a small population by
chance migration of a single family that includes
multiple Aa heterozygotes so as to increase f(a),
marriages in subsequent generations may dramatically increase f(aa)
[see calculations]. Studies of particular medico-genetic anomalies
in small, closed populations seemed to support this.
The alternative Balanced School argued
that a considerably larger fraction of loci, on the order of
several percent, were polymorphic, because the alternative alleles
at a locus were beneficial to a population or species when
maintained in heterozygous genotypes (Aa). Such
variation was evident in natural populations, where morphological
variation could be shown to follow Mendelian rules. Different
allelic variants were adaptive in different places and at
different times, and perhaps in different tissues. The classical
studies of altitudinal and seasonal variation in chromosome types
of wild Drosophila were argued to be due to different
alleles brought together over multiple loci. In such
circumstances, inbreeding might be beneficial to a
species, as a means of creating multiple homozygous genotypes (AA,
A'A', A"A", aa, etc.) that may be adaptive
for novel environments. Studies of natural populations seemed to
support this.
The introduction of protein electrophoretic
data to studies of humans and other species starting in 1967
showed much more variation than even the Balanced School had
anticipated [see lecture notes]. Rather than resolving the
argument in favor of the Balanced interpretation, the ground
shifted to an argument as to whether the observed variation (amino
acid substitutions resulting in protein charge differences) was
adaptively significant (Selectionists vs
Neutralists). The ground has
shifted again with the widespread availability of DNA sequence
data, which shows directly the relative proportions of
third-position "silent" mutations versus substitution
mutations that alter amino acid sequences, which may or may not
alter function. The essential
argument, whether any particular class of observable genetic
variation is adaptive, and if so how, remains after more than 100
years.