HardyWeinberg model
Contributor(s)
Written  200102  Robert Kalmes, JeanLoup Huret 
Genetics, Dept Medical Information, UMR 8125 CNRS, University of Poitiers, CHU Poitiers Hospital, F86021 Poitiers, France (JLH) 
 I. The intuitive approach

II. The HardyWeinberg equilibrium
 For an autosomal, diallele, codominant gene
 Exercise

III. The HW law
 Demonstration of the law
 Exercises

Consequences of the law
 3.1 What is the allele frequency in the n+1 generation?
 3.2 What is the genotype frequency in the n+1 generation?
 3.3 Example

IV. Extension of the HW law to other gene situations
 To an autosomal, triallele, codominant gene
 To an autosomal, diallele, non codominant gene

To an autosomal, triallele, non codominant gene
 3.1 Bernsteins equation

To a heterosomal (=gonosomic) gene
 4.1 Y Chromosome
 4.2 X Chromosome
 V. Summary  Consequences of HWs law
I. The intuitive approach
The HardyWeinberg law can be used under some circumstances to calculate genotype frequencies from allele frequences. Let A1 and A2 be two alleles at the same locus,
 p is the frequency of allele A1 0 =< p =< 1
 q is the frequency of allele A2 0 =< q =< 1 and p + q = 1
where the distribution of allele frequencies is the same in men and women, i.e.:
hommes (p,q) femmes (p,q)

if they procreate : (p + q)^{2} = p^{2} + 2pq + q^{2} = 1 , where:
 p^{2} = frequency of the A1 A1 genotype < HOMOZYGOTE
 2pq = frequency of the A1 A2 genotype < HETEROZYGOTE
 q^{2} = frequency of the A2 A2 genotyp < HOMOZYGOTE
these frequencies remain constant in successive generations.
Example: autosomal recessive inheritance with alleles A and a, and allele frequencies p and q:

frequency of the genotypes:
 AA = p^{2}
 Aa = 2pq
 aa = q^{2}

and the phenotypes [ ]:
 [A] = p^{2} + 2pq
 a] = q^{2}
Example: phenylketonuria (recessive autosomal), of which the deleterious gene has a frequency of 1/100 > q = 1/100
Therefore,
 the frequency of this disease is q^{2} = 1/10 000,
 and the frequency of heterozygotes is 2pq = 2 x 99/100 x 1/100 = 2/100;
Note that there are a lot of heterozygotes: 1/50, two hundred times more than there are individuals suffering from the condition.
For a rare disease, p is very little different from 1, and the frequency of the heterozygotes = 2q.
We use these equations implicitly, in formal genetics and in the genetics of pooled populations, usually without considering whether, and under what conditions, they are applicable.
II. The HardyWeinberg equilibrium
The HardyWeinberg equilibium, which is also known as the panmictic equilibrium, was discovered at the beginning of the 20th century by several researchers, notably by Hardy, a mathematician and Weinberg, and physician.
The HardyWeinberg equilibrium is the central theoretical model in population genetics. The concept of equilibrium in the HardyWeinberg model is subject to the following hypotheses/conditions:
 The population is panmictic (couples form randomly (panmixia), and their gametes encounter each other randomly (pangamy)).
 The population is "infinite" (very large: to minimize differences due to sampling).
 There must be no selection, mutation, migration (no allele loss /gain).
 Successive generations are discrete (no crosses between different generations).
Under these circumstances, the genetic diversity of the population is maintained and must tend towards a stable equilibrium of the distribution of the genotype.
2.1 For an autosomal, diallele, codominant gene
Let:
 The frequencies of genotypes F(G) be called D, H, and R with 0 =< [D,H,R] = < 1 and D + H + R = 1
 The frequencies of alleles F(A) be called p, and q with 0 =< [p,q] =< 1 and p+q = 1
Génotypes  A1A1  A1A2  A2A2  
Number of subjects  DN  HN  RN  (total number N)  
Frequencies F(G)  D  H  R  with (D+H+R) = 1 
allele frequencies F(A):
 de A1 D + H/2 = p
 de A2 R + H/2 = q with p+q=1
NOTES
 The genotype frequencies F(G) can always be used to calculate the allele frequencies F(A).
 F(A) contains less information than F(G)
 if p = 0: allele is lost; if p = 1: allele is fixed.

first demonstration that p = D + H/2, by counting the alleles:
 size of the population = N > number of alleles = 2N
 p = nb A1 / nb total = (2DN + HN) / 2N = D + H/2
 p = nb A1 / nb total = (2DN + HN) / 2N = D + H/2 similarly for A2
 q = nb A2 / nb total = (2RN + HN) / 2N = R + H/2 (note the symmetry between p and q)

second demonstration, by calculating the probabilities:
 proba of drawing A1 =drawing A1A1: : D x 1 then drawing A1 into A1A1
 or drawing A1A2: H x 1/2 then drawing A1 amongst A1A2
 sum: P(A1) = D + H/2
 similarly for A2 ...;
2.2 Exercise
let  les phénotypes  [A1]  [A1A2]  [A2]  
les génotypes  A1A1  A1A2  A2A2  
number of subjects  167  280  109  total N : 556 
F(P) = F(G)  167/556  280/556  109/556  
Where :  D=0.300  H=0.504  R=0.196  confirm:S(D,H,R)=2 
F(A) = F(gam.)  p = D+H/2 = (167+280/2)/ 556 or 0.300+0.504/2 = 0.552 
q = R+H/2 = (109+280/2)/ 556 or 0.196 + 0.504/2 = 0.448  
III. The HW law
In a population consisting of an infinite number of individuals (i.e. a very large population), which is panmictic (mariages occur randomly), and in the absence of mutation and selection, the frequency of the genotypes will be the development of (p+q)^{2}, p and q being the allele frequencies.
The figure shows the correspondence between the allele frequency q of a and the genotype frequencies in the case of two alleles in a panmictic system. The highest frequency of heterozygotes, H, is then reached when p = q and H = 2pq = 0.50. In contrast, when one of the alleles is rare (i.e. q is very small), virtually all the subjects who have this allele are heterozygotes.
3.1 Demonstration of the law
Let A be an autosomal gene that is found in a population in two allele forms, A1 and A2 (with the same frequencies in both sexes of course). As there is codominance, 3 genotypes can be distinguished. According to the hypotheses/conditions of HardyWeinberg (HW), the individuals of the n + 1 generation will be assumed to be the descendants of the random union of a male gamete and a femal gamete.
Consequently, if, by generation n, the probability of drawing an A1 allele is p, then that of producing an A1A1 zygote after fertilization is p x p = p^{2} and similarly for A2, that of producing an A2A2 zygote is q x q = q^{2}. The probability of producing a heterozygote is pq + pq = 2pq. Finally, p^{2} + 2pq + q^{2} = (p+q)^{2} = 1
A1A1  A1A2  A2A2  
D = p^{2}  H=2pq  R = q2  < sous HW 
A1  A2  
(p)  (q)  
_________  __________  
A1 (p)  A1A1 (p^{2})  A1A2 (pq) 
A2 (q)  A1A2 (pq)  A2A2 (q^{2}) 
 (The allele frequencies can only be used to calculate the genotype frequencies if they are subject to HW)
 the allele frequencies remain the same from one generation to another
 the genotype frequencies remain the same from one generation to another
3.2 Exercises

exercise: Show that, in the absence of panmixia, two populations with similar allele frequencies can have different genotype frequencies (by doing this, you show that there is a loss of information between genotype and allele frequencies):
example : for p = q = 0,5
answer:
 if H = 0> p = D + H/2 = 0.5 >D = 0.5 H = 0 R = 0.5
 if H = 1> D = R = 0 > D = 0 H = 1 R = 0

answer:

exercise: calculation of the genotype and allele frequencies, calculation of the numbers predicted by HW (theoretical numbers of individuals), and confirmation that we are indeed in a situation subject to HW :
AA AB BB 1787 3039 1303 N=6129 DN HN RN 
answer:
 F(A) = (1787 + 3039/2) / 6129 = 0.54 = p
 F(B) = (1303 + 3039/2) / 6129 = 0.46 = q and S(p,q)=1

genotype frequencies predicted by HW genotype frequencies predicted by HW
AA : p^{2} = (0.54)^{2} = 0.2916
AB = 2pq = 2x 0.54 x 0.46 = 0.4968
BB : q^{2} = (0.46)^{2} = 0.2116 
Numbers predicted by HW
AA : p^{2}N = 0.2916 x 6129 = 1787.2
AB : 2pqN = 0.4968 x 6129 = 3044.9
BB : q^{2}N = 0.2116 x 6129 = 1296.9 
Confirmation:
> we are in a situation subject to HWc2 = S (0i  Ci)^{2}
Ci= (1787  1787.2)^{2}
1787.2+ (3039  3044.9)^{2}
3044.9+ (1303  1296.9)^{2}
1296.9= NS

answer:
3.3 Consequences of the law
Change in HW across the generations (demonstration that the frequencies are invariable). In a population subject to HW, an equilibrium involving the distribution of the genotype frequencies is reached after a single reproductive cycle.
Is a population in the n generation
3.3.1 What is the allele frequency in the n+1 generation?
A1A1  A1A2  A2A2  
n  p^{2}  2pq  q^{2} 
n + 1  F(A1) = D + H/2 = p^{2} +1/2 (2pq) = p (p+q) = p 
F(A2) = R + H/2 = q^{2} +1/2 (2pq) = q (p+q) = q 
> no change in allele frequencies:
 in the n generation, we have p and q
 in the n+1 generation, we have p and q
3.3.2 What is the genotype frequency in the n+1 generation?
male  p2  2pq  q^{2}  
female  A1A1  A1A2  A2A2  
p2 
A1A1  A1A1  A1A1  no A1A1  
2pq  A1A2  1/2A1A1  1/4A1A1  no A1A1  Generation n+1  
q2  A2A2  no A1A1  no A1A1  no A1A1 
Frequency of (A1A1) in the generation n+1  = (p^{2})^{2} + 1/2 (2 pq.p^{2}) + 1/2 (p^{2}.2pq) + 1/4 (2pq)^{2} 
= p^{4} + p^{3}q + p^{3}q + p^{2}q^{2} = p^{2} (p^{2} + 2pq + q^{2}) = p^{2} 
The frequency of the (A1A1) genotype does not change between generation n and generation n+1 (same demonstration for the (A2A2 ) and (A1A2) genotypes).
The genotype structure no longer undergoes any further changes once the population reaches the Hardy Weinberg equilibrium.
In very many examples, the frequencies seen in natural populations are consistent with those predicted by the HardyWeinberg law.
3.3.3 Example
The MN human blood groups.
Group  MM  MN  NN  
Number:  1787  3039  1303  Total, N = 6129 
Frequency of M = (1787 + 3039/2)/ 6129 = 0.540 = p
Frequency of N = (1303 + 3039/2)/6129 = 0.460 = q
Predicted proportion of MM = p^{2} = (0.540)^{2} = 0,2916
Predicted proportion of MN = 2pq = 2(0.540)(0.460) = 0.4968
Predicted proportion of NN = q^{2} = (0.460)2= 0.2116
Numbers predicted by HardyWeinberg :
 for MM = p^{2}N = 0,2916 x 6129 = 1787.2
 for MN = 2pqN = 0,4968 x 6129 = 3044.9
 for NN = q^{2}N = 0,2116 x 6129 = 1296.9
In the present situation, there is no need to do
IV. Extension of the HW law to other gene situations
4.1 To an autosomal, triallele, codominant gene
3 alleles A1, A2, A3
with frequencies F(A1) = p, F(A2) = q, F(A3) = r
there will be 6 genotypes
A1A1  A1A2  A1A3  A2A2  A2A3  A3A3  
Genotype frequencies according to HW 
p^{2}  2pq  2pr  q^{2}  2qr  r^{2} 
p  q  r  
A1  A2  A3  
p  A1  p^{2}  pq  pr  
q  A2  pq  q^{2}  qr  
r  A3  pr  qr  r^{2} 
4.2 To an autosomal, diallele, non codominant gene
A is dominant over a, which is recessive; in this case the genotypes (AA) and (Aa) cannot be distinguished within the population. Only the individuals with the phenotype [A], who number N1, will be distinguishable from the individuals with the phenotype [a], who number N2.
Genotypes  AA  Aa  aa 
Phenotypes  [A]  [a]  
Number  N1  N2  N 
Frequency of genotype  1q^{2}  q^{2}  
with q^{2} = N2/N = N2 / (N1 + N2)
and the frequency of the allele a = F(a) =(q^{2})1/2 = (N2/(N1 + N2))^{1/2}
This is a method commonly used in human genetics to calculate the frequency of rare, recessive genes.
Frequencies of homozygotes and heterozygotes for rare recessive human genes.
Gene  Incidence in population q^{2} 
Frequency of allele q 
Frequency of heterozygotes 2pq 
Albinism  1/22 500  1/150  1/75 
Phenylketonuria  1/10 000  1/100  1/50 
Mucopolysaccharidosis  11/90 000  1/300  1/150 
4.3 To an autosomal, triallele, non codominant gene
Example: the ABO blood group system. Although the human (ABO) blood group system is often taken to be a simple example of polyallelism, it is in fact a relatively complex situation combining the codominance of A and B, the presence of a nul O allele and the dominance of A and B over O.
If we take
 p to designate the frequency of allele A
 q to designate the frequency of allele B
rdiffering genotype and phenotype frequencies are found by applying the HardyWeinberg law. .
Phenotype  Genotype  Genotype frequency  Phenotype frequency 
[A]  (AA)  p^{2}  
(AO)  2pr  p^{2}+2pr  
[B]  (BB)  q^{2}  
(BO)  2qr  q^{2}+2qr  
[AB]  (AB)  2pq  2pq 
[O]  (OO)  r^{2}  r^{2} 
Using:
 p^{2} +2pr +r^{2} = (p+r)^{2}
 q^{2} +2qr +r^{2} = (q+r)^{2}
where
 F[A] + F[O] = (p+r)^{2}
 F[B] + F[O] = (q+r)^{2} et F[O] = r^{2}
4.3.1 Bernsteins equation (1930)
Bernsteins equation (1930) simplifies the calculations:
p = 1  (F[B] + F[O])^{1/2}
q = 1  (F[A] + F[O])^{1/2}
r = (F[O])^{1/2}
then, if p+q+r # 1, correction by the deviation D = 1  (p + q + r) > p= p (1 + D/2) q= q (1 + D/2) r= (r + D/2) (1 + D/2)
Example:
Group  A  B  O  AB 
Number  9123  2987  7725  1269 
Frequency  0.4323  0.1415  0.3660  0.601 
p = 1  (0.3660+0.1415)1/2 = 0.2876
q = 1  (0.3660+0.4323)1/2 = 0.1065
r = = 0.6050
p+q+r = 0.9991 ... > p= 0.2877, q= 0.1065, r= 0.6057.
4.4 To a heterosomal (=gonosomic) gene
4.4.1 Y Chromosome
frequency p and q in subjects XY; transmission to male descendants.
4.4.2 X Chromosome
Female  XA1XA1  XA1XA2  XA2XA2 
p^{2}  2pq  q^{2} 
Male  XA1/Y  XA2/Y 
p  q 
i.e. the frequency of the q allele, is qx in men, and qxx in women:
 the X chromosome of the boys (in generation n) has been transmitted from the mothers (generation n1) > qx(n) = qxx(n1) q_{x}^{(n)} = q_{xx}^{(n1)}
 the X chromosome carrying the q allele in the daughters has:
 1/2 chance of coming from their father,
 1/2 chance of coming from their mother
> qxx^{(n)} = ( q_{x}^{(n1)} + q_{xx}^{(n1)})/2
 the frequency of the allele in men = the frequency in women in the previous generation
 the frequency of the allele in women = mean of the frequencies in the 2 sexes in the previous generation.
* calculation of the difference in allele frequencies between the 2 sexes:
q_{x}^{(n)}  q_{xx}^{(n)} = q_{xx}^{(n1)}  (q_{xx}^{(n1)})/2  (q_{xx}^{(n1)}) /2 =  1/2 (q_{x}^{(n1)}  q_{xx}^{(n1)})
> q_{x}^{(n)}  q_{xx}^{(n)} = ( 1/2)^{n} (q_{x}^{(0)}  q_{xx}^{(0)}) : tends towards zero in 8 to 10 generations
* mean frequency q:
1/3 of the X chromosomes belong to men, 2/3 to women: q = 1/3 q_{x}^{(n)} + 2/3 q_{xx}^{(n)}
the mean frequency is invariable (develop q1 into q0 ...... > q1 = q0)
at equilibrium, q^{(e)} est : q_{x}^{(e)} = q_{xx}^{(e)} = q^{(e)}
* exercise: For generation G0, consisting of 100% of normal men and 100% of colorblind women, calculate the frequencies of the gene up to G6.
* answer:
G0: X 
XDXD 
G0 : qx(0) = 0.00  qxx(0) = 1.00 
G1 : qx(1) = 1.00  qxx(1) = 0.50 
G2 : qx(2) = 0.50  qxx(2) = 0.75 
G3 : qx(3) = 0.75  qxx(3) = 0.63 
G4 : qx(4) = 0.63  qxx(4) = 0.69 
G5 : qx(5) = 0.69  qxx(5) = 0.66 
G6 : qx(6) = 0.66  qxx(6) = 0.60 
Therefore:
For a sexlinked locus, the Hardy Weinberg equilibrium is reached asymptotically after 810 generations, whereas it is reached after 1 generation for an autosomal locus.
V. Summary  Consequences of HWs law
 Regardless of whether we are in a situation subject to HW or not, the genotype frequencies (D, H, R) can be used to calculate the allele frequencies (p,q), from : p = D + H/2, q = R + H/2.
 Whereas, if and only if we are subject to HW, the genotype frequencies can be calculated from the allele frequencies, from D = p^{2}, H = 2pq, R = q^{2}.
 The dominance relationships between alleles have no effect on the change in allele frequencies (although they do affect how difficult the exercises are!)
 The allele frequencies remain stable over time; and so do the genotype frequencies.
 The random mendelian segregation of the chromosomes preserves the genetic variability of populations.
 Since "evolution" is defined as a change in allele frequencies, an ideal diploid population would not evolve.
 It is only violations of the properties of an ideal population that allow the evolutionary process to take place.

The practical approach to a problem is always the same:
 The Numbers Observed > the (Observed) Genotype Frequencies;
 Calculate the Allele Frequencies: p=D/2 + S Hi/2 , q = ...
 If we are subject to HW (hypothetically), then D=p^{2}, H= 2pq, etc ... : we calculate the Theoretical Genotype Frequencies according to HW.
 The Calculated Genotype Frequencies > the Calculated Numbers;

Comparison of Observed Numbers  Calculated Numbers: :
c^{2} = =1>S (Oi  Ci)^{2}/Ci =2> 
If
c^{2} is significant: we are not in accordance with HW; this =1> > Consanguinity?
 > Selection?
 > Mutations ?
Citation
Kalmes R, Huret JL
Atlas of Genetics and Cytogenetics in Oncology and Haematology 20010201
HardyWeinberg model
Online version: http://atlasgeneticsoncology.org/teaching/30076/hardyweinbergmodel