How best to distinguish selection on discrete loci from the infinitesimal model?
Nick Barton
How best to distinguish selection on discrete loci from the - - PDF document
How best to distinguish selection on discrete loci from the infinitesimal model? Nick Barton 2 Vienna Feb 2019.nb Longshanks Frank Chan, Layla Hiramitsu (Tbingen); Campbell Rolian (Calgary); Stefanie Belohlavy (IST); bioRxiv Two
Nick Barton
Longshanks
Frank Chan, Layla Hiramitsu (Tübingen); Campbell Rolian (Calgary); Stefanie Belohlavy (IST); bioRxiv Two replicates of ~30 mice: within-family selection for the longest tibia Use a composite trait logTM-0.57
2 Vienna Feb 2019.nb
Some 10kb windows show strong allele frequency change: z=2 arcsin p ; Δz2 in 10kb windows
Vienna Feb 2019.nb 3
Motivation
There is a rapid, consistent response to selection We know the selection, the pedigree, the sequence … Can we find the causal alleles ? A small experiment - but it represents larger populations, selected for a longer time.
4 Vienna Feb 2019.nb
Outline
The infinitesimal as the null model Variation in SNP and haplotype frequencies Estimating effects of candidate loci on fitness and trait
Vienna Feb 2019.nb 5
The infinitesimal with linkage
In this experiment, the pedigree is fixed, and so chromosomes evolve independently How much does infinitesimal selection affect allele frequencies?
Even strong selection has little effect The diffusion approximation works well Infinitesimal selection produces a slight excess of sweeps
6 Vienna Feb 2019.nb
SNP are carried on haplotype blocks
Simulate, conditioning on the pedigree, and the observed heritability (assuming additivity)
SNP are thrown down onto the haplotype blocks
Vienna Feb 2019.nb 7
Variance in SNP frequency is inflated
(grey/black: old/new data; colours: replicate simulations)
10 20 30 40 50 1000 2000 3000 4000
LS1 chrom 1
10 1000 2000 3000 10 20 30 40 50 1000 2000 3000 4000 5000
LS1 chrom 2
10 500 1000 1500 2000 2500 3000 3500 10 20 30 40 50 1000 2000 3000 4000
LS1 chrom 3
10 500 1000 1500 2000 2500 3000 3500 10 20 30 40 50 500 1000 1500 2000 2500
LS1 chrom 4
10 500 1000 1500 2000 10 20 30 40 50 1000 2000 3000 4000
LS1 chrom 5
10 500 1000 1500 2000 2500 3000 10 20 30 40 50 1000 2000 3000 4000 5000
LS1 chrom 6
10 1000 2000 3000 4000 5000 10 20 30 40 50 1000 2000 3000 4000
LS1 chrom 7
10 500 1000 1500 10 20 30 40 50 500 1000 1500 2000
LS1 chrom 8
10 500 1000 1500 2000 10 20 30 40 50 1400 1600 1800 2000
LS1 chrom 9
10 500 1000 1500 2000 2500 3000
LS1 chrom 10
1500 2000 2500
8 Vienna Feb 2019.nb
10 20 30 40 50 500 1000 1500 10 500 1000 1500 10 20 30 40 50 1000 2000 3000 4000
LS1 chrom 11
10 500 1000 1500 2000 2500 3000 3500 10 20 30 40 50 500 1000 1500 2000 2500
LS1 chrom 12
10 500 1000 1500 2000 2500 10 20 30 40 50 1000 2000 3000 4000
LS1 chrom 13
10 500 1000 1500 2000 2500 3000 3500 10 20 30 40 50 1000 2000 3000 4000 5000
LS1 chrom 14
10 1000 2000 3000 4000 5000 10 20 30 40 50 500 1000 1500 2000 2500 3000
LS1 chrom 15
10 500 1000 1500 2000 2500 3000 10 20 30 40 50 200 400 600 800 1000 1200
LS1 chrom 16
10 200 400 600 800 1000 1200 10 20 30 40 50 1000 2000 3000 4000
LS1 chrom 17
10 500 1000 1500 2000 2500 3000 3500 10 20 30 40 50 500 1000 1500 2000 2500 3000
LS1 chrom 18
10 500 1000 1500 2000 2500 10 20 30 40 50 500 1000 1500 2000
LS1 chrom 19
10 500 1000 1500 2000 50 100 150
LS1 chrom 20
50 100 150
Vienna Feb 2019.nb 9
10 20 30 40 50 10 20
10 Vienna Feb 2019.nb
Variation in SNP frequency is inflated by LD in the base population
In any window, we have ki of the i' th haplotype: Variation in SNP frequencies reflect ki
< Δp2 > =
1 nS
i=1 ns Δpi 2
(1)
< Δp2 > = j nT
2 var[k]
1 - ji - 1 n0 - 1 where j n0 is the initial SNP frequency
(2)
var < Δp2 > = 1 nS
2
i=1 ns varΔpi 2 + i,j=1 i≠j ns
covΔpi
2, Δpj 2
(3)
var < Δp2 > depends on the moments of ki and is inflated by LD in the base population. With large # of SNP, var < Δp2 > ~ covΔpi
2, Δpj 2, which increases with D2.
e.g. n0 = 32, ki = {10, 4, 2, 1, 1, 1, 1, 0, 0, …}, p0 = 0.5:
Δ ] Vienna Feb 2019.nb 11
Is the candidate on chrom. 5 significant?
12 Vienna Feb 2019.nb
Is the candidate on chrom. 5 significant?
Vienna Feb 2019.nb 13
Is the candidate on chrom. 5 significant?
Pairs of simulations, starting from the same founder genomes, give outlier Δz2 that overlap the signal from LS1 (red) but not LS2 (orange) Based on SNP frequencies, the signal is marginally significant
14 Vienna Feb 2019.nb
Three sources of variation in SNP frequencies
Variation due to LD amongst SNP can be strong:
This source of error can be eliminated by working with haplotypes
Vienna Feb 2019.nb 15
How strong is selection ?
Alleles in the candidate region on chrom. 5 sweep from p= 0.178 → 0.833, 0.981 in LS1, LS2 ⇒ s ~
1 t log p17 q17 q0 p0 ~ 0.25 (cf. Taus et al., 2017)
How large an effect on the trait? Simulate an additive allele, effect A; 40 replicates; s = 0.41 A Ve (le) The mean and sd from infinitesimal simulations (dots) fit with a single-locus WF model, Ne∼44(red)
A/ Ve
s
A 0.2 0.4 0.6 0.8 p17
0.05 0.10 0.15 0.20
The locus on chromosome 5 has effect A
= 0.59
Ve (0.32 Ve to -0.87 Ve ). This single locus is responsi- ble for ~ 9.4% (3.6% - 15.5%) of the response .
16 Vienna Feb 2019.nb
Summary
(but: selection was within families; the map is long)
Vienna Feb 2019.nb 17