Seeking Signatures of Hybridization by Approximate Bayesian - - PowerPoint PPT Presentation

seeking signatures of hybridization by approximate
SMART_READER_LITE
LIVE PREVIEW

Seeking Signatures of Hybridization by Approximate Bayesian - - PowerPoint PPT Presentation

Seeking Signatures of Hybridization by Approximate Bayesian Computation Michael Woodhams with Barbara Holland Simulation 1 Analysis Base Stats Data New summary stats Simulation 2 ABC Results Simulator Fixed params Simulator Set of


slide-1
SLIDE 1

Seeking Signatures of Hybridization by Approximate Bayesian Computation

Michael Woodhams with Barbara Holland

slide-2
SLIDE 2

Data Simulation 2 Base Stats Analysis New summary stats ABC Simulation 1 Results

slide-3
SLIDE 3

Simulator

Simulator Random params Fixed params Set of Gene Trees Set of Gene Trees

slide-4
SLIDE 4

Simulator

Simulator Random params Fixed params Set of Gene Trees Set of Gene Trees

Lineage Trees Hybrid Network Gene Trees Resolve hybridizations Simulate coalescence

slide-5
SLIDE 5

Simulator

Simulator Random params Fixed params

#NEXUS begin hybridseq; epochs = (); speciation rate = (1); hybridization rate = (0.1); introgression rate = (0); hybridization function = step; hybridization threshold = 100; hybridization distribution =(0.5,1); minimum hybridizations = 3; coalesce = true; halt time = 100; [ halt taxa = 23;] halt hybrid = 100; [ number random trees = 1070;] end; begin ABC; iterations = 50000; reduce hybridizations to = HYBR(0,3); coalescence rate = COAL(1,20); ... end; begin trees; ...

Set of Gene Trees Set of Gene Trees

Lineage Trees Hybrid Network Gene Trees Resolve hybridizations Simulate coalescence

slide-6
SLIDE 6

Coalescence Hybridization

3 x 5 x

(we hope that other sources of phylogenetic error will behave like coalescence)

slide-7
SLIDE 7

Base Stats

TE: Tree Entropy. Entropy of gene tree topologies QE: Quartet Entropy: sum over quadruples of taxa, entropy of how that quadruple resolves into quartets. SI: Split incompatibility. Sum over pairs of gene trees of their Robinson-Foulds distance. Equivalently, number of incompatible pairs of splits from the gene trees SI-k: Threshold split incompatibility: like SI but subtract k from number of times each split

  • ccurs

RSk: Rare splits. The number of splits occurring k or fewer times DC: Distance to Consensus. The sum over gene trees of Robinson-Foulds distance to majority-rule consensus tree. TS: Total Splits. The number of distinct splits in the gene trees TC: Total Cherries. The number of distinct cherries in the gene trees

SPR, NNI distances would be ideal, but too computationally expensive. Suggestions welcome.

slide-8
SLIDE 8

Base Stats

  • Coal. rate and hybr rate: high, med, low, tiny

No hybrid, two hybrid ▲fast coal, ● slow coal

slide-9
SLIDE 9

ABC Overview

Data Simulation Summary Stats Close enough? Random Parameters Analyse Parameters

slide-10
SLIDE 10

ABC Overview

Data Simulation Summary Stats Close enough? Random Parameters Analyse Parameters Which summary stats? How close? Randomized

  • ver what

range?

slide-11
SLIDE 11

Summary Stats

Semi-automatic ABC: Fearnhead & Prangle, JRStatS B, 74 419-474 (2012) Random Parameters Simulated Data Simulation Fit parameters

slide-12
SLIDE 12

Summary Stats

Semi-automatic ABC: Fearnhead & Prangle, JRStatS B, 74 419-474 (2012) Hybridization, coalescence Gene Trees Simulation Fit parameters Random Parameters Simulated Data Simulation Fit parameters Base stats Fitted hybridization, coalescence = summary statistics for ABC

slide-13
SLIDE 13

Summary Stats

Coloured by Hybridization Number Coloured by Coalescence Rate Principal Component Analysis

(red = slow coalescence = randomized trees)

slide-14
SLIDE 14

Data

Inferring ancient divergences...: Salichos & Rokas, Nature, 497 327-331 (2013) Yeast 23 taxa 1070 genes

slide-15
SLIDE 15

Results

Hybr = 0 has p=0.25

slide-16
SLIDE 16

Data

Inferring ancient divergences...: Salichos & Rokas, Nature, 497 327-331 (2013) Vertebrates 18 taxa 1087 genes

slide-17
SLIDE 17

Results

Hybr > 0 has p=0.23

slide-18
SLIDE 18

Data

Inferring ancient divergences...: Salichos & Rokas, Nature, 497 327-331 (2013) Metazoa 21 taxa 225 genes

slide-19
SLIDE 19

Results