[PPT] - Assessing Phylogenetic Hypotheses and Phylogenetic Data We use PowerPoint Presentation

SLIDE 1

Assessing Phylogenetic Hypotheses and Phylogenetic Data

We use numerical phylogenetic methods because

most data includes potentially misleading evidence of relationships

We should not be content with constructing

phylogenetic hypotheses but should also assess what ‘confidence’ we can place in our hypotheses

This is not always simple! (but do not despair!)

SLIDE 2

Assessing Data Quality

We expect (or hope) our data will be well structured

and contain strong phylogenetic signal

We can test this using randomization tests of explicit

null hypotheses

The behaviour or some measure of the quality of our

real data is contrasted with that of comparable but phylogenetically uninformative data determined by randomization of the data

SLIDE 3

Random Permutation

Random permutation destroys any correlation among characters to that expected by chance alone It preserves number of taxa, characters and character states in each character (and the theoretical maximum and minimum tree lengths) Original structured data with strong correlations among characters

‘TAXA’ ‘CHARACTERS’ 1 2 3 4 5 6 7 8 R-P N U D E R T O U A-E R E A P L E A D N-R M R M M A D N P D-M L T R E Y M D R O-U D E Y U D E Y M M-T O M O T O U L T L-E Y D N D M P M E Y-D A P L R N R R E

Randomly permuted data with any correlation among characters due to chance

‘TAXA’ ‘CHARACTERS’ 1 2 3 4 5 6 7 8 R-P R P R P R P R P A-E A E A E A E A E N-R N R N R N R N R D-M D M D M D M D M O-U O U O U O U O U M-T M T M T M T M T L-E L E L E L E L E Y-D Y D Y D Y D Y D

SLIDE 4

Matrix Randomization Tests

Compare some measure of data quality/hierarchical

structure for the real and many randomly permuted data sets

This allows us to define a test statistic for the null

hypothesis that the real data are no better structured than randomly permuted and phylogenetically uninformative data

A permutation tail probability (PTP) is the proportion
f data sets with as good or better measure of quality

than the real data

SLIDE 5

Structure of Randomization Tests

Reject null hypothesis if, for example, more than 5% of

random permutations have as good or better measure than the real data

Measure of data quality (e.g. tree length, ML, pairwise incompatibilities) 95% cutoff GOOD BAD Frequency PASS TEST reject null hypothesis FAIL TEST

SLIDE 6

Matrix Randomization Tests

Measures of data quality include:
1. Tree length for most parsimonious trees - the

shorter the tree length the better the data (PAUP*)

2. Numbers of pairwise incompatibilities between

characters (pairs of incongruent characters) - the fewer character conflicts the better the data

3. Skewness of the distribution of tree lengths

(PAUP)

SLIDE 7

Matrix Randomization Tests

Real data Randomly permuted Ciliate SSUrDNA

Ochromonas Symbiodinium Prorocentrum Loxodes Tracheloraphis Spirostomum Gruberia Euplotes Tetrahymena Ochromonas Symbiodinium Prorocentrum Loxodes Tetrahymena Tracheloraphis Spirostomum Euplotes Gruberia

Strict consensus

1 MPT L = 618 CI = 0.696 RI = 0.714 PTP = 0.01 PC-PTP = 0.001 Significantly non random 3 MPTs L = 792 CI = 0.543 RI = 0.272 PTP = 0.68 PC-PTP = 0.737 Not significantly different from random

Min = 430 Max = 927

SLIDE 8

Skewness of Tree Length Distributions

Studies with random (and

phylogenetically uninformative) data showed that the distribution

f tree lengths tends to be normal
In contrast, phylogenetically

informative data is expected to have a strongly skewed distribution with few shortest trees and few trees nearly as short

NUMBER OF TREES

shortest tree

NUMBER OF TREES

shortest tree Tree length Tree length

SLIDE 9

Skewness of Tree Length Distributions

Skewness of tree length distributions can be used as a

measure of data quality in randomization tests

It is measured with the G1 statistic in PAUP
Significance cut-offs for data sets of up to eight taxa

have been published based on randomly generated data (rather than randomly permuted data)

PAUP does not perform the more direct

randomization test

SLIDE 10

Skewness - example

792 | (3) 793 | (6) 794 | (12) 795 | (7) 796 | (17) 797 | (30) 798 | (33) 799 |# (42) 800 |# (62) 801 |# (91) 802 |# (111) 803 |## (134) 804 |## (172) 805 |### (234) 806 |#### (292) 807 |#### (356) 808 |###### (450) 809 |####### (557) 810 |######## (642) 811 |######### (737) 812 |############ (973) 813 |############## (1130) 814 |################ (1308) 815 |#################### (1594) 816 |##################### (1697) 817 |########################## (2097) 818 |############################## (2389) 819 |################################## (2714) 820 |###################################### (3080) 821 |######################################### (3252) 822 |############################################# (3616) 823 |################################################# (3933) 824 |################################################### (4094) 825 |####################################################### (4408) 826 |######################################################### (4574) 827 |########################################################## (4656) 828 |############################################################# (4871) 829 |############################################################## (4962) 830 |################################################################ (5130) 831 |############################################################## (5005) 832 |############################################################### (5078) 833 |############################################################### (5035) 834 |############################################################### (5029) 835 |############################################################# (4864) 836 |########################################################## (4620) 837 |######################################################## (4491) 838 |##################################################### (4256) 839 |################################################### (4057) 840 |############################################### (3749) 841 |############################################ (3502) 842 |####################################### (3160) 843 |################################### (2771) 844 |############################### (2514) 845 |############################ (2258) 846 |######################### (1964) 847 |###################### (1728) 848 |################## (1425) 849 |############## (1159) 850 |########### (915) 851 |######### (760) 852 |####### (581) 853 |###### (490) 854 |#### (321) 855 |### (269) 856 |### (218) 857 |## (161) 858 |# (95) 859 |# (73) 860 |# (46) 861 | (26) 862 | (16) 863 | (14) 864 | (7) 865 | (7) 866 | (3) 867 | (2)

Frequency distribution of tree lengths Frequency distribution of tree lengths

RANDOMLY PERMUTED DATA g1=-0.100478

722 | ## (72) 723 | ### (92) 724 | ### (101) 725 | ### (87) 726 | #### (107) 727 | #### (120) 728 | #### (111) 729 | ##### (134) 730 | ##### (137) 731 | #### (110) 732 | #### (113) 733 | #### (119) 734 | #### (127) 735 | ##### (131) 736 | #### (106) 737 | #### (109) 738 | #### (126) 739 | #### (115) 740 | ##### (136) 741 | #### (128) 742 | ##### (144) 743 | ##### (134) 744 | ###### (160) 745 | ##### (152) 746 | ##### (159) 747 | ###### (164) 748 | ###### (182) 749 | ####### ( 216) 750 | ####### ( 193) 751 | ######## (235) 752 | ######## (244) 753 | ######### (251) 754 | ######## (243) 755 | ######### (254) 756 | ######## (243) 757 | ######### (271) 758 | ######### (255) 759 | ########## (287) 760 | ######### (268) 761 | ########## (291) 762 | ########### (319) 763 | ########## (295) 764 | ########### (314) 765 | ########### (312) 766 | ########### (331) 767 | ########### (325) 768 | ############ (347) 769 | ########### (333) 770 | ############ (361) 771 | ############## (400) 772 | ############# ( 386) 773 | ############## (420) 774 | ############## (399) 775 | ############### (435) 776 | ################# (505) 777 | ################# (492) 778 | ################## (534) 779 | ################## (517) 780 | ################## (529) 781 | ###################### (637) 782 | ##################### (604) 783 | ######################## (685) 784 | ######################## (691) 785 | ###################### (644) 786 | ######################## (700) 787 | ########################## ( 746) 788 | ######################### (713) 789 | ########################## ( 743) 790 | ########################## ( 746) 791 | ######################### (732) 792 | ########################## ( 764) 793 | ############################ (811) 794 | ######################### (717) 795 | ########################## ( 762) 796 | ######################## (695) 797 | ############################ (807) 798 | ######################## (685) 799 | ####################### (660) 800 | ######################## (688) 801 | ####################### (659) 802 | ######################## (693) 803 | ######################## (694) 804 | ########################## ( 762) 805 | ########################## ( 743) 806 | ######################### (737) 807 | ########################## ( 745) 808 | ############################ (816) 809 | ############################# (838) 810 | ############################ (827) 811 | ########################## ( 765) 812 | ############################## (859) 813 | ########################## ( 763) 814 | ########################### (773) 815 | ############################# (835) 816 | ############################ (802) 817 | ########################### (798) 818 | ############################# (848) 819 | ############################# (847) 820 | ############################## (879) 821 | ############################ (828) 822 | ########################### (784) 823 | ########################## ( 757) 824 | ########################## ( 770) 825 | ############################ (812) 826 | ############################ (819) 827 | ############################# (850) 828 | ############################## (863) 829 | ################################ ( 934) 830 | ################################ ( 919) 831 | ################################# (963) 832 | ################################### (1021) 833 | ###################################### (1113) 834 | ####################################### ( 1143) 835 | ######################################## (1162) 836 | ########################################## (1223) 837 | ############################################ (1270) 838 | ############################################### (1356) 839 | ################################################ (1399) 840 | ############################################### (1356) 841 | ################################################# (1424) 842 | ################################################### (1492) 843 | #################################################### ( 1499) 844 | ######################################################## (1630) 845 | ####################################################### (1594) 846 | ######################################################## (1619) 847 | ########################################################### (1718) 848 | ############################################################# (1765) 849 | ############################################################## (1793) 850 | ################################################################ ( 1853) 851 | ############################################################## (1800) 852 | ############################################################# (1773) 853 | ################################################################ ( 1861) 854 | ################################################################ ( 1853) 855 | ############################################################## (1805) 856 | ########################################################### (1722) 857 | ######################################################### (1651) 858 | ####################################################### (1613) 859 | ###################################################### (1559) 860 | ################################################### (1482) 861 | ################################################### (1479) 862 | ################################################ (1409) 863 | ############################################## (1349) 864 | ################################################ (1407) 865 | ################################################### (1487) 866 | ################################################## (1445) 867 | ##################################################### (1550) 868 | ################################################### (1482) 869 | ###################################################### (1573) 870 | ####################################################### (1587) 871 | #################################################### ( 1525) 872 | ###################################################### (1576) 873 | ###################################################### (1572) 874 | #################################################### ( 1499) 875 | ################################################### (1480) 876 | ############################################### (1370) 877 | ############################################ (1289) 878 | ########################################## (1228) 879 | ######################################## (1165) 880 | ################################### (1006) 881 | ################################## (992) 882 | ############################### (890) 883 | ########################### (792) 884 | ######################## (693) 885 | ###################### (650) 886 | ##################### (606) 887 | ################ (469) 888 | ############## (415) 889 | ########### (314) 890 | ######## (232) 891 | ####### ( 213) 892 | ##### (133) 893 | #### (114) 894 | ### (75) 895 | ## (60) 896 | ## (52) 897 | # (17) 898 | # (16) 899 | ( 6) 900 | ( 4)

REAL DATA Ciliate SSUrDNA g1=-0.951947

SLIDE 11

Matrix Randomization Tests - use and limitations

Can detect very poor data - that provides no good

basis for phylogenetic inferences (throw it away!)

However, only very little may be needed to reject

the null hypothesis (passing test ≠

≠ ≠ ≠ great data)

Doesn’t indicate location of this structure (more

discerning tests are possible)

In the skewness test, significance levels for G1 have

been determined for small numbers of taxa only so that this test remains of limited use

SLIDE 12

Assessing Phylogenetic Hypotheses - groups on trees

Several methods have been proposed that attach numerical values to internal branches in trees that are intended to provide some measure of the strength of support for those branches and the corresponding groups These methods include:

character resampling methods - the bootstrap and jackknife decay analyses additional randomization tests

SLIDE 13

Bootstrapping (non-parametric)

Bootstrapping is a modern

statistical technique that uses computer intensive random resampling of data to determine sampling error or confidence intervals for some estimated parameter

SLIDE 14

Bootstrapping (non-parametric)

Characters are resampled with replacement to create

many bootstrap replicate data sets

Each bootstrap replicate data set is analysed (e.g.

with parsimony, distance, ML)

Agreement among the resulting trees is summarized

with a majority-rule consensus tree

Frequency of occurrence of groups, bootstrap

proportions (BPs), is a measure of support for those groups

Additional information is given in partition tables

SLIDE 15

Bootstrapping

Original data matrix Characters Taxa 1 2 3 4 5 6 7 8 A R R Y Y Y Y Y Y B R R Y Y Y Y Y Y C Y Y Y Y Y R R R D Y Y R R R R R R Outgp R R R R R R R R

A B C D

1 2 1 2 3 4 5 6 7 8

A B C D

1 2 2 5 5 6 6 8 Outgroup Outgroup

Resampled data matrix Characters Taxa 1 2 2 5 5 6 6 8 A R R R Y Y Y Y Y B R R R Y Y Y Y Y C Y Y Y Y Y R R R D Y Y Y R R R R R Outgp R R R R R R R R

Randomly resample characters from the original data with replacement to build many bootstrap replicate data sets of the same size as the original - analyse each replicate data set Summarise the results of multiple analyses with a majority-rule consensus tree Bootstrap proportions (BPs) are the frequencies with which groups are encountered in analyses of replicate data sets A B C D

Outgroup

96% 66%

SLIDE 16

Bootstrapping - an example

Ciliate SSUrDNA - parsimony bootstrap 123456789 Freq

.**...... 100.00

...**.... 100.00 .....**.. 100.00 ...****.. 100.00 ...****** 95.50 .......** 84.33 ...****.* 11.83 ...*****. 3.83 .*******. 2.50 .**....*. 1.00 .**.....* 1.00 ajority-rule consensus Partition Table

Ochromonas (1) Symbiodinium (2) Prorocentrum (3) Euplotes (8) Tetrahymena (9) Loxodes (4) Tracheloraphis (5) Spirostomum (6) Gruberia (7) 100 96 84 100 100 100

SLIDE 17

Bootstrapping - random data

Randomly permuted data - parsimony bootstrap Majority-rule consensus (with minority components) Partition Table

123456789 Freq

.*****.** 71.17

..**..... 58.87 ....*..*. 26.43 .*......* 25.67 .***.*.** 23.83 ...*...*. 21.00 .*..**.** 18.50 .....*..* 16.00 .*...*..* 15.67 .***....* 13.17 ....**.** 12.67 ....**.*. 12.00 ..*...*.. 12.00 .**..*..* 11.00 .*...*... 10.80 .....*.** 10.50 .***..... 10.00 Ochromonas Symbiodinium Prorocentrum Loxodes Spirostomumum Tetrahymena Euplotes Tracheloraphis Gruberia 71 26 16 59 16 21 Ochromonas Symbiodinium Prorocentrum Loxodes Tracheloraphis Spirostomumum Euplotes Tetrahymena Gruberia 71 59

SLIDE 18

Bootstrap - interpretation

Bootstrapping was introduced as a way of establishing confidence

intervals for phylogenies

This interpretation of bootstrap proportions (BPs) depends on the

assumption that the original data is a random sample from a much larger set of independent and identically distributed data

However, several things complicate this interpretation
Perhhaps the assumptions are unreasonable - making any statistical

interpretation of BPs invalid

Some theoretical work indicates that BPs are very conservative, and

may underestimate confidence intervals - problem increases with numbers of taxa

BPs can be high for incongruent relationships in separate analyses -

and can therefore be misleading (misleading data -> misleading BPs)

with parsimony it may be highly affected by inclusion or exclusion of
nly a few characters

SLIDE 19

Bootstrapping is a very valuable and widely used

technique - it (or some suitable) alternative is demanded by some journals, but it may require a pragmatic interpretation:

BPs depend on two aspects of the support for a group - the

numbers of characters supporting a group and the level of support for incongruent groups

BPs thus provides an index of the relative support for

groups provided by a set of data under whatever interpretation of the data (method of analysis) is used

Bootstrap - interpretation

SLIDE 20

High BPs (e.g. > 85%) is indicative of strong ‘signal’ in the

data

Provided we have no evidence of strong misleading signal

(e.g. base composition biases, great differences in branch lengths) high BPs are likely to reflect strong phylogenetic signal

Low BPs need not mean the relationship is false, only that it

is poorly supported

Bootstrapping can be viewed as a way of exploring the

robustness of phylogenetic inferences to perturbations in the the balance of supporting and conflicting evidence for groups

Bootstrap - interpretation

SLIDE 21

Jackknifing

Jackknifing is very similar to bootstrapping and

differs only in the character resampling strategy

Some proportion of characters (e.g. 50%) are

randomly selected and deleted

Replicate data sets are analysed and the results

summarised with a majority-rule consensus tree

Jackknifing and bootstrapping tend to produce

broadly similar results and have similar interpretations

SLIDE 22

Decay analysis

In parsimony analysis, a way to assess support for a

group is to see if the group occurs in slightly less parsimonious trees also

The length difference between the shortest trees

including the group and the shortest trees that exclude the group (the extra steps required to overturn a group) is the decay index or Bremer support

Total support (for a tree) is the sum of all clade decay

indices - this has been advocated as a measure for an as yet unavailable matrix randomization test

SLIDE 23

Decay analysis -example

Ochromonas Symbiodinium Prorocentrum Loxodes Tracheloraphis Spirostomum Gruberia Euplotes Tetrahymena Ochromonas Symbiodinium Prorocentrum Loxodes Tetrahymena Tracheloraphis Spirostomum Euplotes Gruberia

Ciliate SSUrDNA data Randomly permuted data +27 +15 +8 +3 +1 +1 +45 +7 +10

SLIDE 24

Decay analyses - in practice

Decay indices for each clade can be determined by:

Saving increasingly less parsimonious trees and

producing corresponding strict component consensus trees until the consensus is completely unresolved

analyses using reverse topological constraints to

determine shortest trees that lack each clade

with the Autodecay or TreeRot programs (in

conjunction with PAUP)

SLIDE 25

Decay indices - interpretation

Generally, the higher the decay index the better the

relative support for a group

Like BPs, decay indices may be misleading if the data

is misleading

Unlike BPs decay indices are not scaled (0-100) and it

is less clear what is an acceptable decay index

Magnitude of decay indices and BPs generally

correlated (i.e. they tend to agree)

Only groups found in all most parsimonious trees

have decay indices > zero

SLIDE 26

Trees are typically complex - they can be thought of as sets of less complex relationships

A B C D E

(AB)C (AC)D (DE)A (AB)D (AC)E (DE)B (AB)E (BC)D (DE)C (AC)E Resolved triplets ABCD ACDE ABDE BCDE ABCE Resolved quartets Clades AB ABC DE

SLIDE 27

Extending Support Measures

The same measures (BP, JP & DI) that are

used for clades/splits can also be determined for triplets and quartets

This provides a lot more information because

there are more triplets/quartets than there are clades

Furthermore....

SLIDE 28

The Decay Theorem

The DI of an hypothesis of relationships is equal to

the lowest DI of the resolved triplets that the hypothesis entails

This applies equally to BPs and JPs as well as DIs
Thus a phylogenetic chain is no stronger than its

weakest link!

and, measures of clade support may give a very

incomplete picture of the distribution of support

SLIDE 29

Extensions

Double decay analysis is the determination of

decay indices for all relationships - gives a more comprehensive but potentially very complicated summary of support

Majority-rule reduced consensus provides a

similarly more comprehensive/complicated summary of bootstrap/jackknife proportions

Leaf stability provides support values for the

phylogenetic position of particular leaves

SLIDE 30

Bootstrapping with Reduced Consensus

A B C D E F G H I J A B C D E F G H I J X A B C D E F G H I J X

A B C D E F G H I J A B C D E F G H I J X 50.5 50.5 50.5 50.5 50.5 100 100 100 100 99 99 98 98

A 1111100000 B 0111100000 C 0011100000 D 0001100000 E 0000100000 F 0000010000 G 0000011000 H 0000011100 I 0000011110 J 0000011111 X 1111111111

SLIDE 31

Bootstrapping

A B C D E F G H I J A B C D E F G H I J X 50.5 50.5 50.5 50.5 50.5 100 100 100 100 99 99 98 98

SLIDE 32

Leaf Stability

Leaf stability is the average of supports of the

triplets/quartets containing the leaf

Acanthostega Ichthyostega Greererpeton Crassigyrinus Eucritta Whatcheeria Gephyrostegus Balanerpeton Dendrerpeton Proterogyrinus Pholiderpeton Megalocephalus Loxomma Baphetes

94 100 59 84 (98) (98) (69) (53) (54) (58) (49) (64) (64) (66) (66) (67) (67) (67) 100 95

SLIDE 33

PTP tests of groups

A number of randomization tests have been proposed

for evaluating particular groups rather than entire data matrices by testing null hypotheses regarding the level of support they receive from the data

Randomisation can be of the data or the group
These methods have not become widely used both

because they are not readily performed and because their properties are still under investigation

One type, the topology dependent PTP tests are

included in PAUP* but have serious problems

SLIDE 34

Comparing competing phylogenetic hypotheses - tests of two trees

Particularly useful techniques are those designed to

allow evaluation of alternative phylogenetic hypotheses

Several such tests allow us to determine if one tree is

statistically significantly worse than another:

Winning sites test, Templeton test, Kishino-Hasegawa test, Shimodaira-Hasegawa test, parametric bootstrapping

SLIDE 35

All these tests are of the null hypothesis that the

differences between two trees (A and B) are no greater than expected from sampling error

The simplest ‘wining sites’ test sums the number of

sites supporting tree A over tree B and vice versa (those having fewer steps on, and better fit to, one of the trees)

Under the null hypothesis characters are equally likely

to support tree A or tree B and a binomial distribution gives the probability of the observed difference in numbers of winning sites

Tests of two trees

SLIDE 36

The Templeton test

Templeton’s test is a non-parametric Wilcoxon signed ranks test of the differences in fits of characters to two trees It is like the ‘winning sites’ test but also takes into account the magnitudes of differences in the support of characters for the two trees

SLIDE 37

Templeton’s test - an example

Seymouriadae Diadectomorpha Synapsida Parareptilia Captorhinidae Paleothyris Claudiosaurus Younginiformes Archosauromorpha Lepidosauriformes Placodus Eosauropterygia Araeoscelidia

2 1

Recent studies of the relationships of turtles using morphological data have produced very different results with turtles grouping either within the parareptiles (H1) or within the diapsids (H2) the result depending on the morphologist This suggests there may be:

problems with the data
special problems with turtles
weak support for turtle relationships

The Templeton test was used to evaluate the trees and showed that the slightly longer H1 tree found in the constrained analyses was not significantly worse than the unconstrained H2 tree The morphological data do not allow choice between H1 and H2 Parsimony analysis of the most recent data favoured H2 However, analyses constrained by H2 produced trees that required only 3 extra steps (<1% tree length)

SLIDE 38

Kishino-Hasegawa test

The Kishino-Hasegawa test is similar in using

differences in the support provided by individual sites for two trees to determine if the overall differences between the trees are significantly greater than expected from random sampling error

It is a parametric test that depends on assumptions

that the characters are independent and identically distributed (the same assumptions underlying the statistical interpretation of bootstrapping)

It can be used with parsimony and maximum

likelihood - implemented in PHYLIP and PAUP*

SLIDE 39

Kishino-Hasegawa test

If the difference between trees (tree lengths or likelihoods) is attributable to sampling error, then characters will randomly support tree A or B and the total difference will be close to zero The observed difference is significantly greater than zero if it is greater than 1.95 standard deviations This allows us to reject the null hypothesis and declare the sub-

ptimal tree significantly worse than

the optimal tree (p < 0.05) Under the null hypothesis the mean of the differences in parsimony steps or likelihoods for each site is expected to be zero, and the distribution normal From observed differences we calculate a standard deviation

Distribution of Step/Likelihood differences at each site Sites favouring tree A Sites favouring tree B

Expected Mean

SLIDE 40

Kishino-Hasegawa test - an example

Ciliate SSUrDNA Maximum likelihood tree

Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Plagiopyla n Plagiopyla f Trimyema c Trimyema s Cyclidium p Cyclidium g Cyclidium l Glaucoma Colpodinium Tetrahymena Paramecium Discophrya Trithigmostoma Opisthonecta Colpoda Dasytrichia Entodinium Spathidium Loxophylum Homalozoon Metopus c Metopus p Stylonychia Onychodromous Oxytrichia Loxodes Tracheloraphis Spirostomum Gruberia Blepharisma

anaerobic ciliates with hydrogenosomes

Parsimonious character optimization

f the presence and absence of

hydrogenosomes suggests four separate origins of hydrogenosomes within the ciliates Questions

how reliable is this result?
in particular how well supported

is the idea of multiple origins?

how many origins can we

confidently infer?

SLIDE 41

Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Plagiopyla n Plagiopyla f Trimyema c Trimyema s Glaucoma Colpodinium Tetrahymena Paramecium Cyclidium p Cyclidium g Cyclidium l Discophryal Trithigmostoma Opisthonecta Dasytrichia Entodinium Spathidium Homalozoon Loxophylum Metopus c Metopus p Stylonychia Onychodromous Oxytrichia Colpoda Loxodes Tracheloraphis Spirostomum Gruberia Blepharisma 81-86 99-100 95-100 96-100 100 100 80-50 100 69-78 18-0 41-30 46-26 100-99 100 100 100 100 69-99 78-99 89-91 35-17 11-0 15-0 100 83-82 53-45 100 100 42 67-99 50-53 100-98

Kishino-Hasegawa test - an example

Ciliate SSUrDNA data Most parsimonious tree Parsimony analysis yields a very similar tree

in particular, parsimonious

character optimization indicates four separate origins

f hydrogenosomes within

ciliates Decay indices and BPs for parsimony and distance analyses indicate relative support for clades Differences between the ML, MP and distance trees generally reflect the less well supported relationships

11 7 3 10 26 18 63 3 3 1 3 3 3 3 48 27 33 75 6 7 12 3 23 4 27 5 45-72 56 3 3 17

SLIDE 42

Kishino-Hasegawa test - example

Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Plagiopyla n Plagiopyla f Trimyema c Trimyema s Cyclidium p Cyclidium g Cyclidium l Dasytrichia Entodinium Loxophylum Homalozoon Spathidium Metopus c Metopus p Loxodes Tracheloraphis Spirostomum Gruberia Blepharisma Discophrya Trithigmostoma Stylonychia Onychodromous Oxytrichia Colpoda Paramecium Glaucoma Colpodinium Tetrahymena Opisthonecta Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Plagiopyla n Plagiopyla f Trimyema c Trimyema s Cyclidium p Cyclidium g Cyclidium l Homalozoon Spathidium Dasytrichia Entodinium Loxophylum Metopus c Metopus p Loxodes Tracheloraphis Spirostomum Gruberia Blepharisma Discophrya Trithigmostoma Stylonychia Onychodromous Oxytrichia Colpoda Paramecium Glaucoma Colpodinium Tetrahymena Opisthonecta

Parsimony analyse with topological constraints were used to find the shortest trees that forced hydrogenosomal ciliate lineages together and thereby reduced the number of separate origins of hydrogenosomes Two examples of the topological constraint trees Each of the constrained parsimony trees were compared to the ML tree and the Kishino-Hasegawa test used to determine which of these trees were significantly worse than the ML tree

SLIDE 43

Kishino-Hasegawa test

No. Constraint Extra Difference Significantly Origins tree Steps and SD worse? 4 ML +10

4

MP

13 ±

± ± ± 18 No 3 (cp,pt) +13

21 ±

± ± ± 22 No 3 (cp,rc) +113

337 ±

± ± ± 40 Yes 3 (cp,m) +47

147 ±

± ± ± 36 Yes 3 (pt,rc) +96

279 ±

± ± ± 38 Yes 3 (pt,m) +22

68 ±

± ± ± 29 Yes 3 (rc,m) +63

190 ±

± ± ± 34 Yes 2 (pt,cp,rc) +123

432 ±

± ± ± 40 Yes 2 (pt,rc,m) +100

353 ±

± ± ± 43 Yes 2 (pt,cp,m) +40

140 ±

± ± ± 37 Yes 2 (cp,rc,m) +124

466 ±

± ± ± 49 Yes 2 (pt,cp)(rc,m) +77

222 ±

± ± ± 39 Yes 2 (pt,m)(rc,cp) +131

442 ±

± ± ± 48 Yes 2 (pt,rc)(cp,m) +140

414 ±

± ± ± 50 Yes 1 (pt,cp,m,rc) +131

515 ±

± ± ± 49 Yes

Constrained analyses used to find most parsimonious trees with less than four separate origins of hydrogenosomes Tested against ML tree Trees with 2 or 1 origin are all significantly worse than the ML tree We can confidently conclude that there have been at least three separate origins of hydrogenosomes within the sampled ciliates

Test summary and results - origins of ciliate hydrogenosomes (simplified)

SLIDE 44

Shimodaira-Hasegawa Test

To be statistically valid, the Kishino-Hasegawa test

should be of trees that are selected a priori

However, most applications have used trees selected

a posteriori on the basis of the phylogenetic analysis

Where we test the ‘best’ tree against some other tree

the KH test will be biased towards rejection of the null hypothesis

The SH test is a similar but more statistically correct

technique in these circumstances and should be preferred

SLIDE 45

Taxonomic Congruence

Trees inferred from different data sets (different

genes, morphology) should agree if they are accurate

Congruence between trees is best explained by

their accuracy

Congruence can be investigated using consensus

(and supertree) methods

Incongruence requires further work to explain
r resolve disagreements

SLIDE 46

Reliability of Phylogenetic Methods

Phylogenetic methods (e.g. parsimony, distance, ML)

can also be evaluated in terms of their general performance, particularly their:

consistency - approach the truth with more data efficiency - how quickly (how much data) robustness - how sensitive to violations of assumptions

Studies of these properties can be analytical or by

simulation

SLIDE 47

Reliability of Phylogenetic Methods

There have been many arguments that ML methods

are best because they have desirable statistical properties, such as consistency

However, ML does not always have these properties

– if the model is wrong/inadequate (fortunately this is testable to some extent) – properties not yet demonstrated for complex inference problems such as phylogenetic trees

SLIDE 48

Reliability of Phylogenetic Methods

“Simulations show that ML methods generally
utperform distance and parsimony methods over a

broad range of realistic conditions”

Whelan et al. 2001 Trends in Genetics 17:262-272

Most simulations are very (unrealistically) simple

– few taxa (typically just four) – few parameters (standard models - JC, K2P etc)

SLIDE 49

Reliability of Phylogenetic Methods

Simulations with four taxa have shown:
Model based methods - distance and maximum likelihood

perform well when the model is accurate (not surprising!)

Violations of assumptions can lead to inconsistency for all

methods (a Felsenstein zone) when branch lengths or rates are highly unequal

Maximum likelihood methods are quite robust to violations
f model assumptions
Weighting can improve the performance of parsimony

(reduce the size of the Felsenstein zone)

SLIDE 50

Reliability of Phylogenetic Methods

However:
Generalising from four taxon simulations may be

dangerous as conclusions may not hold for more complex cases

A few large scale simulations (many taxa) have suggested

that parsimony can be very accurate and efficient

Most methods are accurate in correctly recovering known

phylogenies produced in laboratory studies

More study of methods is needed to help in choice of