S Si i ig g gn n na a at t t u u ur r re e es s s - - PDF document

s si i ig g gn n na a at t t u u ur r re e es s s o o of
SMART_READER_LITE
LIVE PREVIEW

S Si i ig g gn n na a at t t u u ur r re e es s s - - PDF document

S Si i ig g gn n na a at t t u u ur r re e es s s o o of f f a a a p p po o op pu p u ul l la a at t t i i io on o n n S b o t t l e n e c k c a n b e l o c a l i s e d bo ot


slide-1
SLIDE 1

20/ 06/ 2003

S S Si i ig g gn n na a at t t u u ur r re e es s s

  • f

f f a a a p p po

  • p

p pu u ul l la a at t t i i io

  • n

n n b b bo

  • t

t t t t t l l le e en n ne e ec c ck k k c c ca a an n n b b be e e l l lo

  • c

c ca a al l li i is s se e ed d d a a al l lo

  • n

n ng g g a a a r r re e ec c co

  • m

m mb b bi i in n ni i in n ng g g c c ch h hr r ro

  • m

m mo

  • s

s so

  • m

m me e e

C C Cé é él l li i in n ne e e B B Be e ec c cq q qu u ue e et t t

B Bi io

  • i

in nf f o

  • r

rm ma at t i ic cs s a an nd d M Mo

  • d

de el ll li in ng g I I N NS SA A

  • f

f L Ly yo

  • n

n I nstitute f or Cell, Animal and Population Biology Universit y of Edinburgh, Scot land, UK Tut or I CAP B: P rof . Nick H. Bart on Co-Tut or I CAP B: Dr. P et er Andolf at t o Tut or I NSA: Dr. Guillaume Beslon

slide-2
SLIDE 2
slide-3
SLIDE 3

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome

ABSTRACT

Most statistical tests proposed to detect selection are sensitive to demographic factors such as changes in population size. Unlike the localised effect of strong selection, demographic factors are expected to have a similar effect on the whole genome. While this is generally true, we show that signatures of a population bottleneck can be more localised. We characterise spatial patterns of variability across a recombining chromosome that has experienced a recent and strong population bottleneck event. Interestingly, a bottleneck in the presence of recombination results in increased heterogeneity in variability patterns along a chromosome, reminiscent of the effects of selection. Since changes in population size may be common events in the history of natural populations, our results have implications for the interpretation of genome-wide scans of variability in Drosophila and humans.

slide-4
SLIDE 4

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome

Content

ABSTRACT

Content A Year- I nternship in I CAPB, Edinburgh, Scotland, UK. . 1 Scientif ic report. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1. INTRODUCTION..................................................................................3

1.1. DISTINGUISHING DEMOGRAPHY AND SELECTION.......................................... 3 1.2. STATISTICAL TESTS PROPOSED............................................................................. 4 1.3. DATA FOR DROSOPHILA AND HUMANS.............................................................. 4 1.4. INTEREST OF DETECTING SIGNATURE OF A BOTTLENECK........................... 5

2. MATERIALS AND METHODS ............................................................6

2.1. BOTTLENECK SIMULATIONS.................................................................................. 6 2.1.1. Drosophila populations.......................................................................................... 7 2.1.2. Human populations ................................................................................................ 7 2.2. STATISTICAL TESTS.................................................................................................. 8 2.2.1. Levels of variability................................................................................................ 8 2.2.2. Frequency spectrum............................................................................................... 8 a) TAJIMA’s (1989_a) D............................................................................................... 8 b) FAY and WU’s (2000) H.......................................................................................... 9 2.2.3. Linkage disequilibrium........................................................................................... 9

a) High frequency haplotypes (HUDSON et al. 1994, VIEIRA and CHARLESWORTH 2000)... 9

b) Number of haplotypes (STROBECK 1987)................................................................ 9

3. RESULTS.............................................................................................10

3.1. DROSOPHILA POPULATION PARAMETERS....................................................... 10 3.2. HUMAN POPULATION PARAMETERS. ................................................................ 11 3.3. LARGE SURVEY REGIONS IN DROSOPHILA...................................................... 13 3.3.1. Pattern of variability............................................................................................ 13 3.3.2. Increase of the linkage disequilibrium................................................................. 14 3.4. LARGE SURVEY REGIONS IN HUMANS.............................................................. 15 3.4.1. Pattern of variability............................................................................................ 15 3.4.2. Increase of the linkage disequilibrium................................................................. 15

4. DISCUSSION.......................................................................................16

4.1. INTERPRETATION OF SIGNIFICANT TESTS OF FREQUENCY SPECTRUM.. 16 4.1.1. TAJIMA’s D............................................................................................................ 16 4.1.2. FAY and WU’s H test ............................................................................................. 17 4.2. INTERPRETATION OF SIGNIFICANT TESTS TEST OF THE LEVEL OF POLYMORPHISM .............................................................................................................. 17

slide-5
SLIDE 5

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome

4.3. HAPLOTYPE TESTS AND LINKAGE DISEQUILIBRIUM.................................... 17

5. CONCLUSION AND PROSPECTS.....................................................18

  • Epilogue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

ACKNOWLEDGMENT LITERATURE CITED SUPPLEMENTS

TABLE 2.SUPPLEMENT A...............................................................................................I TABLE 2.SUPPLEMENT B..............................................................................................II FIGURE 1.PATTERN OF VARIABILITY ALONG LARGE REGIONS IN DROSOPHILA AND HUMANS FOR S.......................................................III FIGURE 1.CONTINUED. PATTERN OF VARIABILITY FOR K............................... IV FIGURE 1.CONTINUED. PATTERN OF VARIABILITY FOR fMFH. ...........................V FIGURE 1.CONTINUED. PATTERN OF VARIABILITY FOR D............................... VI FIGURE 1.CONTINUED. PATTERN OF VARIABILITY FOR H..............................VII

slide-6
SLIDE 6

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 1

A Year- I nternship in I CAPB, Edinburgh, Scotland, UK

Nine months ago, I arrived in Edinburgh, Scotland, as ready as one can be to work for a year in an almost completely new field, in an unknown city and, to make everything even easier, surrounded by total strangers speaking English with a funny accent (really unknown language). The part of the university where I spent most of my working hours was located to the south of the city, just ten minutes walk from where I was living. The campus is mostly dedicated to Science, the buildings either brand new or made of old stone and separated by lovely courtyards (which could be appreciated when the weather allowed). The the Institute of Cell, Animal and Population Biology (ICAPB) is the first building one encounters when one arrives from the city. In the General Office, the two secretaries are eager to help you find your way (and later help you with all the administrative problems one can experience as an exchange student). I was looking for Prof. Nick Barton, who had accepted me into his lab for the year, and it was in the north wing, on the first floor, that I found the messy little room where six students were already hard at work. A door at the end of this room led to the office

  • f my future tutor, where my first unforgettable visit took place: Prof. Barton welcomed me

by talking very fast about his ideas for the project and drawing dozens of incomprehensible

  • graphs. After what seemed like hours, I was desperate: I had not understood a single word

except for the conclusion: get yourself an idea of what people do here. Nick introduced me to many people this first day, whose names and projects were forgotten in minutes. I was advised to read several books and papers that would provide me with a basic understanding of quantitative and population genetics, which I definitely needed to communicate with my new

  • colleagues. I installed myself at the desk that had been freed for me and I started to make

acquaintances with the other students. I was quickly relieved to discover that the new Ph.D. student, who I would have to share an Internet connection with, was as lost as I was during

  • ur tutor’s first speech. I also attended the M.Sc. in Quantitative Genetics and Genome

Analysis courses from October to December, thus meeting more students who were more or less new in the field and, therefore, more or less lost during the heavy lectures. I became accustomed to the way of life on ICAPB first floor easily. Every morning at 11am, coffee break. From Monday to Wednesday, a one-hour seminar on different areas of quantitative and population genetics; needless to say, I learned a lot during these sessions. On Thursdays at 1pm, Genetics Journal Club, in which a student or professor presents a recent controversial or revolutionary paper on a subject of his/her choice, preferably a subject which is not the speaker’s speciality. And every other Friday, Happy Hour, drinking event organised by a different lab each time. For the first few weeks, I did what I had been told: read the recommended literature and met the first floor population (apart from the Scorpions, Arabidopsis, Drosophila…). Meanwhile Nick realised that my theoretical knowledge was still a bit limited for the projects he had thought about and suggested that I ask my biologist colleagues for data analysis, or simulation projects. So I shyly went into each office, asking people about their current projects and begging for anything that I could do to help. At this point, I would like to thank Prof. Brian Charlesworth for suggesting a project which I finally turned down after I met the person who would become my supervisor and collaborator for the project I present in this report. Dr. Peter Andolfatto is interested in the application of gene-

slide-7
SLIDE 7

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 2

genealogy approaches to understanding the major determinants of genome variability patterns, and had a lot of ideas for natural and simulation data analysis projects involving programming, but no time to work on them. Most importantly, we spoke the same language: he managed to explain things to me at my level (reaching my level of understanding was quite an achievement at that time), told me about his projects and helped me understand the concepts I was still having problems with. We decided to work together on the subject of distinguishing between evolutionary models through their effects on sequence variability. At first the project was meant to look at whether adaptation limits the power of purifying selection in highly recombining regions of the Drosophila genome. However, my preliminary work showed that it was too ambitious a project, and that some required data were lacking. That is how I started working on the project about detecting local signatures of a population bottleneck along a recombining chromosome. I started by reading papers related to the subject, which discussed the statistics we wanted to use, and began programming a draft functional nucleus that performed these tests. Meanwhile, we tried to define sets of parameters for modelling bottlenecks, which were relevant to Drosophila and human populations. I then tried to add functions to perform the analysis of the resulting simulations. My coding abilities and rather messy thinking pathways managed, quite logically, to create a bug in my program. The Christmas holidays arrived just in time, thus giving me a (to my mind) deserved break. January saw me writing the report due at the end of the first semester. Being, “a bit” too ready to rush into the practical area of my work, seldom thinking or reviewing my work, this report helped me to define precisely what I was doing, and quickly became a list of everything I had learned during the first part of my internship and project planning. Now knowing what I needed to do, using the existing functions, I rewrote a clean program with the helpful advice of Hedi Soula. The first part of the project investigated the rejection probabilities of our tests of interest for a single locus. Our analysis of the simulations with bottleneck models for Drosophila very quickly gave interesting and unexpected results. We then worked on the second part of the project, and simulated a larger genomic segment with our bottleneck models, thus giving even more interesting results. Of course all of this was not without trial and error: correcting statistical functions, little bugs, mistakes in the simulations parameters…everything that makes an internship a learning experience (of course, it also drives you crazy). Anyway, we finally reached the point where we could confirm our results by doing similar simulations with bottlenecks modelled for human populations. And that is how, after eight months of intensive work, very few days-off, no holidays, but fortunately lots of relaxing (usually in the pub - I was in Scotland remember) we are now at the point of writing a striking paper, which we hope will give scientists working in the field

  • f quantitative, evolutionary and population genetics, a new perspective on, and a way of

working with, the statistics we studied. The following report will deal with the serious matter: the scientific work that Peter and I did during these months of collaboration.

slide-8
SLIDE 8

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 3

Scientif ic report

  • 1. INTRODUCTION

1.1.

DISTINGUISHING DEMOGRAPHY AND SELECTION. Natural population harbour enormous levels of genetic variability and the signatures

  • f an organism’s evolutionary history lie hidden within this variability. The patterns of

nucleotide variation between populations and species can be used to elucidate the functions encoded by the genomic sequence. Because a functional gene might be subject to selection, detecting the genomic regions subject to selection enables a better understanding of the processes of genome evolution under various population genetic forces. Historically, the importance of mutation, natural selection and migration (i.e. gene flow) has been emphasised in population genetics. But in his neutral theory, K

IMURA (1968, 1983) challenged the notion that natural

selection is the most important force in evolution. He argues that neutral mutations are the source of all variation, that recombination determines the extent of association among polymorphic mutations (Linkage disequilibrium, LD) and in is theory of genetic drift described the role of random events in determining a mutation’s history in a the population, from its origin to either lost or fixation. However, while most DNA variation within species may be neutral, natural selection may still have an important role in shaping it. For instance, the fixation of a favourable mutation reduces the genetic variation in surrounding regions (a phenomenon called 'hitchhiking' or a 'selective sweep', MAYNARD SMITH and HAIGH, 1974). Most of the closely linked neutral variants are lost, but those variants on the same chromosome as the favourable mutation will increase in frequency. So, at the level of DNA where there is linkage (i.e. closely linked markers that are unlikely to be separated by a crossing over event and hence have a greater probability of being inherited together), directional natural selection on functional DNA sequence variation contributes to the genetic drift of closely linked sequences and thus increases LD around the selected locus. Because of this characteristic localised reduction of variability, it is possible to find functional genomic regions. This ability to detect natural selection is useful for the study of the history of the populations, and, for humans, in particular, for medicine. But the task is made especially difficult by the fact that many different evolutionary processes affect the genetic variation in a similar way to selection. For instance, low levels of polymorphism within a region can be explained, not only by hitchhiking due to positive selection on a linked beneficial allele, but also simply by low local mutation rate. In addition, the variability of coding regions tends to be lowest, due to selection against deleterious alleles (i.e. negative or purifying selection). Because coding regions generate functional protein products that can be the targets of natural selection, nucleotide variants near deleterious mutations are removed through indirect selection “Background selection” (CHARLESWORTH et

  • al. 1993).

Another important process affecting natural genetic variability is the demography history of a population. Genetic drift involves the stochastic process of transmitting alleles from one generation to the next, in a large population this will not have much effect in each generation: the random nature of the process will tend to average out. In a small population

slide-9
SLIDE 9

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 4

the effect could be rapid and significant. Moreover, any rapid reduction of the population size tends to “select” by chance few genomes that become the founder genomes of the new

  • population. Thus genetic variability in the population can be sharply reduced (bottleneck

effect). A severe bottleneck is expected to cause a similar average reduction of genetic variability on the whole genome because genomic segments are inherited en bloc from generation to generation and thus share a single genealogical history. However, in the presence of recombination, bottlenecks may produce patterns of variability along a chromosome that, by chance, mimic the localised effects of directional or negative selection. Recombination events juxtapose neighboring chromosomal segments that have different histories, which disrupts the correlation of a genealogical history. Thus as each independent region follows its one mutational history, the homogeneity is lost, which could lead to localised reduction of variability as expected under selection. 1.2. STATISTICAL TESTS PROPOSED To identify genomic regions in which selection might be operating, numerous tests of neutrality have been developed. They use patterns of genetic variability to detect departure from the standard neutral model (SNM) which assumes that mutations are neutral and that gene are sampled from a randomly mating (panmictic) population of constant size. Several tests have been designed to extract the information encoded in a single locus. Such tests can look at the level of polymorphism (KREITMAN and HUDSON 1991) or focus on the structure of haplotypes (i.e. sets of closely linked genetic markers present on one chromosome which tend to be inherited together - not easily separable by recombination, see STROBECK 1987, H

UDSON et al. 1994, VIEIRA and CHARLESWORTH 2000 and D EPAULIS and

VEUILLE 1998). These tests can also consider the frequency spectrum of mutation (see T

AJIMA

1989-a, FU and LI 1993 and FAY and WU 2000). In addition, several multi-locus tests have been proposed to detect genomic regions subject to selection. Examples include the HKA test (HUDSON, KREITMAN, and AGUADE 1987) that compares the divergence between species and the within-specific variability at several independent loci. KIM and STEPHAN (2002) explicitly model genetic linkage between surveyed and selected regions and develop a maximum-likelihood method based on independent loci to examine the significance of a local reduction of genetic variably and estimate the strength of directional selection. SCHLÖTTERER’s (2002) lnRV focuses on differences in levels of variability between two populations. Demography, by affecting natural variation affects these tests as well as selection. However, because the signature of demography is genome wide, while selection has localised effects, in theory, considering data from multiple loci should enable those tests to distinguish between the two processes. 1.3. DATA FOR DROSOPHILA AND HUMANS The data from Drosophila and humans suggest that both species originated in Africa and differentiated between non-African and African populations. Specifically, non-African populations have less diversity and higher LD. These observations suggest an “out of Africa” bottleneck. In particular, in humans, a bottleneck (i.e. a severe reduction in population size) is thought to be associated with the emergence of modern humans (~200,000 ya). In addition, some particular human populations have remained small (i.e. hunters-gatherers) while others generally recovered in size and have recently experienced exponential growth (i.e. after the

slide-10
SLIDE 10

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 5

development of agriculture, ~10,000 years ago) (see EXCOFFIER and SCHNEIDER 1999); leading to interesting differences in patterns of variability among human populations. 1.4. INTEREST OF DETECTING SIGNATURE OF A BOTTLENECK Numerous examples of departure from neutrality had been studied in both Drosophila species and humans and several investigators have claimed that the observation of significant reduction of variability indicates selection. But, even the tests specifically designed to detect selective sweeps (i.e. FAY and WU 2000) can be influenced by population structure (PRZEWORSKI 2002), and may also be sensitive to other departure from demographic stability. A solution to overcome this problem has been to consider the heterogeneity of patterns

  • f variability as an argument for selection. This is relevant because selection has a localised

effect and specifically, in the presence of recombination, strong directional selection is predicted to produce a characteristic “valley” of reduced variability around the selected site (see KIM and STEPHAN 2002). In contrast demography tends to affect the genetic variability uniformly throughout the genome. However, a severe population bottleneck is also predicted to reduce genetic variability and in particular, in the presence of recombination, may produce patterns along a chromosome that, by chance, mimic the effects of localised selection. In our study, we want to measure the extent to which the heterogeneity of the pattern

  • f variability is affected by a severe reduction of population size. We first explicitly model

bottlenecks with “best guess” parameters for Drosophila malanogaster and D. simulans and

  • humans. We then characterise the patterns of variability across recombining chromosomes

that have experienced our population bottleneck models and try to see how the results might affect the interpretation of the statistical tests.

slide-11
SLIDE 11

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 6

  • 2. MATERIALS AND METHODS

2.1. BOTTLENECK SIMULATIONS Since many bottleneck models are possible, we focus on two highly simplified types

  • f bottlenecks. These are a “single step” population size change or a simple step followed by

an instant recovery. In the former, an ancestral population of N0 individuals instantaneously crashes to Nb individuals at time Tb in the past. The reduced population size can also represent the harmonic mean of the population size over a period of severe periodic bottlenecks starting Tb generations ago. In the latter, an ancestral population of N0 individuals that instantaneously crashes to Nb individuals for T generations, at time Tb in the past, then instantly recovers to

  • N0. Parameters for these two classes of models are set such that they produce the same

average reduction in variability in the derived population. In principle, these two models are equivalent in their effects on variability, especially when their starting times are recent.

TABLE 1. Parameter values used in bottleneck simulations Bottleneck Nb Tb T Drosophilaa (I) 50 2000 10 (II) 50 120000 10 (III) 50 2000 50 (IV) 50 120000 50 Humans (V) a 10 600 15 (VI)b 900 600 600 (VII)a 10 2500 10 (VIII)b 2800 2500 2500 See text for explanation

a Parameters for step–recovery bottleneck b Parameters for simple step bottleneck

The simulations with the standard neutral model (SNM) were run using H

UDSON’s

(2002) program which generates independent replicate samples assuming a constant panmictic population size, an infinite-sites mutation model and a neutral coalescent approximation to the WRIGHT–FISHER model. Simulations are based on the parameter θ=4NµL, where N is the diploid effective population size, µ is the sex average mutation rate per base pair and per generation, and L is the sequence length in base pairs. Alternatively, we specify the number of segregating sites (S) in which case each independent replicate sample will have S observed segregating sites. A simulation under a finite-sites recombination model requires the recombination rate along the sequence ρ=4NrL, where r is the sex average recombination rate per base pair and per generation. To simulate bottlenecks the parameters

  • f interest are the number of intervals of population size changes and for each interval a triplet
  • f additional parameters summarising the reduced population size (Nb), the starting time (Tb)

and the length of the bottleneck (T).

slide-12
SLIDE 12

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 7

2.1.1. Drosophila populations We applied the different models of bottlenecks listed in Table 1 to an ancestral population at neutral equilibrium with 5,000,000 individuals (similar than WALL et al. 2002). The timings of these events are consistent with those proposed for dispersal of D. melanogaster from Africa (models (II ) and (IV), 10-15 kya ~ 120,000 generations, L

ACHAISE

et al. 1988) or the founding of North America populations (models (I ) and (III), < 400 ya ~ 2,000 generations, LACHAISE et al. 1988), respectively. The extent to which variability is reduced in these bottlenecks is based on limited data suggesting that the X chromosome of D. melanogaster is 2-fold less diverse in non-African populations relative to central African populations (inversion polymorphisms complicate the interpretation pattern for autosomal genes, A

NDOLFATTO 2001). In D. simulans, autosomal diversity may be about 25% lower for

non-African compared to African populations and potentially even more reduced on the X chromosome (ANDOLFATTO 2001; BEGUN and WHITLEY 2000). For the bottleneck scenarios described in Table 1, the average variability (measured as WATTERSON’s (1975) θW, see Statistical tests below) in the derived (post-bottleneck) population is reduced by either 15% (models (I) and (II)) or 50% (models (III) and (IV)) compared to the ancestral population. We set the population mutation rate to θ = 4NµL = 15 for a length of 500 recombining base pairs for single locus tests (µ=1.5 x 10-9, WALL et al. 2002). To model a larger chromosome segment, we set θ = 800 over 40,000 recombining base pairs and the analysis of the diversity was performed on 500 base pairs widows with a step size of 50 base pairs. We set the population recombination rate ρ=4NrL = 3θ (ANDOLFATTO and PRZEWORSKI 2000) and ρ = 15θ (when we consider regions with high recombination rate and µ=1.5 x 10-9) at and consider a sample size of n=15 chromosomes. 2.1.2. Human populations Here, we assume an neutral equilibrium of 12,000 individuals (Wall 2003). The timing

  • f the bottleneck reflect a population size contraction of non-African populations sometime

between the emergence of modern humans about 200,000 years ago and the preceding population expansion after the introduction of agriculture about 10,000 years ago. In Table 1, the timing of the bottlenecks for humans correspond to 12,000 ya (models (V) and (VI)) and 50,000 (models (VII) and (VIII)) respectively, assuming 20 years per generation. The extent to which variability has been reduced is based of data from ten non-coding autosomal regions sampled in one African and two non-African populations of humans (FRISSE et al. 2000). For the step-recovery models ((VI) and (VIII)) and simple step models ((V) and (VII)) in Table 1, variability is reduced by 35% in the derived population. We set the population mutation rate to θ = 3 for a length of 2500 recombining base pairs (µ=2.5 x 10-8, NACHMAN and CROWELL 2000-a). To model a larger chromosome segment, we set θ = 240 over 200,000 recombining base pair, and the analysis was performed on 2500 base pairs windows with a step size of 250 base pairs. We set ρ = θ (PRZEWORSKI, personal communication) and consider n=15 chromosomes.

slide-13
SLIDE 13

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 8

2.2. STATISTICAL TESTS We employed several commonly used tests of neutrality that focus on aspects of the data such as level of polymorphism, frequency spectrum and linkage disequilibrium. These tests are used to detect selection, but can also be sensitive to departure from demographic stability, and so could detect our modelled bottlenecks. The critical values of the level of diversity were computed from simulations based on θ, computed from the average level of polymorphism of the simulation with bottleneck. The critical values of all the other tests described below were computed from the simulation under the SNM for all possible values of

  • S. All statistical tests are performed assuming no intragenic recombination, considering 15

chromosomes and 10,000 repetitions. Note, this is conservative since the bottleneck simulations have recombination and recombination reduces LD (see Bottleneck simulations part 2.1.). The critical values are computed considering a correlated bloc of 500 of 2500 bp (for Drosophila and humans respectively, no recombination) where only neutral mutation events create the variability. In contrast, in the simulations with the equilibrium population and bottleneck models, recombination events might have broken the bloc, generating new

  • variations. Thus, when there is recombination, the tests of neutrality are more conservative, as

there is more variation generated, and fewer regions with significantly too low variability. 2.2.1. Levels of variability We test the level of polymorphism using the KREITMAN and H

UDSON’s (1991) test :

Prob(S | θ, n), where S is the number of segregating sites, θ is the expected number of differences between a pair of sequences and n is the sample size. Note that this test required that the ancestral variability θ is known, which is generally not the case when applied to natural data. In the presence of a bottleneck, the probability of having S segregating sites given θ and n is expected to be lower than under the SNM. We use this as a one-tailed test and are interested in Prob(S<Scrit | θ, n) where Prob(Scrit | θ, n) 05 . ≤ for the simulations with the SNM. 2.2.2. Frequency spectrum To test the frequency spectrum of a set of sequences, one need to know the ancestor nucleotide at each segregating sites. When used on natural data, the sequence of an outgroup is consider the ancestor sequence, enabling us to determine the variant from the ancestor nucleotides. a) TAJIMA’s (1989-a) D We employ TAJIMA’s (1989-a) D, which measure the normalised difference between θW (WATTERSON 1975) and π (TAJIMA1983).

− =

= θ

1 1 W

1

n i

i S (1)

=

− − = π

S i i i

p p n n

1

) 1 ( 2 1 (2) where n is the sample size, S the observed number of segregating sites and pi is the frequency

  • f variant (i.e. heterozygosity) for the ith segregating sites. Under the SNM, the two estimators
  • f θ, π and θW, are unbiased, so the mean of D is close to zero (E(D)=0).
slide-14
SLIDE 14

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 9

) var(

W W

D θ − π θ − π = (3) A negative value of D indicates an excess of rare mutations, while a positive D indicates an excess of intermediate frequency variants. After a population bottleneck, E(D) can be positive, negative, or zero depending on the length of time since the bottleneck and the severity of the bottleneck. In our analysis, we use the two one-tailed tests Prob(D ≤D5% | S, n) and Prob(D ≥ D95% | S, n) where D5% and D95% are the critical values with 5% rejection in both negative and positive direction computed from the simulations with the SNM. b) FAY and WU’s (2000) H This test has been designed to have high power to detect positive selection. H = π- θH (4) where θH (FAY and WU 2000) is a variant measure of diversity weighted by the homozygosity

  • f the derived variants, as opposed to the frequency of the ancestral variants:

θH = ∑

− =

1 1 2

) 1 ( 2

n i i

n n i S (5) where Si is the number of derived variants found i times in a sample of n

  • chromosomes. Under neutrality and the infinite-site model, θH is another unbiased estimator
  • f θ and so E(H)=0. The H-test is considered to be highly conservative in the presence of

growth, as population growth tends to produce an excess of low frequency variants. However, in the presence of population structure, highly unequal sampling from different populations can also lead to a significant H (PRZEWORSKI 2002). Here, we investigate how it responds to change in population size by using the one-tailed test Prob(H ≤Hcrit | S, n) where Prob(Hcrit | S, n) ≤ 0.05 is computed from simulations with the SNM. 2.2.3. Linkage disequilibrium a) High frequency haplotypes (HUDSON et al. 1994, VIEIRA and CHARLESWORTH 2000) The statistical test for the frequency of the most frequent haplotype ( fMFH) we use is similar to the tests for high frequency haplotypes used by H

UDSON et al. (1994) and VIEIRA

and CHARLESWORTH (2000) to detect selection. Because selection (positive negative or balancing), can create strong haplotype structure, fMFH is expected to be higher in a region that has experienced selection than for neutral region. A recent bottleneck may also have the same effect on the haplotype distribution of a given population, as it tends to reduce the haplotype diversity, so we test the Prob(fMFH > fMFH(crit) | S, n) where Prob(fMFH (crit) | S, n) ≥ 0.95 for the simulations with the SNM. b) Number of haplotypes (STROBECK 1987) STROBECK’s (1987) proposed a test of the SNM based on the number of distinctive haplotypes ( K) (see also F

U 1996). The value of K is expected to be lower than under the

SNM for a given population if it has experienced a recent bottleneck (MARUYAMA and FUERST 1985-a) or periodic reductions of population size (MARUYAMA and FUERST 1985-b). So we use a one-tailed test Prob(K<Kcrit | S, n) where Prob(Kcrit | S, n) ≤ 0.05 is computed from simulations with the SNM. We also note the behaviour of the minimum number of recombination events, RM (HUDSON & KAPLAN 1985). WALL (2000) has proposed an estimate

  • f ρ based on K and RM. Thus, based on the behaviour of these statistics we may infer how a

bottleneck affects the ρ estimate of WALL (2000).

slide-15
SLIDE 15

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 10

  • 3. RESULTS

TABLE 2 Rejection probability for single loci under step-recovery bottleneck models for Drosophila (ρ = 3θ). Bottleneck θ/θo S K fMFH D H (µ, reject) (µ, reject) (µ, reject) (µ, reject 5%, 95%) (µ, reject) (I) Tb = 2000 ga 0.82 40.1, 0.005 8.2, 0.15 0.28, 0.07 0.41, 0.001, 0.03

  • 1.28, 0.02

(II) Tb = 120,000 ga 0.83 40.7, 0.005 10.1, 0.02 0.23, 0.03 0.36, 0.001, 0.03

  • 1.19, 0.02
  • Eqb. Pop.a

0.81 39.3, 0.002 12.6,<0.0001 0.16, 0.001

  • 0.01, 0.001, 0.001

0.00, 0.01 (III) Tb = 2000 ga 0.50 24.3, 0.07 3.5, 0.96 0.55, 0.56 0.88, 0.04, 0.35

  • 3.94, 0.21

(IV) Tb = 120,000 ga 0.52 25.4, 0.06 5.9, 0.39 0.47, 0.36 0.69, 0.04, 0.27

  • 3.77, 0.19
  • Eqb. Pop.a

0.50 24.5, 0.01 11.2, 0.0002 0.20, 0.002

  • 0.02, 0.01, 0.01
  • 0.01, 0.01

Ancestral b 1.00 48.8, - 13.1,<0.0001 0.14, 0.0006

  • 0.01, 0.0003, 0.0008

0.05, 0.004 All the simulations consider ρ = 3 θ, 15 chromosomes and 10,000 repetitions. The Roman numbers refer to the bottleneck models described in Table 1. µ and reject are the mean and rejection probability.

a Simulations with the SNM based on θ = 12 or 7.5 (for 0.8 and 0.5 variability reduction

respectively) with recombination ρ = 3θ.

b Simulations with the SNM based on θ = 15, with recombination ρ = 3θ.

3.1. DROSOPHILA POPULATION PARAMETERS. In Table 2, we present the effect of the bottleneck models on statistical tests applied to short sequenced regions in Drosophila. We have modelled severe (θ/θ0 = 50%) and less severe (83%) bottlenecks associated with expansion from Africa (~120,000 ga) and colonisation of non-Africa (~2000 ga). We also model an equilibrium population of the same size as the derived population for comparison. The apparent robustness of some tests to even large departures from an equilibrium population of constant size reflects how conservative these tests are ( PRZEWORSKI et al. 2001). The reason is that statistical tests are most often employed, as they are here, assuming no recombination. For example, for S, the proportion of significant tests is much below the 5% level for each of the equilibrium populations modelled, since ρ = 3θ in the modelled populations. Remarkably, while the variance of both S and TAJIMA’s D increases under a bottleneck, relative to an equilibrium population, this rarely results in significant tests for the S statistic and for the negative tail of TAJIMA’s test (see Materials and Methods part 2.2). The bottom line here is that recent bottlenecks in a species’ history are unlikely to result in a significantly negative TAJIMA’s D or a marked deficiency of segregating sites (and thus, rejection by the KREITMAN and H

UDSON’s (1991) test). Very similar results were recovered

by assuming that ρ = 15θ or under a simple bottleneck in which the population loses the same amount of variability but never recovers in size (see Materials and Methods, part 2.1. and Supplementary II). The reason these details do not matter is probably because both bottlenecks modelled here are so recent relative to the effective population size of the species (0.0004N0 and 0.024N0 generations, respectively).

slide-16
SLIDE 16

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 11

The statistical tests based on haplotype structure appeared to be the most sensitive. While the most sensitive overall was the expected number of haplotypes, K, (STROBECK 1987), the frequency of the most frequent haplotype ( fMFH), analogous to the haplotype test proposed by HUDSON et al. (1994), also had considerable power. In contrast, our most severe bottlenecks (Nb/T=1) result in positive mean values of D and the positive tail of D show considerable power to detect these models. This is not surprising because following a reduction in population size, rare frequency mutations are lost more readily than are common mutations (NEI et al. 1975), and transient positive D values are expected (TAJIMA 1989-b). Also, the mean of H is negative under a bottleneck and FAY and WU’s H test is sensitive to the assumption of no recombination. Surprisingly this statistic has some power to detect drastic bottlenecks (see Table 2, most severe bottlenecks (III) and (IV)), and thus like other statistical tests, it is not robust to assumptions about population history (see also PRZEWORSKI 2002, LAZZARO and CLARK 2003).

TABLE 3 Rejection probability for single loci under different bottleneck models for humans Bottleneck θ/θo S K fMFH D H (µ, reject) (µ, reject) (µ, reject) (µ, reject 5%, 95%) (µ, reject) (V) Tb = 600 gaa 0.65 6.2, 0.09 3.8, 0.10 0.55, 0.08 0.42, 0.03, 0.16

  • 0.58, 0.12

(VI) Tb = 600 gab 0.65 6.1, 0.09 3.6, 0.12 0.55, 0.08 0.56, 0.03, 0.19

  • 0.57, 0.12

(VII) Tb = 2500 gaa 0.64 6.2, 0.08 4.3, 0.05 0.58, 0.07

  • 0.02, 0.06, 0.09
  • 0.55, 0.11

(VIII)Tb = 2500 gab 0.65 6.2, 0.08 3.9, 0.09 0.55, 0.07 0.40, 0.03, 0.15

  • 0.58, 0.11
  • Eqb. Pop.c

0.65 6.3, 0.04 5.1, 0.01 0.49, 0.02

  • 0.05, 0.03, 0.04

0.02, 0.05 Ancestrald 1.00 9.8, - 6.5, 0.01 0.39,0.02

  • 0.04, 0.03,0.04

0.01, 0.04 All the simulations consider ρ = θ, 15 chromosomes and 10,000 repetitions. The roman numbers refer to the bottleneck models described in Table 1. µ and reject are the mean and rejection probability.

a Correspond to the step –recovery bottlenecks (V) and (VII) of Table 1. b Correspond to the simple step bottleneck (VI) and (VIII) of Table 1. c Simulations with the SNM based on θ = 3*0.65 with recombination ρ = θ. d Simulations with the SNM on θ = 3 with recombination ρ = θ.

3.2. HUMAN POPULATION PARAMETERS. Table 3 presents the effect of some bottleneck models on statistical tests applied to short sequenced regions in humans. The probability of rejecting the SNM increase of around 2 fold for most of the statistics under the different models of bottlenecks. However, all if all the test a not sensitive to our models, they have little power to detect the bottleneck chosen for human compared to those for Drosophila (Table 2). Eventually, all the tests are more sensitive to recent reduction of population (but not significant difference). Our results for human populations are consistent with the fact that our two bottleneck models (simple step and step-recovery) are equivalent when the bottleneck is recent. In contrast, the observations for older bottlenecks suggest that the history of a population after a bottleneck is important. If a population bounces back to its original size, TAJIMA’s D might be close to zero. However, if the average population size stays low (or fluctuates with a low harmonic mean, see Materials and Methods part 2.1) then TAJIMA’s D could be positive.

slide-17
SLIDE 17

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 12

slide-18
SLIDE 18

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 13

3.3. LARGE SURVEY REGIONS IN DROSOPHILA Directional selection is expected to have localised effects on the genome (see MAYNARD-SMITH and HAIGH 1974; KAPLAN et al. 1989; FAY and WU 2000; KIM and STEPHAN 2000; PRZEWORSKI 2002). But theory tells us that closely linked genomic regions will have correlated genealogical histories even under neutrality (HUDSON 1983). This difficulty has been overcome by either comparing unlinked (or effectively unlinked) loci or by explicitly modelling linkage between surveyed regions ( KIM and STEPHAN 2000). In contrast, population history is expected to affect the entire genome in a similar manner. If we

  • bserve localised rejections of the neutral model in genome wide scans, can we conclude that

selection has been operating? To address this question, we modelled large segments of the genome (approx. 40 kb) undergoing bottlenecks like those described in Table 1, and asked how statistical tests behaved spatially across such sequences. We thus asked not only about the number of unusual regions observed in the genome segment modelled, but also the distribution of sizes of these regions. 3.3.1. Pattern of variability Figures 1a, 1b and 1c (see Supplements III to VII) show three examples of patterns of variability along large segments simulated under step-recovery bottleneck models (I) and (IV) with ρ = 3θ and (IV) with ρ = 15θ respectively. These graphs where chosen from 10 sample randomly selected from the simulations with bottleneck. The lower graphs show the worst pattern (more and larger regions rejecting the SNM) with the equilibrium population of the same size as the derived population for comparison. Primarily, one can observe considerable heterogeneity across a sequence. More precisely, the regions on which the tests significantly reject the SNM are not uniformly distributed across the genomic sequences simulated, as one might expect by the statement “similar average effect on the genome”. Moreover, the standard single-locus tests suggest that for independent loci, tests are in some cases conservative. For instance, the negative tail of TAJIMA’s D rejects the SNM with a probability p=0.04 (Table 2 Bottleneck (IV), p=0.01 for the equilibrium population). However Figure 1b for the statistic D (Supplement VI) shows that the rejection probability does not inform about the distribution of the regions where the SNM is rejected. Similarly, K is conservative under a severe bottleneck (p=0.02 bottleneck (IV) and p<0.0001 for the equilibrium population in Supplement II), while one can observe a 40 kb region with too few haplotypes along its whole length (Figure 1c for K, Supplement IV). This clearly shows that even tests conservative for independent loci can detect large regions rejecting the SNM under bottleneck models. Depending on the patterns observed on the three graphs, the conclusion about the event influencing the genetic variability might be different. The graphs (a) and (c) for the statistic S all show patterns with very few rejecting regions, similar to the patterns observed for the equilibrium population, thus the observed segments appear to be neutral. This means that there is small effects on the variance and mean of S. The graphs of the other bottleneck (b) modelled for this statistic show very different pattern of variability. The upper graph has no rejection, the middle plot has large regions of very low variability and the lower one plotting a pattern of low variability along the whole 40 kb segment. These observations would lead us to the conclusion of some selection going on the middle and lower graphs. Similarly, the graphs (a) and (c) for the statistics K, fMFH and FAY and WU’s H display patterns very different which could lead to the conclusion of localised selective sweep on the middle graphs and selection spread along the whole 40 kb segment for the lower plots. In contrast, the graphs for the more severe bottleneck but less recombination (b) for these three statistics display very similar patterns of very numerous and large regions of too

slide-19
SLIDE 19

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 14

low variability, showing what one can expect to observe under a bottleneck. These

  • bservations show that a mild demographic reduction in the population history (a) does not

affect independent recombining segments homogeneously, but can by chance mimic the pattern of a selective sweep. Similarly, a more severe bottleneck applied to genetic sequences subject to many recombination events (c) can display patterns expected under selection. The graphs for TAJIMA’s D display neutral genomic segments (a) or, for (b) and (c), patterns that seem to show selection on the two lower plots. Thus despite the fact that the tests for negative TAJIMA’s D and deficiency of segregating sites are relatively robust to our modelled bottlenecks, a genome screening with these test may detect selection where in fact a severe bottleneck has occurred. Also, the positive TAJIMA’s D, powerful to detect the most severe bottleneck (see Table 2), displays patterns which lack the homogeneity that would suggest that demography is the cause of the too high variability detected by the test. 3.3.2. Increase of linkage disequilibrium REICH et al. (2002) showed that the human genome contains sizeable regions (stretching over tens of thousands of base pairs) that have intrinsically high and low rates of sequence variation and showed that the primary determinant of these patterns is shared genealogical history. By measuring the average distance over which genealogical histories are typically preserved, it is possible to have an estimate of the average extent of correlation among variants (linkage disequilibrium). The size of correlated segments can be computed from the approximation (6) (OHTA and KIMURA 1971 and WEIR and HILL) NrL r E 4 2 1 ) (

2

+ ≈ (6) For the parameters for Drosophila (see Materials and Method, part 2.1), and E(r²)=0.1, the size of regions with correlated genealogical histories is expected to be between 20 or 100 bp (for ρ = 15θ and 3θ respectively) under the SNM. The graph for the equilibrium population in the Figures 1a, b and c (lower graphs in Supplement from III to VII) show regions rejecting the SNM with size in this range. However, the more severe bottlenecks can create patterns with until 40 kb regions sharing the same genealogy history for the haplotype and frequency spectrum tests (see Figures 1b and 1c, Supplements IV to VII). When a population had experienced a bottleneck the probability to find regions rejecting the SNM is significantly larger than under the SNM (for all the tests except the negative tail of TAJIMA’s D, two-sample Kolmogorov-Smirnov test : KS<0.0001). Moreover, conditioning on detecting at least one unusual region in a sample of 40 kb sequences, the size

  • f the region under a bottleneck model is significantly larger (KS<0.0001 for all the graphs

where the test was applicable). Figures 2a and 2b show a summary of these observations and plot for the six statistics the probability of finding an unusual region of a given size along a 40 kb recombining chromosome that has experienced a old and severe bottleneck (model (IV)) and the corresponding null model, for ρ = 3θ and ρ = 15θ respectively. They show for all the tests that the probability of detecting large regions is significantly increased when the population has experienced a bottleneck. Even the negative tail of TAJIMA’s D, which is conservative (see Table 2 and Supplement I and II), show regions sharing genealogical histories larger under the bottleneck models. This is an expected result, as bottlenecks increase the variance of the statistics (expect the haplotype tests). These results confirm the important excess of linkage disequilibrium (LD) that a bottleneck can create across genomic sequences.

slide-20
SLIDE 20

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 15

3.4. LARGE SURVEY REGIONS IN HUMANS. We modelled large segments of the genome (approx. 200 kb) undergoing the bottlenecks described in Table 1. 3.4.1. Pattern of variability Figures 1d and 1c (see Supplements III to VII) display the pattern of variability for the five statistics for the step-recovery bottleneck (recent (V) and older (VI) respectively). For all the statistics, the patterns displayed in the three graphs with bottlenecks show very different patterns of variability, thus mimicking what one expect to find under selection. The only exception is the graph (e) for the number of haplotypes K that display relatively homogeneous patterns, with numerous small regions rejecting the null model scattered along the whole 200 kb segment. We do not show the results for the simple step model, but in the 4 bottlenecks we modelled for humans, this is the only time the selected 10 samples display these homogeneous patterns. 3.4.2. Increase of the linkage disequilibrium (6) and the parameters defined for humans (see Materials and Methods, part 2.1.) we expect that the length of sequences sharing the same history is about 5000 bp. The graph for equilibrium populations (lower graphs) show rejecting regions with sizes consistent with this

  • value. However, one can observe much larger regions, and even until 150 kb rejecting

homogeneously the SNM in the Figure 1d for S (Supplement III). The excess of linkage disequilibrium due to a severe bottleneck can be very strong. As for Drosophila parameters, we find significantly more regions rejecting the SNM (KS<0.0001 for all tests except the negative tail of TAJIMA’s D), and conditioning on the

slide-21
SLIDE 21

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 16

region rejecting the SNM, the probability for its size to be large is significantly higher when the population had experienced a bottleneck than when the population size had remained constant (KS<0.0001 for all statistics). The Figure 2c summarises these results by displaying for the six statistics the probability of finding an unusual region of a given size along a 200 kb recombining chromosome with the model of bottleneck (VI). A bottleneck creates a population where only few genealogies a shared, thus smoothing the effect of recombination during the period of small population size. Recombination usually creates variability by breaking correlation between loci. However, when the population size is small (ρ=4NrL smaller), recombination events do not systematically induce variability (recombination between identical genomes). Thus, the breakdown of LD across chromosome is not as effective as under the SNM. This explain how bottleneck increases the extent to which genealogical histories are shared.

  • 4. DISCUSSION.

4.1. INTERPRETATION OF SIGNIFICANT TESTS OF FREQUENCY SPECTRUM 4.1.1. TAJIMA’s D Our simulations show that, in Drosophila, positing a recent and drastic bottleneck predicts that derived populations should have a more positive T

AJIMA’s D than the ancestral

population (Tables 2, and Supplements I and II). In Table 3, the human values of D are closer to those expected under neutrality (old step-recovery bottleneck (VII)), but tend to be positive as well. In contrast, the data from Drosophila usually do not show positive values of D (PRZEWORSKI et al. 2001). This could be explained by an inadequate population sampling which may cover up the positive values of D. However, WALL et al. (2002) pointed out that TAJIMA’s D is more positive on the X chromosome than on the autosomes for D. simulans, and this may be better explained by demographic history of the species than by selection. The X chromosome is affected in a different manner than the autosomes if the population has experience a bottleneck, because it has a smaller effective population size the timing of the bottleneck would seem more recent. Also more negative values of D are found in the African populations (HARR et al. 2002). If ancestral populations of Drosophila have experienced a long-term growth, this may cover up the evidence for bottlenecks in the derived populations. Note that our models might show positive values of D because our bottleneck models for Drosophila are recent relative to the effective population size of the species (only 0.024N0 for the oldest model). Thus the fact that our models do not fit with the real demographic events of this species may also explain the differences with the natural data. D is also not generally positive in humans (FRISSE et al. 2001), but this could be due to inadequate population sampling which may cover up the positive values of D (PTAK and PRZEWORSKI 2002). The comparison between African and non-African populations also shows more negative D for the ancestral populations than for the derived ones (FRISSE et al. 2001, PLUZHNIKOV et al. 2002). GILAD and LANCET (2003) found values of D positive for data from a Pigmy population, while a sample defined as Caucasians show values close to

  • neutral. The Pigmies have a hunters-gathers cultures and remain a relatively small population.

The value of D for this population is consistent with our results for the old simple step bottleneck model (see bottleneck (VIII) Table 3). In contrast, the results for the Caucasian population, which derived from an agricultural culture that may have experienced a recent exponential growth, are similar to those find for our old step-recovery model (see bottleneck

slide-22
SLIDE 22

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 17

(VII) Table 3). Thus these observation may well be due to differences in the demography history of the compared populations. 4.1.2. FAY and WU’s H test Since it was designed specifically to detect selective sweeps (FAY and WU 2000 and OTTO 2000), FAY and WU’s H test has been used on loci of interest to provide evidence for their adaptive functionality. We find, however, that this test is generally negative under bottlenecks and is not conservative under when the demography history of a population is

  • unknown. In Drosophila, numerous comparisons between African and non-African

populations show significant differences at loci thought to be subject to selection such as Acp26Aa (FAY and WU 2000), desat2 (TAKAHASHI et al.) and janus-ocnus (PARSCH et al. 2001). However, these observations can not be interpreted as a unique signature of selection. Studies of ten human non-coding autosomal regions and found more significant F

AY

and WU’s H (4 loci of 10) in the non-African populations, while the African populations fit the neutral model (FRISSE et al. 2001, HAMBLIN et al. 2002) studied. Also, GILAD and LANCET (2003) found that H at human olfactory genes are significantly more negative in non-African populations than African populations. These two studies were interpreted as evidence for

  • selection. Our study shows that these signatures of selection could well be due to a bottleneck
  • r population structure (PRZEWORSKI 2002).

4.2. INTERPRETATION OF SIGNIFICANT TESTS TEST OF THE LEVEL OF POLYMORPHISM We measure the level of variability on single loci with the KREITMAN and H

UDSON’s

(1991) test but one expect the HKA-test, which considers the divergence between species and compares the within-specific level of polymorphism at multiple loci, to be affected similarly by demographic changes. In natural populations of Drosophila, there are few examples of loci showing significant HKA-tests, such as Pgd in D. melanogaster (BEGUN and AQUADRO 1994) and runt in D. simulans (LABATE et al. 1999). In general, however, the HKA test fails to reject the SNM model in Drosophila. Similarly, in humans, the gene Dmd shows a deficiency of segregating sites in the non-African populations ( NACHMAN and CROWELL 2000-b), but the HKA-test does not detect departure from neutrality on human loci. If selection is a rare phenomenon, the fact that so few loci show significant HKA-test in both species is understandable. Our S statistic has a poor power to detect our models of bottlenecks for both species, thus suggesting that a bottleneck would have to be very severe indeed to produce rejections by the HKA-test (HUDSON, KREITMAN, and AGUADE 1987) and similar tests (Tables 2 and 3, and Supplements I and II). So another explanation of the

  • bservation in natural populations of Drosophila and Humans could be that they have

experienced relatively mild bottlenecks, because these observations are consistent with the lack of power we found for our test of level of polymorphism in presence of bottlenecks. 4.3. HAPLOTYPE TESTS AND LINKAGE DISEQUILIBRIUM In contrast to the test of level of polymorphism, the haplotype tests tend to be fairly sensitive to our models of bottleneck in Drosophila; too few distinct haplotypes and a strong structure of haplotype can be observed after of a drastic or recent population size reduction. The haplotype tests have been developed because directional selection tends to reduced heterozygosity and increases the haplotype structure. But, under neutrality, closely linked regions have correlated histories (HUDSON 1983). Researchers have been using the heterogeneity as an argument for selection. However, our results (Figure 1, Supplement III to

slide-23
SLIDE 23

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 18

VII and Figure 2) suggest that bottlenecks, by increasing linkage disequilibrium (decrease of RM: those results were not shown but are consistent with those of WALL et al. 2002), exaggerate the correlation leading to larger scale heterogeneity among regions than expected by the neutral model. Note that, even for tests with low power to detect our models of bottleneck (level of polymorphism and negative tail of D), conditioning on finding a region rejecting the SNM, its size is expected to be larger in presence of bottleneck than under the SNM (Figure 1a, Supplement III). This creates a problem for using the heterogeneity argument to detect selection, as the observation of heterogeneity along a sequence may be consistent with demography. In natural populations, recent data on loci suggest a deficiency of haplotypes and spatial heterogeneity in non-African relative to African populations in Drosophila (HUDSON et al. 1994, BEGUN and AQUADRO 1995, HUDSON et al.1997, VEUILLE et al. 1998, ANDOLFATTO et al. 1999, MOUSSET et al. 2003, and others) and in human (KAYSER et al. 2003, SCHNEIDER et al. 2002). Also, multi-locus studies for selection produce data showing wells of diversity observed on chromosomes of Drosophila (NURMINSKY et al. 2001, HARR et

  • al. 2002) and Humans (PATIL et al. 2001, SCHLÖTTERER 2002, and SABETI et al. 2002). This

numerous examples of heterogeneity can not be interpreted as a unique signature of selection and may be explain by severe or recent bottlenecks.

  • 5. CONCLUSION AND PROSPECTS

This study stresses the importance of the demographic history in shaping patterns of genome variability. The biggest issue is that a populations history is usually unknown. We have studied bottleneck models because changes in populations size may be common in the history of most species. However, the effect of other departures from demographic stability should also be studied in modelling population structure (e.g. extinction-recolonisation). Departures from the SNM make finding evidence for selection and thus genomic regions experiencing adaptive evolution difficult. Specific methods need to be developed to better distinguish population size changes from selection. Combining several tests of neutrality which look at uncorrelated information from the genomic data may be an accurate way to do so. Unfortunately, the five tests we studied tend to be significant on the same regions under the bottlenecks models. An alternative approach was pursued by LAZZARO and CLARK (2003), who compare a set of candidate genes to random genes. The mean values of the statistic are significantly different for the two sets of genes; which is unlikely under a demographic change. This could be a good method to distinguish selection from demography.

slide-24
SLIDE 24

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome 19

Epilogue

This internship has been a wonderful experience. I have been immersed in the domain

  • f quantitative and population genetics with an amazing speed. In ICAPB, all my colleagues

provided me with a favourable learning environment, and were eager to answer the most basic

  • f questions answering this field of study. This enabled me to rapidly gain confidence with

the new concepts and method I had to deal with during my project. In addition to the scientific achievement during my time in Edinburgh, my professional future has been deeply changed. More precisely, for the next three years I will undertake a PhD in Brown University, Providence, Rhode Island, USA under the supervision

  • f Molly Przeworki. The subject itself will be determined later, but it will certainly be a

continuation of the project presented in this report. This PhD opportunity is the direct result of my presence and working in ICAPB this year. For all this I can never be grateful enough to all the persons who made this project possible, and indeed, a total success. In particular, the INSA of Lyon and the department for Bioinformatics and Modelling have been from the start ready to help me realise this year-

  • internship. I really hope the new promotions will still have this option to gain experience in a

year-project abroad during the fifth year of their engineering degree, because it appears to me to be a valuable opportunity to experience professional life. To conclude, I have had great time during this year and have learned a lot which, I am convinced, has prepared me for academic researches in the quantitative and population genetic.

slide-25
SLIDE 25

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome

ACKNOWLEDGEMENT

First, I would like to thank Pr. Nick Barton for his invitation to work in his group of Population Genetics. He gave me the wonderful opportunity to come to work in Edinburgh and especially in the prestigious Institute of Cell, Animal and Population Biology (ICAPB), University of Edinburgh, UK. Throughout the internship, he was always available to discuss my hypotheses and directions. I thank also Dr. Peter Andolfatto who was my collaborator and supervisor for my

  • project. He explained to me the quantitative genetics: hitchhiking, coalescence process,

linkage disequilibrium and other notion required to start my work on the project… Each time I had unclear ideas or a problem in my program, he was available to help me solve it. He also provides me with the guidance I needed to organise my work. But most of all, he had the patience and skill to handle me despite my unsettled mood and willingness to work. I thank Dr. Molly Przeworski for her guidelines and her comments about the project. Thanks also to the secretaries of ICAPB who were always ready to provide immediate help in finding solutions to solve the multiple technical and administrational problems I had during the beginning of my training period. I thank particularly all my colleagues of Nick Barton‘s group, and of the surrounding

  • ffices Alex Kalinka, Tim Sands, Toby Johnson, Jelle Zuidema, Angus Davison, Penny

Haddrill and Andy Gardner for our discussions about our projects, for their help and friendship and for some of them, their proof-reading of parts of this report. I also thank all the teams of the ICAPB for their kindness. I particularly thank to Xulio Maside for his help in finding Windows’ programs and solutions under such a Mac environment. I thank all the MSc students for helping me spend a nice time in Edinburgh by discovering with me the entertaining parts of this beautiful city. I particularly thank Hedi Soula for helping me rewrite a more clear and workable program and Guillaume Beslon for his friendly supervision from Lyon. I finally thank Jean-Michel Fayard and all my teachers for the help, advice and support they provided me during the internship.

slide-26
SLIDE 26

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome

LITERATURE CITED

ANDOLFATTO, P., 2001 Contrasting patterns of X-linked and autosomal nucleotide variation in Drosophila melanogaster and Drosophila simulans. Mol. Biol. Evol. 18: 279–290. ANDOLFATTO, P., M. PRZEWORSKI, 2000 A genome-wide departure from the standard neutral model in natural population of Drosophila. Genetics 156: 257–268. ANDOLFATTO, P., J.D. WALL, M. KREITMAN, 1999 Unusual haplotype structure at the proximal breakpoint of In(2L)t in a natural population of Drosophila melanogaster. Genetics 153:1297–1311. BEGUN, D.J, and, C.F. AQUADRO, 1994 Evolutionary inferences from DNA variation at the 6-Phosphogluconate dehydrogenase locus in natural populations of Drosophila: selection and geographic differentiation. Genetics 136: 155–171. BEGUN, D.J, and, C.F. AQUADRO, 1995: Molecular variation at the vermilion locus in geographically diverse populations of Drosophila melanogaster and D. simulans. Genetics 140: 1019–1032. BEGUN, D.J., and P. WHITLEY, 2000 Reduced X-linked nucleotide polymorphism in Drosophila simulans. Proc.

  • Natl. Acad. Sci. USA 97: 5960–5965

CHARLESWORTH, B., M. T. MORGAN and D. CHARLESWORTH, 1993 The effect of deleterious mutations on neutral molecular variation. Genetics 134: 1289–1303. DEPAULIS, F., and M. VEUILLE, 1998 Neutrality tests based on the distribution of haplotypes under an infinite- site model. Mol Biol Evol 15: 1788–1790. DEPAULIS, F., L. BRAZIER., and M. VEUILLE, 1999 Selective sweep at the Drosophila melanogaster suppressor

  • f Hairless locus and its association with the In(2L)t inversion polymorphism. Genetics 152:1017–1024.

EXCOFFIER, L., and S. S

CHNEIDER, 1999 Why hunter-gatherer populations do not show signs of pleistocene

demographic expansions. Proc Natl Acad Sci. USA 96: 10597–10602. FAY J.C., and C-I. WU, 2000 Hitchhiking under positive Darwinian selection. Genetics 155: 1405–1413. FRISSE, L., R. R. H

UDSON, A. BARTOSZEWICZ, J. D. WALL, J. DONFACK, and A. DI RIENZO, 2001 Gene

conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. Am J Hum Genet 69: 831–843. FU, Y.-X., 1996 New statistical tests of neutrality for DNA samples from a population. Genetics 143: 557–570. FU, Y.-X., and W.H. LI ,1993 Statistical tests of neutrality of mutations. Genetics 133: 693-709. GILAD, Y., and D. L

ANCET, 2003 Population Differences in the Human Functional Olfactory Repertoire Mol.

  • Biol. Evol. 20: 307–314.

HARR, B., M. KAUER, and C. S

CHLÖTTERER, 2002 Hitchhiking mapping: A population-based finemapping

strategy for adaptive mutations in Drosophila melanogaster . Proc Natl Acad Sci USA 99: 12949–12954 HAMBLIN, M.T., E.E. THOMPSON, A. DI RIENZO, 2002 Complex signatures of natural selection at the Duffy blood group locus. Am J Hum Genet. 70: 369–83. HUDSON, R. R., 1983 Properties of a neutral allele model with intragenic recombination. Theoretical Population Biology 23: 183–201. HUDSON, R. R., 2002 Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18: 337–338.

slide-27
SLIDE 27

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome HUDSON, R. R., and N. L. K

APLAN, 1985 Statistical properties of the number of recombination events in the

history of a sample of DNA sequences. Genetics 111:147–164. HUDSON, R. R., M. K

REITMAN, and M. AGUADE, 1987 A test of neutral molecular evolution based on

nucleotide data. Genetics 116: 153–159. HUDSON, R. R., K. B

AILEY, D. SKARECKY, J. KWIATOWSKY, and F.J. AYALA, 1994 Evidence for a positive

selection in the Superoxide Dismutase (Sod) region of Drosophila melanogaster. Genetics 136:1329–1340. HUDSON, R. R., A.G. SAEZ, AND F.J. AYALA, 1997 DNA variation at the Sod locus of Drosophila melanogaster: An unfolding story of natural selection. Proc. Natl. Acad. Sci. USA 94: 7725–7729. KAPLAN, N. L., R. R. H

UDSON, and C. H. LANGLEY, 1989 The "hitchhiking effect" revisited. Genetics 123:

887–899. KAYSER, M., S. BRAUER, M. STONEKING, 2003 A genome scan to detect candidate regions influenced by local natural selection in human populations. Mol Biol Evol. 20: 893–900. KIM, Y., and W. S

TEPHAN, 2002 Detecting a local signature of genetic hitchhiking along a recombining

  • chromosome. Genetics 160: 765–777.

KIMURA, M., 1968 Evolutionary rate at the molecular level. Nature 217: 624–626. KIMURA, M., 1983 The neutral theory of molecular evolution. Cambridge University Press, Cambridge, UK. KREITMAN, M., and R. R. HUDSON, 1991 Inferring the evolutionary histories of the Adh and Adh-dup loci in Drosophila melanogaster from patterns of polymorphism and divergence. Genetics 127: 565-582. LABATE, J.A., C.H. BIERMANN, W.F. EANES, 1999 Nucleotide variation at the runt locus in Drosophila melanogaster and Drosophila simulans. Mol Biol Evol 16: 724–731. LACHAISE, D., M.L. CARIOU, J.R. DAVID, F. LEMEUNIER and L. TSACAS, 1988 The origin and dispersal of the Drosophila melanogaster subgroup: a speculative paleobiogeographic essay. Evol Biol 22: 159–225. LAZZARO, B.P., C

LARK A.G., 2003 Molecular population genetics of inducible antibacterial peptide genes in

Drosophila melanogaster. Mol. Biol. Evol. 20: 914–923. MARUYAMA, T., and P. A. FUERST, 1985-a Population bottleneck and nonequilibrium models in population

  • genetics. II. Number of alleles in a small population that was formed by a recent bottleneck. Genetics 111:

675–689. MARUYAMA, T., and P. A. FUERST, 1985-b Population bottleneck and nonequilibrium models in population

  • genetics. III. Genic homozygosity in population which experience periodic bottlenecks. Genetics 111: 691–

703. MAYNARD SMITH, J., and J. HAIGH, 1974 The hitch-hiking effect of a favorable gene. Genet. Res. 23: 23–35. MOUSSET, S., L. B

RAZIER, M.-L. CARIOU, F. CHARTOIS, F. DEPAULIS, and M. V EUILLE, 2003 Evidence of a

high rate of selective sweeps in African Drosophila melanogaster. Genetics 163: 599–609 NACHMAN, M.W., and S.L. CROWELL, 2000-a Estimate of the Mutation Rate per Nucleotide in Humans. Genetics 156: 297–304. NACHMAN, M.W., and S.L. CROWELL, 2000-b Contrasting evolutionary histories of two introns of the duchenne muscular dystrophy gene, dmd, in humans. Genetics 155: 1855–1864. NEI, M., T. MARUYAMA, and R. CHAKRABORTY. 1975. The bottleneck effect and genetic variability in

  • populations. Evolution 29: 1–10.
slide-28
SLIDE 28

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome NURMINSKY, D., D. DE AGUIAR, C. D. BUSTAMANTE, D.L. HARTL, 2001 Chromosomal Effects of Rapid Gene Evolution in Drosophila melanogaster. Science 291: 128–130. OHTA, T., and M. KIMURA, 1971 Linkage disequilibrium between two segregating nucleotide sites under the steady flux of mutations in a finite population. Genetics 68: 571–580. OTTO, S. P., 2000 Detecting the form of selection from DNA sequence data. Trend In Genetics 16: 526–529. PARSCH, J., C.D. MEIKLEJOHN and D. L. HARTL 2001 Patterns of DNA sequence variation suggest the recent action of positive selection in the janus-ocnus region of Drosophila simulans. Genetics 159: 647–657 PATIL, N., A.J. BERNO, D. A. HINDS, W. A. BARRETT, J.M. DOSHI, C.R. H

ACKER, C.R. KAUTZER, D.H. LEE, C.

MARJORIBANKS, D.P. MCDONOUGH, B.T.N. N

GUYEN, M.C. N ORRIS, J.B. S HEEHAN, N. SHEN, D. STERN,

R.P. STOKOWSKI, D.J. THOMAS, M.O. TRULSON, K.R. VYAS, K.A. FRAZER, S.P.A. FODOR, D.R. COX, 2001 Blocks of limited haplotype diversity revealed by high-resolution scanning of Human chromosome 21. Science 294: 1719–1723 PLUZHNIKOV, A., A. DI RIENZO, and R. R. H

UDSON, 2002 Inferences about Human demography based on

multilocus analyses of noncoding sequences. Genetics 161: 1209–1218. PRZEWORSKI, M., 2002 The signature of positive selection at randomly chosen loci. Genetics 160: 1179–1189 PRZEWORSKI, M., WALL J.D., ANDOLFATTO P., 2001 Recombination and the frequency spectrum in Drosophila melanogaster and Drosophila simulans. Mol Biol Evol. 18: 291–298. PTAK, S.E, and M. PRZEWORSKI, 2002 Evidence for population growth in humans is confounded by fine-scale population structure. Trends Genet. 18: 559–563. REICH, D.E., S.F. SCHAFFNER, M. J. DALY, G. MCVEAN, J.C. MULLIKIN, J.M. H

IGGINS1, D.J. RICHTER, E. S.

LANDER and D. ALTSHULER, 2002 Human genome sequence variation and the influence of gene history, mutation and recombination. nature genetics 32: 135–142. SABETI, P.C., D. E. REICH, J. M. HIGGINS, H. Z. P. LEVINE, D. J. RICHTER, S.F. SCHAFFNER, S. B. G

ABRIEL, J.

  • V. PLATKO, N. J. P

ATTERSON, G. J. MCDONALD, H. C. ACKERMAN, S. J. C AMPBELL, D. ALTSHULER, R.

COOPERK, D. KWIATKOWSKI, R. WARD and E. S. LANDER, 2002 Detecting recent positive selection in the Human genome from haplotype structure. Nature 419: 832–837. SCHLÖTTERER, C., 2002 A microsatellite-based multilocus screen for the identification of local selective

  • sweeps. Genetics 160: 753–763.

SCHNEIDER, J.A., et al., 2002 Non-neutral evolution revealed by comparison of gene-based DNA sequence diversity in humans and chimpanzees. Am. J. Hum. Genet 71(supplement): abstract 1149. STROBECK, C., 1987 Average number of nucleotide differences in a sample from a single subpopulation: a test for population subdivision. Genetics 117: 149–153. TAJIMA F., 1983 Evolutionary relationship of the DNA sequences in finite populations. Genetics 123: 437–460. TAJIMA F., 1989-a Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123: 585–595. TAJIMA F., 1989-b The effect of change in population size on DNA polymorphism. Genetics 123:597–601. TAKAHASHI, A., S.C. TSAUR, J.A. COYNE, C.I. W

U, 2001 The nucleotide changes governing cuticular

hydrocarbon variation and their evolution in Drosophila melanogaster. Proc Natl Acad Sci USA 98: 3920– 3925. VEUILLE, M., V. BENASSI, S. AULARD, F. DEPAULIS, 1998 Allele-specific population structure of Drosophila melanogaster alcohol dehydrogenase at the molecular level. Genetics. 149: 971–81

slide-29
SLIDE 29

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome VIEIRA, J., and B. CHARLESWORTH, 2000 Evidence for selection at the fused locus of Drosophila virilis. Genetics 155: 1701–1709. WALL, J.D., 2000 A Comparison of estimators of the population recombination rate. Mol. Biol. Evol. 17:156– 163. WALL, J.D., 2003 Estimating ancestral population sizes and divergence times. Genetics. 163: 395–404. WALL, J.D., P. ANDOLFATTO, and M. PRZEWORSKI, 2002 Testing models of selection and demography in Drosophila simulans. Genetics 162: 203–216. WATTERSON, G. A., 1975 On the number of segregating sites. Theor. Popul. Biol. 7: 256–276. finite

  • populations. Theor. Appl. Genet. 38: 473–485.

WEIR, B. S., and W. G. HILL, 1986 Nonuniform recombination within the human β-globin gene cluster. Am. J.

  • Hum. Genet. 38: 776– 778.
slide-30
SLIDE 30

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome I

TABLE 2.Supplement A Simple step bottleneck models for Drosophila (ρ = 3θ).

Bottleneck θ/θo S K fMFH D H (µ, reject) (µ, reject) (µ, reject) (µ, reject 5%, 95%) (µ, reject) (I) Tb = 2000 ga 0.82 40.2, 0.0002 8.2, 0.16 0.27, 0.07 0.42, 0.0004, 0.03

  • 1.17, 0.02

(II) Tb = 120,000 ga 0.85 41.3, 0.0002 9.4, 0.05 0.24, 0.03 0.36, 0.0004, 0.02

  • 1.12, 0.02
  • Eqb. Pop.a

0.80 39.3, 0.002 12.6,< 0.0001 0.16, 0.001

  • 0.01, 0.001, 0.001

0.00, 0.01 (III) Tb = 2000 ga 0.50 24.3, 0.25 3.5, 0.96 0.55, 0.56 0.88, 0.05, 0.36

  • 3.94, 0.21

(IV) Tb = 120,000 ga 0.51 24.9, 0.22 4.4, 0.78 0.50, 0.42 0.85, 0.04, 0.33

  • 3.76, 0.19
  • Eqb. Pop.a

0.50 24.5, 0.01 11.2, 0.0002 0.20, 0.002

  • 0.02, 0.01, 0.01
  • 0.01, 0.01

Ancestralb 1.00 48.8, - 13.1,< 0.0001 0.14, 0.0006

  • 0.01, 0.0003, 0.0008

0.05, 0.004 All the simulations consider ρ = 3θ, 15 chromosomes and 10,000 repetitions.. The bottleneck modelled are the simple step bottlenecks corresponding to the step-recovery bottlenecks of Table1 (Roman numbers), with Nb=2000 and 120,000 (0.8 variability reduction) and 10,000 and 600,000 (0.5 variability reduction). µ and reject are the mean and rejection probability.

a Simulations with the SNM based on θ = 12 or 7.5 (for 0.8 and 0.5 variability reduction

respectively) with recombination ρ = 3θ.

b Simulations with the SNM based on θ = 15, with recombination ρ = 3θ.

slide-31
SLIDE 31

Céline BECQUET ICAPB, Edinburgh, 09/02 – 06/03 Signatures of a population bottleneck can be localised along a recombining chromosome II

TABLE 2.Supplement B Step-recovery bottleneck models for Drosophila (ρ = 15θ).

Bottleneck θ/θo S K fMFH D H (µ, reject) (µ, reject) (µ, reject) (µ, reject 5%, 95%) (µ, reject) (I) Tb = 2000 ga 0.83 40.3, 0.0002 8.8, 0.07 0.25, 0.04 0.42,< 0.0001, 0.01

  • 1.25, 0.01

(II) Tb = 120,000 ga 0.84 40.8, 0.0001 13.4,< 0.0001 0.14, 0.0006 0.36,< 0.0001, 0.004

  • 1.30, 0.01
  • Eqb. Pop.a

0.80 39.0, 0.0001 14.3,< 0.0001 0.10,<0.0001 0.00,< 0.0001, 0.0002

  • 0.01, 0.001

(III) Tb = 2000 ga 0.50 24.4, 0.03 3.7, 0.96 0.54, 0.54 0.90, 0.04, 0.32

  • 3.83, 0.18

(IV) Tb = 120,000 ga 0.52 25.2, 0.03 9.7, 0.02 0.33, 0.12 0.71, 0.03, 0.21

  • 3.63, 0.17
  • Eqb. Pop.a

0.50 24.4, 0.002 13.6,< 0.0001 0.13,< 0.0001 0.00, 0.0005, 0.0009

  • 0.01, 0.01

Ancestralb 1.00 48.8, - 14.5,< 0.0001 0.10,< 0.0001 0.0031,< 0.0001, 0.0001

  • 0.01, 0.0007

All the simulations consider ρ = 15θ, 15 chromosomes and 10,000 repetitions. The Roman numbers refer to the bottleneck models described in Table 1. µ and reject are the mean and rejection probability.

a Simulations with the SNM based on θ = 12 or 7.5 (for 0.8 and 0.5 variability reduction

respectively) with recombination ρ = 15θ.

b Simulations with the SNM based on θ = 15, with recombination ρ = 3θ.

slide-32
SLIDE 32

III

slide-33
SLIDE 33

IV

slide-34
SLIDE 34

V

slide-35
SLIDE 35

VI

slide-36
SLIDE 36

VII