Hierarchische und mehrkriterielle Optimierungssystematik nach dem - - PowerPoint PPT Presentation

hierarchische und mehrkriterielle optimierungssystematik
SMART_READER_LITE
LIVE PREVIEW

Hierarchische und mehrkriterielle Optimierungssystematik nach dem - - PowerPoint PPT Presentation

Hierarchische und mehrkriterielle Optimierungssystematik nach dem Vorbild der RNA-Selektion Peter Schuster Institut fr Theoretische Chemie und Molekulare Strukturbiologie der Universitt Wien BBAW Studiengruppe: Strukturbildung und


slide-1
SLIDE 1
slide-2
SLIDE 2

Hierarchische und mehrkriterielle Optimierungssystematik nach dem Vorbild der RNA-Selektion Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien BBAW Studiengruppe: Strukturbildung und Innovation Berlin, 21.– 22.11.2003

slide-3
SLIDE 3 O CH2 OH O O P O O O

N1

O CH2 OH O P O O O

N2

O CH2 OH O P O O O

N3

O CH2 OH O P O O O

N4

N A U G C

k =

, , ,

3' - end 5' - end Na Na Na Na

RNA

nd 3’-end

GCGGAU AUUCGC UUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG 3'-end 5’-end

70 60 50 40 30 20 10

Definition of RNA structure

5'-e

slide-4
SLIDE 4

5'-End 5'-End 5'-End 3'-End 3'-End 3'-End

70 60 50 40 30 20 10 GCGGAUUUAGCUCAGDDGGGAGAGCMCCAGACUGAAYAUCUGGAGMUCCUGUGTPCGAUCCACAGAAUUCGCACCA

Sequence Secondary structure Symbolic notation

  • A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs
slide-5
SLIDE 5

S1

(h)

S9

(h)

Free energy G Minimum of free energy Suboptimal conformations

S0

(h) S2

(h)

S3

(h)

S4

(h)

S7

(h)

S6

(h)

S5

(h)

S8

(h)

G G G G G G G G G G G G G G G G U U U U U U U U U U U A A A A A A A A A A A A U C C C C C C C C C C C C 5’-end 3’-end

The minimum free energy structures on a discrete space of conformations

slide-6
SLIDE 6 5.10 2 2.90 8 14 15 18 2.60 17 23 19 27 22 38 45 25 36 33 39 40 3.10 43 3.40 41 3.30 7.40 5 3 7 3.00 4 10 9 3.40 6 13 12 3.10 11 21 20 16 28 29 26 30 32 42 46 44 24 35 34 37 49 2.80 31 47 48

S0 S1

Kinetic Structures Free Energy S0 S0 S1 S2 S3 S4 S5 S6 S7 S8 S10 S9 Minimum Free Energy Structure Suboptimal Structures T = 0 K , t T > 0 K , t T > 0 K , t finite

5.90

Different notions of RNA structure including suboptimal conformations

slide-7
SLIDE 7

Free energy G "Reaction coordinate" Sk S{ Saddle point T

{ k

F r e e e n e r g y G Sk S{ T

{ k

"Barrier tree"

Definition of a ‚barrier tree‘

slide-8
SLIDE 8

5 . 1

2 8

14 15 18 17 23 19 27 22 38 45 25 36 33 39 40 43 41

3 . 3 7 . 4

5 3 7 4 10 9 6

13 12 3.10 11 21 20 16 28 29 26 30 32 42 46 44 24 35 34 37 49 31 47 48

S0 S1

Kinetic folding

S0 S1 S2 S3 S4 S5 S6 S7 S8 S10 S9

Suboptimal structures

lim t finite folding time

5 . 9

A typical energy landscape of a sequence with two (meta)stable comformations

slide-9
SLIDE 9

Kinetics RNA refolding between a long living metastable conformation and the minmum free energy structure

slide-10
SLIDE 10

Minimal hairpin loop size: nlp 3 Minimal stack length: nst 2

Recursion formula for the number of acceptable RNA secondary structures

slide-11
SLIDE 11

Computed numbers of minimum free energy structures over different nucleotide alphabets

  • P. Schuster, Molecular insights into evolution of phenotypes. In: J. Crutchfield & P.Schuster,

Evolutionary Dynamics. Oxford University Press, New York 2003, pp.163-215.

slide-12
SLIDE 12

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Structure space Real numbers Mapping from sequence space into structure space and into function

slide-13
SLIDE 13

G G G G G G G G G G G G G G G G U U U U U U U U U U U A A A A A A A A A A A A U C C C C C C C C C C C C 5’-end 3’-end GGCGCGCCCGGCGCC GUAUCGAAAUACGUAGCGUAUGGGGAUGCUGGACGGUCCCAUCGGUACUCCA UGGUUACGCGUUGGGGUAACGAAGAUUCCGAGAGGAGUUUAGUGACUAGAGG

RNAStudio.lnk

Folding of RNA sequences into secondary structures of minimal free energy, G0

300

slide-14
SLIDE 14

Hamming distance d (S ,S ) =

H 1 2

4 d (S ,S ) = 0

H 1 1

d (S ,S ) = d (S ,S )

H H 1 2 2 1

d (S ,S ) d (S ,S ) + d (S ,S )

H H H 1 3 1 2 2 3

  • (i)

(ii) (iii)

The Hamming distance between structures in parentheses notation forms a metric in structure space

slide-15
SLIDE 15

f0 f f1 f2 f3 f4 f6 f5 f7

Replication rate constant: fk = / [+ dS

(k)]

  • dS

(k) = dH(Sk,S

)

Evaluation of RNA secondary structures yields replication rate constants

slide-16
SLIDE 16

Stock Solution Reaction Mixture

Replication rate constant: fk = / [+ dS

(k)]

  • dS

(k) = dH(Sk,S

) Selection constraint: # RNA molecules is controlled by the flow N N t N ± ≈ ) ( The flowreactor as a device for studies of evolution in vitro and in silico

slide-17
SLIDE 17

s p a c e Sequence Concentration

Master sequence Mutant cloud “Off-the-cloud” mutations

The molecular quasispecies in sequence space

slide-18
SLIDE 18

S{ = ( ) I{ f S

{ {

ƒ = ( )

S{ f{ I{

Mutation Genotype-Phenotype Mapping Evaluation of the Phenotype

Q{

j

I1 I2 I3 I4 I5 In

Q

f1 f2 f3 f4 f5 fn

I1 I2 I3 I4 I5 I{ In+1 f1 f2 f3 f4 f5 f{ fn+1

Q

Evolutionary dynamics including molecular phenotypes

slide-19
SLIDE 19

In silico optimization in the flow reactor: Trajectory (biologists‘ view) Time (arbitrary units) A v e r a g e d i s t a n c e f r

  • m

i n i t i a l s t r u c t u r e 5

  • d
  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-20
SLIDE 20

In silico optimization in the flow reactor: Trajectory (physicists‘ view) Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t

  • t

a r g e t d

  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-21
SLIDE 21

AUGC GC Movies of optimization trajectories over the AUGC and the GC alphabet

slide-22
SLIDE 22

Runtime of trajectories F r e q u e n c y

1000 2000 3000 4000 5000 0.05 0.1 0.15 0.2

Statistics of the lengths of trajectories from initial structure to target (AUGC-sequences)

slide-23
SLIDE 23

44

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Endconformation of optimization

slide-24
SLIDE 24

44 43

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of the last step 43 44

slide-25
SLIDE 25

44 43 42

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of last-but-one step 42 43 ( 44)

slide-26
SLIDE 26

44 43 42 41

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of step 41 42 ( 43 44)

slide-27
SLIDE 27

44 43 42 41 40

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of step 40 41 ( 42 43 44)

slide-28
SLIDE 28

44 43 42 41 40 39 Evolutionary process Reconstruction

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of the relay series

slide-29
SLIDE 29

Transition inducing point mutations Neutral point mutations

Change in RNA sequences during the final five relay steps 39 44

slide-30
SLIDE 30

In silico optimization in the flow reactor: Trajectory and relay steps Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t

  • t

a r g e t d

  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

Relay steps

slide-31
SLIDE 31

10 08 12 14 Time (arbitrary units) Average structure distance to target dS

  • 500

250 20 10

Uninterrupted presence Evolutionary trajectory Number of relay step

28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations Neutral point mutations

Neutral genotype evolution during phenotypic stasis

slide-32
SLIDE 32

18 19 20 21 26 28 29 31

Time (arbitrary units)

750 1000 1250

Average structure distance to target dS

  • 30

20 10

Uninterrupted presence Evolutionary trajectory 35 30 25 20 Number of relay step

A random sequence of minor or continuous transitions in the relay series

slide-33
SLIDE 33

18 19 25 27 20 22 24 21 23 26 30 28 29 31

A random sequence of minor or continuous transitions in the relay series

slide-34
SLIDE 34

Time (arbitrary units)

750 1000 1250

Average structure distance to target dS

  • 30

20 10

Uninterrupted presence Evolutionary trajectory 35 30 25 20 Number of relay step

A random sequence of minor or continuous transitions in the relay series

slide-35
SLIDE 35

In silico optimization in the flow reactor: Main transitions Main transitions Relay steps Time (arbitrary units) Average structure distance to target d S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-36
SLIDE 36

00 09 31 44

Three important steps in the formation of the tRNA clover leaf from a randomly chosen initial structure corresponding to three main transitions.

slide-37
SLIDE 37

Number of transitions F r e q u e n c y

20 40 60 80 100 0.05 0.1 0.15 0.2 0.25 0.3

All transitions Main transitions

Statistics of the numbers of transitions from initial structure to target (AUGC-sequences)

slide-38
SLIDE 38

Alphabet Runtime Transitions Main transitions

  • No. of runs

AUGC 385.6 22.5 12.6 1017 GUC 448.9 30.5 16.5 611 GC 2188.3 40.0 20.6 107

Statistics of trajectories and relay series (mean values of log-normal distributions)

slide-39
SLIDE 39

10 08 12 14 Time (arbitrary units) Average structure distance to target dS

  • 500

250 20 10

Uninterrupted presence Evolutionary trajectory Number of relay step

28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations Neutral point mutations

Neutral genotype evolution during phenotypic stasis

slide-40
SLIDE 40

Variation in genotype space during optimization of phenotypes

Mean Hamming distance within the population and drift velocity of the population center in sequence space.

slide-41
SLIDE 41

Spread of population in sequence space during a quasistationary epoch: t = 150

slide-42
SLIDE 42

Spread of population in sequence space during a quasistationary epoch: t = 170

slide-43
SLIDE 43

Spread of population in sequence space during a quasistationary epoch: t = 200

slide-44
SLIDE 44

Spread of population in sequence space during a quasistationary epoch: t = 350

slide-45
SLIDE 45

Spread of population in sequence space during a quasistationary epoch: t = 500

slide-46
SLIDE 46

Spread of population in sequence space during a quasistationary epoch: t = 650

slide-47
SLIDE 47

Spread of population in sequence space during a quasistationary epoch: t = 820

slide-48
SLIDE 48

Spread of population in sequence space during a quasistationary epoch: t = 825

slide-49
SLIDE 49

Spread of population in sequence space during a quasistationary epoch: t = 830

slide-50
SLIDE 50

Spread of population in sequence space during a quasistationary epoch: t = 835

slide-51
SLIDE 51

Spread of population in sequence space during a quasistationary epoch: t = 840

slide-52
SLIDE 52

Spread of population in sequence space during a quasistationary epoch: t = 845

slide-53
SLIDE 53

Spread of population in sequence space during a quasistationary epoch: t = 850

slide-54
SLIDE 54

Spread of population in sequence space during a quasistationary epoch: t = 855

slide-55
SLIDE 55

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Structure space Real numbers Mapping from sequence space into structure space and into function

slide-56
SLIDE 56

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Structure space Real numbers

slide-57
SLIDE 57

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Structure space Real numbers

The pre-image of the structure Sk in sequence space is the neutral network Gk

slide-58
SLIDE 58

Neutral networks are sets of sequences forming the same structure. Gk is the pre-image of the structure Sk in sequence space: Gk =

  • 1(Sk) π{Ij |

(Ij) = Sk} The set is converted into a graph by connecting all sequences of Hamming distance one. Neutral networks of small RNA molecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence

  • space. In this approach, nodes are inserted randomly into sequence

space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.

slide-59
SLIDE 59

λj = 27 = 0.444 ,

/

12 λk = (k)

j

| | Gk

λ κ

cr = 1 -

  • 1 (

1)

/ κ- λ λ

k cr . . . .

> λ λ

k cr . . . .

< network is connected Gk network is connected not Gk Connectivity threshold: Alphabet size : = 4

  • AUGC

G S S

k k k

= ( ) | ( ) =

  • 1

U

  • I

I

j j

  • cr

2 0.5 3 0.423 4 0.370

GC,AU GUC,AUG AUGC

Mean degree of neutrality and connectivity of neutral networks

slide-60
SLIDE 60

A connected neutral network

slide-61
SLIDE 61

Giant Component

A multi-component neutral network

slide-62
SLIDE 62 5'-End 5'-End 5'-End 5'-End 3'-End 3'-End 3'-End 3'-End 70 70 70 70 60 60 60 60 50 50 50 50 40 40 40 40 30 30 30 30 20 20 20 20 10 10 10 10

Alphabet Degree of neutrality

AU AUG AUGC UGC GC

  • -
  • -

0.275 0.064 0.263 0.071 0.052 0.033

  • -

0.217 0.051 0.279 0.063 0.257 0.070

  • 0.057 0.034
  • 0.073 0.032

0.201 0.056 0.313 0.058 0.250 0.064 0.068 0.034

  • Degree of neutrality of cloverleaf RNA secondary structures over different alphabets
slide-63
SLIDE 63

Reference for postulation and in silico verification of neutral networks

slide-64
SLIDE 64

Structure

slide-65
SLIDE 65

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G

Compatible sequence Structure

5’-end 3’-end

slide-66
SLIDE 66

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G G G G G G C C C C G G G G C C C C C C C U A U U G U A A A A U

Compatible sequence Structure

5’-end 3’-end

slide-67
SLIDE 67

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G C C C C G G G G C C G G G G G C C C C C U A U U G U A A A A U

Compatible sequence Structure

5’-end 3’-end

Base pairs: AU , UA GC , CG GU , UG Single nucleotides: A U G C , , ,

slide-68
SLIDE 68

Gk Neutral Network

Structure S

k

Gk C k

Compatible Set Ck

The compatible set Ck of a structure Sk consists of all sequences which form Sk as its minimum free energy structure (the neutral network Gk) or one of its suboptimal structures.

slide-69
SLIDE 69

Structure S Structure S

1

The intersection of two compatible sets is always non empty: C0 C1 π

slide-70
SLIDE 70

Reference for the definition of the intersection and the proof of the intersection theorem

slide-71
SLIDE 71

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G

3’-end

M i n i m u m f r e e e n e r g y c

  • n

f

  • r

m a t i

  • n

S S u b

  • p

t i m a l c

  • n

f

  • r

m a t i

  • n

S 1

G G G G G G G G G G G G C C C C U U U U C C C C C C U A A A A A C G G G G G G C C C C U U G G G G G C C C C C C C U U A A A A A U G

A sequence at the intersection of two neutral networks is compatible with both structures

slide-72
SLIDE 72

5.10 5.90

2 8

14 15 18 17 23 19 27 22 38 45 25 36 33 39 40 43 41

3.30 7.40

5 3 7 4 10 9 6

13 12 3 . 1 11 21 20 16 28 29 26 30 32 42 46 44 24 35 34 37 49 31 47 48

S0 S1

basin '1' long living metastable structure basin '0' minimum free energy structure

Barrier tree for two long living structures

slide-73
SLIDE 73

A ribozyme switch

E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452

slide-74
SLIDE 74

Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis-

  • virus (B)
slide-75
SLIDE 75

The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures

slide-76
SLIDE 76

Two neutral walks through sequence space with conservation of structure and catalytic activity

slide-77
SLIDE 77

Sequence of mutants from the intersection to both reference ribozymes

slide-78
SLIDE 78

Dolomites

Examples of rugged landscapes on Earth

Bryce Canyon

slide-79
SLIDE 79

Genotype Space Fitness

Start of Walk End of Walk

Evolutionary optimization in absence of neutral paths in sequence space

slide-80
SLIDE 80

Genotype Space F i t n e s s

Start of Walk End of Walk Random Drift Periods Adaptive Periods

Evolutionary optimization including neutral paths in sequence space

slide-81
SLIDE 81

Grand Canyon

Example of a landscape on Earth with ‘neutral’ ridges and plateaus

slide-82
SLIDE 82

Neutral ridges and plateus

slide-83
SLIDE 83

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-84
SLIDE 84