Everything you wanted to know about VAMP but were afraid to ask - - PowerPoint PPT Presentation

everything you wanted to know about vamp but were afraid
SMART_READER_LITE
LIVE PREVIEW

Everything you wanted to know about VAMP but were afraid to ask - - PowerPoint PPT Presentation

Everything you wanted to know about VAMP but were afraid to ask Brooke Husic Stanford/FU Berlin PyEMMA Workshop February 21, 2019 First of all V ariational A pproach for M arkov P rocesses Key papers: Wu & No 2017, arXiv:1707.04659,


slide-1
SLIDE 1

Everything you wanted to know about VAMP but were afraid to ask

Brooke Husic Stanford/FU Berlin PyEMMA Workshop February 21, 2019

slide-2
SLIDE 2

First of all

Variational Approach for Markov Processes

Key papers: Wu & Noé 2017, arXiv:1707.04659, “Variational approach…” Paul et al, arXiv:1811.12551, “Identification of kinetic…”

slide-3
SLIDE 3

First of all

Variational Approach for Markov Processes

Real answer Guesses Value

Key papers: Wu & Noé 2017, arXiv:1707.04659, “Variational approach…” Paul et al, arXiv:1811.12551, “Identification of kinetic…”

slide-4
SLIDE 4

First of all

Variational Approach for Markov Processes

A B C D E Our data: Z1, Z2, …, Zt−2, Zt−1, Zt, Zt+1

Key papers: Wu & Noé 2017, arXiv:1707.04659, “Variational approach…” Paul et al, arXiv:1811.12551, “Identification of kinetic…”

slide-5
SLIDE 5

First of all

Variational Approach for Markov Processes

Our data: Z1, Z2, …, Zt−2, Zt−1, Zt, Zt+1 [Zt, Zt+1] [Zt, Zt+!] X = Z1 Z2 Z2 Z3 Z3 Z4 Z4 Z5 Zt−1 Zt Y =

Key papers: Wu & Noé 2017, arXiv:1707.04659, “Variational approach…” Paul et al, arXiv:1811.12551, “Identification of kinetic…”

slide-6
SLIDE 6

Some history

Hand-selected features MSM Pairwise RMSD MSM Large sets of features Dimensionality reduction Atomic positions Atomic positions Atomic positions MSM State decomposition MSMBuilder 2009 dPCA, tICA 2005, 2011, 2013 Zwanzig 1983

Figure from: Husic & Pande 2018, JACS, “Markov State Models: From an Art to a Science”

slide-7
SLIDE 7

The problem

Clustering

▷ algorithm ▷ number of clusters

Featurization

▷ internal coordinate system ▷ transformations Dimensionality Reduction

▷ PCA, TICA ▷ TICA lag time, # components

Raw Trajectories MSM ⊠ # timescales ⊠ lag time

Figure from: Husic & Pande 2017, J Chem Phys, “MSM lag time cannot be used for variational model selection”

Try 5 different featurizations? Compare 3 different TICA lag times? S e a r c h 1 d i f f e r e n t n u m b e r s

  • f

c l u s t e r s ? Do chi angles help for a dihedral featurization? Need a method to objectively evaluate modeling choices!

slide-8
SLIDE 8

Back to history

Hand-selected features MSM Pairwise RMSD MSM Large sets of features Dimensionality reduction Atomic positions Atomic positions Atomic positions MSM Large sets of features Dimensionality reduction Atomic positions MSM State decomposition Cross validation MSMBuilder 2009 dPCA, TICA 2005, 2011, 2013 VAC 2013 GMRQ, VAMP 2015, 2017 Zwanzig 1983 Variational evaluation Training set Validation set MSM Atomic positions VAMPnets 2017 Neural network

Figure from: Husic & Pande 2018, JACS, “Markov State Models: From an Art to a Science”

① ② ③

slide-9
SLIDE 9

Let’s make sure we’re clear on MSMs

Figure from: Husic & Pande 2018, JACS, “Markov State Models: From an Art to a Science”

This is it! This *is* the MSM.

Transition matrix ★ Thermodynamics (populations!) ★ Kinetics (transition probabilities!) ★ Dynamical processes (eigenvectors!) ★ Pathways (TPT!)

slide-10
SLIDE 10

The VAC

Key papers: Noé & Nüske 2013, Multiscale Model Simul, “A Variational Approach…” Nüske et al 2014, J Chem Theory Comput, “Variational Approach…”

T(!)

Transition matrix

slide-11
SLIDE 11

The VAC

Key papers: Noé & Nüske 2013, Multiscale Model Simul, “A Variational Approach…” Nüske et al 2014, J Chem Theory Comput, “Variational Approach…”

T(!)ψi = λiψi ti = – ! / ln | λi |

Eigenvectors: dynamical processes Eigenvalues: related to timescales

The eigenvalues have special properties according to the Perron-Frobenius theorem:

  • They are real
  • There is a unique

maximum eigenvalue of 1

  • All other eigenvalues have

absolute values below 1

slide-12
SLIDE 12

The VAC

Key papers: Noé & Nüske 2013, Multiscale Model Simul, “A Variational Approach…” Nüske et al 2014, J Chem Theory Comput, “Variational Approach…”

T(!)ψi = λiψi ti = – ! / ln | λi |

The variational principle is for the eigenvalues

Σ λi ≤ Σ λi

m i=1 i=1 m ⋀

Eigenvalue predictions from MSM Unknown true eigenvalues

slide-13
SLIDE 13

The VAC

Key papers: Noé & Nüske 2013, Multiscale Model Simul, “A Variational Approach…” Nüske et al 2014, J Chem Theory Comput, “Variational Approach…”

T(!)ψi = λiψi ti = – ! / ln | λi |

Σ λi ≤ Σ λi

m i=1 i=1 m ⋀

Eigenvalue predictions from MSM Unknown true eigenvalues

SCORE =

IMPORTANT: This score is only for the transition matrix defined at the given lag time !

slide-14
SLIDE 14

Reminder

Clustering

▷ algorithm ▷ number of clusters

Featurization

▷ internal coordinate system ▷ transformations Dimensionality Reduction

▷ PCA, TICA ▷ TICA lag time, # components

Raw Trajectories MSM ⊠ # timescales ⊠ lag time

Figure from: Husic & Pande 2017, J Chem Phys, “MSM lag time cannot be used for variational model selection”

Try 5 different featurizations? Compare 3 different TICA lag times? S e a r c h 1 d i f f e r e n t n u m b e r s

  • f

c l u s t e r s ? Do chi angles help for a dihedral featurization?

✅‍ ✅‍ ✅‍ ✅‍

Check 5 different MSM lag times? Eligible regime for scoring MSMs

slide-15
SLIDE 15

Cross validation

Key paper: McGibbon & Pande 2015, J Chem Phys, “Variational cross-validation…”

Σ λi ≤ Σ λi

m i=1 i=1 m ⋀

Unknown true eigenvalues

SCORE =

This method will have a problem with overfitting

Data:

Training set Validation set Make MSM (is there enough sampling?) Apply MSM and score eigenvalues some number of iterations with different sets ⨉

Eigenvalue predictions from MSM validation set

slide-16
SLIDE 16

An example

From Husic et al 2016, J Chem Phys, “Optimized parameter selection…”

slide-17
SLIDE 17

Finally: the VAMP!

Key papers: Wu & Noé 2017, arXiv:1707.04659, “Variational approach…” Paul et al, arXiv:1811.12551, “Identification of kinetic…”

T(!)

Transition matrix

The transition matrix has certain properties due to the reversibility assumption. This includes having an eigendecomposition.

slide-18
SLIDE 18

Finally: the VAMP!

Key papers: Wu & Noé 2017, arXiv:1707.04659, “Variational approach…” Paul et al, arXiv:1811.12551, “Identification of kinetic…”

K(!)

Transition matrix

?

However, it will always have a singular value decomposition.

Σ σi ≤ Σ σi

m i=1 i=1 m ⋀

SCORE = { φi, σi, φi }

The VAMP uses more general math to score models that may not be reversible

Consider now a different matrix that is not necessarily reversible. It may not have an eigendecomposition anymore,

  • r its eigendecomposition

may not be useful.

slide-19
SLIDE 19

What we’ve learned…

  • We have many choices when we make Markov state models
  • Luckily, we have the VAC to evaluate different choices objectively
  • But not the MSM lag time, of course.
  • We just have to do it under cross-validation to avoid overfitting
  • We can use the VAMP in the more general, nonreversible case
  • Which is the same as the VAC when we have an MSM!
  • With an objective metric, can’t we just make models automatically..?
  • Stay tuned!
slide-20
SLIDE 20

Paper highlights

VAC theory

Noé & Nüske 2013, Multiscale Model Simul, “A Variational Approach…” Nüske et al 2014, J Chem Theory Comput, “Variational Approach…”

Cross-validation

McGibbon & Pande 2015, J Chem Phys, “Variational cross-validation…”

VAMP theory

Wu & Noé 2017, arXiv:1707.04659, “Variational approach…” Paul et al, arXiv:1811.12551, “Identification of kinetic…”

General overview/history of MSMs

Husic & Pande 2018, JACS, “Markov State Models: From an Art to a Science”

General overview of ML methods

Noé 2018, arXiv:1812.07669, “Machine learning…”