[PPT] - Data-driven model reduction for stochastic Burgers equations Fei Lu PowerPoint Presentation

SLIDE 1

Data-driven model reduction for stochastic Burgers equations

Fei Lu

Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) 2nd Symposium on Machine Learning and Dynamical Systems September, 2020

1 / 23

SLIDE 2

Consider a stochastic Burgers equation

vt = νvxx − vvx + f(x, t), x ∈ [0, 2π], periodic BC

N-mode Fourier-Galerkin: k = 1, . . . , N

d dt vk = −νk2 vk + ik 2

|l|≤N,|k−l|≤N
vl

vk−l + fk(t),

Need: N 1/ν, dt ∼ 1/N by (CFL) → Costly: ν = 10−4 → N ∼ 104, time steps= 104T To simulate 104 time units, we need 108 time steps! Interested in: efficient simulations of ( v1:K), K << N. Question: a reduced closure model of ( v1:K)? Space-time reduction: reduce spatial dimension + increase time step size

2 / 23

SLIDE 3

Motivation: data assimilation with ensemble prediction

x′= F(x) + U(x, y), resolved scales ( v1:K) y′ = G(x, y), subgrid-scales ( vK+1:N)

Data assimilation: partial noisy observation → prediction missing i.c. → ensemble prediction can only afford to resolve x′ = F(x)

3 / 23

SLIDE 4

Motivation: data assimilation with ensemble prediction

x′= F(x) + U(x, y), resolved scales ( v1:K) y′ = G(x, y), subgrid-scales ( vK+1:N)

Data assimilation: partial noisy observation → prediction missing i.c. → ensemble prediction can only afford to resolve x′ = F(x) Objective: Develop a closure reduced model of x that captures key statistical + dynamical properties can be used for ensemble simulations

4 / 23

SLIDE 5

Closure modeling, model error UQ, subgrid parametrization

Direct constructions: non-linear Galerkin [Fioas, Jolly,

Kevrekidis, Titi...]

moment closure [Levermore, Morokoff...] Mori-Zwanzig formalism memory → non-Markov process

[Chorin, Hald, Kupferman, Stinis, Li, Darve, E, Karniadarkis, Venturi, Duraisamy ...]

Inference/Data-driven ROM PCA/POD, DMD, Kooperman [Holmes,

Lumley, Marsden, Mezic, Wilcox, Kutz, Rowley ...]

ROM closure [Farhat, Carlberg, Iliescu, Wang...] stochastic models: SDEs/GLEs, time series models [Chorin/Majda/Gil groups] Equation-free [Kevrekidis,...] manifold/machine learning [***...]

5 / 23

SLIDE 6

Inference-based model reduction

6 / 23

SLIDE 7

x′ = F(x) + U(x, y), y′ = G(x, y).

Data {x(nh)}N

n=1

KEY: approx. the distribution of the stochastic process Approximate the discrete-time forward map: xn = Fn(x1:n−1) curse of dimensionality parametric inference: use the structure of the map

7 / 23

SLIDE 8

Discrete-time stochastic parametrization NARMA(p, q) [Chorin-Lu15] Xn = Xn−1 + Rh(Xn−1) + Zn, Zn = Φn + ξn, Φn =

p

j=1

ajXn−j +

r

j=1

s

i=1

bi,jPi(Xn−j)

Auto-Regression

+

q

j=1

cjξn−j

Moving Average

Rh(Xn−1) from a numerical scheme for x′ ≈ F(x) Φn depends on the past NARMAX in system identification Zn = Φ(Z, X) + ξn, Tasks: Structure derivation: terms and orders (p, r, s, q) in Φn; Parameter estimation: aj, bi,j, cj, and σ. Conditional MLE

8 / 23

SLIDE 9

Example: The two-layer Lorenz 96 model NARMA reproduces statistics: ACF, PDF [Chorin-Lu15PNAS] NARMA improves Data Assimilation [Lu-Tu-Chorin17MWR]

9 / 23

SLIDE 10

Model reduction for dissipative PDEs nonlinear Galerkin ↓ parametric inference

10 / 23

SLIDE 11

Kuramoto-Sivashinsky: vt = −vxx − νvxxxx − vvx Burgers: vt = νvxx − vvx + f(x, t), Goal: a closed model for ( v1:K), K << N.

d dt vk = −qν

k

vk + ik 2

|l|≤K,|k−l|≤K
vl

vk−l + fk(t), + ik 2

|l|>K or |k−l|>K
vl

vk−l

View ( v1:K) ∼ x, ( vk>K) ∼ y:

x′ = F(x) + U(x, y), y′ = G(x, y).

TODO: represent the effects of high modes to the low modes

11 / 23

SLIDE 12

Derivation of a parametric form (KSE): vt = −vxx − νvxxxx − vvx

Let v = u + w. In operator form: vt = Av + B(v), du dt = PAu + PB(u) + [PB(u + w) − PB(u)] dw dt = QAw + QB(u + w) Nonlinear Galerkin: approximate inertial manifold (IM)1

dw dt ≈ 0 ⇒ w ≈ A−1QB(u + w) ⇒ w ≈ ψ(u)

Need: spectral gap condition ; dim(u) > K: parametrization with time delay (Lu-Lin-Chorin17) A time series (NARMA) model of the form un

k = Rδ(un−1 k

) + gn

k + Φn k,

with Φn

k := Φn k(un−p:n−1, gn−p:n−1) in form of

Φn

k = p

j=1

cv

k,jun−j k

+ cR

k,jRδ(un−j k

) + cw

k,j

|k−l|≤K,K<|l|≤2K
r |l|≤K,K<|k−l|≤2K
un−1

l

un−j

k−l

KEY: high-modes = functions of low modes

1Foias, Constantin, Temam, Sell, Jolly, Kevrekidis, Titi et al (88-94) 12 / 23

SLIDE 13

Test setting: ν = 3.43 N = 128, dt = 0.001 Reduced model: K = 5,δ = 100dt 3 unstable modes 2 stable modes Long-term statistics:

−0.4 −0.2 0.2 0.4 0.6 10

−2

10 Real v4 pdf Data Truncated system NARMA

probability density function

10 20 30 40 50 −0.2 0.2 0.4 0.6 0.8 time ACF Data Truncated system NARMA

auto-correlation function

13 / 23

SLIDE 14

Prediction A typical forecast:

20 40 60 80 −0.5 0.5 v4 20 40 60 80 −0.4 −0.2 0.2 0.4 time t v4 the truncated system NARMA

RMSE of many forecasts:

20 40 60 80 5 10 15 lead time RMSE

NARMA the truncated system

Forecast time: the truncated system: T ≈ 5 the NARMA system: T ≈ 50 (≈ 2 Lyapunov time)

14 / 23

SLIDE 15

Derivation of parametric form: stochastic Burgers vt = νvxx − vvx + f(x, t)

Let v = u + w. In operator form: du dt = PAu + PB(u) + Pf + [PB(u + w) − PB(u)] dw dt = QAw + QB(u + w) + Qf spectral gap: Burgers ? (likely not) w(t) is not function of u(t), but a functional of its path Integration instead: w(t) = e−QAtw(0) + t e−QA(t−s)[QB(u(s) + w(s))]ds wn ≈ c0QB(un) + c1QB(un−1) + · · · + cpQB(un−p) Linear in parameter approximation: PB(u + w) − PB(u) = P[(uw)x + (u2)x]/2 ≈ P[(uw)x]/2 + noise ≈

p

j=0

cjP[(unQB(un−j))x] + noise

KEY: high-modes = functionals of paths of low modes

15 / 23

SLIDE 16

A time series (NARMA) model of the form un

k = Rδ(un−1 k

) + f n

k + gn k + Φn k,

with Φn

k := Φn k(un−p:n−1, f n−p:n−1) in form of

Φn

k = p

j=1

cv

k,jun−j k

+ cR

k,jRδ(un−j k

) + cw

k,j

|k−l|≤K,K<|l|≤2K
r |l|≤K,K<|k−l|≤2K
un−1

l

un−j

k−l

16 / 23

SLIDE 17

Numerical tests: ν = 0.05, K0 = 4 → random shocks

Full model: N = 128, dt = 0.005 Reduced model: K = 8, δ = 20dt

1 2 3 4 5 6 7 8 Wavenumber 10-2 10-1 100 Spectrum Spectrum True Truncated NAR

Energy spectrum

17 / 23

SLIDE 18

0.5 1 1.5 2 2.5 10 20

ACF

10-3

cov(|u

2|2,|uk|2) k=1

0.5 1 1.5 2 2.5 0.02 0.04 0.06

cov(|u

2|2,|uk|2) k=2 True Truncated NAR

0.5 1 1.5 2 2.5

4
2

2

ACF

10-3

cov(|u

2|2,|uk|2) k=3

0.5 1 1.5 2 2.5

1

1 2 10-3

cov(|u

2|2,|uk|2) k=4

0.5 1 1.5 2 2.5 10 20

ACF

10-4

cov(|u

2|2,|uk|2) k=5

0.5 1 1.5 2 2.5 10 20 10-4

cov(|u

2|2,|uk|2) k=6

0.5 1 1.5 2 2.5

Time Lag

10 20

ACF

10-4

cov(|u

2|2,|uk|2) k=7

0.5 1 1.5 2 2.5

Time Lag

2 4 10-3

cov(|u

2|2,|uk|2) k=8

Cross-ACF of energy (4th moments!)

18 / 23

SLIDE 19

5 10 15 20 25

0.5

0.5 1

Abs of Mode k=1

True Truncated NAR

5 10 15 20 25

1
0.5

0.5

Abs of Mode k=2

5 10 15 20 25

0.5

0.5 1

Abs of Mode k=3

5 10 15 20 25

0.6
0.4
0.2

0.2 0.4

Abs of Mode k=4

5 10 15 20 25

0.5

0.5

Abs of Mode k=5

5 10 15 20 25

0.4
0.2

0.2 0.4

Abs of Mode k=6

5 10 15 20 25

Time

0.5

0.5

Abs of Mode k=7

5 10 15 20 25

Time

0.5

0.5

Abs of Mode k=8

Trajectory prediction in response to force

19 / 23

SLIDE 20

Spacial-temporal reduction: how small can K (spatial dim.) be? how large can δ (time-step size) be? CFL number: |u| dt dx ∼ |u|Ndt ∼ |u|Kδ

20 / 23

SLIDE 21

Summary and ongoing work

x′ = f(x) + U(x,y), y′= g(x,y).

Data {x(nh)}N

n=1

“X ′ = f(X) + Z(t, ω)” Inference “Xn+1 = Xn + Rh(Xn) + Zn ” for prediction Discretization Inference

Inference-based stochastic model reduction non-intrusive time series (NARMA) parametrize projections on path space xn = Fn(x1:n−1) ≈

k

ckΦk

n−p:n−1

xn = Fn(x1:n−1) ≈ E[xn|x1:n−1] → Effective stochastic reduced model

21 / 23

SLIDE 22

Open problems: general dissipative systems + model selection post-processing to predict shocks theoretical understanding of the approximation

◮ optimal on the basis space in L2 (Lin-L.19) ◮ distance between the two stochastic processes? 22 / 23

SLIDE 23

References

Data-driven stochastic model reduction

◮ Chorin-Lu: Discrete approach to stochastic parametrization and dimension

reduction in nonlinear dynamics. PNAS 112 (2015), no. 32, 9804–9809.

◮ Lu-Lin-Chorin: Comparison of continuous and discrete-time data-based

modeling for hypoelliptic systems. CAMCoS, 11 (2016), no. 8, 4227–4246.

◮ Lu-Lin-Chorin: Data-based stochastic model reduction for the Kuramoto –

Sivashinsky equation. Physica D, 340 (2017), 46–57.

◮ Lin-Lu: Data-driven model reduction, Wiener projections, and the

Mori-Zwanzig formalism. preprint (2019)

◮ Lu: Data-driven model reduction for stochastic Burgers equations. In

preparation.

Data assimilation

◮ Lu-Tu-Chorin: Accounting for model error from unresolved scales in

EnKFs: improving the forecast model. MWR, 340 (2017).

Thank you!

FL acknowledges supports from JHU, LBL, NSF

23 / 23