[PPT] - The challenge of bounded, non-Gaussian, non- linear and multi-scale PowerPoint Presentation

SLIDE 1

The challenge of bounded, non-Gaussian, non- linear and multi-scale variables

Craig H. Bishop University of Melbourne, Parkville, Victoria, Australia

SLIDE 2

Major source of model error in weather and climate prediction models

is clouds and precipitation.

Until recently, observations of clouds and precipitation have been

studiously avoided in DA – not surprising then that they are a big source

f model error.
Clouds/precip DA problem cluttered with bounded, non-Gaussian, non-

linear and mult-scale variables.

2

Some Motivation

SLIDE 3

3

A cloud DA thought experiment

Imagine an infinite number

f replicate Earths
same solar/GHG forcing
each giving an independent

realization of today’s climate

same observation types and

locations but different random observation-errors

SLIDE 4

4

Visible Geostationary Imagery that we do not yet assimilate

SLIDE 5

5

The ideal objective of Data Assimilation

Imagine an infinite number

f replicate Earths
same solar forcing
each giving an independent

realization of today’s prior

same observation types and

locations but different random observation-errors

Each point represents a (u, u2) pair from a distinct replicate Earth

SLIDE 6

6

Each cross represents a (u, y) pair from a distinct replicate Earth

The ideal objective of Data Assimilation

Imagine an infinite number

f replicate Earths
same solar forcing
each giving an independent

realization of today’s prior

same observation types and

locations but different random observation-errors Error standard deviation of unbiased observation of a bounded variable must tend to zero as truth tends to zero. Bishop (2019, Q. J. Roy. Met. Soc.)

SLIDE 7

7

On our Earth, the observation of u2 gives y=1.71

The ideal objective of Data Assimilation

Imagine an infinite number

f replicate Earths
same solar forcing
each giving an independent

realization of today’s prior

same observation types and

locations but different random observation-errors

SLIDE 8

8

Ideal Data Assimilation (DA) in a simple model

Ideal DA gives the posterior pdf of replicate Earths having the same y value as our Earth’s y value.

Ideal posterior pdf is bi-modal. Bi-modality caused by non-linearity Prior and posterior pdf of u2 are like gamma pdfs. Highly non-Gaussian.

SLIDE 9

9

Current DA: 4DVar-No-Outer-Loop (US Navy) and EnKF (DWD)

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

1 2 2 2 2 1

estimate the posterior pdf using and covar covar co 4DVarNOL and EnKF var covar

a f f fT f fT

f

i i i i i i i i a f f fT f fT i i i i i i

u u u y y y y u u u u y y y y e e

é

ù é ù é ù = + +

ë

û ê ú ê ú ë û ë û é ù = + + ë û linear regression perturbed observations

( ) ( )

2

f

i i

u é ù

ê

ú ë û

EnKF/4DVarNOL posterior pdf of u is very poor.

EnKF & 4DvarNoOuterLoop (4DVarNOL) posterior pdf of u2 is highly inaccurate. Also, analyzed u2 values are not equal to the square of analyzed u values

Fails due to linear, Gaussian assumptions

SLIDE 10

10

Current DA: Incremental 4DVar (4DVar-with-outer-loop)

Fails due to Gaussians assumption and the presence of multiple extrema (non-linearity)

4DVar posterior pdf of u is still poor.

4DVar posterior pdf of u2 is highly inaccurate. However, analyzed u2 values are now equal to the square of analyzed u values

4DVar uses perturbed

bservations. Each posterior

member is a local extreme value

f a Gaussian approximation to

the true posterior pdf.

SLIDE 11

1. Observations of bounded non-Gaussian variables

Ø EnKF’s, 4DVar (with Gaussian assumptions) inaccurate

2. Assimilation of an on/off variable like rain when no rain

is in the (ensemble) forecast

Ø EnKFs, 4DVar and Particle Filters very unsatisfactory

3. Non-linear relationship between model variables and observed variables (cf above example and Leonhard Scheck talk)

Ø EnKFs and VAR-with-no-outer-loop or grossly inaccurate TLM/adjoint are inaccurate.

4. Multi-scale error structures

Ø Somewhat challenging for all methods

11

Overview of Challenges

SLIDE 12

Improvements to GIGG-EnKF (Bishop, 2016, QJRMS.)

1. Background 2. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse- gamma-likelihood is now precise as K=>infinity - previously just

approximate. Significance: Rigorous basis for GIG for all

moments 3. Test of standard GIG for tropical cyclone surface wind energy assimilation problem: Significance: Standard GIG better than EAKF/EnKF for this problem. 4. Local iterative regression to account for non-linearity in observation

perator. Significance: Greatly reduces analysis error.

5. Rigorous approach for dealing with on-off variables (rain, cloud, fire, etc) with gamma based delta function. Significance: Justifies ignoring dry members when rain is observed. 6. Discussion of ways to include GIGG ideas/tools in other DA schemes.

12

Overview of talk

SLIDE 13

13

for 1: ; % where is the number of observations Step 1: Do univariate Gaussian assimilation of to obtain , 1,2,..., Step 2: Find corresponding analysis ensemble for observations a

a j ji

j p p y y i K = =

( ) ( ) ( ) ( ) ( ) ( )

nd model variables covar , , for 1,2,..., ; 1,2,..., var covar , , for 1,2,..., ; 1,2,..., var Step 3: Let the analysis ensemble be

f f k j a f a f ki ki ji ji f j f f j a f a f i i ji ji f j

y y y y y y k p i K y x y x x y y n i K y

µ µ µ

µ = +

=

= = +

=

= the prior ensemble for the next observation , for 1,2,..., ; 1,2,..., , for 1,2,..., ; 1,2,..., end

f a ki ki f a i i

y y k p i K x x n i K

µ µ

µ = = = = = =

Background: A typical EnKF serial observation assimilation scheme

SLIDE 14

14

Background: Gaussian pdfs versus bounded pdfs

14

Haboob, Iraq. 27 April, 2005 Extratropical Cyclone Xynthia 2010, 63 fatalities, losses $2-4B, linked to H20 plume

SLIDE 15

Gaussian anamorphosis transforms a non-Gaussian variable to a

Gaussian variable via a non-linear function; e.g. taking the log of a log- normal variable makes it Gaussian

In observation space, this is problematic because a non-linear transform
f an error prone unbiased observation will create a biased observation
f the non-linear function of the truth.
In model space, this is problematic because while a good DA scheme

will deliver a posterior ensemble whose variance is equal to the mean square error (mse) of the mean in the transformed space, in general, analysis ensemble variance will not equal mse once the transformation is undone.

15

Background: Gaussian anamorphosis problematic

15

The bar on the right shows that imperfect Gaussian anamorphosis via non- linear transform approach led to overdispersive ensemble in the case considered in Bishop (2019, Q.J. Roy. Met. Soc.)

SLIDE 16

Gaussian anamorphosis transforms a non-Gaussian variable to a

Gaussian variable via a non-linear function; e.g. taking the log of a log- normal variable makes it Gaussian

In observation space, this is problematic because a non-linear transform
f an error prone unbiased observation will create a biased observation
f the non-linear function of the truth.
In model space, this is problematic because while a good DA scheme

will deliver a posterior ensemble whose variance is equal to the mean square error (mse) of the mean in the transformed space, in general, analysis ensemble variance will not equal mse once the transformation is undone.

16

Background: Gaussian anamorphosis problematic

16

The bar on the right shows that imperfect Gaussian anamorphosis via non- linear transform approach led to overdispersive ensemble in the case considered in Bishop (2019, Q.J. Roy. Met. Soc.)

Non-linear transformation of observations and/or model variables unsatisfactory

SLIDE 17

17

gamma prior

prior ( )

c r

prior mean

A prior Gamma pdf

SLIDE 18

Inverse-Gamma pdf of obs given truth

18

y, the observed value gamma prior

prior ( )

c r

( _ | ), the inverse gamma pdf that gives the pdf of obs that could occur if the truth was equal to the vertical "dot-dash" line. L y possible c y =

prior mean

SLIDE 19

The likelihood function

19

y, the observed value gamma prior

prior ( )

c r

( | ) with fixed at the observed value. L y c y

prior mean

SLIDE 20

The posterior pdf is then a gamma

20

y, the observed value

prior post prior

( | ) ( ) ( | ) (a gamma posterior) ( | ) ( ) L y c c c y L y c c dc r r r

¥

=

ò

gamma prior

prior ( )

c r

( | ) L y c prior mean posterior mean

SLIDE 21

Equation for posterior mean

21

y, the observed value

( ) ( )

( )

1 2 2 2 2

1 1 1 1 ˆ ˆ ˆ ˆ 1 ˆ ˆ where and are type 2 relative prior and observation var ˆ variances; e.g. var

r r r r post prior prior r r prior r prior prior

P P R R y c c c P R c P c c s µ s

é

ù æ ö æ ö = + +

+

ê ú ç ÷ ç ÷ ç ÷ ç ÷ ê ú è ø è ø ë û = = + +

prior

c Posterior mean equation has Kalman like gain but everything else is inverted ! (See Bishop, 2016, QJRMS and Bishop, 2018 for details of GIG method)

SLIDE 22

22

Improvement of GIG for high relative variances

(a) (b)

Problem: Bishop’s (2016, QJRMS) formulation (thick grey curve) departs from true posterior (thin solid line) when relative variance of prior (dashed curve) and likelihood (dot-dash curve) is large. Solution: Use summation theorem for gamma pdfs to ensure that GIG-EnKF samples the true posterior. (Bishop 2019,hopefully) Bishop, 2016 Bishop, 2019

prior prior likelihood likelihood Ob Ob

( ) ( ) ( )

( )

( ) ( )

( )

( ) ( ) ( )

( )

( ) ( ) ( ) ( )

1 1 1 1 1 2 2 2

, ~ 1, , 1 1 1 1 1 , var var var , and . var var

r a a f gig gig r r f i i i i f r r r r r r a f f f f t r r r t t f f f

P y y y y y R P y y P R P P R R y y y y y y y y P P R y y y y y y

=

+ G + + é ù æ ö æ ö = + +

+

ê ú ç ÷ ç ÷ ç ÷ ç ÷ ê ú è ø è ø ë û

=

= = +

+

! ! ! ! ! ! ! ! !

SLIDE 23

23

Improvement of GIG for high relative variances

(a) (b)

Problem: Bishop’s (2016, QJRMS) formulation (thick grey curve) departs from true posterior (thin solid line) when relative variance of prior (dashed curve) and likelihood (dot-dash curve) is large. Solution: Use summation theorem for gamma pdfs to ensure that GIG-EnKF samples the true posterior. (Bishop 2019,hopefully) Bishop, 2016 Bishop, 2019

prior prior likelihood likelihood Ob Ob

( ) ( ) ( )

( )

( ) ( )

( )

( ) ( ) ( )

( )

( ) ( ) ( ) ( )

1 1 1 1 1 2 2 2

, ~ 1, , 1 1 1 1 1 , var var var , and . var var

r a a f gig gig r r f i i i i f r r r r r r a f f f f t r r r t t f f f

P y y y y y R P y y P R P P R R y y y y y y y y P P R y y y y y y

=

+ G + + é ù æ ö æ ö = + +

+

ê ú ç ÷ ç ÷ ç ÷ ç ÷ ê ú è ø è ø ë û

=

= = +

+

! ! ! ! ! ! ! ! !

GIG-EnKF now rigorous for all posterior moments

SLIDE 24

24

for 1: ; % where is the number of observations Step 1: Decide whether forecast and observation uncertainty associated with is best approximated by GIG-delta, GIG, IGG or Gaussian assu

j

j p p y = mptions. Step 2: if (GIG-delta) then use ... to obtain , 1,2,..., ; else if (GIG) then use ... to obtain , 1,2,..., ; else if (IGG) then use ... to obtain , 1,2,..., ; else i

a ji a ji a ji

y i K y i K y i K = = =

( ) ( ) ( )

f (Gaussian) then use ... to obtain , 1,2,..., Step 3: Find corresponding analysis ensemble for observations and model variables covar , , for 1,2,..., ; 1 var

a ji f f k j a f a f ki ki ji ji f j

y i K y y y y y y k p i y = = +

=

=

( ) ( ) ( )

,2,..., covar , , for 1,2,..., ; 1,2,..., var Step 4: Let the analysis ensemble be the prior ensemble for the next observation , for 1,2,..., ; 1,2

f f j a f a f i i ji ji f j f a ki ki

K x y x x y y n i K y y y k p i

µ µ µ

µ = +

=

= = = = ,..., , for 1,2,..., ; 1,2,..., end

f a i i

K x x n i K

µ µ

µ = = =

Background: The GIGG-EnKF serial observation assimilation scheme with linear regression

(EAKF/EnSRF/EnKF)

SLIDE 25

Aim: Outline some improvements to GIGG-EnKF (Bishop, 2016,

QJRMS.)

1. Background 2. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse- gamma-likelihood is now precise as K=>infinity - previously just

approximate. Significance: Rigorous basis for GIG for all

moments 3. Test of standard GIG for tropical cyclone surface wind energy assimilation problem: Significance: Standard GIG better than EAKF/EnKF for this problem. 4. Local iterative regression to account for non-linearity in observation

perator. Significance: Greatly reduces analysis error.

5. Rigorous approach for dealing with on-off variables (rain, cloud, fire, etc) with gamma based delta function. Significance: Justifies ignoring dry members when rain is observed. 6. Discussion of ways to include GIGG ideas/tools in other DA schemes.

25

Overview

SLIDE 26

26

Simple DA testbed for TC like surface winds

A random draw from a TC relevant pdf

SLIDE 27

27

Another random draw from the simple testbed’s multi-scale pdf

Simple DA testbed for TC like surface winds

SLIDE 28

Model states defined by random, multi-scale TC like (u,v) wind field.
Let observations be non-linear functions of u and v; e.g. Kinetic Energy, KE=(u2+

v2)/2, tanh(KE) or Heaviside(KE-constant).

28

Simple DA testbed for TC like surface winds

SLIDE 29

Pr Prior mean, ob

bs,

, truth and GIG analysis using g a 3000 member ens ensembl emble e (no no localization n requi equired) ed).

29

Observed variable is KE=0.5(u2+v2). Distribution of random observations given truth is an inverse gamma pdf with a relative variance of 0.25. GIG analysis mean

SLIDE 30

Do Does the GIG variation on the EAKF improve the Kinetic c Energy analysis?

30

The GIG-EnKF

utperforms the EAKF

under all metrics in all 8 independent sets of 50 trials. The only difference between EAKF and GIG code is the univariate ensemble update. Linear regression code is identical.

SLIDE 31

Aim: Outline some improvements to GIGG-EnKF (Bishop, 2016, QJRMS.)

1. Background 2. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse-gamma- likelihood is now precise as K=>infinity - previously just approximate. Significance: Rigorous basis for GIG for all moments 3. Test of standard GIG for tropical cyclone surface wind energy assimilation problem: Significance: Standard GIG better than EAKF/EnKF for this problem. 4. Local iterative regression to account for non-linearity in observation operator. Significance: Greatly reduces analysis error. 5. Rigorous approach for dealing with on-off variables (rain, cloud, fire, etc) with gamma based delta function. Significance: Justifies ignoring dry members when rain is observed. 6. Discussion of attempts to include GIGG ideas/tools in a global solution procedure like 4DVar. Significance: The most accurate data assimilation schemes may need to support both global and local solvers.

31

Overview

SLIDE 32

32

for 1: ; % where is the number of observations Step 1: Decide whether forecast and observation uncertainty associated with is best approximated by GIG-delta, GIG, IGG or Gaussian assu

j

j p p y = mptions. Step 2: if (GIG-delta) then use ... to obtain , 1,2,..., ; else if (GIG) then use ... to obtain , 1,2,..., ; else if (IGG) then use ... to obtain , 1,2,..., ; else i

a ji a ji a ji

y i K y i K y i K = = =

( ) ( ) ( )

f (Gaussian) then use ... to obtain , 1,2,..., Step 3: Find corresponding analysis ensemble for observations and model variables covar , , for 1,2,..., ; 1 var

a ji f f k j a f a f ki ki ji ji f j

y i K y y y y y y k p i y = = +

=

=

( ) ( ) ( )

,2,..., covar , , for 1,2,..., ; 1,2,..., var Step 4: Let the analysis ensemble be the prior ensemble for the next observation , for 1,2,..., ; 1,2

f f j a f a f i i ji ji f j f a ki ki

K x y x x y y n i K y y y k p i

µ µ µ

µ = +

=

= = = = ,..., , for 1,2,..., ; 1,2,..., end

f a i i

K x x n i K

µ µ

µ = = =

Background: The GIGG-EnKF serial observation assimilation scheme with linear regression

(EAKF/EnSRF/EnKF)

SLIDE 33

Tr Treatment of non-lin linear arit ity o

f K

f Kin inetic ic E Energy ob

b op
perator
r.

Lin Linear ar r regressio ion fr from ob

b to

to model space yields inconsiste tencies!

33

Observed variable is KE=0.5(u2+v2). Standard GIG/EAKF uses linear regression to give an inconsistent analysis of (ua,va) and (KE)a. Bottom left panel gives (KE)a. Bottom right gives,

which is far less

accurate than (KE)a.

( ) ( )

( )

2 2

1 , 2

a a a a a

KE u v u v KE = + ¹

(Similar to Leonhard Scheck finding for SEVERI reflectance DA with LETKF)

SLIDE 34

Tr Treatment of non-lin linear arit ity o

f K

f Kin inetic ic E Energy ob

b op
perator
r

(lin (linear ar r regressio ion fr from ob

b to

to model space yields inconsiste tencies)

34

Observed variable is KE=0.5(u2+v2). Standard GIG/EAKF uses linear regression to give an inconsistent analysis of (ua,va) and (KE)a. Bottom left panel gives (KE)a. Bottom right gives,

which is far less

accurate than (KE)a.

( ) ( )

( )

2 2

1 , 2

a a a a a

KE u v u v KE = + ¹

SLIDE 35

35

for 1: ; % where is the number of observations Step 1: Decide whether forecast and observation uncertainty associated with is best approximated by GIG-delta, GIG, IGG or Gaussian assu

j

j p p y = mptions. Step 2: if (GIG-delta) then use ... to obtain , 1,2,..., ; else if (GIG) then use ... to obtain , 1,2,..., ; else if (IGG) then use ... to obtain , 1,2,..., ; else i

a ji a ji a ji

y i K y i K y i K = = =

( ) ( ) ( )

f (Gaussian) then use ... to obtain , 1,2,..., Step 3: Find corresponding analysis ensemble for observations and model variables covar , , for 1,2,..., ; 1 var

a ji f f k j a f a f ki ki ji ji f j

y i K y y y y y y k p i y = = +

=

=

( ) ( ) ( )

,2,..., covar , , for 1,2,..., ; 1,2,..., var Step 4: Let the analysis ensemble be the prior ensemble for the next observation , for 1,2,..., ; 1,2,

f f j a f a f i i ji ji f j f a ki ki

K x y x x y y n i K y y y k p i

µ µ µ

µ = +

=

= = = = ..., , for 1,2,..., ; 1,2,..., end

f a i i

K x x n i K

µ µ

µ = = =

Background: The GIGG-EnKF serial observation assimilation scheme

(EAKF/EnSRF/EnKF)

Need to replace the linear regression step with something better!

SLIDE 36

36

New method to account for non-linearity in ob-operator: The observation to model space consistency iteration ( ) ( ) ( )

3.1: Define minimum list of variables required to predict the ob that was just assimilated; for example, in the KE example where , are the wind components required

j j

i j h j i j j h j

y u u v v é ù ê ú ë û é ù = ê ú ê ú ë û x x

( ) ( ) ( ) ( )

to predict the KE of the model state at . 3.2: Find the usual GIG-EnKF model-space analysis using linear regression. 1 3.3: Starting with minimize 2

j j j j

j a i h lin a a i i i ji j h h h lin

y J y h é ù ê ú ë û é ù é ù = =

ê

ú ê ú ë û ë û x x x x

( )

{ }

( )

2

using ensemble-space constrained Newton iteration on gradient to obtain (the minimizer).

j j

local i h a i h

é ù ê ú ë û x x

16

SLIDE 37

37

Solid black line gives prior pdf of zonal wind (u) field u2 is observed at 25th, 50th or 75th percentile of prior pdf of obs (left to right) Dashed black line gives true posterior pdf of u field Solid mauve line is GIG posterior pdf with linear regression Solid cyan line is GIG posterior pdf with non-linear observation to model space consistency iteration

The observation to model space consistency iteration. Test in 1D model in which only u2 is observed

SLIDE 38

38

The observation to model space consistency iteration. Test in cloud model in which only rain is observed

Posselt and Bishop (2018, in review, QJRMS) True posterior from MCMC N_0r=slope intercept of the rain particle size distribution Q_c0=cloud to rain autoconversion parameter GIG with linear regression GIG with observation to model space consistency iteration

SLIDE 39

39

( ) ( ) ( ) ( )

{ }

( ) ( ) ( )

1

3.4: Update the rest of the model state using and regression covar , covar , , for 1,2,..., ; 1,2,..., covar

j j j j j j

a i h a f f f f f a f i i i i i h h h h h f k a f ki ki lin

x x x n i K y y y

µ µ µ

µ

é

ù é ù é ù = +

=

= ê ú ê ú ê ú ë û ë û ë û = + x multivariate x x x x x

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

, , for 1,2,..., ; 1,2,..., var 0.5 if , for 1,2,..., ; 1,2,..., if

k k

f j a f ji ji f j a local a a ki k i ki lin h lin a ki local a a k i ki h lin

y y y k p i K y y h y y k p i K h y

=

= ì é ù + ³ ï ê ú ë û = = = í ï < î x x

New method to account for non-linearity in ob-operator: The observation to model space consistency iteration

SLIDE 40

40

Observed variable is KE=0.5(u2+v2). Linear regression plus consistency iteration improves consistency

f (ua,va) and (KE)a.

Bottom left panel gives (KE)a. Bottom right gives,

( ) ( )

( )

2 2

1 , 2

a a a a a

KE u v u v KE = + ¹

The observation to model space consistency iteration. Test in 2D model

SLIDE 41

41

Observed variable is KE=0.5(u2+v2). Linear regression plus consistency iteration improves consistency

f (ua,va) and (KE)a.

Bottom left panel gives (KE)a. Bottom right gives,

( ) ( )

( )

2 2

1 , 2

a a a a a

KE u v u v KE = + ¹

The observation to model space consistency iteration. Test in 2D model

Accuracy of direct and derived KE analyses are now the same

SLIDE 42

42

Ob-to-model space consistency iteration reduces mse in (u,v) field by 75%; i.e. standard deviation of analysis error is halved Chance of getting 28 consecutive wins (as above) by pure chance is 1 in 2.8x108 .

The observation to model space consistency iteration. 28 independent tests in 2D model

SLIDE 43

Aim: Outline some improvements to GIGG-EnKF (Bishop, 2016,

QJRMS.)

1. Background 2. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse- gamma-likelihood is now precise as K=>infinity - previously just

approximate. Significance: Rigorous basis for GIG for all

moments 3. Test of standard GIG for tropical cyclone surface wind energy assimilation problem: Significance: Standard GIG better than EAKF/EnKF for this problem. 4. Local iterative regression to account for non-linearity in observation

perator. Significance: Greatly reduces analysis error.

5. Rigorous approach for dealing with on-off variables (rain, cloud, fire, etc) with gamma based delta function. Significance: Justifies ignoring dry members when rain is observed. 6. Discussion of ways to include GIGG ideas/tools in other DA schemes.

43

Overview

SLIDE 44

EnKFs, 4DVAR, Particle filters, etc, all highly unsatisfactory in this

case.

How would Bayes’ theorem be used in this case?
Might an adaptation of the GIGG filter better deal with this problem?

44

Problem: No rain in ensemble forecast but rain is observed

SLIDE 45

45

No rain forecast as a gamma function limit

Observation likelihood pdf

Dashed blue lines pertain to posterior/analysis pdf

( ) ( )

1/2 2

Dashed black lines pertain to prior/forecast pdf with var 0.707 and

f r f r f

y P y P y

=

= =

f

y

( ) ( ) ( )

2

var 1/ 4 var

t

r

t

t

y y R y y y

=

=

+

!

Observed value, yo

SLIDE 46

46

No rain forecast as a gamma function limit

Observation likelihood pdf

( ) ( )

1/2 2

Dashed black lines pertain to prior/forecast pdf with var 1.41 and

f r f r f

y P y P y

=

= =

f

y

Dashed blue lines pertain to posterior/analysis pdf

( ) ( ) ( )

2

var 1/ 4 var

t

r

t

t

y y R y y y

=

=

+

!

Observed value, yo

SLIDE 47

47

No rain forecast as a gamma function limit

Observed value, yo Observation likelihood pdf

f

y

Dashed blue lines pertain to posterior/analysis pdf

( ) ( ) ( )

2

var 1/ 4 var

t

r

t

t

y y R y y y

=

=

+

!

( ) ( )

1/2 2

Dashed black lines pertain to prior/forecast pdf with var 11.3 and

f r f r f

y P y P y

=

= =

SLIDE 48

48

No rain forecast as a gamma function limit

Observation likelihood pdf

( ) ( )

1/2 2

Dashed black lines pertain to prior/forecast pdf with var 181 and

f r f r f

y P y P y

=

= =

f

y

Dashed blue lines pertain to posterior/analysis pdf

( ) ( ) ( )

2

var 1/ 4 var

t

r

t

t

y y R y y y

=

=

+

!

Observed value, yo

SLIDE 49

49

No rain forecast as a gamma function limit: a gamma delta function

Observation likelihood pdf

( ) ( )

1/2 2

Dashed black lines pertain to prior/forecast pdf with var 32,768 and

f r f r f

y P y P y

=

= =

f

y

Dashed blue lines pertain to posterior/analysis pdf

( ) ( ) ( )

2

var 1/ 4 var

t

r

t

t

y y R y y y

=

=

+

!

Observed value, yo

SLIDE 50

50

No rain forecast as a gamma function limit: a gamma delta function

Observation likelihood pdf

( ) ( )

1/2 2

Dashed black lines pertain to prior/forecast pdf with var 32,768 and

f r f r f

y P y P y

=

= =

f

y

Dashed blue lines pertain to posterior/analysis pdf

Note that (i) posterior mode is equal to the observed value, and (ii) posterior mean is equal to the mode of ob-likelihood function.

( ) ( ) ( )

2

var 1/ 4 var

t

r

t

t

y y R y y y

=

=

+

!

Observed value, yo

SLIDE 51

51

No rain forecast as a gamma function limit: a gamma delta function

Observation likelihood pdf

( ) ( )

1/2 2

Dashed black lines pertain to prior/forecast pdf with var 32,768 and

f r f r f

y P y P y

=

= =

f

y

Dashed blue lines pertain to posterior/analysis pdf

Using gamma delta function to represent the zero-rain-prior pdf makes Bayes’ theorem give a plausible posterior pdf.

( ) ( ) ( )

2

var 1/ 4 var

t

r

t

t

y y R y y y

=

=

+

!

Observed value, yo

SLIDE 52

52

gamma-delta + gamma pdf for case when some members dry and some wet

Observation likelihood pdf

( ) ( )

( )

1

prior prior precip j dry j wet j

y w y w y r r r = +

(

)

prior wet j

y r

Dashed blue lines pertain to posterior/analysis pdf

In this case, only the mean and variance of the wet members determine the mean and variance

f the posterior. Dry members ignored!

( ) ( ) ( )

2

var 1/ 4 var

t

r

t

t

y y R y y y

=

=

+

!

Observed value, yo

( )

prior dry j

y r

SLIDE 53

53

for 1: ; % where is the number of observations Step 1: Decide whether forecast and observation uncertainty associated with is best approximated by GIG-delta, GIG, IGG or Gaussian assu

j

j p p y = mptions. Step 2: if (GIG-delta) then use ... to obtain , 1,2,..., ; else if (GIG) then use ... to obtain , 1,2,..., ; else if (IGG) then use ... to obtain , 1,2,..., ; else i

a ji a ji a ji

y i K y i K y i K = = =

( ) ( ) ( )

f (Gaussian) then use ... to obtain , 1,2,..., Step 3: Find corresponding analysis ensemble for observations and model variables covar , , for 1,2,..., ; 1 var

a ji f f k j a f a f ki ki ji ji f j

y i K y y y y y y k p i y = = +

=

=

( ) ( ) ( )

,2,..., covar , , for 1,2,..., ; 1,2,..., var Step 4: Let the analysis ensemble be the prior ensemble for the next observation , for 1,2,..., ; 1,2

f f j a f a f i i ji ji f j f a ki ki

K x y x x y y n i K y y y k p i

µ µ µ

µ = +

=

= = = = ,..., , for 1,2,..., ; 1,2,..., end

f a i i

K x x n i K

µ µ

µ = = =

Multi-variate GIGG-Delta filter also fits seamlessly in DART

SLIDE 54

54

GIGG-delta for coupled model DA:

An idealized coupled model

Evolution based on Lorenz 96 model plus relaxation to adjacent levels. The blue line variable is analogous to zonal wind/current. Green bars give rainfall which only occurs when upper level divergence exceeds a small threshold . Rain magnitude is proportional to product of square of the surface wind‘s deviation from climatological mean and the square root of upper level

divergence. Rain increases flux of momentum from upper levels to lower levels.

SLIDE 55

55

7 independent 30 day DA cycles

GIGG-delta vs EnKF mse in case when EnKF has stabilizing inflation factor

Mse for wind/current GIGG-delta (+ signs) and EnKF (x signs) Mse for rain GIGG-delta (+ signs) and EnKF (x signs)

GIGG-delta much better than EnKF in this “stable- system versus stable-system” comparison.

Forecasts

SLIDE 56

Aim: Outline some improvements to GIGG-EnKF (Bishop, 2016,

QJRMS.)

1. Background 2. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse- gamma-likelihood is now precise as K=>infinity - previously just

approximate. Significance: Rigorous basis for GIG for all

moments 3. Test of standard GIG for tropical cyclone surface wind energy assimilation problem: Significance: Standard GIG better than EAKF/EnKF for this problem. 4. Local iterative regression to account for non-linearity in observation

perator. Significance: Greatly reduces analysis error.

5. Rigorous approach for dealing with on-off variables (rain, cloud, fire, etc) with gamma based delta function. Significance: Justifies ignoring dry members when rain is observed. 6. Discussion of ways to include GIGG ideas/tools in other DA schemes.

56

Overview

SLIDE 57

Co Comb mbining GIGG-EnK EnKF wi with ob

b-to

to-mo model-space ce iteration with other methods

1. Use GIGG-EnKF with ob-to-model-space iteration to assimilate the non-Gaussian variables in the observation volume then assimilate the rest using LETKF/VAR or other favorite quasi- Gaussian method. 2. To improve non-linear trajectories, use 4D smoother form of GIGG-EnKF, then use Geir Evensen’s suggestion of, say tripling

b and relative ob error variances and doing step 1 three times.

On the third iteration of this, the prior may look Gaussian and pure 4D-Var or LETKF may be appropriate.

57

SLIDE 58

Mu Multi ti-va variate “all-at at-on

nce” assimilation
n of
f parameterized non
n-

Gau Gaussian ssian p pdfs n s not so so e eas asy

To do this, one first needs to find a compelling multi-variate

statistical model of the moments of the prior.

… not so easy in the multi-variate case …
Multi-variate Wishart, Gaussian and log-normal all fail

58

SLIDE 59

Me Mean of powers s of relati tive pertu rturb rbati tion [(x-<x <x>) >)/<x <x>] >]

59

Wishart distribution and other such distributions based on sample covariances of samples from normal pdfs produce spatially uniform relative variances of 2/(N-1) where N is the sample size. Hence, they are incapable of representing the variation in relative variance seen here. Literature search failed to find gamma-like multi-variate pdf capable of producing the mean of powers of relative perturbations shown here. Variance almost twice the size of the mean here

SLIDE 60

Mu Multi ti-va variate “all-at at-on

nce” assimilation
n of
f parameterized non
n-

Gau Gaussian ssian p pdfs n s not so so e eas asy

To do this, one first needs to find a compelling multi-variate

statistical model of the moments of the prior.

… not so easy in the multi-variate case …
Multi-variate Wishart, Gaussian and log-normal all fail
Perturbed obs 4DVar or EnKF plus non-linear transformation to

“normalish” variables results in ensemble whose variance does not equal mse after inverse transform is performed (Bishop 2019,

Q. J. Roy. Met. Soc). However, ease of implementation in
perational systems provides compelling case for

testing/tuning/engineering this approach.

60

GIGG-EnKF with ob-to-model-space iteration avoids these issues

SLIDE 61

1. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse- gamma-likelihood is now precise for all posterior moments as K=> =>infinity - previously just approximate. 2. In idealized TC surface wind energy assimilation experiments, GIG soundly beat EAKF even using linear regression. The newly introduced ob-space to model space non-linear regression iteration:

i. Gave consistent model and ob space analyses ii. Greatly reduced mean square error (mse). iii. Gave an analysis ensemble variance approximately equal to mse.

3. In 1D experiments, the new local ob-space to model space iteration procedure gave fairly accurate multi-modal posteriors. 4. GIG-EnKF provides provides rigorous approach for dealing with on-

ff variables (rain, cloud, fire, etc) with gamma based delta function.

Approach Justifies ignoring dry members when rain is observed. 5. Attempts to model prior pdfs GLOBALLY for use in “all-obs-at-once” variational assimilation schemes were unsuccessful – should

perational centers consider simultaneously supporting both local

and global solvers?

61

Conclusions

SLIDE 62

62

Ob-to-model space consistency iteration reduces mse in KE field by 41%

Ob-space Ob-space Model-space Model-space

The observation to model space consistency iteration. 28 independent tests in 2D model

SLIDE 63

Ob-to-model-space consistency iteration helps! (Result for Kinetic Energy below, 7x4 independent trials)

63

Ob-to-model space consistency iteration reduces mse in KE field by 41% Chance of getting 7 wins by pure chance is 1 in 128.

SLIDE 64

64

Solid black line gives prior pdf of zonal wind (u) field u2 is observed at 25th, 50th or 75th percentile of prior (left to right) Dashed black line gives true posterior pdf of u field Solid mauve line is GIGG posterior pdf with no outer loop Solid cyan line is GIGG posterior pdf with outer loop

The observation to model space consistency iteration. Test in 1D model

SLIDE 65

Data assimilation for clouds and high-resolution models

Data Assimilation (DA)

Imperfect

bservations, y

Data Assimilation weights y and xf i using uncertainty pdfs Posterior analyses, xai , i=1,2,..,K. The initial conditions for the next ensemble forecast

65

Prior ensemble forecast, xf i, i=1,2,..,K

SLIDE 66

Irma Jose

Even more disturbing: the extreme weather parts of the image (Irma, Jose, etc) are completely ignored!!! Most of the observations (pixels) in this image are not assimilated by

perational Data Assimilation (DA) systems!

66

Next Decade Data Assimilation (DA): Challenges

Cloud DA

SLIDE 67

Near zero semi-positive definite variables, like water vapor mixing ratio, cloud, rain, etc., inevitably have non-Gaussian uncertainties.

67

Next Decade Data Assimilation (DA): Challenges

Non-Gaussian Episodic Variables

SLIDE 68

The change-in-cloud-cover per change-in-mixing-ratio of water vapor is a highly non-linear function of mixing ratio.

Cloud = non-linear-function(H2O g/kg)

68

No cloud Cloud

Next Decade Data Assimilation (DA): Challenges

bserved-variable = non-linear-function(model-variable)