The challenge of bounded, non-Gaussian, non- linear and multi-scale - - PowerPoint PPT Presentation
The challenge of bounded, non-Gaussian, non- linear and multi-scale - - PowerPoint PPT Presentation
The challenge of bounded, non-Gaussian, non- linear and multi-scale variables Craig H. Bishop University of Melbourne, Parkville, Victoria, Australia Some Motivation Major source of model error in weather and climate prediction models is
- Major source of model error in weather and climate prediction models
is clouds and precipitation.
- Until recently, observations of clouds and precipitation have been
studiously avoided in DA – not surprising then that they are a big source
- f model error.
- Clouds/precip DA problem cluttered with bounded, non-Gaussian, non-
linear and mult-scale variables.
2
Some Motivation
3
A cloud DA thought experiment
Imagine an infinite number
- f replicate Earths
- same solar/GHG forcing
- each giving an independent
realization of today’s climate
- same observation types and
locations but different random observation-errors
4
Visible Geostationary Imagery that we do not yet assimilate
5
The ideal objective of Data Assimilation
Imagine an infinite number
- f replicate Earths
- same solar forcing
- each giving an independent
realization of today’s prior
- same observation types and
locations but different random observation-errors
Each point represents a (u, u2) pair from a distinct replicate Earth
6
Each cross represents a (u, y) pair from a distinct replicate Earth
The ideal objective of Data Assimilation
Imagine an infinite number
- f replicate Earths
- same solar forcing
- each giving an independent
realization of today’s prior
- same observation types and
locations but different random observation-errors Error standard deviation of unbiased observation of a bounded variable must tend to zero as truth tends to zero. Bishop (2019, Q. J. Roy. Met. Soc.)
7
On our Earth, the observation of u2 gives y=1.71
The ideal objective of Data Assimilation
Imagine an infinite number
- f replicate Earths
- same solar forcing
- each giving an independent
realization of today’s prior
- same observation types and
locations but different random observation-errors
8
Ideal Data Assimilation (DA) in a simple model
Ideal DA gives the posterior pdf of replicate Earths having the same y value as our Earth’s y value.
Ideal posterior pdf is bi-modal. Bi-modality caused by non-linearity Prior and posterior pdf of u2 are like gamma pdfs. Highly non-Gaussian.
9
Current DA: 4DVar-No-Outer-Loop (US Navy) and EnKF (DWD)
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
1 2 2 2 2 1
estimate the posterior pdf using and covar covar co 4DVarNOL and EnKF var covar
a f f fT f fT
- f
i i i i i i i i a f f fT f fT i i i i i i
u u u y y y y u u u u y y y y e e
- é
ù é ù é ù = + +
- ë
û ê ú ê ú ë û ë û é ù = + + ë û linear regression perturbed observations
( ) ( )
2
- f
i i
u é ù
- ê
ú ë û
EnKF/4DVarNOL posterior pdf of u is very poor.
EnKF & 4DvarNoOuterLoop (4DVarNOL) posterior pdf of u2 is highly inaccurate. Also, analyzed u2 values are not equal to the square of analyzed u values
Fails due to linear, Gaussian assumptions
10
Current DA: Incremental 4DVar (4DVar-with-outer-loop)
Fails due to Gaussians assumption and the presence of multiple extrema (non-linearity)
4DVar posterior pdf of u is still poor.
4DVar posterior pdf of u2 is highly inaccurate. However, analyzed u2 values are now equal to the square of analyzed u values
4DVar uses perturbed
- bservations. Each posterior
member is a local extreme value
- f a Gaussian approximation to
the true posterior pdf.
- 1. Observations of bounded non-Gaussian variables
Ø EnKF’s, 4DVar (with Gaussian assumptions) inaccurate
- 2. Assimilation of an on/off variable like rain when no rain
is in the (ensemble) forecast
Ø EnKFs, 4DVar and Particle Filters very unsatisfactory
3. Non-linear relationship between model variables and observed variables (cf above example and Leonhard Scheck talk)
Ø EnKFs and VAR-with-no-outer-loop or grossly inaccurate TLM/adjoint are inaccurate.
4. Multi-scale error structures
Ø Somewhat challenging for all methods
11
Overview of Challenges
Improvements to GIGG-EnKF (Bishop, 2016, QJRMS.)
1. Background 2. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse- gamma-likelihood is now precise as K=>infinity - previously just
- approximate. Significance: Rigorous basis for GIG for all
moments 3. Test of standard GIG for tropical cyclone surface wind energy assimilation problem: Significance: Standard GIG better than EAKF/EnKF for this problem. 4. Local iterative regression to account for non-linearity in observation
- perator. Significance: Greatly reduces analysis error.
5. Rigorous approach for dealing with on-off variables (rain, cloud, fire, etc) with gamma based delta function. Significance: Justifies ignoring dry members when rain is observed. 6. Discussion of ways to include GIGG ideas/tools in other DA schemes.
12
Overview of talk
13
for 1: ; % where is the number of observations Step 1: Do univariate Gaussian assimilation of to obtain , 1,2,..., Step 2: Find corresponding analysis ensemble for observations a
a j ji
j p p y y i K = =
( ) ( ) ( ) ( ) ( ) ( )
nd model variables covar , , for 1,2,..., ; 1,2,..., var covar , , for 1,2,..., ; 1,2,..., var Step 3: Let the analysis ensemble be
f f k j a f a f ki ki ji ji f j f f j a f a f i i ji ji f j
y y y y y y k p i K y x y x x y y n i K y
µ µ µ
µ = +
- =
= = +
- =
= the prior ensemble for the next observation , for 1,2,..., ; 1,2,..., , for 1,2,..., ; 1,2,..., end
f a ki ki f a i i
y y k p i K x x n i K
µ µ
µ = = = = = =
Background: A typical EnKF serial observation assimilation scheme
14
Background: Gaussian pdfs versus bounded pdfs
14
Haboob, Iraq. 27 April, 2005 Extratropical Cyclone Xynthia 2010, 63 fatalities, losses $2-4B, linked to H20 plume
- Gaussian anamorphosis transforms a non-Gaussian variable to a
Gaussian variable via a non-linear function; e.g. taking the log of a log- normal variable makes it Gaussian
- In observation space, this is problematic because a non-linear transform
- f an error prone unbiased observation will create a biased observation
- f the non-linear function of the truth.
- In model space, this is problematic because while a good DA scheme
will deliver a posterior ensemble whose variance is equal to the mean square error (mse) of the mean in the transformed space, in general, analysis ensemble variance will not equal mse once the transformation is undone.
15
Background: Gaussian anamorphosis problematic
15
The bar on the right shows that imperfect Gaussian anamorphosis via non- linear transform approach led to overdispersive ensemble in the case considered in Bishop (2019, Q.J. Roy. Met. Soc.)
- Gaussian anamorphosis transforms a non-Gaussian variable to a
Gaussian variable via a non-linear function; e.g. taking the log of a log- normal variable makes it Gaussian
- In observation space, this is problematic because a non-linear transform
- f an error prone unbiased observation will create a biased observation
- f the non-linear function of the truth.
- In model space, this is problematic because while a good DA scheme
will deliver a posterior ensemble whose variance is equal to the mean square error (mse) of the mean in the transformed space, in general, analysis ensemble variance will not equal mse once the transformation is undone.
16
Background: Gaussian anamorphosis problematic
16
The bar on the right shows that imperfect Gaussian anamorphosis via non- linear transform approach led to overdispersive ensemble in the case considered in Bishop (2019, Q.J. Roy. Met. Soc.)
Non-linear transformation of observations and/or model variables unsatisfactory
17
gamma prior
prior ( )
c r
prior mean
A prior Gamma pdf
Inverse-Gamma pdf of obs given truth
18
y, the observed value gamma prior
prior ( )
c r
( _ | ), the inverse gamma pdf that gives the pdf of obs that could occur if the truth was equal to the vertical "dot-dash" line. L y possible c y =
prior mean
The likelihood function
19
y, the observed value gamma prior
prior ( )
c r
( | ) with fixed at the observed value. L y c y
prior mean
The posterior pdf is then a gamma
20
y, the observed value
prior post prior
( | ) ( ) ( | ) (a gamma posterior) ( | ) ( ) L y c c c y L y c c dc r r r
¥
=
ò
gamma prior
prior ( )
c r
( | ) L y c prior mean posterior mean
Equation for posterior mean
21
y, the observed value
( ) ( )
( )
( )
( )
1 2 2 2 2
1 1 1 1 ˆ ˆ ˆ ˆ 1 ˆ ˆ where and are type 2 relative prior and observation var ˆ variances; e.g. var
r r r r post prior prior r r prior r prior prior
P P R R y c c c P R c P c c s µ s
- é
ù æ ö æ ö = + +
- +
ê ú ç ÷ ç ÷ ç ÷ ç ÷ ê ú è ø è ø ë û = = + +
prior
c Posterior mean equation has Kalman like gain but everything else is inverted ! (See Bishop, 2016, QJRMS and Bishop, 2018 for details of GIG method)
22
Improvement of GIG for high relative variances
(a) (b)
Problem: Bishop’s (2016, QJRMS) formulation (thick grey curve) departs from true posterior (thin solid line) when relative variance of prior (dashed curve) and likelihood (dot-dash curve) is large. Solution: Use summation theorem for gamma pdfs to ensure that GIG-EnKF samples the true posterior. (Bishop 2019,hopefully) Bishop, 2016 Bishop, 2019
prior prior likelihood likelihood Ob Ob
( ) ( ) ( )
( )
( ) ( )
( )
( ) ( ) ( )
( )
( )
( )
( ) ( ) ( ) ( )
1 1 1 1 1 2 2 2
, ~ 1, , 1 1 1 1 1 , var var var , and . var var
r a a f gig gig r r f i i i i f r r r r r r a f f f f t r r r t t f f f
P y y y y y R P y y P R P P R R y y y y y y y y P P R y y y y y y
- =
+ G + + é ù æ ö æ ö = + +
- +
ê ú ç ÷ ç ÷ ç ÷ ç ÷ ê ú è ø è ø ë û
- =
= = +
- +
! ! ! ! ! ! ! ! !
23
Improvement of GIG for high relative variances
(a) (b)
Problem: Bishop’s (2016, QJRMS) formulation (thick grey curve) departs from true posterior (thin solid line) when relative variance of prior (dashed curve) and likelihood (dot-dash curve) is large. Solution: Use summation theorem for gamma pdfs to ensure that GIG-EnKF samples the true posterior. (Bishop 2019,hopefully) Bishop, 2016 Bishop, 2019
prior prior likelihood likelihood Ob Ob
( ) ( ) ( )
( )
( ) ( )
( )
( ) ( ) ( )
( )
( )
( )
( ) ( ) ( ) ( )
1 1 1 1 1 2 2 2
, ~ 1, , 1 1 1 1 1 , var var var , and . var var
r a a f gig gig r r f i i i i f r r r r r r a f f f f t r r r t t f f f
P y y y y y R P y y P R P P R R y y y y y y y y P P R y y y y y y
- =
+ G + + é ù æ ö æ ö = + +
- +
ê ú ç ÷ ç ÷ ç ÷ ç ÷ ê ú è ø è ø ë û
- =
= = +
- +
! ! ! ! ! ! ! ! !
GIG-EnKF now rigorous for all posterior moments
24
for 1: ; % where is the number of observations Step 1: Decide whether forecast and observation uncertainty associated with is best approximated by GIG-delta, GIG, IGG or Gaussian assu
- j
j p p y = mptions. Step 2: if (GIG-delta) then use ... to obtain , 1,2,..., ; else if (GIG) then use ... to obtain , 1,2,..., ; else if (IGG) then use ... to obtain , 1,2,..., ; else i
a ji a ji a ji
y i K y i K y i K = = =
( ) ( ) ( )
f (Gaussian) then use ... to obtain , 1,2,..., Step 3: Find corresponding analysis ensemble for observations and model variables covar , , for 1,2,..., ; 1 var
a ji f f k j a f a f ki ki ji ji f j
y i K y y y y y y k p i y = = +
- =
=
( ) ( ) ( )
,2,..., covar , , for 1,2,..., ; 1,2,..., var Step 4: Let the analysis ensemble be the prior ensemble for the next observation , for 1,2,..., ; 1,2
f f j a f a f i i ji ji f j f a ki ki
K x y x x y y n i K y y y k p i
µ µ µ
µ = +
- =
= = = = ,..., , for 1,2,..., ; 1,2,..., end
f a i i
K x x n i K
µ µ
µ = = =
Background: The GIGG-EnKF serial observation assimilation scheme with linear regression
(EAKF/EnSRF/EnKF)
- Aim: Outline some improvements to GIGG-EnKF (Bishop, 2016,
QJRMS.)
1. Background 2. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse- gamma-likelihood is now precise as K=>infinity - previously just
- approximate. Significance: Rigorous basis for GIG for all
moments 3. Test of standard GIG for tropical cyclone surface wind energy assimilation problem: Significance: Standard GIG better than EAKF/EnKF for this problem. 4. Local iterative regression to account for non-linearity in observation
- perator. Significance: Greatly reduces analysis error.
5. Rigorous approach for dealing with on-off variables (rain, cloud, fire, etc) with gamma based delta function. Significance: Justifies ignoring dry members when rain is observed. 6. Discussion of ways to include GIGG ideas/tools in other DA schemes.
25
Overview
26
Simple DA testbed for TC like surface winds
A random draw from a TC relevant pdf
27
Another random draw from the simple testbed’s multi-scale pdf
Simple DA testbed for TC like surface winds
- Model states defined by random, multi-scale TC like (u,v) wind field.
- Let observations be non-linear functions of u and v; e.g. Kinetic Energy, KE=(u2+
v2)/2, tanh(KE) or Heaviside(KE-constant).
28
Simple DA testbed for TC like surface winds
Pr Prior mean, ob
- bs,
, truth and GIG analysis using g a 3000 member ens ensembl emble e (no no localization n requi equired) ed).
29
Observed variable is KE=0.5(u2+v2). Distribution of random observations given truth is an inverse gamma pdf with a relative variance of 0.25. GIG analysis mean
Do Does the GIG variation on the EAKF improve the Kinetic c Energy analysis?
30
The GIG-EnKF
- utperforms the EAKF
under all metrics in all 8 independent sets of 50 trials. The only difference between EAKF and GIG code is the univariate ensemble update. Linear regression code is identical.
- Aim: Outline some improvements to GIGG-EnKF (Bishop, 2016, QJRMS.)
1. Background 2. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse-gamma- likelihood is now precise as K=>infinity - previously just approximate. Significance: Rigorous basis for GIG for all moments 3. Test of standard GIG for tropical cyclone surface wind energy assimilation problem: Significance: Standard GIG better than EAKF/EnKF for this problem. 4. Local iterative regression to account for non-linearity in observation operator. Significance: Greatly reduces analysis error. 5. Rigorous approach for dealing with on-off variables (rain, cloud, fire, etc) with gamma based delta function. Significance: Justifies ignoring dry members when rain is observed. 6. Discussion of attempts to include GIGG ideas/tools in a global solution procedure like 4DVar. Significance: The most accurate data assimilation schemes may need to support both global and local solvers.
31
Overview
32
for 1: ; % where is the number of observations Step 1: Decide whether forecast and observation uncertainty associated with is best approximated by GIG-delta, GIG, IGG or Gaussian assu
- j
j p p y = mptions. Step 2: if (GIG-delta) then use ... to obtain , 1,2,..., ; else if (GIG) then use ... to obtain , 1,2,..., ; else if (IGG) then use ... to obtain , 1,2,..., ; else i
a ji a ji a ji
y i K y i K y i K = = =
( ) ( ) ( )
f (Gaussian) then use ... to obtain , 1,2,..., Step 3: Find corresponding analysis ensemble for observations and model variables covar , , for 1,2,..., ; 1 var
a ji f f k j a f a f ki ki ji ji f j
y i K y y y y y y k p i y = = +
- =
=
( ) ( ) ( )
,2,..., covar , , for 1,2,..., ; 1,2,..., var Step 4: Let the analysis ensemble be the prior ensemble for the next observation , for 1,2,..., ; 1,2
f f j a f a f i i ji ji f j f a ki ki
K x y x x y y n i K y y y k p i
µ µ µ
µ = +
- =
= = = = ,..., , for 1,2,..., ; 1,2,..., end
f a i i
K x x n i K
µ µ
µ = = =
Background: The GIGG-EnKF serial observation assimilation scheme with linear regression
(EAKF/EnSRF/EnKF)
Tr Treatment of non-lin linear arit ity o
- f K
f Kin inetic ic E Energy ob
- b op
- perator
- r.
Lin Linear ar r regressio ion fr from ob
- b to
to model space yields inconsiste tencies!
33
Observed variable is KE=0.5(u2+v2). Standard GIG/EAKF uses linear regression to give an inconsistent analysis of (ua,va) and (KE)a. Bottom left panel gives (KE)a. Bottom right gives,
- which is far less
accurate than (KE)a.
( ) ( )
( )
2 2
1 , 2
a a a a a
KE u v u v KE = + ¹
(Similar to Leonhard Scheck finding for SEVERI reflectance DA with LETKF)
Tr Treatment of non-lin linear arit ity o
- f K
f Kin inetic ic E Energy ob
- b op
- perator
- r
(lin (linear ar r regressio ion fr from ob
- b to
to model space yields inconsiste tencies)
34
Observed variable is KE=0.5(u2+v2). Standard GIG/EAKF uses linear regression to give an inconsistent analysis of (ua,va) and (KE)a. Bottom left panel gives (KE)a. Bottom right gives,
- which is far less
accurate than (KE)a.
( ) ( )
( )
2 2
1 , 2
a a a a a
KE u v u v KE = + ¹
35
for 1: ; % where is the number of observations Step 1: Decide whether forecast and observation uncertainty associated with is best approximated by GIG-delta, GIG, IGG or Gaussian assu
- j
j p p y = mptions. Step 2: if (GIG-delta) then use ... to obtain , 1,2,..., ; else if (GIG) then use ... to obtain , 1,2,..., ; else if (IGG) then use ... to obtain , 1,2,..., ; else i
a ji a ji a ji
y i K y i K y i K = = =
( ) ( ) ( )
f (Gaussian) then use ... to obtain , 1,2,..., Step 3: Find corresponding analysis ensemble for observations and model variables covar , , for 1,2,..., ; 1 var
a ji f f k j a f a f ki ki ji ji f j
y i K y y y y y y k p i y = = +
- =
=
( ) ( ) ( )
,2,..., covar , , for 1,2,..., ; 1,2,..., var Step 4: Let the analysis ensemble be the prior ensemble for the next observation , for 1,2,..., ; 1,2,
f f j a f a f i i ji ji f j f a ki ki
K x y x x y y n i K y y y k p i
µ µ µ
µ = +
- =
= = = = ..., , for 1,2,..., ; 1,2,..., end
f a i i
K x x n i K
µ µ
µ = = =
Background: The GIGG-EnKF serial observation assimilation scheme
(EAKF/EnSRF/EnKF)
Need to replace the linear regression step with something better!
36
New method to account for non-linearity in ob-operator: The observation to model space consistency iteration ( ) ( ) ( )
3.1: Define minimum list of variables required to predict the ob that was just assimilated; for example, in the KE example where , are the wind components required
j j
i j h j i j j h j
y u u v v é ù ê ú ë û é ù = ê ú ê ú ë û x x
( ) ( ) ( ) ( )
to predict the KE of the model state at . 3.2: Find the usual GIG-EnKF model-space analysis using linear regression. 1 3.3: Starting with minimize 2
j j j j
j a i h lin a a i i i ji j h h h lin
y J y h é ù ê ú ë û é ù é ù = =
- ê
ú ê ú ë û ë û x x x x
( )
{ }
( )
2
using ensemble-space constrained Newton iteration on gradient to obtain (the minimizer).
j j
local i h a i h
é ù ê ú ë û x x
16
37
Solid black line gives prior pdf of zonal wind (u) field u2 is observed at 25th, 50th or 75th percentile of prior pdf of obs (left to right) Dashed black line gives true posterior pdf of u field Solid mauve line is GIG posterior pdf with linear regression Solid cyan line is GIG posterior pdf with non-linear observation to model space consistency iteration
The observation to model space consistency iteration. Test in 1D model in which only u2 is observed
38
The observation to model space consistency iteration. Test in cloud model in which only rain is observed
Posselt and Bishop (2018, in review, QJRMS) True posterior from MCMC N_0r=slope intercept of the rain particle size distribution Q_c0=cloud to rain autoconversion parameter GIG with linear regression GIG with observation to model space consistency iteration
39
( ) ( ) ( ) ( )
{ }
( ) ( ) ( )
1
3.4: Update the rest of the model state using and regression covar , covar , , for 1,2,..., ; 1,2,..., covar
j j j j j j
a i h a f f f f f a f i i i i i h h h h h f k a f ki ki lin
x x x n i K y y y
µ µ µ
µ
- é
ù é ù é ù = +
- =
= ê ú ê ú ê ú ë û ë û ë û = + x multivariate x x x x x
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
, , for 1,2,..., ; 1,2,..., var 0.5 if , for 1,2,..., ; 1,2,..., if
k k
f j a f ji ji f j a local a a ki k i ki lin h lin a ki local a a k i ki h lin
y y y k p i K y y h y y k p i K h y
- =
= ì é ù + ³ ï ê ú ë û = = = í ï < î x x
New method to account for non-linearity in ob-operator: The observation to model space consistency iteration
40
Observed variable is KE=0.5(u2+v2). Linear regression plus consistency iteration improves consistency
- f (ua,va) and (KE)a.
Bottom left panel gives (KE)a. Bottom right gives,
( ) ( )
( )
2 2
1 , 2
a a a a a
KE u v u v KE = + ¹
The observation to model space consistency iteration. Test in 2D model
41
Observed variable is KE=0.5(u2+v2). Linear regression plus consistency iteration improves consistency
- f (ua,va) and (KE)a.
Bottom left panel gives (KE)a. Bottom right gives,
( ) ( )
( )
2 2
1 , 2
a a a a a
KE u v u v KE = + ¹
The observation to model space consistency iteration. Test in 2D model
Accuracy of direct and derived KE analyses are now the same
42
Ob-to-model space consistency iteration reduces mse in (u,v) field by 75%; i.e. standard deviation of analysis error is halved Chance of getting 28 consecutive wins (as above) by pure chance is 1 in 2.8x108 .
The observation to model space consistency iteration. 28 independent tests in 2D model
- Aim: Outline some improvements to GIGG-EnKF (Bishop, 2016,
QJRMS.)
1. Background 2. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse- gamma-likelihood is now precise as K=>infinity - previously just
- approximate. Significance: Rigorous basis for GIG for all
moments 3. Test of standard GIG for tropical cyclone surface wind energy assimilation problem: Significance: Standard GIG better than EAKF/EnKF for this problem. 4. Local iterative regression to account for non-linearity in observation
- perator. Significance: Greatly reduces analysis error.
5. Rigorous approach for dealing with on-off variables (rain, cloud, fire, etc) with gamma based delta function. Significance: Justifies ignoring dry members when rain is observed. 6. Discussion of ways to include GIGG ideas/tools in other DA schemes.
43
Overview
- EnKFs, 4DVAR, Particle filters, etc, all highly unsatisfactory in this
case.
- How would Bayes’ theorem be used in this case?
- Might an adaptation of the GIGG filter better deal with this problem?
44
Problem: No rain in ensemble forecast but rain is observed
45
No rain forecast as a gamma function limit
Observation likelihood pdf
Dashed blue lines pertain to posterior/analysis pdf
( ) ( )
1/2 2
Dashed black lines pertain to prior/forecast pdf with var 0.707 and
f r f r f
y P y P y
- =
= =
f
y
( ) ( ) ( )
2
var 1/ 4 var
- t
r
- t
t
y y R y y y
- =
=
- +
!
Observed value, yo
46
No rain forecast as a gamma function limit
Observation likelihood pdf
( ) ( )
1/2 2
Dashed black lines pertain to prior/forecast pdf with var 1.41 and
f r f r f
y P y P y
- =
= =
f
y
Dashed blue lines pertain to posterior/analysis pdf
( ) ( ) ( )
2
var 1/ 4 var
- t
r
- t
t
y y R y y y
- =
=
- +
!
Observed value, yo
47
No rain forecast as a gamma function limit
Observed value, yo Observation likelihood pdf
f
y
Dashed blue lines pertain to posterior/analysis pdf
( ) ( ) ( )
2
var 1/ 4 var
- t
r
- t
t
y y R y y y
- =
=
- +
!
( ) ( )
1/2 2
Dashed black lines pertain to prior/forecast pdf with var 11.3 and
f r f r f
y P y P y
- =
= =
48
No rain forecast as a gamma function limit
Observation likelihood pdf
( ) ( )
1/2 2
Dashed black lines pertain to prior/forecast pdf with var 181 and
f r f r f
y P y P y
- =
= =
f
y
Dashed blue lines pertain to posterior/analysis pdf
( ) ( ) ( )
2
var 1/ 4 var
- t
r
- t
t
y y R y y y
- =
=
- +
!
Observed value, yo
49
No rain forecast as a gamma function limit: a gamma delta function
Observation likelihood pdf
( ) ( )
1/2 2
Dashed black lines pertain to prior/forecast pdf with var 32,768 and
f r f r f
y P y P y
- =
= =
f
y
Dashed blue lines pertain to posterior/analysis pdf
( ) ( ) ( )
2
var 1/ 4 var
- t
r
- t
t
y y R y y y
- =
=
- +
!
Observed value, yo
50
No rain forecast as a gamma function limit: a gamma delta function
Observation likelihood pdf
( ) ( )
1/2 2
Dashed black lines pertain to prior/forecast pdf with var 32,768 and
f r f r f
y P y P y
- =
= =
f
y
Dashed blue lines pertain to posterior/analysis pdf
Note that (i) posterior mode is equal to the observed value, and (ii) posterior mean is equal to the mode of ob-likelihood function.
( ) ( ) ( )
2
var 1/ 4 var
- t
r
- t
t
y y R y y y
- =
=
- +
!
Observed value, yo
51
No rain forecast as a gamma function limit: a gamma delta function
Observation likelihood pdf
( ) ( )
1/2 2
Dashed black lines pertain to prior/forecast pdf with var 32,768 and
f r f r f
y P y P y
- =
= =
f
y
Dashed blue lines pertain to posterior/analysis pdf
Using gamma delta function to represent the zero-rain-prior pdf makes Bayes’ theorem give a plausible posterior pdf.
( ) ( ) ( )
2
var 1/ 4 var
- t
r
- t
t
y y R y y y
- =
=
- +
!
Observed value, yo
52
gamma-delta + gamma pdf for case when some members dry and some wet
Observation likelihood pdf
( ) ( )
( )
( )
1
prior prior precip j dry j wet j
y w y w y r r r = +
- (
)
prior wet j
y r
Dashed blue lines pertain to posterior/analysis pdf
In this case, only the mean and variance of the wet members determine the mean and variance
- f the posterior. Dry members ignored!
( ) ( ) ( )
2
var 1/ 4 var
- t
r
- t
t
y y R y y y
- =
=
- +
!
Observed value, yo
( )
prior dry j
y r
53
for 1: ; % where is the number of observations Step 1: Decide whether forecast and observation uncertainty associated with is best approximated by GIG-delta, GIG, IGG or Gaussian assu
- j
j p p y = mptions. Step 2: if (GIG-delta) then use ... to obtain , 1,2,..., ; else if (GIG) then use ... to obtain , 1,2,..., ; else if (IGG) then use ... to obtain , 1,2,..., ; else i
a ji a ji a ji
y i K y i K y i K = = =
( ) ( ) ( )
f (Gaussian) then use ... to obtain , 1,2,..., Step 3: Find corresponding analysis ensemble for observations and model variables covar , , for 1,2,..., ; 1 var
a ji f f k j a f a f ki ki ji ji f j
y i K y y y y y y k p i y = = +
- =
=
( ) ( ) ( )
,2,..., covar , , for 1,2,..., ; 1,2,..., var Step 4: Let the analysis ensemble be the prior ensemble for the next observation , for 1,2,..., ; 1,2
f f j a f a f i i ji ji f j f a ki ki
K x y x x y y n i K y y y k p i
µ µ µ
µ = +
- =
= = = = ,..., , for 1,2,..., ; 1,2,..., end
f a i i
K x x n i K
µ µ
µ = = =
Multi-variate GIGG-Delta filter also fits seamlessly in DART
54
GIGG-delta for coupled model DA:
An idealized coupled model
Evolution based on Lorenz 96 model plus relaxation to adjacent levels. The blue line variable is analogous to zonal wind/current. Green bars give rainfall which only occurs when upper level divergence exceeds a small threshold . Rain magnitude is proportional to product of square of the surface wind‘s deviation from climatological mean and the square root of upper level
- divergence. Rain increases flux of momentum from upper levels to lower levels.
55
7 independent 30 day DA cycles
GIGG-delta vs EnKF mse in case when EnKF has stabilizing inflation factor
Mse for wind/current GIGG-delta (+ signs) and EnKF (x signs) Mse for rain GIGG-delta (+ signs) and EnKF (x signs)
GIGG-delta much better than EnKF in this “stable- system versus stable-system” comparison.
Forecasts
- Aim: Outline some improvements to GIGG-EnKF (Bishop, 2016,
QJRMS.)
1. Background 2. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse- gamma-likelihood is now precise as K=>infinity - previously just
- approximate. Significance: Rigorous basis for GIG for all
moments 3. Test of standard GIG for tropical cyclone surface wind energy assimilation problem: Significance: Standard GIG better than EAKF/EnKF for this problem. 4. Local iterative regression to account for non-linearity in observation
- perator. Significance: Greatly reduces analysis error.
5. Rigorous approach for dealing with on-off variables (rain, cloud, fire, etc) with gamma based delta function. Significance: Justifies ignoring dry members when rain is observed. 6. Discussion of ways to include GIGG ideas/tools in other DA schemes.
56
Overview
Co Comb mbining GIGG-EnK EnKF wi with ob
- b-to
to-mo model-space ce iteration with other methods
1. Use GIGG-EnKF with ob-to-model-space iteration to assimilate the non-Gaussian variables in the observation volume then assimilate the rest using LETKF/VAR or other favorite quasi- Gaussian method. 2. To improve non-linear trajectories, use 4D smoother form of GIGG-EnKF, then use Geir Evensen’s suggestion of, say tripling
- b and relative ob error variances and doing step 1 three times.
On the third iteration of this, the prior may look Gaussian and pure 4D-Var or LETKF may be appropriate.
57
Mu Multi ti-va variate “all-at at-on
- nce” assimilation
- n of
- f parameterized non
- n-
Gau Gaussian ssian p pdfs n s not so so e eas asy
- To do this, one first needs to find a compelling multi-variate
statistical model of the moments of the prior.
- … not so easy in the multi-variate case …
- Multi-variate Wishart, Gaussian and log-normal all fail
58
Me Mean of powers s of relati tive pertu rturb rbati tion [(x-<x <x>) >)/<x <x>] >]
59
Wishart distribution and other such distributions based on sample covariances of samples from normal pdfs produce spatially uniform relative variances of 2/(N-1) where N is the sample size. Hence, they are incapable of representing the variation in relative variance seen here. Literature search failed to find gamma-like multi-variate pdf capable of producing the mean of powers of relative perturbations shown here. Variance almost twice the size of the mean here
Mu Multi ti-va variate “all-at at-on
- nce” assimilation
- n of
- f parameterized non
- n-
Gau Gaussian ssian p pdfs n s not so so e eas asy
- To do this, one first needs to find a compelling multi-variate
statistical model of the moments of the prior.
- … not so easy in the multi-variate case …
- Multi-variate Wishart, Gaussian and log-normal all fail
- Perturbed obs 4DVar or EnKF plus non-linear transformation to
“normalish” variables results in ensemble whose variance does not equal mse after inverse transform is performed (Bishop 2019,
- Q. J. Roy. Met. Soc). However, ease of implementation in
- perational systems provides compelling case for
testing/tuning/engineering this approach.
60
GIGG-EnKF with ob-to-model-space iteration avoids these issues
1. GIG-EnKF solution to Bayes’ theorem for gamma prior and inverse- gamma-likelihood is now precise for all posterior moments as K=> =>infinity - previously just approximate. 2. In idealized TC surface wind energy assimilation experiments, GIG soundly beat EAKF even using linear regression. The newly introduced ob-space to model space non-linear regression iteration:
i. Gave consistent model and ob space analyses ii. Greatly reduced mean square error (mse). iii. Gave an analysis ensemble variance approximately equal to mse.
3. In 1D experiments, the new local ob-space to model space iteration procedure gave fairly accurate multi-modal posteriors. 4. GIG-EnKF provides provides rigorous approach for dealing with on-
- ff variables (rain, cloud, fire, etc) with gamma based delta function.
Approach Justifies ignoring dry members when rain is observed. 5. Attempts to model prior pdfs GLOBALLY for use in “all-obs-at-once” variational assimilation schemes were unsuccessful – should
- perational centers consider simultaneously supporting both local
and global solvers?
61
Conclusions
62
Ob-to-model space consistency iteration reduces mse in KE field by 41%
Ob-space Ob-space Model-space Model-space
The observation to model space consistency iteration. 28 independent tests in 2D model
Ob-to-model-space consistency iteration helps! (Result for Kinetic Energy below, 7x4 independent trials)
63
Ob-to-model space consistency iteration reduces mse in KE field by 41% Chance of getting 7 wins by pure chance is 1 in 128.
64
Solid black line gives prior pdf of zonal wind (u) field u2 is observed at 25th, 50th or 75th percentile of prior (left to right) Dashed black line gives true posterior pdf of u field Solid mauve line is GIGG posterior pdf with no outer loop Solid cyan line is GIGG posterior pdf with outer loop
The observation to model space consistency iteration. Test in 1D model
Data assimilation for clouds and high-resolution models
Data Assimilation (DA)
Imperfect
- bservations, y
Data Assimilation weights y and xf i using uncertainty pdfs Posterior analyses, xai , i=1,2,..,K. The initial conditions for the next ensemble forecast
65
Prior ensemble forecast, xf i, i=1,2,..,K
Irma Jose
Even more disturbing: the extreme weather parts of the image (Irma, Jose, etc) are completely ignored!!! Most of the observations (pixels) in this image are not assimilated by
- perational Data Assimilation (DA) systems!
66
Next Decade Data Assimilation (DA): Challenges
Cloud DA
Near zero semi-positive definite variables, like water vapor mixing ratio, cloud, rain, etc., inevitably have non-Gaussian uncertainties.
67
Next Decade Data Assimilation (DA): Challenges
Non-Gaussian Episodic Variables
The change-in-cloud-cover per change-in-mixing-ratio of water vapor is a highly non-linear function of mixing ratio.
Cloud = non-linear-function(H2O g/kg)
68
No cloud Cloud
Next Decade Data Assimilation (DA): Challenges
- bserved-variable = non-linear-function(model-variable)
Irma Jose small scale features large scale features
The scale of forecast errors for clouds is situation dependent.
69