Dealing with unknown discontinuities in data and models Kerry - - PowerPoint PPT Presentation

dealing with unknown discontinuities in data and models
SMART_READER_LITE
LIVE PREVIEW

Dealing with unknown discontinuities in data and models Kerry - - PowerPoint PPT Presentation

Dealing with unknown discontinuities in data and models Kerry Gallagher John Stephenson Chris Holmes Discontinuities occur in both data and processes in the Earth and Environmental Sciences Spatial : faults, topography, lithology, phase,


slide-1
SLIDE 1

Dealing with unknown discontinuities in data and models

Kerry Gallagher John Stephenson Chris Holmes

slide-2
SLIDE 2

Discontinuities occur in both data and processes in the Earth and Environmental Sciences

Spatial : faults, topography, lithology, phase, composition,… Temporal : climate, seismicity, tectonics,…

slide-3
SLIDE 3

400 600 800 1000 1200 1400 1860 1880 1900 1920 1940 1960 1980

Nile discharge (m

3 x10 8)

Year

What is the appropriate question ?

What was the signficance of the opening of the Aswan Dam ?

(data from Cobb 1978)

slide-4
SLIDE 4

400 600 800 1000 1200 1400 1860 1880 1900 1920 1940 1960 1980

Nile discharge (m

3 x10 8)

Year

ƒ(t) = μ1I( t ≤ tc) + μ2I( t > tc)

When was the change ?

(after Denison et al. 2002)

slide-5
SLIDE 5

Data interpolation and prediction with discontinuities Standard methods may be too smooth

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • 0.2

0.2 0.4 0.6 0.8 1 1.2 Kriging model of synthetic step function X Y Realisation of true data True Function Kriging Fit (Gaussian)

  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 Kriging model of synthetic model X Y Realisation of true data True Function Kriging Fit (Gaussian)

s s ƒ(s) ƒ(s)

slide-6
SLIDE 6

Need a method that can deal with an unknown number of discontinuities in unknown locations

Partition Modelling

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 X Y Partition model of synthetic data True Function Kriging Fit (Gaussian)

ƒ(x) x

1D 2D

slide-7
SLIDE 7

Formulating a Partition Model

How many discontinuities, where are they ?

Regression function, ƒ, specified within region X ƒ(X) Space partitioned into discrete regions Parameters: (c1-N,ƒ1-N, N, σ2) = θ c6 c1 c3 c4 c5 c2 Voronoi Centres Partitions defined by Voronoi tessellation

slide-8
SLIDE 8

Generating Partition Models

( )

Θ

= θ θ θ d D p D y p D y p | ) , | ( ) | (

) ( ) | ( ) | ( θ θ θ p D p D p ∝

Bayes’ Theorem

Posterior Likelihood Prior Use Markov chain Monte Carlo (MCMC) to sample the posterior distribution, p(θ|D) D = observed data θ = model parameters y = value to be predicted

Prediction Monte Carlo integration

=

N i i D

y p N D y p

1

) , | ( 1 ) | ( θ (θi D) p |

Posterior distribution

slide-9
SLIDE 9

Sampling with (transdimensional) MCMC

  • Propose new θ’
  • Calculate likelihood with new θ’
  • Accept new θ’ or retain current θ

Initialise θ

Distribution of accepted models θ ~ p(θ|D)

Jump proposal Jacobian Model Proposal Prior Likelihood

Iterate

α(θ,θ’) = min 1, p(θ’)p(D|θ’) p(θ|θ’) p(θ)p(D|θ) p(θ’|θ) R |J|

Acceptance criterion

slide-10
SLIDE 10

Sampling Partition Models natural parsimony

Better data fit

Likelihood

slide-11
SLIDE 11

Atmospheric dust input to peat bogs

Mean±95%C.I.

  • Max. Like.

38,500 yr

Looking for common signature in multiple systems

1D partition models for data interpolation

45,500 yr 8,850 yr

slide-12
SLIDE 12

Partition Models – 2D example function

slide-13
SLIDE 13

Partition Sampling – 2D single realisation

Multiple realisations … ensemble average (smooth, but maintain discontinuities)

slide-14
SLIDE 14

Partition Model Digital Elevation Model (DEM) example

Pixels Pixels Raw ERS Sample Image 10 20 30 40 50 60 10 20 30 40 50 60 10 20 30 40 50 60 10 20 30 40 50 60 Pixels Pixels Contour Plot of Partition Model Pixel Value

slide-15
SLIDE 15

Partition Models Application to spatially variable physical processes and parameters

Example from thermochronology

slide-16
SLIDE 16

Thermochronology : data are sensitived to temperature history experience by host rock

e.g. apatite fission track analysis

p(D|θ) = ƒ(T(t),φ)

Likelihood is a non-linear function of unknown parameters at each location within each partition

slide-17
SLIDE 17

Model partition distribution and thermal histories

slide-18
SLIDE 18

The problem is to find (a) how to partition the samples in 2D (i) number of partitions (ii) location of the partitions (b) the distribution of thermal histories in each partition

slide-19
SLIDE 19

Inferred partition distribution and thermal histories

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.25 0.15 0.1 0.05

P

X1 X2

inferred true

(Stephenson, Gallagher and Holmes 2006)

slide-20
SLIDE 20

Summary

  • Partition models allow for unknown number of

discontinuities with unknown geometry in variable dimensions

  • Bayesian approach deals with the problem in terms
  • f probabilities…intuitive for model choice
  • Obtain probability distributions (partitions, model

parameters, posterior predictions)

  • Bayesian approach is naturally parsimonious
  • Potential for self-adaptive/self regularising model

parameterisation

slide-21
SLIDE 21

0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 p(k)

  • No. of Partitions (k)

Sampling Partition Models distribution on number of partitions

slide-22
SLIDE 22

Traditionally, each sample is modelled independently..

ignores spatial relationships….

slide-23
SLIDE 23

ignores spatial relationships…. ..ideally want to group samples with common thermal history

Traditionally, each sample is modelled independently..

slide-24
SLIDE 24

…but the spatial relationships may be unknown…

Traditionally, each sample is modelled independently..