Pattern recognition in nuclear fusion data by means of geometric - - PowerPoint PPT Presentation

pattern recognition in nuclear fusion data by means of
SMART_READER_LITE
LIVE PREVIEW

Pattern recognition in nuclear fusion data by means of geometric - - PowerPoint PPT Presentation

Pattern recognition in nuclear fusion data by means of geometric methods in probabilistic spaces Geert Verdoolaege Department of Applied Physics, Ghent University, Ghent, Belgium Laboratory for Plasma Physics, Royal Military Academy


slide-1
SLIDE 1

Pattern recognition in nuclear fusion data by means of geometric methods in probabilistic spaces

Geert Verdoolaege

Department of Applied Physics, Ghent University, Ghent, Belgium Laboratory for Plasma Physics, Royal Military Academy (LPP–ERM/KMS), Brussels, Belgium

ECEA 2017, November 21 – December 1, 2017

slide-2
SLIDE 2

1

Stochastic uncertainty in fusion plasmas

2

Pattern recognition in probabilistic spaces

3

Geodesic least squares regression

4

Application in fusion science: edge-localized plasma instabilities

5

Application in astronomy: Tully-Fisher scaling

6

Conclusion

Overview

slide-3
SLIDE 3

1

Stochastic uncertainty in fusion plasmas

2

Pattern recognition in probabilistic spaces

3

Geodesic least squares regression

4

Application in fusion science: edge-localized plasma instabilities

5

Application in astronomy: Tully-Fisher scaling

6

Conclusion

Overview

slide-4
SLIDE 4

‘Star on earth’ Clean, safe, inexhaustible energy source Magnetic confinement fusion: tokamak, stellarator, . . . Confine hot hydrogen isotope plasma with magnetic fields ITER: next-generation international tokamak Complex physical system, turbulent transport Difficult to probe → uncertainty in measurements and models

Fusion energy

slide-5
SLIDE 5

Sources of statistical uncertainty:

Fluctuation of system properties Measurement noise

Plasma turbulence (PPPL) Edge-localized modes (MAST) Confinement time vs. density (JET)

Uncertainty in fusion plasmas

slide-6
SLIDE 6

1

Stochastic uncertainty in fusion plasmas

2

Pattern recognition in probabilistic spaces

3

Geodesic least squares regression

4

Application in fusion science: edge-localized plasma instabilities

5

Application in astronomy: Tully-Fisher scaling

6

Conclusion

Overview

slide-7
SLIDE 7

Patterns ↔ distances

Difference/distance between points

slide-8
SLIDE 8

Zooming in...

slide-9
SLIDE 9

Mahalanobis distance

slide-10
SLIDE 10

Family of probability distributions → differentiable manifold Parameters = coordinates Metric tensor: Fisher information matrix Parametric probability model: p (x|θ) = ⇒ gµν (θ) = −E

  • ∂2

∂θµ∂θν ln p (x|θ)

  • ,

µ, ν = 1, . . . , m θ = m-dimensional parameter vector Line element: ds2 = gµνdθµdθν Minimum-length curve: geodesic Rao geodesic distance (GD)

Information geometry

slide-11
SLIDE 11

Pattern recognition:

Classification, clustering Regression analysis Dimensionality reduction, visualization

Observation/prediction (structureless number) → distribution (structured object) More information, more flexibility

Pattern recognition in probabilistic spaces

slide-12
SLIDE 12

PDF: p(x|µ, σ) = 1 √ 2πσ exp

  • −(x − µ)2

2σ2

  • Line element:

ds2 = dµ2 σ2 + 2dσ2 σ2 Hyperbolic geometry: Poincaré half-plane, Poincaré disk, Klein disk, . . . Analytic geodesic distance ❤tt♣s✿✴✴✇✇✇✳②♦✉t✉❜❡✳❝♦♠✴✇❛t❝❤❄✈❂✐✾■❯③◆①❡❍✹♦

The univariate Gaussian manifold

slide-13
SLIDE 13

Original Compressed

The pseudosphere (tractroid)

slide-14
SLIDE 14

Geodesics on the Gaussian manifold

slide-15
SLIDE 15

Plasma energy confinement time w.r.t. global plasma parameters Euclidean Geodesic

Data visualization with uncertainty

slide-16
SLIDE 16

1

Stochastic uncertainty in fusion plasmas

2

Pattern recognition in probabilistic spaces

3

Geodesic least squares regression

4

Application in fusion science: edge-localized plasma instabilities

5

Application in astronomy: Tully-Fisher scaling

6

Conclusion

Overview

slide-17
SLIDE 17

Data uncertainty: measurement error, fluctuations, . . . Model uncertainty: missing variables, linear vs. nonlinear, Gaussian vs. non-Gaussian, . . . Heterogeneous data and error bars Uncertainty on response (y) and predictor (xj) variables Atypical observations (outliers) Near-collinearity of predictor variables Data transformations, e.g. ln(y) = ln(β0) + β1 ln(x1) + β2 ln(x2) + . . . + βp ln(xp)

Challenges in regression analysis

slide-18
SLIDE 18

Workhorse: ordinary least squares (OLS) Maximum likelihood (ML) / maximum a posteriori (MAP): p(yi|xi, θ) = 1 √ 2πσ exp  −1 2 yi − µi σ 2  µi = fi(xi, θ)

e.g.

= β0 + β1xi Need flexible and robust regression Parameter estimation → distance minimization: Expected ↔ Measured

Michigan, circa 1890s.

Least squares and maximum a posteriori

slide-19
SLIDE 19

Minimum distance estimation (Wolfowitz, 1952): Which distribution does the model predict? vs. Which distribution do you observe? Gaussian case: different means and standard deviations Hellinger divergence (Beran, 1977) Empirical distribution: kernel density estimate

The minimum distance approach

slide-20
SLIDE 20

Modeled and observed distribution

slide-21
SLIDE 21

Example: fluid turbulence

slide-22
SLIDE 22

1

  • σ2

y + ∑m j=1 βj 2σ2 x,j

exp        −1 2

  • y −
  • β0 + ∑m

j=1 βj xij

2 σ2

y + ∑m j=1 βj 2σ2 x,j

       1 √ 2π σobs exp

  • −1

2 (y − yi)2 σobs 2

  • Rao GD

Modeled distribution Observed distribution σ2

mod

Model-based approach: regression on probabilistic manifold To be estimated: σobs, β0, β1, . . . , βm iid data: minimize sum of squared GDs = ⇒ geodesic least squares (GLS) regression If σmod = σobs → Mahalanobis distance

  • G. Verdoolaege et al., Entropy 17, 4602, 2015

Geodesic least squares

slide-23
SLIDE 23

1

Stochastic uncertainty in fusion plasmas

2

Pattern recognition in probabilistic spaces

3

Geodesic least squares regression

4

Application in fusion science: edge-localized plasma instabilities

5

Application in astronomy: Tully-Fisher scaling

6

Conclusion

Overview

slide-24
SLIDE 24

Repetitive instabilities in plasma edge Magnetohydrodynamic origin

MAST, Culham Centre for Fusion Energy, UK

Edge-localized modes (ELMs)

slide-25
SLIDE 25

Analogy 1: Solar flares

slide-26
SLIDE 26

Analogy 2: Cooking pot

slide-27
SLIDE 27

Confinement loss Potential damaging effects Impurity outflux → ELM control/mitigation Energy ∝ (frequency)−1

Importance of ELMs

slide-28
SLIDE 28

32 recent JET discharges Waiting time: time before ELM burst

Data extraction: waiting times

slide-29
SLIDE 29

Energy carried from the plasma by an ELM

Data extraction: energies

slide-30
SLIDE 30

Average waiting times and energies

slide-31
SLIDE 31

Standard deviation / √n → error bars

Error bars on averages

slide-32
SLIDE 32

EELM = β0 + β1∆tELM, σE,obs ∝ µE,obs

Regression on averages

slide-33
SLIDE 33

Regression results on pseudosphere

slide-34
SLIDE 34

Multidimensional scaling:

Projected regression results

slide-35
SLIDE 35

Average Method β0 (MJ) β1 (MJ/s) OLS

  • 0.050

5.7 GLS

  • 0.021

4.6 Individual Method β0 (MJ) β1 (MJ/s) OLS 0.024 3.2 GLS

  • 0.022

4.2

Average vs. collective trend

slide-36
SLIDE 36

1

Stochastic uncertainty in fusion plasmas

2

Pattern recognition in probabilistic spaces

3

Geodesic least squares regression

4

Application in fusion science: edge-localized plasma instabilities

5

Application in astronomy: Tully-Fisher scaling

6

Conclusion

Overview

slide-37
SLIDE 37

Simple, tight relation for disk galaxies: Mb = β0Vβ1

f

  • Mb = total (stellar + gaseous) baryonic mass (M⊙)

Vf = rotational velocity (km s−1) Various purposes:

Distance indicator Constraints on galaxy formation models Test for alternatives to ΛCDM cosmological model (slope and scatter)

Baryonic Tully-Fisher Relation (BTFR)

slide-38
SLIDE 38

47 gas-rich galaxies (McGaugh, Astron. J. 143, 40, 2012) Loglinear (σobs,i ≡ sobs) and nonlinear (σobs,i = robs Mb) Benchmarking:

Ordinary least squares (OLS) Bayesian: errors in all variables, marginalized standard deviations (Bayes) Geodesic least squares (GLS) Kullback-Leibler least squares (KLS)

Experiments

slide-39
SLIDE 39

Loglinear regression

slide-40
SLIDE 40

Nonlinear regression

slide-41
SLIDE 41

Parameter distributions

slide-42
SLIDE 42

rMb ≈ 38%, robs ≈ 63%

GLS uncertainty estimates

slide-43
SLIDE 43

Interpretation on pseudosphere

slide-44
SLIDE 44

1

Stochastic uncertainty in fusion plasmas

2

Pattern recognition in probabilistic spaces

3

Geodesic least squares regression

4

Application in fusion science: edge-localized plasma instabilities

5

Application in astronomy: Tully-Fisher scaling

6

Conclusion

Overview

slide-45
SLIDE 45

Probabilistic modeling of stochastic system properties Information geometry: distance measure, geometrical intuition Pattern recognition in probabilistic spaces More information, more flexibility Geodesic least squares regression: flexible and robust Easy to use, fast optimization

Conclusions