Improving Forecasts of Extreme Values By Machine Learning Models - - PowerPoint PPT Presentation
Improving Forecasts of Extreme Values By Machine Learning Models - - PowerPoint PPT Presentation
Improving Forecasts of Extreme Values By Machine Learning Models Using Occam's Razor William W. Hsieh University of British Columbia (Visiting Scientist at Univ. of Victoria) American Meteorological Society Annual Meeting January 2018, Austin
Introduction
l Machine learning (ML) methods developed mostly
for discrete data.
l In Environmental Sc.:
§Mostly continuous data. §Importance of extreme values. §Are ML methods not suited for extreme values?
l Continuous data: Wait long enough, a new
predictor value will lie outside the training range => ML model doing extrapolation!
l Extreme learning machine (ELM) – 1 hidden layer
artificial neural network (ANN) with random weights at hidden nodes.
§Ensemble average output from 100 runs. §3 choices of activation functions at hidden layer: (a) sigmoidal, (b) Gaussian (RadBas), (c) softplus.
2
3
Dash = true signal
y = x + 0.2 x2
x = training data Line = linear regr. Black = ELM with different activation
- fn. in (a), (b), (c)
+ = with extrapolat. (d) ELM solutions in extended domain
- 6
- 4
- 2
2 4 6
x
- 10
- 5
5 10 15 20
y
(a) sigmoidal
- 6
- 4
- 2
2 4 6
x
- 10
- 5
5 10 15 20
y
(b) radial basis
- 6
- 4
- 2
2 4 6
x
- 10
- 5
5 10 15 20
y
(c) softplus
- 100
- 50
50 100
x
20 40 60 80 100
y
(d) extended range (a) (b) (c)
l Occam’s razor: among competing hypotheses, the
- ne with the fewest assumptions should be
- selected. (Parsimony)
l In the extrapolation region, Occam would avoid
nonlinear ML models with many parameters -- but instead use linear model??
l New idea:
1) In predictor space, determine which test data points involve extrapolation (based on Mahalanobis distance to training dataset). 2) Use nonlinear ML solution to perform linear extrapolation.
l E.g.: predict Vancouver airport (YVR) prcp. amount
(on precip. days). 3 predictors: SLP , humidity, Z500 (NCEP Reanalysis), 1971-76 training, 1978-2000 testing.
4
5
- 1.5
- 1
- 0.5
0.5 1 1.5
x1
- 1.5
- 1
- 0.5
0.5 1 1.5
x3 test data training data
- utlier
centre
Use ML model to compute gradient to extrapolate
Extrapolate from nearest neighbour
6
- 1.5
- 1
- 0.5
0.5 1 1.5
x1
- 1.5
- 1
- 0.5
0.5 1 1.5
x3 test data training data
- utlier
centre
Extrapolate from centre of cluster
Use these 2 points to compute gradient for extrapolation
l Use both extrapolation schemes (each with a fine
and coarse finite difference estimate of the gradient for extrapolation) => 4 extrap. schemes
§Take median (of 4 extrap.schemes & original value)
l Compute mean absolute error (MAE), get skill score
(SS) relative to original ML model’s MAE.
l 4 datasets: YVR prcp, streamflow at Englishman
River (ENG) and Stave River (STA), sediment concentration at Fraser River (FRA).
§Also reversed training and testing data (rev).
l Ran ELM:
§200 trials with different random no. sequences.
7
8
ENG ENG(rev) STA STA(rev) YVR YVR(rev) FRA(rev)
- 0.2
0.2 0.4 0.6 0.8 1
MAE SS (extrapolated data)
sigmoid radbas softplus
9
ENG ENG(rev) STA STA(rev) YVR YVR(rev) FRA(rev)
- 0.6
- 0.4
- 0.2
0.2 0.4 0.6 0.8 1
MAE SS (extrapolated data using MLR)
sigmoid radbas softplus
Simple alternative: Train MLR (multiple linear regression) & use its output for the extrapolation pts.
10
MAE SS RMSE SS
- corr. SS
- 0.8
- 0.6
- 0.4
- 0.2
0.2 0.4 0.6 0.8
Medians of skill scores
ELM with lin. extrap. MLR
Boxplot the 21 medians (of SS over 200 trials) for MLR and ELM (with linear extrap.) over the extrapolated data.
Conclusion & future work
l For extreme values, ML models often do nonlinear
extrapolation.
l Following Occam, proposed using linear
extrapolation instead of nonlinear extrapolation:
§Use nonlinear ML solution to linearly extrapolate. §Or simply use MLR model for the extrapolation points.
l Future improvements:
§Determination of outliers by Mahalanobis distance is not robust – replace with more robust method. §Some predictors may be discrete variables – will need to modify the current linear extrapolation schemes.
11