State space methods for temporal GPs
Arno Solin Assistant Professor in Machine Learning Department of Computer Science Aalto University
GAUSSIAN PROCESS SUMMER SCHOOL September 11, 2019
@arnosolin
arno.solin.fi
State space methods for temporal GPs Arno Solin Assistant Professor - - PowerPoint PPT Presentation
State space methods for temporal GPs Arno Solin Assistant Professor in Machine Learning Department of Computer Science Aalto University G AUSSIAN P ROCESS S UMMER S CHOOL September 11, 2019 @arnosolin arno.solin.fi Outline Motivation:
Arno Solin Assistant Professor in Machine Learning Department of Computer Science Aalto University
GAUSSIAN PROCESS SUMMER SCHOOL September 11, 2019
@arnosolin
arno.solin.fi
State space methods for temporal GPs Arno Solin 2/44
Motivation: Temporal models Three views into GPs State space models General likelihoods Spatio- temporal GPs Further extensions Recap
State space methods for temporal GPs Arno Solin 3/44
State space methods for temporal GPs Arno Solin 4/44
State space methods for temporal GPs Arno Solin 5/44
f(t) ∼ GP(µ(t), κ(t, t′)) GP prior y | f ∼
p(yi | f(ti)) likelihood ◮ Let’s focus on the GP prior only. ◮ A temporal Gaussian process (GP) is a random function f(t), such that joint distribution of f(t1), . . . , f(tn) is always Gaussian. ◮ Mean and covariance functions have the form: µ(t) = E[f(t)], κ(t, t′) = E[(f(t) − µ(t))(f(t′) − µ(t′))T]. ◮ Convenient for model specification, but expanding the kernel to a covariance matrix can be problematic (the notorious O(n3) scaling).
State space methods for temporal GPs Arno Solin 6/44
◮ The Fourier transform of a function f(t) : R → R is F[f](i ω) =
f(t) exp(−i ω t) dt ◮ For a stationary GP, the covariance function can be written in terms of the difference between two inputs: κ(t, t′) κ(t − t′) ◮ Wiener–Khinchin: If f(t) is a stationary Gaussian process with covariance function κ(t), then its spectral density is S(ω) = F[κ]. ◮ Spectral representation of a GP in terms of spectral density function S(ω) = E[˜ f(i ω)˜ f T(−i ω)]
State space methods for temporal GPs Arno Solin 7/44
◮ Path or state space representation as solution to a linear time-invariant (LTI) stochastic differential equation (SDE): df = F f dt + L dβ, where f = (f, df/dt, . . .) and β(t) is a vector of Wiener processes. ◮ Equivalently, but more informally df(t) dt = F f(t) + L w(t), where w(t) is white noise. ◮ The model now consists of a drift matrix F ∈ Rm×m, a diffusion matrix L ∈ Rm×s, and the spectral density matrix of the white noise process Qc ∈ Rs×s. ◮ The scalar-valued GP can be recovered by f(t) = hT f(t).
State space methods for temporal GPs Arno Solin 8/44
◮ The initial state is given by a stationary state f(0) ∼ N(0, P∞) which fulfils F P∞ + P∞ FT + L Qc LT = 0 ◮ The covariance function at the stationary state can be recovered by κ(t, t′) =
t′ ≥ t hT exp((t′ − t)F) P∞ h, t′ < t where exp(·) denotes the matrix exponential function. ◮ The spectral density function at the stationary state can be recovered by S(ω) = hT(F + i ω I)−1 L Qc LT (F − i ω I)−Th
State space methods for temporal GPs Arno Solin 9/44
◮ Similarly as the kernel has to be evaluated into a covariance matrix for computations, the SDE can be solved for discrete time points {ti}n
i=1.
◮ The resulting model is a discrete state space model: fi = Ai−1 fi−1 + qi−1, qi ∼ N(0, Qi), where fi = f(ti). ◮ The discrete-time model matrices are given by: Ai = exp(F ∆ti), Qi = ∆ti exp(F (∆ti − τ)) L Qc LT exp(F (∆ti − τ))T dτ, where ∆ti = ti+1 − ti ◮ If the model is stationary, Qi is given by Qi = P∞ − Ai P∞ AT
i
State space methods for temporal GPs Arno Solin 10/44
−4 −2 2 4 0.2 0.4 0.6 0.8 1 τ = t − t′ κ(τ) Covariance function −4 −2 2 4 0.5 1 1.5 2 ω S(ω) Spectral density function 1 2 3 4 5 6 7 8 9 10 −2 2 Input, t Output, f(t) Sample functions
State space methods for temporal GPs Arno Solin 11/44
State space methods for temporal GPs Arno Solin 12/44
State space methods for temporal GPs Arno Solin 13/44
State space methods for temporal GPs Arno Solin 14/44
State space methods for temporal GPs Arno Solin 15/44
i=1:
n)
n I)−1 y,
n I)−1 kT ∗
State space methods for temporal GPs Arno Solin 16/44
State space methods for temporal GPs Arno Solin 17/44
◮ The sequential solution (goes under the name ‘Kalman filter’) considers
◮ Start from m0 = 0 and P0 = P∞ and for each data point iterate the following steps. ◮ Kalman prediction: mi|i−1 = Ai−1 mi−1|i−1, Pi|i−1 = Ai−1 Pi−1|i−1 AT
i−1 + Qi−1.
◮ Kalman update: vi = yi − hTmi|i−1, Si = hTPi|i−1 h + σ2
n,
Ki = Pi|i−1 h S−1
i
, mi|i = mi|i−1 + Ki vi, Pi|i = Pi|i−1 − Ki Si KT
i .
State space methods for temporal GPs Arno Solin 18/44
◮ To condition all time-marginals on all data, run a backward sweep (Rauch–Tung–Striebel smoother): mi+1|i = Ai mi|i, Pi+1|i = Ai Pi|i AT
i + Qi,
Gi = Pi|i AT
i P−1 i+1|i,
mi|n = mi|i + Gi (mi+1|n − mi+1|i), Pi|n = Pi|i + Gi (Pi+1|n − Pi+1|i) GT
i ,
◮ The marginal mean and variance can be recovered by: E[fi] = hTmi|n, V[fi] = hTPi|n h ◮ The log marginal likelihood can be evaluated as a by-product of the Kalman update: log p(y) = −1 2
n
log |2π Si| + v T
i S−1 i
vi
State space methods for temporal GPs Arno Solin 19/44
State space methods for temporal GPs Arno Solin 20/44
Mat.
Mat.
Mat.
Mat.
State space methods for temporal GPs Arno Solin 20/44
Mat.
Mat.
Mat.
Mat.
Explaining changes in number of births in the US
State space methods for temporal GPs Arno Solin 21/44
State space methods for temporal GPs Arno Solin 22/44
1 2 3 4 5 6 1 2 3 4 5 6
K = k(X, X)
1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00
1 2 3 4 5 6 1 2 3 4 5 6
Q = k(X, X)
1 3 2 1 1 2 3
see Durrande et al. (2019)
State space methods for temporal GPs Arno Solin 23/44
I . . . −A1 I . . . −A2 I . . . . . . . . . . . . ... . . . −An I
−T
P0 . . . . . . Q1 . . . . . . Q2 . . . . . . . . . ... . . . . . . . . . Qn
−1
I . . . −A1 I . . . −A2 I . . . . . . . . . . . . ... . . . −An I
−1
State space methods for temporal GPs Arno Solin 24/44
State space methods for temporal GPs Arno Solin 25/44
◮ The observation model might not be Gaussian f(t) ∼ GP(0, κ(t, t′)) y | f ∼
p(yi | f(ti)) ◮ There exists a multitude of great methods to tackle general likelihoods with approximations of the form Q(f | D) = N(f | m + Kα, (K−1 + W)−1) ◮ Use those methods, but deal with the latent using state space models
State space methods for temporal GPs Arno Solin 26/44
State space methods for temporal GPs Arno Solin 27/44
κ(t, t′) = κν=3/2
Mat.
(t, t′) + κyear
Mat.
(t, t′) + κweek
Mat.
(t, t′)
State space methods for temporal GPs Arno Solin 27/44
κ(t, t′) = κν=3/2
Mat.
(t, t′) + κyear
Mat.
(t, t′) + κweek
Mat.
(t, t′)
State space methods for temporal GPs Arno Solin 27/44
κ(t, t′) = κν=3/2
Mat.
(t, t′) + κyear
Mat.
(t, t′) + κweek
Mat.
(t, t′)
State space methods for temporal GPs Arno Solin 28/44
State space methods for temporal GPs Arno Solin 29/44
State space methods for temporal GPs Arno Solin 30/44
GPs under the kernel formalism f(x, t) ∼ GP(0, k(x, t; x′, t′)) yi = f(xi, ti) + εi Stochastic partial differential equations ∂f(x, t) ∂t = F f(x, t) + L w(x, t) yi = Hi f(x, t) + εi
Location (x) Time (t) f(x, t) Covariance k(x, t; x′, t′) Location (x) Time (t) f(x, t) The state at time t
State space methods for temporal GPs Arno Solin 31/44
−1 1 −1 1 Temporal dimension, t Spatial dimension, x −1 1 Estimate mean, E[f(t, x)]
State space methods for temporal GPs Arno Solin 32/44
−1 1 −1 1 Temporal dimension, t Spatial dimension, x −1 1 Estimate mean, E[f(t, x)]
State space methods for temporal GPs Arno Solin 33/44
State space methods for temporal GPs Arno Solin 34/44
State space methods for temporal GPs Arno Solin 35/44
State space methods for temporal GPs Arno Solin 36/44
https://youtu.be/myCvUT3XGPc
State space methods for temporal GPs Arno Solin 37/44
State space methods for temporal GPs Arno Solin 38/44
https://youtu.be/iellGrlNW7k
State space methods for temporal GPs Arno Solin 39/44
State space methods for temporal GPs Arno Solin 40/44
GPs under the kernel formalism f(t) ∼ GP(0, κ(t, t′)) y | f ∼
p(yi | f(ti)) Stochastic differential equations df(t) = F f(t) + L dβ(t) yi ∼ p(yi | hTf(ti)) Flexible model specification Inference / First-principles
State space methods for temporal GPs Arno Solin 41/44
State space methods for temporal GPs Arno Solin 42/44
The examples and methods presented on this lecture are presented in greater detail in the following works: Hartikainen, J. and S¨ arkk¨ a, S. (2010). Kalman filtering and smoothing solutions to temporal Gaussian process regression models. Proceedings of IEEE International Workshop on Machine Learning for Signal Processing (MLSP). S¨ arkk¨ a, S., Solin, A., and Hartikainen, J. (2013). Spatio-temporal learning via infinite-dimensional Bayesian filtering and smoothing. IEEE Signal Processing Magazine, 30(4):51–61. S¨ arkk¨ a, S. (2013). Bayesian Filtering and Smoothing. Cambridge University Press. Cambridge, UK. S¨ arkk¨ a, S., and Solin, A. (2019). Applied Stochastic Differential
Solin, A. (2016). Stochastic Differential Equation Methods for Spatio-Temporal Gaussian Process Regression. Doctoral dissertation, Aalto University.
State space methods for temporal GPs Arno Solin 43/44
The examples and methods presented on this lecture are presented in greater detail in the following works: Durrande, N., Adam, V., Bordeaux, L., Eleftheriadis, E., Hensman, J. (2019). Banded matrix operators for Gaussian Markov models in the automatic differentiation era. International Conference on Artificial Intelligence and Statistics (AISTATS). PMLR 89:2780–2789. Nickisch, H., Solin, A., and Grigorievskiy, A. (2018). State apace Gaussian processes with non-Gaussian likelihood. International Conference on Machine Learning (ICML). PMLR 80:3789–3798. Solin, A., Hensman, J., and Turner, R.E. (2018). Infinite-horizon Gaussian processes. Advances in Neural Information Processing Systems (NeurIPS), pages 3490–3499. Hou, Y., Kannala, J. and Solin, A. (2019). Multi-view stereo by temporal nonparametric fusion. International Conference on Computer Vision (ICCV).
State space methods for temporal GPs Arno Solin 44/44
◮ Homepage: http://arno.solin.fi ◮ Twitter: @arnosolin
arkk¨ a and A. Solin (2019). Applied Stochastic Differential
Book PDF and codes for replicating examples available online.