Time series vs. point processes for space-time surveillance in - - PowerPoint PPT Presentation

time series vs point processes for space time
SMART_READER_LITE
LIVE PREVIEW

Time series vs. point processes for space-time surveillance in - - PowerPoint PPT Presentation

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References Time series vs. point processes for space-time surveillance in public health ohle 1 , 2 Thais Correa 3 Michael H 1 Department of Statistics,


slide-1
SLIDE 1

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Time series vs. point processes for space-time surveillance in public health

Michael H¨

  • hle1,2

Thais Correa3

1Department of Statistics, Ludwig-Maximilians-Universit¨

at M¨ unchen, Germany

2Munich Center of Health Sciences, Germany 3Department of Mathematics, Universidade Federal de Ouro Preto, Brazil

Joint workshop of 3 IBS German Region workgroups L¨ ubeck, 02-05 December, 2009

  • M. H¨
  • hle

1/ 28

slide-2
SLIDE 2

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Outline

1

Motivation

2

Case study: Invasive Meningococcal Disease

3

Temporal Monitoring

4

Spatio-temporal Monitoring

5

Discussion

  • M. H¨
  • hle

2/ 28

slide-3
SLIDE 3

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Motivation

Consider outbreak detection within the framework of statistical process control Contrast the count data time series approach in H. and Paul (2008) with the spatio-temporal point process approach in Assun¸ c´ ao and Correa (2009). Idea: extend the spatio-temporal method to a regression model based approach. Hence,

no quantitative comparisons a few qualitative considerations

Aim of the talk is thus more to introduce the two views on

  • utbreak detection
  • M. H¨
  • hle

3/ 28

slide-4
SLIDE 4

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Invasive Meningococcal Disease (IMD) in Germany

IMD is an infectious bacterial disease causing life-threatening meningitis and sepsis conditions. Co-operation with the German National Reference for Meningococci hosted at the Institute for Hygiene and Microbiology at the University of W¨ urzburg. Data consists of 336 cases of finetype B:P1.7-2,4:F1-5 (abbrv. B) and 300 cases of C:P1.5,2:F3-3 (abbrv. C) during 2002–2008 in Germany Research questions:

Perform a temporal, spatial and spatio-temporal routine surveillance of disease cases to detect emerging clusters Quantify differences in the dynamics of the two finetypes

  • M. H¨
  • hle

4/ 28

slide-5
SLIDE 5

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

IMD in Germany – Temporal Incidence

Times series of monthly number of counts for each finetype aggregated for entire Germany:

time

  • No. infected

2002 1 2004 1 2006 1 2008 1 5 10 15 B time

  • No. infected

2002 1 2004 1 2006 1 2008 1 5 10 15 C

  • M. H¨
  • hle

5/ 28

slide-6
SLIDE 6

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

IMD in Germany – Spatial Incidence

Spatial resolution at postcode level. Multiplicity indicated by the radius of the circles:

1000 2000 3000 4000 5000 6000 7000 8000

  • (a) Finetype B

1000 2000 3000 4000 5000 6000 7000 8000

  • (b) Finetype C

Also shown are the population densities (persons per km2) of the 413 administrative regions in Germany.

  • M. H¨
  • hle

6/ 28

slide-7
SLIDE 7

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

CUSUM based Temporal Monitoring (1)

Prospective change-point detection in GLMs using likelihood ratio cumulative sum (CUSUM) methods: TA = min

n≥1 {r(Xn) ≥ A} ,

r(Xn) = max

1≤k≤n

fk(X1, . . . , Xn) f∞(X1, . . . , Xn) The joint distribution f∞(xn) is modelled by independent variables given by a log-linear Poisson model, i.e. f∞(xn) =

n

  • t=1

f (xt; β) Null-hypothesis: The time varying mean is µt = E∞(Xt) e.g. modeled as log µt = β0 + β1 · t +

S

  • s=1
  • β2s sin

2π 12 t

  • + β2s+1 cos

2π 12 t

  • .
  • M. H¨
  • hle

7/ 28

slide-8
SLIDE 8

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

CUSUM based Temporal Monitoring (2)

Alternative-hypothesis: The mean in the model for fk(x) is specified as Ek(Xt) = µt · κI(t≥k), with I(·) being the indicator function and κ > 1 the prespecified relative change to detect optimal against. Historic data are used to estimate the regression parameters β under the null-model. In the subsequent monitoring the resulting estimation uncertainty is ignored. In case of fixed β and κ the loglikelihood ratio monitoring can be performed by the recursive form r0 = 0, rn = max

  • 0, rn−1 + log

fn(xn) f∞(xn)

  • .
  • M. H¨
  • hle

8/ 28

slide-9
SLIDE 9

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Results: Monitoring Meningococci Finetype C

Use first 3 years to estimate β in a Poisson GLM with S = 1. Monitor with A = 4.5 and κ = 2.

time (months)

  • No. infected

2005 1 2006 1 2007 1 2008 1 2 4 6 8 10 µt κµt

A Markov chain approximation (H., 2010) yields median∞(TA) = 226 and P∞(TA ≤ 48) = 0.11.

  • M. H¨
  • hle

9/ 28

slide-10
SLIDE 10

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Spatio-temporal monitoring (1) – Setup

Let X be a Poisson process in the three-dimensional region W × [0, T] where W ⊆ R2 and having space-time intensity function λ(s, t). Let N(A) denote the random number of events of X in region A ⊆ W × [0, T] and define µ(A) = E(N(A)) =

  • A

λ(s, t) ds dt. In other words we assume N(A) ∼ Po(µ(A)). Furthermore, we will assume a separable intensity λ(s, t) = µ λS(s) λT(t), with µ > 0, and λS(s) and λT(t) denoting spatial and temporal marginal densities.

  • M. H¨
  • hle

10/ 28

slide-11
SLIDE 11

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Spatio-temporal monitoring (2) – Hypothesis

Null

Intensity λ(s, t) = µ λS(s) λT(t).

Alternative

For a time τ > 0 and a cylinder Cτ we have λ(s, t) = µ λS(s) λT(t) (1 + ε ICτ (s, t)) where ICτ (s, t) is an indicator function for (s, t) ∈ Cτ and ε > 0 is the predetermined relative intensity change.

  • M. H¨
  • hle

11/ 28

slide-12
SLIDE 12

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Spatio-temporal monitoring (3) – Hypothesis

We constrain τ to be equal to one of the observed event times and let the cylinder be centered in the event location. We only consider alive cylinders, i.e. Ck,n = B(sk, ρ) × (tk, tn], where tn is the last event and ρ > 0 is a predetermined radius.

W B(sk, ρ) x y tk ρ tn t

  • M. H¨
  • hle

12/ 28

slide-13
SLIDE 13

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Spatio-temporal monitoring (4) – Likelihood

The likelihood of observations {(si, ti); i = 1, . . . , n} under the null-hypothesis is L∞(n) = n

  • i=1

λ(si, ti)

  • exp
  • W
  • [0,T]

λ(s, t) ds dt

  • The likelihood under the alternative with cylinder Ck,n is

Lk(n) = n

  • i=1

λ(si, ti) (1 + ε ICk,n(si, ti))

  • · exp
  • W
  • [0,T]

λ(s, t) ds dt − ε

  • Ck,n

λ(s, t) ds dt

  • M. H¨
  • hle

13/ 28

slide-14
SLIDE 14

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Likelihood ratio based changepoint detection (1)

CUSUM test statistic Rn = max

1≤k≤n

Lk(n) L∞(n) = max

1≤k≤n Λk,n,

when defining Λk,n = Lk(n)/L∞(n). Alternatively, the Shiryaev-Roberts test statistic is Rn =

n

  • k=1

Λk,n With the specific pair of hypothesis one obtains Λk,n =

n

  • k=1

(1 + ε)N(Ck,n) exp (−ε µ(Ck,n))

  • M. H¨
  • hle

14/ 28

slide-15
SLIDE 15

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Likelihood ratio based changepoint detection (2)

To compute Λk,n, the unknown µ(Ck,n) is replaced by the estimator: ˆ µ(Ck,n) = 1 n · N(B(sk, ρ) × (0, tn]) · N(W × (tk, tn]). Changepoint detection: For increasing n compute Rn based on the ˆ Λk,n’s until Rn is above a predefined threshold A. From theory on the Shiryaev-Roberts detector, we know that the average in-control run-length of the detector will be larger

  • r equal to A. For the CUSUM method no such statements

are immediately available.

  • M. H¨
  • hle

15/ 28

slide-16
SLIDE 16

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Likelihood ratio based changepoint detection (3)

Shiryaev-Roberts: Once Rn ≥ A, we assume that the 1 ≤ k ≤ n with the largest contribution Λk,n determines Ck,n. The resulting Ck,n is denoted the cluster estimate. CUSUM: The cluster estimate is immediate from max

1≤k≤n Λk,n.

  • M. H¨
  • hle

16/ 28

slide-17
SLIDE 17

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Recursive Computations (1)

For every new n, all Λk,n terms in the summation of Rn need to be re-computed. However, by careful consideration one can show that these can also be performed iteratively for k < n + 1: N(Ck,n+1) = N(Ck,n) + Ik,n+1 µ(Ck,n+1) = µ(Ck,n) + µ(B(sk, ρ) × (tn, tn+1]) where Ik,n = I(||sn − sk|| ≤ ρ) for k < n and hence Λk,k+1 = Λk,n(1 + ε)Ik,n+1 exp (−εµ(B(sk, ρ) × (tn, tn+1])) .

  • M. H¨
  • hle

17/ 28

slide-18
SLIDE 18

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Recursive Computations (2)

Similarly, the calculation of ˆ µ(Ck,n+1) can be performed iteratively. In summary, Rn+1 can be computed by simple updating of the values of ˆ Λk,n together with a few simple additional calculations.

  • M. H¨
  • hle

18/ 28

slide-19
SLIDE 19

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Results (1)

SR procedure with A = 300 and five different radii ρ and ε = 2.

100 200 300 400 500 600 date of event Rn 2002 2003 2004 2005 2006 2007 2008 2009 0.5 1 1.5 2 2.5

  • M. H¨
  • hle

19/ 28

slide-20
SLIDE 20

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Results for Finetype C (2)

Comparing Shiryaev-Roberts and CUSUM detection when ε = 2 and ρ = 1.

100 200 300 400 500 600 700 date of event Rn 2002 2004 2006 2008 50 100 150 200 date of event Rn 2002 2004 2006 2008

(a) Shiryaev-Roberts (b) CUSUM

  • M. H¨
  • hle

20/ 28

slide-21
SLIDE 21

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Results for Finetype C (3)

SR monitoring for ε = 2 and ρ = 1 with restarting after alarms.

50 150 250 date of event Rn 2002 2003 2004 2005 2006 2007 2008 2009

The first alarm occurs 22-Nov-2005 with the corresponding cluster estimate starting at 29-Mar-2005 (24 events).

  • M. H¨
  • hle

21/ 28

slide-22
SLIDE 22

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Results for Finetype C (4)

Spatial view of the obtained alarm with ε = 2 and ρ = 1.

6 8 10 12 14 46 48 50 52 54 56 58

2003

  • 6

8 10 12 14 46 48 50 52 54 56 58

2004

  • 6

8 10 12 14 46 48 50 52 54 56 58

2005

  • M. H¨
  • hle

22/ 28

slide-23
SLIDE 23

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Discussion and Outlook (1)

The presented space-time cluster detection methods is an alternative to the space-time SatScan procedure (Kulldorff, 2001) for point referenced data. No population at risk information is needed, only cases. The time-continuous problem is converted to a time discrete problem by considering the embedded sequence of events analysed within the context of statistical process control. Function stcd in the R package surveillance (H., 2007) implements the approach.

  • M. H¨
  • hle

23/ 28

slide-24
SLIDE 24

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Discussion and Outlook (2)

Currently, the radius ρ is prespecified. A quantitative procedure for its choice could be ˆ ρk,n = arg sup

ρ>0

Λk,n(ρ), i.e. as a generalized likelihood ratio scheme (H. and Paul, 2008). Simulations in Assun¸ c´ ao and Correa (2009) show that the space-time procedure is in most cases somewhat insensitive to the choice of radius.

  • M. H¨
  • hle

24/ 28

slide-25
SLIDE 25

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Discussion and Outlook (3)

Space-time process to time discrete (univariate) process Xt = N(It × W ), t = 1, 2, . . . , n, where It = ((t − 1)h, th] and h is the length. Under the null and alternative with cylinder (tk, tn] × W : Xt ∼ Po

  • µ
  • It

λT(u)du

  • = Po(µt)

Xt,k ∼ Po

  • µt + εµ
  • It∩(tk,tn]

λT(u)du

  • = Po
  • µtκI(t≥k)

, where = if the cylinder is (kh, tn] × W and κ = 1 + ε.

  • M. H¨
  • hle

25/ 28

slide-26
SLIDE 26

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Discussion and Outlook (4)

Count data time series: Alternative as an auto-regressive epidemic model of Held et al. (2005) Ek(µt) = µt + I(t ≥ k)λyt−1, t > 1, λ > 0 Continuous time multivariate susceptible-infectious-recovered model (H., 2009): Conditional intensity function λi(t|Ft) for a state change from susceptible to infectious of individual i: λi(t|Ft) = exp

  • h0(t) + zi(t)Tβ
  • + I(t ≥ τ)
  • j∈I(t)

f (||si − sj||), for i = 1, . . . , N, locations si known and time τ > 0.

  • M. H¨
  • hle

26/ 28

slide-27
SLIDE 27

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Course Advertisement

Course: Spatial Statistics in R Dates: 25-26 Feb 2010 Lecturers: Thomas Kneib and Michael H¨

  • hle

Location: Department of Statistics, Ludwig-Maximilians-Universit¨ at M¨ unchen, Munich, Germany Language: German Contents:

1

Visualization of spatial data with R. Interfacing with the open source GIS program GRASS.

2

Classical geostatistics

3

Markov random fields

4

Point processes

URL: http://www.statistik.lmu.de/R/

  • M. H¨
  • hle

27/ 28

slide-28
SLIDE 28

Motivation Case Study Temporal Monitoring Spatio-temporal Monitoring Discussion References

Literature I

Assun¸ c´ ao, R. and Correa, T. (2009). Surveillance to detect emerging space-time

  • clusters. Computational Statistics & Data Analysis, 53(8):2817–2830.

Held, L., H¨

  • hle, M., and Hofmann, M. (2005). A statistical framework for the analysis
  • f multivariate infectious disease surveillance data. Statistical Modelling,

5:187–199. H¨

  • hle, M. (2007). surveillance: An R package for the monitoring of infectious
  • diseases. Computational Statistics, 22(4):571–582.

  • hle, M. (2009). Additive-multiplicative regression models for spatio-temporal
  • epidemics. Biometrical Journal. . To appear.

  • hle, M. (2010). Changepoint detection in categorical time series. In Kneib, T. and

Tutz, G., editors, Statistical Modelling and Regression Structures – Festschrift in Honour of Ludwig Fahrmeir. Springer. To appear. H¨

  • hle, M. and Paul, M. (2008). Count data regression charts for the monitoring of

surveillance time series. Computational Statistics & Data Analysis, 52(9):4357–4368. Kulldorff, M. (2001). Prospective time periodic geographical disease surveillance using a scan statistic. Journal of the Royal Statistical Society, Series A, 164:61–72.

  • M. H¨
  • hle

28/ 28