Bayesian regression with a categorical predictor Alicia Johnson - - PowerPoint PPT Presentation

bayesian regression with a categorical predictor
SMART_READER_LITE
LIVE PREVIEW

Bayesian regression with a categorical predictor Alicia Johnson - - PowerPoint PPT Presentation

DataCamp Bayesian Modeling with RJAGS BAYESIAN MODELING WITH RJAGS Bayesian regression with a categorical predictor Alicia Johnson Associate Professor, Macalester College DataCamp Bayesian Modeling with RJAGS Chapter 4 goals Incorporate


slide-1
SLIDE 1

DataCamp Bayesian Modeling with RJAGS

Bayesian regression with a categorical predictor

BAYESIAN MODELING WITH RJAGS

Alicia Johnson

Associate Professor, Macalester College

slide-2
SLIDE 2

DataCamp Bayesian Modeling with RJAGS

Chapter 4 goals

Incorporate categorical predictors into Bayesian models Engineer multivariate Bayesian regression models Extend our methodology for Normal regression models to generalized linear models: Poisson regression

slide-3
SLIDE 3

DataCamp Bayesian Modeling with RJAGS

Rail-trail volume

Goal: Explore daily volume on a rail-trail in Massachusetts.

[1] Photo courtesy commons.wikimedia.org

slide-4
SLIDE 4

DataCamp Bayesian Modeling with RJAGS

Modeling volume

Y = trail volume (# of users) on day i Model Y ∼ N(m ,s )

i i i 2

slide-5
SLIDE 5

DataCamp Bayesian Modeling with RJAGS

Modeling volume by weekday

Y = trail volume (# of users) on day i X = 1 for weekdays, 0 for weekends Model Y ∼ N(m ,s )

i i i i 2

slide-6
SLIDE 6

DataCamp Bayesian Modeling with RJAGS

Modeling volume by weekday

Y = trail volume (# of users) on day i X = 1 for weekdays, 0 for weekends Model Y ∼ N(m ,s )

i i i i 2

slide-7
SLIDE 7

DataCamp Bayesian Modeling with RJAGS

Modeling volume by weekday

Y = trail volume (# of users) on day i X = 1 for weekdays, 0 for weekends Model Y ∼ N(m ,s ) m = a + bX

i i i i 2 i i

slide-8
SLIDE 8

DataCamp Bayesian Modeling with RJAGS

Modeling volume by weekday

Y = trail volume (# of users) on day i X = 1 for weekdays, 0 for weekends Model Y ∼ N(m ,s ) m = a + bX a = typical weekend volume

i i i i 2 i i

slide-9
SLIDE 9

DataCamp Bayesian Modeling with RJAGS

Modeling volume by weekday

Y = trail volume (# of users) on day i X = 1 for weekdays, 0 for weekends Model Y ∼ N(m ,s ) m = a + bX a = typical weekend volume a + b = typical weekday volume b = contrast between typical weekday vs weekend volume s = residual standard deviation

i i i i 2 i i

slide-10
SLIDE 10

DataCamp Bayesian Modeling with RJAGS

Priors for a & b

Typical weekend volume is most likely around 400 users per day, but possibly as low as 100 or as high as 700 users. We lack certainty about how weekday volume compares to weekend volume. It could be more, it could be less.

slide-11
SLIDE 11

DataCamp Bayesian Modeling with RJAGS

Prior for s

The standard deviation in volume from day to day (whether on weekdays or weekends) is equally likely to be anywhere between 0 and 200 users.

slide-12
SLIDE 12

DataCamp Bayesian Modeling with RJAGS

Bayesian model of volume by weekday status

Y ∼ N(m ,s ) m = a + bX a ∼ N(400,100 ) b ∼ N(0,200 ) s ∼ Unif(0,200)

i i 2 i i 2 2

slide-13
SLIDE 13

DataCamp Bayesian Modeling with RJAGS

DEFINE the Bayesian model in RJAGS

Y ∼ N(m ,s ) m = a + bX a ∼ N(400,100 ) b ∼ N(0,200 ) s ∼ Unif(0,200)

i i 2 i i 2 2

rail_model_1 <- "model{ # Likelihood model for Y[i] # Prior models for a, b, s }"

slide-14
SLIDE 14

DataCamp Bayesian Modeling with RJAGS

DEFINE the Bayesian model in RJAGS

Y ∼ N(m ,s ) m = a + bX a ∼ N(400,100 ) b ∼ N(0,200 ) s ∼ Unif(0,200)

i i 2 i i 2 2

rail_model_1 <- "model{ # Likelihood model for Y[i] for(i in 1:length(Y)) { Y[i] ~ dnorm(m[i], s^(-2)) } # Prior models for a, b, s a ~ dnorm(400, 100^(-2)) s ~ dunif(0, 200) }"

slide-15
SLIDE 15

DataCamp Bayesian Modeling with RJAGS

DEFINE the Bayesian model in RJAGS

m[i] <- a + b[X[i]] X[1] = weekend, X[2] = weekday b has 2 levels: b[1], b[2]

weekend trend (m = a)

m[i] <- a + b[1]

i

rail_model_1 <- "model{ # Likelihood model for Y[i] for(i in 1:length(Y)) { Y[i] ~ dnorm(m[i], s^(-2)) m[i] <- a + b[X[i]] } # Prior models for a, b, s a ~ dnorm(400, 100^(-2)) s ~ dunif(0, 200) }"

slide-16
SLIDE 16

DataCamp Bayesian Modeling with RJAGS

DEFINE the Bayesian model in RJAGS

m[i] <- a + b[X[i]] X[1] = weekend, X[2] = weekday b has 2 levels: b[1], b[2]

weekend trend (m = a)

m[i] <- a + b[1] b[1] <- 0

i

rail_model_1 <- "model{ # Likelihood model for Y[i] for(i in 1:length(Y)) { Y[i] ~ dnorm(m[i], s^(-2)) m[i] <- a + b[X[i]] } # Prior models for a, b, s a ~ dnorm(400, 100^(-2)) s ~ dunif(0, 200) b[1] <- 0 }"

slide-17
SLIDE 17

DataCamp Bayesian Modeling with RJAGS

DEFINE the Bayesian model in RJAGS

m[i] <- a + b[X[i]] X[1] = weekend, X[2] = weekday b has 2 levels: b[1], b[2]

weekend trend (m = a)

m[i] <- a + b[1] b[1] <- 0

weekday trend (m = a + b)

m[i] <- a + b[2] b[2] ~ dnorm(0, 200^(-2))

i i

rail_model_1 <- "model{ # Likelihood model for Y[i] for(i in 1:length(Y)) { Y[i] ~ dnorm(m[i], s^(-2)) m[i] <- a + b[X[i]] } # Prior models for a, b, s a ~ dnorm(400, 100^(-2)) s ~ dunif(0, 200) b[1] <- 0 b[2] ~ dnorm(0, 200^(-2)) }"

slide-18
SLIDE 18

DataCamp Bayesian Modeling with RJAGS

Let's practice!

BAYESIAN MODELING WITH RJAGS

slide-19
SLIDE 19

DataCamp Bayesian Modeling with RJAGS

Multivariate Bayesian regression

BAYESIAN MODELING WITH RJAGS

Alicia Johnson

Associate Professor, Macalester College

slide-20
SLIDE 20

DataCamp Bayesian Modeling with RJAGS

Modeling volume

Y = trail volume (# of users) on day i

i

[1] Photo courtesy commons.wikimedia.org

slide-21
SLIDE 21

DataCamp Bayesian Modeling with RJAGS

Modeling volume by weekday

Y = trail volume (# of users) on day i X = 1 for weekdays, 0 for weekends

i i

slide-22
SLIDE 22

DataCamp Bayesian Modeling with RJAGS

Modeling volume by temperature

Y = trail volume (# of users) on day i Z = high temperature on day i (in F)

i i ∘

slide-23
SLIDE 23

DataCamp Bayesian Modeling with RJAGS

Modeling volume by temperature & weekday

Y = trail volume (# of users) on day i X = 1 for weekdays, 0 for weekends Z = high temperature on day i (in F) Y ∼ N(m ,s ) m = a + bX + cZ Weekends: m = a + cZ Weekdays: m = (a + b) + cZ

i i i ∘ i i 2 i i i i i i i

slide-24
SLIDE 24

DataCamp Bayesian Modeling with RJAGS

Modeling volume by temperature & weekday

Y = trail volume (# of users) on day i X = 1 for weekdays, 0 for weekends Z = high temperature on day i (in F) Y ∼ N(m ,s ) m = a + bX + cZ Weekends: m = a + cZ Weekdays: m = (a + b) + cZ

i i i ∘ i i 2 i i i i i i i

slide-25
SLIDE 25

DataCamp Bayesian Modeling with RJAGS

Modeling volume by temperature & weekday

m = a + bX + cZ Weekends: m = a + cZ Weekdays: m = (a + b) + cZ a = weekend y-intercept a + b = weekday y-intercept b = contrast between weekday vs weekend y-intercepts c = common slope s = residual standard deviation

i i i i i i i

slide-26
SLIDE 26

DataCamp Bayesian Modeling with RJAGS

Priors for a and b

We lack certainty about the y-intercept for the relationship between temperature & weekend volume. We lack certainty about how typical volume compares on weekdays vs weekends of similar temperature.

slide-27
SLIDE 27

DataCamp Bayesian Modeling with RJAGS

Priors for c and s

Whether on weekdays or weekends, we lack certainty about the association between trail volume & temperature. The typical deviation from the trend is equally likely to be anywhere between 0 and 200 users.

slide-28
SLIDE 28

DataCamp Bayesian Modeling with RJAGS

Bayesian model of volume by weekday status

Y ∼ N(m ,s ) m = a + bX + cZ a ∼ N(0,200 ) b ∼ N(0,200 ) c ∼ N(0,20 ) s ∼ Unif(0,200)

i i 2 i i i 2 2 2

slide-29
SLIDE 29

DataCamp Bayesian Modeling with RJAGS

DEFINE the Bayesian model in RJAGS

Y ∼ N(m ,s ) m = a + bX + cZ a ∼ N(0,200 ) b ∼ N(0,200 ) c ∼ N(0,20 ) s ∼ Unif(0,200)

i i 2 i i i 2 2 2

rail_model_2 <- "model{ # Likelihood model for Y[i] for(i in 1:length(Y)) { Y[i] ~ dnorm(m[i], s^(-2)) m[i] <- a + b[X[i]] + c * Z[i] } # Prior models for a, b, c, s a ~ dnorm(0, 200^(-2)) b[1] <- 0 b[2] ~ dnorm(0, 200^(-2)) c ~ dnorm(0, 20^(-2)) s ~ dunif(0, 200) }"

slide-30
SLIDE 30

DataCamp Bayesian Modeling with RJAGS

Let's practice!

BAYESIAN MODELING WITH RJAGS

slide-31
SLIDE 31

DataCamp Bayesian Modeling with RJAGS

Poisson regression

BAYESIAN MODELING WITH RJAGS

Alicia Johnson

Associate Professor, Macalester College

slide-32
SLIDE 32

DataCamp Bayesian Modeling with RJAGS

Normal likelihood structure

Y = volume (# of users) on a given day Y ∼ N(m,s ) Technically... The Normal model assumes Y has a continuous scale and can be negative. But Y is a discrete count and cannot be negative.

2

slide-33
SLIDE 33

DataCamp Bayesian Modeling with RJAGS

The Poisson model

Y = volume (# of users) on a given day Y ∼ Pois(l) Y is the # of independent events that

  • ccur in a fixed interval (0, 1, 2,...).

Rate parameter l represents the typical # of events per time interval (l > 0).

slide-34
SLIDE 34

DataCamp Bayesian Modeling with RJAGS

The Poisson model

Y = volume (# of users) on a given day Y ∼ Pois(l) Y is the # of independent events that

  • ccur in a fixed interval (0, 1, 2,...).

Rate parameter l represents the typical # of events per time interval (l > 0).

slide-35
SLIDE 35

DataCamp Bayesian Modeling with RJAGS

The Poisson model

Y = volume (# of users) on a given day Y ∼ Pois(l) Y is the # of independent events that

  • ccur in a fixed interval (0, 1, 2,...).

Rate parameter l represents the typical # of events per time interval (l > 0).

slide-36
SLIDE 36

DataCamp Bayesian Modeling with RJAGS

The Poisson model

Y = volume (# of users) on a given day Y ∼ Pois(l) Y is the # of independent events that

  • ccur in a fixed interval (0, 1, 2,...).

Rate parameter l represents the typical # of events per time interval (l > 0).

slide-37
SLIDE 37

DataCamp Bayesian Modeling with RJAGS

Poisson regression

Y ∼ Pois(l ) where l > 0

i i i

slide-38
SLIDE 38

DataCamp Bayesian Modeling with RJAGS

Poisson regression

Y ∼ Pois(l ) where l > 0 l = a + bX + cZ

i i i i i i

slide-39
SLIDE 39

DataCamp Bayesian Modeling with RJAGS

Poisson regression

Y ∼ Pois(l ) where l > 0 l = a + bX + cZ A problem: Linking l directly to the linear model assumes l can be negative.

i i i i i i i i

slide-40
SLIDE 40

DataCamp Bayesian Modeling with RJAGS

Poisson regression

Y ∼ Pois(l ) where l > 0 log(l ) = a + bX + cZ A solution: Use a log link function to link l to the linear model. In turn: l = e

i i i i i i i i a+bX +cZ

i i

slide-41
SLIDE 41

DataCamp Bayesian Modeling with RJAGS

Poisson regression

Y ∼ Pois(l ) where l > 0 log(l ) = a + bX + cZ A solution: Use a log link function to link l to the linear model. In turn: l = e

i i i i i i i i a+bX +cZ

i i

slide-42
SLIDE 42

DataCamp Bayesian Modeling with RJAGS

Poisson regression in RJAGS

Y ∼ Pois(l ) log(l ) = a + bX + cZ a ∼ N(0,200 ) b ∼ N(0,2 ) c ∼ N(0,2 )

i i i i i 2 2 2

poisson_model <- "model{ # Likelihood model for Y[i] # Prior models for a, b, c }"

slide-43
SLIDE 43

DataCamp Bayesian Modeling with RJAGS

Poisson regression in RJAGS

Y ∼ Pois(l ) log(l ) = a + bX + cZ a ∼ N(0,200 ) b ∼ N(0,2 ) c ∼ N(0,2 )

i i i i i 2 2 2

poisson_model <- "model{ # Likelihood model for Y[i] # Prior models for a, b, c a ~ dnorm(0, 200^(-2)) b[1] <- 0 b[2] ~ dnorm(0, 2^(-2)) c ~ dnorm(0, 2^(-2)) }"

slide-44
SLIDE 44

DataCamp Bayesian Modeling with RJAGS

Poisson regression in RJAGS

Y ∼ Pois(l ) log(l ) = a + bX + cZ a ∼ N(0,200 ) b ∼ N(0,2 ) c ∼ N(0,2 )

i i i i i 2 2 2

poisson_model <- "model{ # Likelihood model for Y[i] for(i in 1:length(Y)) { Y[i] ~ dpois(l[i]) } # Prior models for a, b, c a ~ dnorm(0, 200^(-2)) b[1] <- 0 b[2] ~ dnorm(0, 2^(-2)) c ~ dnorm(0, 2^(-2)) }"

slide-45
SLIDE 45

DataCamp Bayesian Modeling with RJAGS

Poisson regression in RJAGS

Y ∼ Pois(l ) log(l ) = a + bX + cZ a ∼ N(0,200 ) b ∼ N(0,2 ) c ∼ N(0,2 )

i i i i i 2 2 2

poisson_model <- "model{ # Likelihood model for Y[i] for(i in 1:length(Y)) { Y[i] ~ dpois(l[i]) log(l[i]) <- a + b[X[i]] + c*Z[i] } # Prior models for a, b, c a ~ dnorm(0, 200^(-2)) b[1] <- 0 b[2] ~ dnorm(0, 2^(-2)) c ~ dnorm(0, 2^(-2)) }"

slide-46
SLIDE 46

DataCamp Bayesian Modeling with RJAGS

Caveats

Y ∼ Pois(l ) Assumption: Among days with similar temperatures and weekday status, variance in Y is equal to the mean of Y . Our data demonstrate potential overdispersion - the variance is larger than the mean. Though not perfect, this model is an OK place to start.

i i i

slide-47
SLIDE 47

DataCamp Bayesian Modeling with RJAGS

Let's practice!

BAYESIAN MODELING WITH RJAGS

slide-48
SLIDE 48

DataCamp Bayesian Modeling with RJAGS

Conclusion

BAYESIAN MODELING WITH RJAGS

Alicia Johnson

Associate Professor, Macalester College

slide-49
SLIDE 49

DataCamp Bayesian Modeling with RJAGS

Bayesian modeling with RJAGS

Define, compile, & simulate intractable Bayesian models. Explore the Markov chain mechanics behind RJAGS simulation.

slide-50
SLIDE 50

DataCamp Bayesian Modeling with RJAGS

The power of Bayesian modeling

Combine insights from your data and priors to inform posterior insights.

slide-51
SLIDE 51

DataCamp Bayesian Modeling with RJAGS

The power of Bayesian modeling

Combine insights from your data and priors to inform posterior insights. Conduct intuitive posterior inference: posterior credible intervals & probabilities.

slide-52
SLIDE 52

DataCamp Bayesian Modeling with RJAGS

Foundational, flexible, & generalizable Bayesian models

my_model <- "model{ # Likelihood model for(i in 1:length(Y)) { Y[i] ~ dnorm(m, s^(-2)) } # Prior models m ~ dnorm(...) s ~ dunif(...) }"

slide-53
SLIDE 53

DataCamp Bayesian Modeling with RJAGS

Foundational, flexible, & generalizable Bayesian models

my_model <- "model{ # Likelihood model for(i in 1:length(Y)) { Y[i] ~ dnorm(m[i], s^(-2)) m[i] <- a + b * X[i] } # Prior models a ~ dnorm(...) b ~ dnorm(...) s ~ dunif(...) }"

slide-54
SLIDE 54

DataCamp Bayesian Modeling with RJAGS

Foundational, flexible, & generalizable Bayesian models

my_model <- "model{ # Likelihood model for(i in 1:length(Y)) { Y[i] ~ dnorm(m[i], s^(-2)) m[i] <- a + b[X[i]] } # Prior models a ~ dnorm(...) b[1] <- 0 b[2] ~ dnorm(...) s ~ dunif(...) }"

slide-55
SLIDE 55

DataCamp Bayesian Modeling with RJAGS

Foundational, flexible, & generalizable Bayesian models

my_model <- "model{ # Likelihood model for(i in 1:length(Y)) { Y[i] ~ dnorm(m[i], s^(-2)) m[i] <- a + b[X[i]] + c * Z[i] } # Prior models a ~ dnorm(...) b[1] <- 0 b[2] ~ dnorm(...) c ~ dnorm(...) s ~ dunif(...) }"

slide-56
SLIDE 56

DataCamp Bayesian Modeling with RJAGS

Foundational, flexible, & generalizable Bayesian models

my_model <- "model{ # Likelihood model for(i in 1:length(Y)) { Y[i] ~ dpois(l[i]) log(l[i]) <- a + b[X[i]] + c*Z[i] } # Prior models a ~ dnorm(...) b[1] <- 0 b[2] ~ dnorm(...) c ~ dnorm(...) }"

slide-57
SLIDE 57

DataCamp Bayesian Modeling with RJAGS

Thank you!

BAYESIAN MODELING WITH RJAGS