Evaluation, data science, and the causal revolution January 15, - - PowerPoint PPT Presentation

evaluation data science and the causal revolution
SMART_READER_LITE
LIVE PREVIEW

Evaluation, data science, and the causal revolution January 15, - - PowerPoint PPT Presentation

Evaluation, data science, and the causal revolution January 15, 2020 PMAP 8521: Program Evaluation for Public Service Andrew Young School of Policy Studies Georgia State University Spring 2020 Plan for today Data science and public


slide-1
SLIDE 1

Evaluation, data science, and the causal revolution

January 15, 2020

PMAP 8521: Program Evaluation for Public Service Andrew Young School of Policy Studies • Georgia State University Spring 2020

slide-2
SLIDE 2

Plan for today

Data science and public service Evidence, evaluation, and causation Class details Getting staRted!

slide-3
SLIDE 3

Data science and public service

slide-4
SLIDE 4
slide-5
SLIDE 5

“To responsibly unleash the power

  • f data to benefit

all Americans”

Data and government

slide-6
SLIDE 6
slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11

How do you use all this data to make the world better?

slide-12
SLIDE 12

Collecting and analyzing data from a representative sample in order to make inferences about a whole population

What is “statistics”?

slide-13
SLIDE 13

What is “data science”?

Big data Machine learning A r t i f i c i a l i n t e l l i g e n c e Data mining PR-speak for “statistics” C l

  • u

d c

  • m

p u t i n g Algorithms Neural networks

slide-14
SLIDE 14

Turning raw data into understanding, insight, and knowledge

Collect Analyze Communicate

What is “data science”?

slide-15
SLIDE 15

Collect Analyze Communicate

Statistics

What’s the difference?

slide-16
SLIDE 16

What is “program evaluation”?

Measuring the effect of social programs on society

Data and statistics Communication Causal inference

(econometrics)

slide-17
SLIDE 17

Evidence, evaluation, and causation

slide-18
SLIDE 18

What is the relationship between social science research and public policy & administration?

slide-19
SLIDE 19

Evidence-based medicine

slide-20
SLIDE 20

Modern evidence-based medicine

Apply evidence to clinical treatment decisions Move away from clinical judgment and “craft knowledge” Is this good?

slide-21
SLIDE 21

Can we find and measure evidence for policies and programs?

slide-22
SLIDE 22

Evidence-based policy

RAND health insurance study Oregon Medicaid expansion HUD’s Moving to Opportunity Tennessee STAR

slide-23
SLIDE 23

Policy evidence industry

Jameel Poverty Action Lab (J-PAL) Campbell Collaboration

slide-24
SLIDE 24

Should we have evidence for every policy or program? No! Science vs. art/craft/intuition

slide-25
SLIDE 25
slide-26
SLIDE 26

Where does program evaluation fit with all this?

It’s a method for collecting evidence for policies and programs

slide-27
SLIDE 27

Types of evaluation

Needs assessment Design and theory assessment Process evaluation and monitoring Impact evaluation Efficiency evaluation (CBA)

slide-28
SLIDE 28 to all schools in the district PSD Attendance Court (K–10) 4th District Juvenile Court (9–10) Meet with district social worker (11–12) No truancy Reduced risk factors for delinquency Judges PSD distributes truancy information to all families #
  • f people
who know expectations 1st citation mailed home # of 1st citations mailed 3rd citation mailed home + referral to truancy court # of 3rd citations mailed # of court attendees Alternative plan created* 2nd citation mailed home + referral to truancy school PowerPoint presentation + Explanation of state law + Instruction on PowerSchool Students and parents attend truancy school # of 2nd citations mailed # of truancy school attendees Increased commitment to school Better grades Law, parents, students, teachers, and administrators Grants Truancy Activity Outcome Input Output Logic Model Legend Adapted from Provo School District, “Truancy Program Logic Model: FY 2011–2012.” 5 unexcused absences (5 total) 5 unexcused absences (10 total) 5 unexcused absences (15 total) * Because 11th and 12th graders who receive 3rd citations are generally unable to graduate from high school, district social workers no longer attempt to increase their commitment to school. As such, any outcomes that occur as a result of the alternative plans made for these students (work study programs, career development assistance, etc.) are only tangentially related to the outcomes of the truancy program itself. The system for creating alternative plans is an entirely separate program with its own logic model, goals, and outcomes. % increase in grades and attendance
slide-29
SLIDE 29

No truancy Reduced risk factors Increased commitment to school Better grades Three phases of truancy intervention

Theories of change

Impact evaluation!

slide-30
SLIDE 30
slide-31
SLIDE 31

Theory → impact

Grades Before Program During Program After Program

Post-program grades Grades with program Grades without program Outcome change Pre-program grades

Program activities Program outcomes

slide-32
SLIDE 32

1.5 2.0 2.5 3.0 3.5 4.0 4.5

  • −10

−5 5

Weeks before/after truancy intervention Average number of absences

Lines Actual Predicted Colors 80% Confidence 95% Confidence Truancy intervention

slide-33
SLIDE 33
slide-34
SLIDE 34

Godwin’s Law for statistics

Correlation does not imply causation

Except when it does Even if it doesn’t, this phrase is useless and kills discussion

slide-35
SLIDE 35
slide-36
SLIDE 36

Correlation vs. causation

How do we figure out correlation?

Math and statistics

How do we figure out causation?

  • Philosophy. No math.
slide-37
SLIDE 37
slide-38
SLIDE 38

How do we know if X causes Y?

X causes Y if… …we intervene and change X without changing anything else… …and Y changes

slide-39
SLIDE 39

Y “listens to” X

X isn’t the only thing that causes Y A light switch causes a light to go on, but not if bulb is burned out (no Y despite X) or if the light was already on (Y without X)

slide-40
SLIDE 40

Causal relationships?

Lighting fireworks causes noise Getting an MPA increases your earnings Rooster crows are followed by sunrise Colds go away a few days after you take vitamin C

slide-41
SLIDE 41

Causation

Causation = Correlation + time order + all other factors ruled out

How do you know if you have it right? You need a philosophical model That’s what this class is for!

slide-42
SLIDE 42

The causal revolution

slide-43
SLIDE 43

Causal diagrams

Directed acyclic graphs (DAGs)

Graphical model of the process that generates the data Maps your philosophical model Fancy math (“do-calculus”) tells you what to control for to find causation

slide-44
SLIDE 44
slide-45
SLIDE 45
slide-46
SLIDE 46
slide-47
SLIDE 47
slide-48
SLIDE 48
slide-49
SLIDE 49

Break

Set up an RStudio.cloud account if you haven’t Go to https://andhs.co/rstudio to join the class workspace

slide-50
SLIDE 50

Ask me anything!

slide-51
SLIDE 51

Class details

slide-52
SLIDE 52
slide-53
SLIDE 53
slide-54
SLIDE 54

model_2sls <- iv_robust( health ~ bed_net | treatment, data = bed_nets)

slide-55
SLIDE 55

Class technology

slide-56
SLIDE 56

The tidyverse

slide-57
SLIDE 57

The tidyverse

slide-58
SLIDE 58

R code, but reads like English!

strike_damages_month <- bird_strikes %>% group_by(Month) %>% summarize(total_damages = sum(Cost, na.rm = TRUE), average_damages = mean(Cost, na.rm = TRUE)) ggplot(data = strike_damages_month, mapping = aes(x = Month, y = total_damages)) + geom_col() + scale_y_continuous(labels = dollar) + labs(x = "Month", y = "Total damages", title = "Really expensive collisions happen in the fall?", subtitle = "Don't fly in August or October?", source = "Source: FAA Wildlife Strike Database")

slide-59
SLIDE 59

Sucking

There is no way to go from knowing nothing about a subject to knowing something about a subject without going through a period of much frustration and suckiness Push through. You'll suck less.

Hadley Wickham, author of ggplot2 and the tidyverse

slide-60
SLIDE 60

Sucking

slide-61
SLIDE 61

Am I making you computer scientists?

No!

You don’t need to be a mechanic to drive a car safely You don’t need to be a computer scientist or developer to use R safely

slide-62
SLIDE 62

Learning R

slide-63
SLIDE 63

You can do this.

slide-64
SLIDE 64

Goals for the class Speak and do causation Design rigorous evaluations Change the world with data Become an expert with R

slide-65
SLIDE 65

Prerequisites

Basic algebra

Math skills

None

Computer science skills

Regression and differences in means

(ideally; you can survive without it, though)

Statistical skills

slide-66
SLIDE 66

Miscellanea

slide-67
SLIDE 67

Late work Technology Participation Other?

Class expectations

slide-68
SLIDE 68

Getting staRted!

slide-69
SLIDE 69

Goals for the class

andhs.co/survey