Visualizing Public Health Data Anamaria Crisan, MSc PhD Candidate - - PowerPoint PPT Presentation

visualizing public health data
SMART_READER_LITE
LIVE PREVIEW

Visualizing Public Health Data Anamaria Crisan, MSc PhD Candidate - - PowerPoint PPT Presentation

Visualizing Public Health Data Anamaria Crisan, MSc PhD Candidate University of British Columbia @amcrisan acrisan@cs.ubc.ca http://cs.ubc.ca/~acrisan 1 I will attempt to make two points The differences between clinical medicine and


slide-1
SLIDE 1

Visualizing Public Health Data

Anamaria Crisan, MSc PhD Candidate University of British Columbia

@amcrisan http://cs.ubc.ca/~acrisan acrisan@cs.ubc.ca

1

slide-2
SLIDE 2

I will attempt to make two points

  • The differences between clinical medicine and public health and the vis

implications

  • Provide an overview of the state of visualization within a public health domain

2

slide-3
SLIDE 3

Public Health and Clinical Medicine

3

slide-4
SLIDE 4

Clinical Medicine vs. Public Health

  • Targets individual patients
  • Targets populations

Clinical Medicine Public Health

4

slide-5
SLIDE 5

Clinical Medicine vs. Public Health

  • Targets individual patients
  • Diagnosis and treatment focused
  • Targets populations
  • Prevention and control focused

Clinical Medicine Public Health

5

slide-6
SLIDE 6

Clinical Medicine vs. Public Health

  • Targets individual patients
  • Diagnosis and treatment focused
  • Interventions are typically

pharmaceutical interventions

  • Targets populations
  • Prevention and control focused
  • Pharmaceutical, but also more

commonly environmental and behavioral interventions

Clinical Medicine Public Health

6

slide-7
SLIDE 7

Clinical Medicine vs. Public Health

  • Targets individual patients
  • Diagnosis and treatment focused
  • Interventions are typically

pharmaceutical interventions

  • Decision makers typically siloed

specialists (doctors, nurses, etc.)

  • Targets populations
  • Prevention and control focused
  • Pharmaceutical, but also more

commonly environmental and behavioral interventions

  • Decision makers diverse, not

necessarily specialists (community leaders, etc.)

Clinical Medicine Public Health

7

slide-8
SLIDE 8

Clinical Medicine vs. Public Health

  • Targets individual patients
  • Diagnosis and treatment focused
  • Interventions are typically

pharmaceutical interventions

  • Decision makers typically siloed

specialists (doctors, nurses, etc.)

  • Targets populations
  • Prevention and control focused
  • Pharmaceutical, but also more

commonly environmental and behavioral interventions

  • Decision makers diverse, not

necessarily specialists (community leaders, etc.)

Clinical Medicine Public Health

Treating lung cancer patient Anti-smoking campaign Example: Both use data, even the same data, in different ways

8

slide-9
SLIDE 9

Clin Clinic icia ians Re Researchers Pa Patients

Visualization consumers in clinical medicine

§ Currently data vis tends to emphasize clinical medicine applications and targets clinicians, researchers, and patients

9

slide-10
SLIDE 10

Nu Nurses Clin Clinic icia ians

Me Medical He Health Of Officer ers

Re Researchers Co Communit ity Le Lead aders

§ Public Health has much more multidisciplinary decision making teams

§ More data & diverse data types = more informed decision making § BUT – different stakeholder abilities to interpret data & different needs

§ Gap: few vis applications for public health

Pol Politicians Pa Patients

Visualization consumers in public health

10

slide-11
SLIDE 11

What are Public Health Data?

11

slide-12
SLIDE 12

Person Place Time

The Epidemiological Trinity

What are Public Health Data?

12

slide-13
SLIDE 13

Person Place Time

Outcomes Whole Genome Sequences (WGS) Pa Pathogen or Huma man Treatment

What are Public Health Data?

Contact & Social networks

13

slide-14
SLIDE 14

Person Place Time

Contact & Social networks Outcomes Whole Genome Sequences (WGS) Pa Pathogen or Huma man Treatment My My project: Tuberculosis (TB) WGS

What are Public Health Data?

14

slide-15
SLIDE 15

Person Place Time

Location Geographic Context

What are Public Health Data?

15

slide-16
SLIDE 16

Person Place Time

What are Public Health Data?

16

slide-17
SLIDE 17

What are Public Health Data?

Via EHRs data are passively collected about entire populations over time

17

slide-18
SLIDE 18

The State of Data Vis in Public Health

18

slide-19
SLIDE 19

§ Barriers for creating data visualizations are lowering

§ Many domain specialists (scientists, public servants) routinely create data visualizations

§ Guidance on what makes a good data visualization is absent

§ Domain specialists don’t read the vis literature

§ Lack of guidance = inefficient unsupervised exploration of vis design space

§ “Hit or Miss” ad hoc design solutions

The state of visualization in public health

19

slide-20
SLIDE 20

§ Barriers for creating data visualizations are lowering

§ Many domain specialists (scientists, public servants) routinely create data visualizations

§ Guidance on what makes a good data visualization is absent

§ Domain specialists don’t read the vis literature

§ Lack of guidance = inefficient unsupervised exploration of vis design space

§ “Hit or Miss” ad hoc design solutions

§ Our proposed solution: systematically create an explorable vis design space

The state of visualization in public health

20

slide-21
SLIDE 21

Design Spaces : A quick primer

Design spaces are made of visualization design choices or varying utility (+ 0 - )

Source: Sedlmair (2012) “Design Study Methodology” 21

slide-22
SLIDE 22

Design Spaces : A quick primer

Source: Munzner (2014) “Visualization Analysis and Design”

GOAL – nudge domain specialists toward better design choice solutions

22

slide-23
SLIDE 23

Design Spaces : A quick primer

BUT – how do we sy system stemati tically describe design space to promote good exploration?

Source: Munzner (2014) “Visualization Analysis and Design” 23

slide-24
SLIDE 24

§ Our observation: there’s a lot of figures in research papers, let’s study them! § Challenge: methods for systematic assessment of data visualizations don’t exist

§ Systematic matters! Shows the good, the bad, and the common § Existing studies (setvis, treevis, vishealth) are not systematic reviews of specialist's domain

§ We combined methods from epidemiology with infovis to construct a design space

Constructing a design space

24

slide-25
SLIDE 25

Our approach allows us to answer three different questions

§ Scope: Infectious Disease Genomic Epidemiology literature § Objective: Identify and enumerate the kinds of visualizations generated for different topics

  • f infectious disease genomic epidemiology

25

slide-26
SLIDE 26

Literature Analysis Qualitative Data Visualization Analysis Quantitative Data Visualization Analysis

Our approach allows us to answer three different questions

§ Scope: Infectious Disease Genomic Epidemiology literature § Objective: Identify and enumerate the kinds of visualizations generated for different topics

  • f infectious disease genomic epidemiology

26

slide-27
SLIDE 27

Literature Analysis Qualitative Data Visualization Analysis Quantitative Data Visualization Analysis

WHY are researchers visualizing data? HOW are researchers visualizing data, WHAT are they visualizing? HOW MANY examples are there

  • f specific visualizations?

Our approach allows us to answer three different questions

§ Scope: Infectious Disease Genomic Epidemiology literature § Objective: Identify and enumerate the kinds of visualizations generated for different topics

  • f infectious disease genomic epidemiology

27

slide-28
SLIDE 28

https://amcrisan.shinyapps.io/gevit_gallery/

Unpublished & still some work to be done so please don’t distribute

28

slide-29
SLIDE 29

Implications of our findings

  • Surp

Surpri rise se fi find nding ng: a lot t of f data ta in n data ta visua sualiza zati tions ns wer ere e no not t visua sualized zed!

  • Pedagogical implications :
  • Can we give people more complex vis applications when their vis skills are kind of low?
  • How can we improve vis literacy?
  • I think a design space is a useful discussion tool
  • Software develop implications:
  • Discussion of a design space in bioinformatics development
  • GEViT is resource to provide alternative designs
  • Alternative designs also see gaps in the where vis research is needed
  • Human-in-the-loop implications:
  • Need to think beyond image recognition problems
  • Might be premature to apply AI methods (no good training data)

29

slide-30
SLIDE 30

Additional Slides

30

slide-31
SLIDE 31

An overview of our results so far

  • Li

Literature Analysis: Understanding the structure of genomic epidemiology papers promotes systematicity via intelligent sampling

  • Total sample ~18K papers on genomic epidemiology
  • Defined strata by pathogen (document structure) and a priori concepts (domain knowledge)
  • Literature analysis stratified sampling yielded ~850 figures for analysis from 221 papers
  • Qu

Qualit litativ ive Analy lysis is: Developed GEViT, a Genomic Epidemiology Visualization Typology

  • Developed a typology to systematically described charts using three descriptive axes: chart

types, chart combinations, and chart enhancements

  • Qu

Quantit itativ ive Analy lysis is: It’s nearly all phylogenetic trees, across all pathogens and concepts, but there’s also a lot of tables and plain text

  • Surp

Surpri risi sing ng genera eneral conc nclusi usion: n: mo most da data i is these these da data v visualizations a are n not vi visualized

31

slide-32
SLIDE 32

“Identify and enumerate the kinds of visualizations generated per topic of i.d. genomic epidemiology”

4

An overview of our approach

32

slide-33
SLIDE 33

“Identify and enumerate the kinds of visualizations generated per topic of i.d. genomic epidemiology”

Text mining analysis of document corpus

1 4

An overview of our approach

33

slide-34
SLIDE 34

“Identify and enumerate the kinds of visualizations generated per topic of i.d. genomic epidemiology”

Text mining analysis of document corpus

1 4

Systematically sample papers

2

An overview of our approach

34

slide-35
SLIDE 35

“Identify and enumerate the kinds of visualizations generated per topic of i.d. genomic epidemiology”

Text mining analysis of document corpus

1 4

Systematically sample papers

2

Create a codeset to classify research figures consistently

3

An overview of our approach

35

slide-36
SLIDE 36

”Identify and enumerate the kinds of visualizations generated per topic of i.d. genomic epidemiology”

Text mining analysis of document corpus

1 4

Systematically sample papers

2

Create a codeset to classify research figures consistently

3

Apply code set to research figures

4

An overview of our approach

36

slide-37
SLIDE 37

”Identify and enumerate the kinds of visualizations generated per topic of i.d. genomic epidemiology”

Create a codeset to classify research figures consistently Text mining analysis of document corpus Descriptive statistics (literally count)

1 4

Systematically sample papers

2 3

Apply code set to research figures

4 5

An overview of our approach

37