Visualizing Public Health Data
Anamaria Crisan, MSc PhD Candidate University of British Columbia
@amcrisan http://cs.ubc.ca/~acrisan acrisan@cs.ubc.ca
1
Visualizing Public Health Data Anamaria Crisan, MSc PhD Candidate - - PowerPoint PPT Presentation
Visualizing Public Health Data Anamaria Crisan, MSc PhD Candidate University of British Columbia @amcrisan acrisan@cs.ubc.ca http://cs.ubc.ca/~acrisan 1 I will attempt to make two points The differences between clinical medicine and
Anamaria Crisan, MSc PhD Candidate University of British Columbia
@amcrisan http://cs.ubc.ca/~acrisan acrisan@cs.ubc.ca
1
implications
2
3
Clinical Medicine Public Health
4
Clinical Medicine Public Health
5
pharmaceutical interventions
commonly environmental and behavioral interventions
Clinical Medicine Public Health
6
pharmaceutical interventions
specialists (doctors, nurses, etc.)
commonly environmental and behavioral interventions
necessarily specialists (community leaders, etc.)
Clinical Medicine Public Health
7
pharmaceutical interventions
specialists (doctors, nurses, etc.)
commonly environmental and behavioral interventions
necessarily specialists (community leaders, etc.)
Clinical Medicine Public Health
Treating lung cancer patient Anti-smoking campaign Example: Both use data, even the same data, in different ways
8
Clin Clinic icia ians Re Researchers Pa Patients
§ Currently data vis tends to emphasize clinical medicine applications and targets clinicians, researchers, and patients
9
Nu Nurses Clin Clinic icia ians
Me Medical He Health Of Officer ers
Re Researchers Co Communit ity Le Lead aders
§ Public Health has much more multidisciplinary decision making teams
§ More data & diverse data types = more informed decision making § BUT – different stakeholder abilities to interpret data & different needs
§ Gap: few vis applications for public health
Pol Politicians Pa Patients
10
11
The Epidemiological Trinity
12
Person Place Time
Outcomes Whole Genome Sequences (WGS) Pa Pathogen or Huma man Treatment
Contact & Social networks
13
Person Place Time
Contact & Social networks Outcomes Whole Genome Sequences (WGS) Pa Pathogen or Huma man Treatment My My project: Tuberculosis (TB) WGS
14
Person Place Time
Location Geographic Context
15
Person Place Time
16
Via EHRs data are passively collected about entire populations over time
17
18
§ Barriers for creating data visualizations are lowering
§ Many domain specialists (scientists, public servants) routinely create data visualizations
§ Guidance on what makes a good data visualization is absent
§ Domain specialists don’t read the vis literature
§ Lack of guidance = inefficient unsupervised exploration of vis design space
§ “Hit or Miss” ad hoc design solutions
19
§ Barriers for creating data visualizations are lowering
§ Many domain specialists (scientists, public servants) routinely create data visualizations
§ Guidance on what makes a good data visualization is absent
§ Domain specialists don’t read the vis literature
§ Lack of guidance = inefficient unsupervised exploration of vis design space
§ “Hit or Miss” ad hoc design solutions
§ Our proposed solution: systematically create an explorable vis design space
20
Design spaces are made of visualization design choices or varying utility (+ 0 - )
Source: Sedlmair (2012) “Design Study Methodology” 21
Source: Munzner (2014) “Visualization Analysis and Design”
GOAL – nudge domain specialists toward better design choice solutions
22
BUT – how do we sy system stemati tically describe design space to promote good exploration?
Source: Munzner (2014) “Visualization Analysis and Design” 23
§ Our observation: there’s a lot of figures in research papers, let’s study them! § Challenge: methods for systematic assessment of data visualizations don’t exist
§ Systematic matters! Shows the good, the bad, and the common § Existing studies (setvis, treevis, vishealth) are not systematic reviews of specialist's domain
§ We combined methods from epidemiology with infovis to construct a design space
24
§ Scope: Infectious Disease Genomic Epidemiology literature § Objective: Identify and enumerate the kinds of visualizations generated for different topics
25
Literature Analysis Qualitative Data Visualization Analysis Quantitative Data Visualization Analysis
§ Scope: Infectious Disease Genomic Epidemiology literature § Objective: Identify and enumerate the kinds of visualizations generated for different topics
26
Literature Analysis Qualitative Data Visualization Analysis Quantitative Data Visualization Analysis
WHY are researchers visualizing data? HOW are researchers visualizing data, WHAT are they visualizing? HOW MANY examples are there
§ Scope: Infectious Disease Genomic Epidemiology literature § Objective: Identify and enumerate the kinds of visualizations generated for different topics
27
Unpublished & still some work to be done so please don’t distribute
28
Surpri rise se fi find nding ng: a lot t of f data ta in n data ta visua sualiza zati tions ns wer ere e no not t visua sualized zed!
29
30
Literature Analysis: Understanding the structure of genomic epidemiology papers promotes systematicity via intelligent sampling
Qualit litativ ive Analy lysis is: Developed GEViT, a Genomic Epidemiology Visualization Typology
types, chart combinations, and chart enhancements
Quantit itativ ive Analy lysis is: It’s nearly all phylogenetic trees, across all pathogens and concepts, but there’s also a lot of tables and plain text
Surpri risi sing ng genera eneral conc nclusi usion: n: mo most da data i is these these da data v visualizations a are n not vi visualized
31
4
An overview of our approach
32
Text mining analysis of document corpus
1 4
An overview of our approach
33
Text mining analysis of document corpus
1 4
Systematically sample papers
2
An overview of our approach
34
Text mining analysis of document corpus
1 4
Systematically sample papers
2
Create a codeset to classify research figures consistently
3
An overview of our approach
35
Text mining analysis of document corpus
1 4
Systematically sample papers
2
Create a codeset to classify research figures consistently
3
Apply code set to research figures
4
An overview of our approach
36
Create a codeset to classify research figures consistently Text mining analysis of document corpus Descriptive statistics (literally count)
1 4
Systematically sample papers
2 3
Apply code set to research figures
4 5
An overview of our approach
37