Color in v is u ali z ations IMP R OVIN G YOU R DATA VISU AL - - PowerPoint PPT Presentation

color in v is u ali z ations
SMART_READER_LITE
LIVE PREVIEW

Color in v is u ali z ations IMP R OVIN G YOU R DATA VISU AL - - PowerPoint PPT Presentation

Color in v is u ali z ations IMP R OVIN G YOU R DATA VISU AL IZATION S IN P YTH ON Nick Stra y er Instr u ctor Ho w color is u sed Di erentiates classes of data Encodes contin u o u s v al u es Sho u ld be u sed caref u ll y IMPROVING YOUR


slide-1
SLIDE 1

Color in visualizations

IMP R OVIN G YOU R DATA VISU AL IZATION S IN P YTH ON

Nick Strayer

Instructor

slide-2
SLIDE 2

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

How color is used

Dierentiates classes of data Encodes continuous values Should be used carefully

slide-3
SLIDE 3

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

Color can be beautiful

Boring → eye-catching Variety is good

slide-4
SLIDE 4

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

Meaning is oen applied to colors via culture/ personal experience

Parlapiano, A. (2016 November 1) There are many ways to map election results. We've tried most of them. New York Times. Retrieved from hps://www.nytimes.com/

1

slide-5
SLIDE 5

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

slide-6
SLIDE 6

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

Color can be misleading...

"It is evident that the color-size illusion is present in a marked degree [no maer what] arrangement." C.J. Warden & E.L. Flynn, 1926

slide-7
SLIDE 7

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

A remedy for the color-size illusion

sns.barplot(x = values, y = ids) sns.barplot(x = values, y = ids, edgecolor = 'black')

slide-8
SLIDE 8

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON sns.barplot(x = values, y = ids, color = 'cadetblue')

slide-9
SLIDE 9

Let's paint some data!

IMP R OVIN G YOU R DATA VISU AL IZATION S IN P YTH ON

slide-10
SLIDE 10

Continuous color palettes

IMP R OVIN G YOU R DATA VISU AL IZATION S IN P YTH ON

Nick Strayer

Instructor

slide-11
SLIDE 11

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

slide-12
SLIDE 12

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON blue_scale = sns.light_palette("steelblue") sns.palplot(blue_scale) red_scale = sns.dark_palette("orangered") sns.palplot(red_scale)

slide-13
SLIDE 13

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

slide-14
SLIDE 14

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

Keep it simple

indy_oct = pollution.query("year == 2015 & city == 'Indianapolis'") blue_scale = sns.light_palette("steelblue", as_cmap = True) sns.heatmap(indy_oct[['O3']], cmap = blue_scale)

slide-15
SLIDE 15

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

Keep it simple

indy_oct = pollution.query("year == 2015 & city == 'Indianapolis'") jet_scale = palette = sns.color_palette('jet', as_cmap = True) sns.heatmap(indy_oct[['O3']], cmap = jet_scale)

slide-16
SLIDE 16

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

Be aware of color blindness

Avoid transitions between green and red Palees that use intensity are safer

slide-17
SLIDE 17

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

Encoding neutral values

pal_light = sns.diverging_palette(250, 0) pal_dark = sns.diverging_palette(250, 0, center = 'dark')

slide-18
SLIDE 18

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

plt.style.use('seaborn-white') light_palette = sns.light_palette("orangered") sns.scatterplot(x = 'CO', y = 'NO2', hue = 'O3', data = lb_2012, palette = light_palette)

slide-19
SLIDE 19

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

plt.style.use('dark_background') dark_palette = sns.dark_palette("orangered") sns.scatterplot(x = 'CO', y = 'NO2', hue = 'O3', data = lb_2012, palette = dark_palette)

slide-20
SLIDE 20

Let's continue in the exercises

IMP R OVIN G YOU R DATA VISU AL IZATION S IN P YTH ON

slide-21
SLIDE 21

Categorical palettes

IMP R OVIN G YOU R DATA VISU AL IZATION S IN P YTH ON

Nick Strayer

Instructor

slide-22
SLIDE 22

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

slide-23
SLIDE 23

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

Limits in perception

Try and limit to 10 or fewer categories Keep color-blindness in mind

sns.palplot(sns.color_palette('Set2', 11))

slide-24
SLIDE 24

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

# Assign a new column to dataframe the desired combos pollution['interesting cities'] = [x if x in ['Long Beach', 'Cincinnati'] else 'other' for x in pollution['city'] ] sns.scatterplot(x="NO2", y="SO2", hue = 'interesting cities', palette='Set2', data=pollution.query('year == 2014 & month == 12'))

slide-25
SLIDE 25

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

colorbrewer_palettes = ['Set1', 'Set2', 'Set3', 'Accent', 'Paired', 'Pastel1', 'Pastel2', 'Dark2'] for pal in colorbrewer_palettes: sns.palplot(pal=sns.color_palette(pal)) plt.title(pal, loc = 'left')

slide-26
SLIDE 26

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

Ordinal data (a)

Has order between classes A set number of distinct classes

slide-27
SLIDE 27

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

Ordinal data (b)

Has order between classes A set number of distinct classes

slide-28
SLIDE 28

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

Ordinal data (c)

Has order between classes A set number of distinct classes

slide-29
SLIDE 29

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

colorbrewer_palettes = ['Reds', 'Blues', 'YlOrBr', 'PuBuGn', 'GnBu', 'Greys'] for i, pal in enumerate(colorbrewer_palettes): sns.palplot(pal=sns.color_palette(pal, n_colors=i+4))

slide-30
SLIDE 30

IMPROVING YOUR DATA VISUALIZATIONS IN PYTHON

# Make a tertials column using qcut() pollution['NO2 Tertial'] = pd.qcut(pollution['NO2'], 3, labels = False) # Plot colored by the computer tertials sns.scatterplot(x="CO", y="SO2", hue='NO2 Tertial', palette="OrRd", data=pollution.query("city == 'Long Beach' & year == 2014"))

slide-31
SLIDE 31

Let's color some categories

IMP R OVIN G YOU R DATA VISU AL IZATION S IN P YTH ON