[PPT] - MEN ALSO LIKE SHOPPING REDUCING GENDER BIAS AMPLIFICATION USING PowerPoint Presentation

SLIDE 1

REDUCING GENDER BIAS AMPLIFICATION USING CORPUS-LEVEL CONSTRAINTS

Jieyu Zhao1,3, Tianlu Wang 1, Mark Yatskar 2,4, Vicente Ordonez 1 , Kai-Wei Chang 1,3

1 University of Virginia 2 University of Washington 3 UCLA 4 Allen Institute for AI

MEN ALSO LIKE SHOPPING

1

( me )

SLIDE 2

2

33% 66% Female Male Dataset Gender Bias imsitu.org

SLIDE 3

3

16% 84% Female Male Model Bias After Training imsitu.org

SLIDE 4

4

Why does this happen? Good for accuracy

SLIDE 5

Algorithmic Bias in Grounded Setting

World Dataset Model dusting cooking faucet} fork

}

SLIDE 6

dusting cooking faucet} fork

}

Algorithmic Bias in Grounded Setting

World Dataset Model woman cooking

SLIDE 7

Algorithmic Bias in Grounded Setting

World Dataset man fixing faucet woman cooking Model dusting cooking faucet} fork

}

SLIDE 8

Algorithmic Bias in Grounded Setting

World Dataset RBA Model dusting cooking faucet} fork

}

SLIDE 9

Algorithmic Bias in Grounded Setting

World Dataset RBA

Reduce amplification ~50% Negligible loss in performance

Model dusting cooking faucet} fork

}

SLIDE 10

Contributions

imSitu vSRL (events) COCO MLC (objects) data model RBA

High dataset gender bias Models amplify existing gender bias Reducing bias amplification

~50% reduction in amplification Insignificant loss in performance ~70% objects and events have bias amplification 38% (objects) 47% (events) exhibit strong bias

SLIDE 11

Outline

imSitu vSRL (events) COCO MLC (objects) data model RBA

2. Dataset Bias
3. Bias Amplification
4. Reducing Bias Amplification
1. Background

SLIDE 12

imSitu Visual Semantic Role Labeling (vSRL)

12

COOKING ROLES NOUNS AGENT woman FOOD vegetable CONTAINER pot TOOL spatula

FrameNet WordNet Internet

(events)

Yatskar et al. CVPR ’16, Yang et al. NAACL ’16, Gupta and Malik arXiv ’16

SLIDE 13

imSitu Visual Semantic Role Labeling (vSRL)

13

COOKING ROLES NOUNS AGENT woman FOOD vegetable CONTAINER pot TOOL spatula

FrameNet WordNet Internet

(events)

Yatskar et al. CVPR ’16, Yang et al. NAACL ’16, Gupta and Malik arXiv ’16

SLIDE 14

imSitu Visual Semantic Role Labeling (vSRL)

14

COOKING ROLES NOUNS AGENT woman FOOD vegetable CONTAINER pot TOOL spatula

Yatskar et al. CVPR ’16, Yang et al. NAACL ’16, Gupta and Malik arXiv ’16

(events)

Convolutional Neural Network Regression Conditional Random Field

SLIDE 15

imSitu Visual Semantic Role Labeling (vSRL)

15

COOKING ROLES NOUNS AGENT woman FOOD vegetable CONTAINER pot TOOL spatula

Yatskar et al. CVPR ’16, Yang et al. NAACL ’16, Gupta and Malik arXiv ’16

(events)

Convolutional Neural Network Regression Conditional Random Field

SLIDE 16

imSitu Visual Semantic Role Labeling (vSRL)

16

COOKING ROLES NOUNS AGENT woman FOOD vegetable CONTAINER pot TOOL spatula

Yatskar et al. CVPR ’16, Yang et al. NAACL ’16, Gupta and Malik arXiv ’16

(events)

Convolutional Neural Network Regression Conditional Random Field

Need to model correlation between variables Model can use that machinery to amplify gender bias

SLIDE 17

a woman is smiling in a kitchen near a pizza on a stove Internet COCO Objects Caption Inferred Label

COCO Multi-Label Classification (MLC)

17

(objects)

WOMAN PIZZA yes ZEBRA no FRIDGE yes CAR no

…

SLIDE 18

WOMAN PIZZA yes ZEBRA no FRIDGE yes CAR no

…

Convolutional Neural Network Regression Conditional Random Field

18

COCO Multi-Label Classification (MLC)

(objects)

SLIDE 19

19

Related Work

Implicit Bias

image search (Kay et al., 2015) search advertising (Sweeny, 2013)

nline news (Ross and Carter, 2011)

credit score (Hardt et al., 2016)

Classifier class imbalance

Barocas and Selbst, 2014; Dwork et al., 2012; Feldman et al., 2015; Zliobaite, 2015

word vector (Bolukbasi et al., 2016)

SLIDE 20

Outline

imSitu vSRL (events) COCO MLC (objects) data model RBA

2. Dataset Bias
3. Model Bias Amplification
4. Reducing Bias Amplification
1. Background

SLIDE 21

Defining Dataset Bias (events)

Training Gender Ratio ( verb)

woman cooking man

Training Set

COOKING ROLES NOUNS AGENT woman FOOD stir-fry COOKING ROLES NOUNS AGENT man FOOD noodle

= #( cooking , man) + #( cooking , woman) #( cooking , man) 1/3

SLIDE 22

WOMAN snowboard yes refrigerator no bowl no MAN snowboard yes refrigerator no bowl no

Defining Dataset Bias (objects)

Training Gender Ratio ( noun)

woman snowboard man

Training Set

2/3 = #( snowboard , man) + #( snowboard , woman) #( snowboard, man)

SLIDE 23

0.05 0.1 0.15 0.2 0.25 0.25 0.5 0.75 1

Gender Dataset Bias

Gender Ratio

Unbiased Male bias Female bias

% of items imSitu Verb COCO Noun

SLIDE 24

0.05 0.1 0.15 0.2 0.25 0.25 0.5 0.75 1

Gender Dataset Bias

coaching lecturing Gender Ratio

Unbiased Male bias Female bias

% of items repairing shopping braiding washing cooking imSitu Verb COCO Noun

SLIDE 25

0.05 0.1 0.15 0.2 0.25 0.25 0.5 0.75 1

Gender Dataset Bias

surfboard Gender Ratio

Unbiased Male bias Female bias

% of items skateboard fork bed refrigerator ski imSitu Verb COCO Noun

SLIDE 26

0.05 0.1 0.15 0.2 0.25 0.25 0.5 0.75 1

Gender Dataset Bias

Gender Ratio

Unbiased Male bias Female bias

% of items imSitu Verb COCO Noun 64.6% 86.6% bias bias

SLIDE 27

0.05 0.1 0.15 0.2 0.25 0.25 0.5 0.75 1

Gender Dataset Bias

Gender Ratio

Unbiased Male bias Female bias

% of items imSitu Verb COCO Noun 46.9% strong bias (>2:1) 37.9% strong bias (>2:1) 64.6% 86.6% bias bias

SLIDE 28

Outline

imSitu vSRL (events) COCO MLC (objects) data model RBA

2. Dataset Bias
4. Reducing Bias Amplification
1. Background
3. Bias Amplification

SLIDE 29

Defining Bias Amplification (events)

Predicted Gender Ratio ( verb)

COOKING ROLES NOUNS AGENT man FOOD noodle COOKING ROLES NOUNS AGENT woman FOOD stir-fry

Development Set

What does the model predict on unseen data?

SLIDE 30

Predicted Gender Ratio ( verb)

woman cooking man

COOKING ROLES NOUNS AGENT man FOOD noodle

= #( cooking , man) + #( cooking , woman) #( cooking , man) 1/6

COOKING ROLES NOUNS AGENT woman FOOD stir-fry

Defining Bias Amplification (events)

Development Set

SLIDE 31

imSitu Verb COCO Noun Predicted Gender Ratio

0.00 0.25 0.50 0.75 1.00 0.25 0.5 0.75 1

Model Bias Amplification

31

Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio

SLIDE 32

Predicted Gender Ratio

0.00 0.25 0.50 0.75 1.00 0.25 0.5 0.75 1

imSitu Verb COCO Noun

Model Bias Amplification

32

Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio Matched gender ratio Amplification Zone

SLIDE 33

Predicted Gender Ratio

0.00 0.25 0.50 0.75 1.00 0.25 0.5 0.75 1

imSitu Verb COCO Noun

Model Bias Amplification

33

Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio Matched gender ratio Amplification Zone

washing cooking assembling autographing

SLIDE 34

Predicted Gender Ratio

0.00 0.25 0.50 0.75 1.00 0.25 0.5 0.75 1

imSitu Verb COCO Noun

Model Bias Amplification

34

Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio Matched gender ratio Amplification Zone

69% 73% bias bias .05 bias bias .04

SLIDE 35

Predicted Gender Ratio

0.00 0.25 0.50 0.75 1.00 0.25 0.5 0.75 1

imSitu Verb COCO Noun

Model Bias Amplification

35

Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio Matched gender ratio Amplification Zone

69% 73% bias bias .05 bias bias .04 > 2:1 initial bias : .07 bias > 2:1 initial bias : .08 bias

SLIDE 36

Predicted Gender Ratio

0.00 0.25 0.50 0.75 1.00 0.25 0.5 0.75 1

Summary

36

Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio Matched gender ratio

Can we remove gender bias amplification and still maintain performance?

SLIDE 37

Predicted Gender Ratio

0.00 0.25 0.50 0.75 1.00 0.25 0.5 0.75 1

Summary

37

Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio Matched gender ratio

Can we remove gender bias amplification and still maintain performance? Performance Goal: as good as the original Fairness Goal: not more biased than the data it was trained on

SLIDE 38

Outline

imSitu vSRL (events) COCO MLC (objects) data model RBA

2. Dataset Bias
4. Reducing Bias Amplification
1. Background
3. Bias Amplification

SLIDE 39

39

Dataset Model RBA

★ Doesn’t require model retraining

Reuse model inference through Lagrangian relaxation
Corpus level constraints on model output (ILP)

★Can be applied to any structured model

Reducing Bias Amplification (RBA)

SLIDE 40

base model

40

CRF Inference

Reducing Bias Amplification (RBA)

Integer Linear Program s(yi , image) max yi

X

i

SLIDE 41

0.00 0.25 0.50 0.75 1.00 0.25 0.5 0.75 1

41

Reducing Bias Amplification (RBA)

Predicted Gender Ratio Gender Ratio Integer Linear Program s(yi , image) max yi

X

i

<= margin

Training Ratio - Predicted Ratio

∀ points

f(y1 … yn)

Matched gender ratio Margin Violating margin Within margin

SLIDE 42

0.00 0.25 0.50 0.75 1.00 0.25 0.5 0.75 1

42

Reducing Bias Amplification (RBA)

Integer Linear Program s(yi , image) max yi

X

i

<= margin

Training Ratio - Predicted Ratio

∀ points Predicted Gender Ratio Gender Ratio

f(y1 … yn)

Matched gender ratio Margin Violating margin Within margin

SLIDE 43

Reducing Bias Amplification (RBA)

Integer Linear Program <= margin

Training Ratio - Predicted Ratio

∀ points

f(y1 … yn)

Lagrangian Relaxation

Sontag et al., 2011; Rush and Collins, 2012; Chang and Collins, 2011; Peng et al., 2015, Chang et al., 2013; Dalvi, 2015

s(yi , image) max yi

X

i

constraints inference

SLIDE 44

Lagrangian Relaxation

Sontag et al., 2011; Rush and Collins, 2012; Chang and Collins, 2011; Peng et al., 2015, Chang et al., 2013; Dalvi, 2015

44

COOKING ROLES NOUNS AGENT woman FOOD pancake COOKING ROLES NOUNS AGENT woman FOOD vegetable

s(yi , image) max yi

X

i

<= margin

Training Ratio - Predicted Ratio

(1/2)

SLIDE 45

Sontag et al., 2011; Rush and Collins, 2012; Chang and Collins, 2011; Peng et al., 2015, Chang et al., 2013; Dalvi, 2015

45

inference update 𝝁 update potentials

COOKING ROLES NOUNS AGENT woman FOOD pancake COOKING ROLES NOUNS AGENT woman FOOD vegetable

Lagrangian Relaxation

Lagrange Multiplier (𝝁) Per Constraint

s(yi , image) max yi

X

i

<= margin

Training Ratio - Predicted Ratio

(1/2)

SLIDE 46

COOKING ROLES NOUNS AGENT woman FOOD pancake COOKING ROLES NOUNS AGENT woman FOOD vegetable

update 𝝁 update potentials inference

Sontag et al., 2011; Rush and Collins, 2012; Chang and Collins, 2011; Peng et al., 2015, Chang et al., 2013; Dalvi, 2015

46

Lagrangian Relaxation

Lagrange Multiplier (𝝁) Per Constraint

s(yi , image) max yi

X

i

<= margin

Training Ratio - Predicted Ratio

(1/2)

SLIDE 47

COOKING ROLES NOUNS AGENT woman FOOD pancake COOKING ROLES NOUNS AGENT woman FOOD vegetable

inference update 𝝁 update potentials

Sontag et al., 2011; Rush and Collins, 2012; Chang and Collins, 2011; Peng et al., 2015, Chang et al., 2013; Dalvi, 2015

47

Lagrangian Relaxation

Lagrange Multiplier (𝝁) Per Constraint

s(yi , image) max yi

X

i

<= margin

Training Ratio - Predicted Ratio

(1/2)

SLIDE 48

COOKING ROLES NOUNS AGENT woman FOOD pancake COOKING ROLES NOUNS AGENT woman FOOD vegetable

inference update 𝝁 update potentials

Sontag et al., 2011; Rush and Collins, 2012; Chang and Collins, 2011; Peng et al., 2015, Chang et al., 2013; Dalvi, 2015

48

Lagrangian Relaxation

Lagrange Multiplier (𝝁) Per Constraint
Lagrange Multiplier (𝝁) Per Constraint

s(yi , image) max yi

X

i

<= margin

Training Ratio - Predicted Ratio

(1/2)

SLIDE 49

update 𝝁 update potentials inference

COOKING ROLES NOUNS AGENT woman FOOD pancake COOKING ROLES NOUNS AGENT man FOOD vegetable

Sontag et al., 2011; Rush and Collins, 2012; Chang and Collins, 2011; Peng et al., 2015, Chang et al., 2013; Dalvi, 2015

49

Lagrangian Relaxation

Lagrange Multiplier (𝝁) Per Constraint

s(yi , image) max yi

X

i

<= margin

Training Ratio - Predicted Ratio

(1/2)

SLIDE 50

COOKING ROLES NOUNS AGENT woman FOOD pancake COOKING ROLES NOUNS AGENT man FOOD vegetable

update potentials update 𝝁 inference

Sontag et al., 2011; Rush and Collins, 2012; Chang and Collins, 2011; Peng et al., 2015, Chang et al., 2013; Dalvi, 2015

50

Lagrangian Relaxation

Lagrange Multiplier (𝝁) Per Constraint

s(yi , image) max yi

X

i

<= margin

Training Ratio - Predicted Ratio

(1/2)

SLIDE 51

0.25 0.5 0.75 1 0.25 0.5 0.75 1

Gender Bias De-amplification in imSitu

51

Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio Margin

Predicted Gender Ratio

Violating margin Within margin

Violation: 72.6% imSitu Verb .050 bias 24.07 acc.

SLIDE 52

Gender Bias De-amplification in imSitu

52

Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio Margin

Predicted Gender Ratio

Violating margin Within margin 0.25 0.5 0.75 1 0.25 0.5 0.75 1

imSitu Verb Violation: 72.6% .050 bias 24.07 acc. Violation: 50.5% .024 bias 23.97 acc. w/ RBA

SLIDE 53

0.25 0.5 0.75 1 0.25 0.5 0.75 1

53

Gender Bias De-amplification in COCO

COCO Noun Violation: 60.6% .032 bias 45.27 mAP Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio Margin

Predicted Gender Ratio

Violating margin Within margin

SLIDE 54

54

0.25 0.5 0.75 1 0.25 0.5 0.75 1

Gender Bias De-amplification in COCO

COCO Noun Violation: 60.6% .032 bias 45.27 mAP Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio Margin

Predicted Gender Ratio

Violating margin Within margin

54

w/ RBA Violation: 36.4% .022 bias 45.19 mAP

SLIDE 55

55

0.25 0.5 0.75 1 0.25 0.5 0.75 1

Gender Bias De-amplification in COCO

COCO Noun Violation: 60.6% .032 bias 45.27 mAP Gender Ratio

Unbiased Male bias Female bias

Matched gender ratio Margin

Predicted Gender Ratio

Violating margin Within margin

55

w/ RBA Violation: 36.4% .022 bias 45.19 mAP

Performance Goal: as good as the original Fairness Goal: not more biased than the data it was trained on

SLIDE 56

Contributions

imSitu vSRL (events) COCO MLC (objects) data model RBA

High dataset gender bias Models amplify existing gender bias Reducing bias amplification

~50% reduction in amplification Insignificant loss in performance ~70% objects and events have bias amplification 38% (objects) 47% (events) exhibit strong bias

SLIDE 57

57

data model RBA

Other direct applications? i.e. co-ref, racial bias Do all models amplify equally? i.e. different objectives

Future Work

Can existing data be made more balanced?