[PPT] - The Analysis of Faces in Brains and Machines Rafael PowerPoint Presentation

SLIDE 1

The ¡Analysis ¡of ¡Faces ¡ ¡ in ¡Brains ¡and ¡Machines ¡

9.523 ¡Aspects ¡of ¡a ¡Computational ¡Theory ¡of ¡Intelligence ¡ Rafael ¡Reif ¡ stay ¡tuned... ¡

SLIDE 2

Why ¡is ¡face ¡analysis ¡important ¡ ¡ for ¡intelligence? ¡

Remember/recognize ¡people ¡we’ve ¡seen ¡before ¡

¡

Categorization ¡– ¡e.g. ¡gender, ¡race, ¡age, ¡kinship ¡

¡

Social ¡communication ¡– ¡emotions/mood, ¡intentions, ¡trustworthiness, ¡ ¡ ¡ ¡competence ¡or ¡intelligence, ¡attractiveness ¡

¡

Scene ¡understanding, ¡e.g. ¡direction ¡of ¡gaze ¡suggests ¡focus ¡of ¡attention ¡

SLIDE 3

Why ¡is ¡face ¡recognition ¡hard? ¡

changing ¡pose ¡ changing ¡illumination ¡ changing ¡expression ¡ clutter ¡

¡

cclusion ¡

aging ¡

SLIDE 4

Jenkins, ¡White, ¡Van ¡Montfort ¡& ¡Burton, ¡Cognition, ¡2011 ¡

How ¡good ¡are ¡we ¡at ¡face ¡recognition? ¡

SLIDE 5

Face ¡recognition ¡performance ¡in ¡humans ¡

chance ¡ performance ¡ testmybrain.org ¡ Wilmer ¡et ¡al., ¡2012 ¡ Duchaine ¡& ¡Nakayama, ¡2006 ¡

SLIDE 6

Bruce ¡et ¡al., ¡1999 ¡

Face ¡recognition ¡performance ¡in ¡humans ¡

Which ¡of ¡the ¡10 ¡photos ¡on ¡the ¡ bottom ¡depicts ¡the ¡target ¡face? ¡

¡

Viewers ¡are ¡~ ¡70% ¡correct ¡ ¡ Performance ¡degrades ¡with ¡ changes ¡in ¡pose, ¡expression ¡

¡

Only ¡slight ¡improvement ¡with ¡ short ¡video ¡clip ¡of ¡target ¡

Importance ¡of ¡familiar ¡vs. ¡ unfamiliar ¡face ¡recognition! ¡

SLIDE 7

How ¡good ¡are ¡the ¡best ¡machines? ¡

Public ¡databases ¡of ¡face ¡images ¡serve ¡as ¡benchmarks: ¡

¡

Labeled ¡Faces ¡in ¡the ¡Wild ¡(LFW, ¡http://vis-‑www.cs.umass.edu/lfw) ¡ ¡> ¡13,000 ¡images ¡of ¡celebrities, ¡5,749 ¡different ¡identities ¡

¡

YouTube ¡Faces ¡Database ¡(YTF, ¡http://www.cs.tau.ac.il/~wolf/ytfaces) ¡ ¡3,425 ¡videos, ¡1,595 ¡different ¡identities ¡ ¡ Private ¡face ¡image ¡datasets: ¡

¡

(Facebook) ¡Social ¡Face ¡Classieication ¡dataset ¡ ¡ ¡4.4 ¡million ¡face ¡photos, ¡4,030 ¡different ¡identities ¡ (Google) ¡100-‑200 ¡million ¡face ¡images, ¡~ ¡8 ¡million ¡different ¡identities ¡ ¡ LFW ¡ YTF ¡ Facebook ¡DeepFace ¡ 97.4% ¡ 91.4% ¡ Google ¡FaceNet ¡ 99.6% ¡ 95.1% ¡ Human ¡performance ¡ 97.5% ¡ 89.7% ¡

SLIDE 8

Machine ¡vision ¡applications ¡of ¡face ¡recognition ¡

surveillance ¡ access ¡ control ¡ security, ¡forensics ¡

SLIDE 9

More ¡applications ¡of ¡face ¡recognition ¡

content-‑based ¡image ¡retrieval ¡ social ¡media ¡ graphics, ¡HCI ¡ humanoid ¡robots ¡

SLIDE 10

Aspects ¡of ¡face ¡processing ¡

Face ¡detection ¡– ¡eind ¡image ¡regions ¡that ¡contain ¡faces ¡ ¡ Face ¡identieication ¡– ¡who ¡is ¡the ¡person? ¡ ¡ Categorization ¡– ¡gender, ¡age, ¡race ¡ ¡ Facial ¡expression ¡– ¡mood, ¡emotion ¡ ¡ Non-‑verbal ¡social ¡perception ¡and ¡communication ¡

¡ ¡

SLIDE 11

It ¡all ¡began ¡with ¡Takeo ¡Kanade ¡(1973)… ¡

PhD ¡thesis, ¡Picture ¡Processing ¡System ¡by ¡Computer ¡Complex ¡and ¡ ¡ ¡ ¡ ¡ ¡Recognition ¡of ¡Human ¡Faces ¡

Special ¡purpose ¡algorithms ¡to ¡locate ¡eyes, ¡

nose, ¡mouth, ¡boundaries ¡of ¡face ¡

~ ¡40 ¡geometric ¡features, ¡e.g. ¡ratios ¡of ¡

distances ¡and ¡angles ¡between ¡features ¡

SLIDE 12

Eigenfaces ¡for ¡recognition ¡(Turk ¡& ¡Pentland) ¡

Principal ¡Components ¡Analysis ¡(PCA) ¡

Goal: ¡reduce ¡the ¡dimensionality ¡of ¡the ¡data ¡while ¡retaining ¡as ¡much ¡ ¡ ¡ ¡information ¡as ¡possible ¡in ¡the ¡original ¡dataset ¡

¡

PCA ¡allows ¡us ¡to ¡compute ¡a ¡linear ¡transformation ¡that ¡maps ¡data ¡from ¡ ¡ ¡ ¡a ¡high ¡dimensional ¡space ¡to ¡a ¡lower ¡dimensional ¡subspace ¡

SLIDE 13

Typical ¡sample ¡training ¡set… ¡

One ¡or ¡more ¡images ¡ ¡ ¡per ¡person ¡

¡

Aligned ¡& ¡cropped ¡to ¡ ¡ ¡common ¡pose, ¡size ¡

¡

Simple ¡background ¡

Sample ¡images ¡from ¡the ¡Yale ¡face ¡database, ¡results ¡from ¡C. ¡deCoro ¡ ¡http://www.cs.princeton.edu/~cdecoro/eigenfaces/ ¡

SLIDE 14

Eigenfaces ¡for ¡recognition ¡(Turk ¡& ¡Pentland) ¡

1-14

Perform ¡PCA ¡on ¡a ¡large ¡set ¡of ¡training ¡ images, ¡to ¡create ¡a ¡set ¡of ¡eigenfaces, ¡ Ei(x,y), ¡that ¡span ¡the ¡data ¡set ¡ First ¡components ¡capture ¡most ¡of ¡the ¡ variation ¡across ¡the ¡data ¡set, ¡later ¡ components ¡capture ¡subtle ¡variations ¡ Each ¡face ¡image ¡F(x,y) ¡can ¡be ¡expressed ¡as ¡a ¡weighted ¡combination ¡of ¡the ¡ eigenfaces ¡Ei(x,y): ¡ ¡ ¡ Ψ(x,y): ¡average ¡face ¡(across ¡all ¡faces) ¡ Ψ(x,y) ¡

http://vismod.media.mit.edu/vismod/demos/facerec/basic.html ¡

F(x,y) ¡= ¡Ψ(x,y) ¡+ ¡Σi ¡wi*Ei(x,y) ¡

¡

SLIDE 15

Representing ¡individual ¡faces ¡

Each ¡face ¡image ¡F(x,y) ¡can ¡be ¡expressed ¡as ¡a ¡weighted ¡combination ¡of ¡the ¡ eigenfaces ¡Ei(x,y): ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡

Recognition ¡process: ¡

(1) Compute ¡weights ¡wi ¡ ¡ for ¡novel ¡face ¡image ¡ (2) Find ¡image ¡m ¡in ¡face ¡ database ¡with ¡most ¡ similar ¡weights, ¡e.g. ¡

min (wi − wi

m i=1 k

∑

)2

F(x,y) ¡= ¡Ψ(x,y) ¡+ ¡Σi ¡wi*Ei(x,y) ¡

¡

SLIDE 16

Changing ¡expressions ¡& ¡lighting ¡

1-16

Eigenfaces ¡approach ¡ handles ¡changes ¡in ¡ facial ¡expression ¡

k… ¡

… ¡but ¡not ¡changes ¡in ¡ lighting ¡

(results ¡from ¡C. ¡deCoro) ¡

SLIDE 17

1-17

Face ¡detection: ¡Viola ¡& ¡Jones ¡

Multiple ¡view-‑based ¡classi4iers ¡based ¡on ¡simple ¡features ¡ that ¡best ¡discriminate ¡faces ¡vs. ¡non-‑faces ¡ Most ¡discriminating ¡features ¡learned ¡from ¡thousands ¡of ¡ samples ¡of ¡face ¡and ¡non-‑face ¡image ¡windows ¡ Attentional ¡mechanism: ¡ cascade ¡of ¡increasingly ¡ discriminating ¡classieiers ¡ improves ¡performance ¡

SLIDE 18

1-18

Viola ¡& ¡Jones ¡use ¡simple ¡features ¡

Use ¡simple ¡rectangle ¡features: ¡ ¡ ¡ ¡ ¡ ¡ ¡Σ ¡I(x,y) ¡in ¡gray ¡area ¡– ¡Σ ¡I(x,y) ¡in ¡white ¡area ¡ within ¡24 ¡x ¡24 ¡image ¡sub-‑windows ¡

¡Initially ¡consider ¡160,000 ¡potential ¡ ¡ ¡ ¡ ¡ ¡

¡features ¡per ¡sub-‑window! ¡

¡features ¡computed ¡very ¡efeiciently ¡

Which ¡features ¡best ¡distinguish ¡face ¡vs. ¡non-‑face? ¡

Learn ¡most ¡discriminating ¡features ¡from ¡ thousands ¡of ¡samples ¡of ¡face ¡and ¡non-‑ face ¡image ¡windows ¡

SLIDE 19

1-19

Learning ¡the ¡best ¡features ¡

x ¡= ¡image ¡window ¡ f ¡= ¡feature ¡ ¡ p ¡= ¡+1 ¡or ¡-‑1 ¡ θ ¡= ¡threshold ¡ ¡ weak ¡classiBier ¡using ¡one ¡feature: ¡ (x1,w1,1) ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡(xn,wn,0) ¡

…

normalize ¡ weights ¡ find next best weak classifier use ¡classieication ¡errors ¡ ¡ ¡ ¡ to ¡update ¡weights ¡

n ¡training ¡samples, ¡ equal ¡weights, ¡ known ¡classes ¡

τ ¡

einal ¡classieier ¡

~ ¡200 ¡features ¡yields ¡good ¡results ¡ ¡for ¡“monolithic” ¡classieier ¡

AdaBoost ¡

SLIDE 20

1-20

“Attentional ¡cascade” ¡of ¡increasingly ¡ ¡discriminating ¡classieiers ¡

Early ¡classieiers ¡use ¡a ¡few ¡highly ¡ discriminating ¡features, ¡low ¡threshold ¡

1st ¡classieier ¡uses ¡two ¡features, ¡ ¡ ¡ ¡ ¡

removes ¡50% ¡non-‑face ¡windows ¡ ¡

later ¡classieiers ¡distinguish ¡harder ¡

examples ¡

¡ ¡Increases ¡efeiciency ¡
¡ ¡Allows ¡use ¡of ¡many ¡more ¡features ¡

à ¡Cascade ¡of ¡38 ¡classieiers, ¡using ¡~6000 ¡features ¡

SLIDE 21

Training ¡with ¡normalized ¡faces ¡

5000 ¡faces ¡ many ¡more ¡non-‑face ¡patches ¡ ¡ faces ¡are ¡normalized ¡ for ¡scale, ¡rotation ¡ ¡ small ¡variation ¡in ¡pose ¡

SLIDE 22

1-22

Viola ¡& ¡Jones ¡results ¡

With ¡additional ¡diagonal ¡features, ¡classieiers ¡were ¡created ¡ to ¡handle ¡image ¡rotations ¡and ¡proeile ¡views ¡

SLIDE 23

Feature ¡based ¡vs. ¡holistic ¡processing ¡

inversion ¡disrupts ¡recognition ¡of ¡

faces ¡more ¡than ¡other ¡objects ¡

¡

prosopagnosics ¡do ¡not ¡show ¡

inversion ¡effect ¡

Composite ¡Face ¡Effect ¡

identical ¡top ¡halves ¡seen ¡as ¡

different ¡when ¡aligned ¡with ¡ different ¡bottom ¡halves ¡

¡

when ¡misaligned, ¡top ¡halves ¡

perceived ¡as ¡identical ¡

Face ¡Inversion ¡Effect ¡

SLIDE 24

Feature ¡based ¡vs. ¡holistic ¡processing ¡

Which ¡features ¡are ¡ ¡ more ¡diagnostic? ¡ ¡ Whole-‑Part ¡Effect ¡

Identieication ¡of ¡the ¡“studied” ¡face ¡is ¡ signieicantly ¡better ¡in ¡the ¡whole ¡vs. ¡ part ¡condition ¡ Test ¡conditions ¡ Eyebrows ¡are ¡important! ¡

SLIDE 25

View ¡generalization ¡mediated ¡by ¡motion? ¡

Hypothesis: ¡ ¡Temporal ¡association ¡is ¡used ¡to ¡link ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡multiple ¡views ¡of ¡a ¡person’s ¡face ¡ ¡

12 ¡female ¡faces ¡scanned ¡for ¡ 3D ¡shape ¡and ¡visual ¡texture ¡ image ¡sequences ¡were ¡created ¡that ¡ morph ¡between ¡two ¡different ¡faces ¡

bservers ¡viewed ¡

morph ¡sequences, ¡ back ¡and ¡forth ¡ same ¡or ¡different ¡person? ¡ (shown ¡separated ¡in ¡time) ¡ performance ¡within ¡morph ¡ groups ¡was ¡compromised ¡ by ¡temporal ¡association ¡

✔ ¡

Wallis ¡& ¡Bulthoff, ¡PNAS, ¡2001 ¡

SLIDE 26

The ¡power ¡of ¡averages ¡(Burton ¡& ¡colleagues) ¡

Improves ¡accuracy ¡in ¡the ¡ recognition ¡of ¡famous ¡faces ¡

¡

‑

PCA ¡

‑

commercial ¡system ¡

‑

human ¡experiments ¡

average ¡“texture” ¡ average ¡ “shape” ¡

SLIDE 27

The ¡Analysis ¡of ¡Faces ¡ ¡ in ¡Brains ¡and ¡Machines ¡

Why ¡is ¡face ¡analysis ¡important ¡ ¡ for ¡intelligence? ¡

Why ¡is ¡face ¡recognition ¡hard? ¡

changing ¡pose ¡ changing ¡illumination ¡ changing ¡expression ¡ clutter ¡

aging ¡

How ¡good ¡are ¡we ¡at ¡face ¡recognition? ¡

Face ¡recognition ¡performance ¡in ¡humans ¡

Face ¡recognition ¡performance ¡in ¡humans ¡

Importance ¡of ¡familiar ¡vs. ¡ unfamiliar ¡face ¡recognition! ¡

How ¡good ¡are ¡the ¡best ¡machines? ¡

Machine ¡vision ¡applications ¡of ¡face ¡recognition ¡

surveillance ¡ access ¡ control ¡ security, ¡forensics ¡

More ¡applications ¡of ¡face ¡recognition ¡

content-­‑based ¡image ¡retrieval ¡ social ¡media ¡ graphics, ¡HCI ¡ humanoid ¡robots ¡

Aspects ¡of ¡face ¡processing ¡

It ¡all ¡began ¡with ¡Takeo ¡Kanade ¡(1973)… ¡

Eigenfaces ¡for ¡recognition ¡(Turk ¡& ¡Pentland) ¡

Principal ¡Components ¡Analysis ¡(PCA) ¡

Typical ¡sample ¡training ¡set… ¡

One ¡or ¡more ¡images ¡ ¡ ¡per ¡person ¡

Aligned ¡& ¡cropped ¡to ¡ ¡ ¡common ¡pose, ¡size ¡

Simple ¡background ¡

Eigenfaces ¡for ¡recognition ¡(Turk ¡& ¡Pentland) ¡

F(x,y) ¡= ¡Ψ(x,y) ¡+ ¡Σi ¡wi*Ei(x,y) ¡

Representing ¡individual ¡faces ¡

Recognition ¡process: ¡

min (wi − wi

∑

)2

F(x,y) ¡= ¡Ψ(x,y) ¡+ ¡Σi ¡wi*Ei(x,y) ¡

Changing ¡expressions ¡& ¡lighting ¡

Eigenfaces ¡approach ¡ handles ¡changes ¡in ¡ facial ¡expression ¡

… ¡but ¡not ¡changes ¡in ¡ lighting ¡

Face ¡detection: ¡Viola ¡& ¡Jones ¡

Viola ¡& ¡Jones ¡use ¡simple ¡features ¡

Which ¡features ¡best ¡distinguish ¡face ¡vs. ¡non-­‑face? ¡

Learning ¡the ¡best ¡features ¡

…

τ ¡

einal ¡classieier ¡

“Attentional ¡cascade” ¡of ¡increasingly ¡ ¡discriminating ¡classieiers ¡

à ¡Cascade ¡of ¡38 ¡classieiers, ¡using ¡~6000 ¡features ¡

Training ¡with ¡normalized ¡faces ¡

Viola ¡& ¡Jones ¡results ¡

Feature ¡based ¡vs. ¡holistic ¡processing ¡

Composite ¡Face ¡Effect ¡

Face ¡Inversion ¡Effect ¡

Feature ¡based ¡vs. ¡holistic ¡processing ¡

Which ¡features ¡are ¡ ¡ more ¡diagnostic? ¡ ¡ Whole-­‑Part ¡Effect ¡

View ¡generalization ¡mediated ¡by ¡motion? ¡

Hypothesis: ¡ ¡Temporal ¡association ¡is ¡used ¡to ¡link ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡multiple ¡views ¡of ¡a ¡person’s ¡face ¡ ¡

✔ ¡

The ¡power ¡of ¡averages ¡(Burton ¡& ¡colleagues) ¡

Faces ¡everywhere... ¡

content-‑based ¡image ¡retrieval ¡ social ¡media ¡ graphics, ¡HCI ¡ humanoid ¡robots ¡

Which ¡features ¡best ¡distinguish ¡face ¡vs. ¡non-‑face? ¡

Which ¡features ¡are ¡ ¡ more ¡diagnostic? ¡ ¡ Whole-‑Part ¡Effect ¡