[PPT] - Weihong Deng Beijing University of Posts and PowerPoint Presentation

SLIDE 1

真实世界人脸表情识别

Weihong Deng（邓伟洪）

Beijing University of Posts and Telecommunications http://whdeng.cn/Emotion/projects.html

SLIDE 2

In Intr trod

duc

uction tion & Bac ackgrou kground nd 01 Facial Expression Databases 02 Our Works 03 Latest Survey 04

Outlines

SLIDE 3

Charles Darwin theorized that emotional expression was evoluted

by natural selection

Important for survival: Fear expression can directly let our eyes absorb more light

and our lungs take more air.

Improve group fitness: Surprise indicates something new happen; Sadness is a

signal to the group that something is wrong.

Share similar facial muscles

Evolution Creates Facial Expressions

SLIDE 4

Basic Emotions are Universal

Basic Expressions

Paul Ekman designed the acknowledged Facial Action Coding

System (FACS)

Paul Ekman claimed that Basic emotional expressions are in

fact universal across cultures acted by similar muscle group.

In 1960s, Paul Ekman identified six core expressions:

happiness, fear, surprise, disgust, sadness, anger

Paul Ekman Common muscle group

SLIDE 5

Introduction & Background 01 Fac acia ial Expre pression ssion Dat atab abas ases es 02 Our Works 03 Latest Survey 04

Outliness

SLIDE 6

Prototype Databases

2017 2015 2013 2011 2009 2007

CK+ MMI TFD Multi-PIE JAFFE Oulu-CASIA … …

Previous widely-used facial expression

datasets are lab-controlled and small- scale.

MMI: 2900 videos, 75 subjects JAFFE: 213 images, 10 females CK+: 596 videos, 123 subjects Oulu-CASIA: 2880 videos, 80 subjects

SLIDE 7

1809 videos from movies and TV shows
7 basic facial expressions
Three annotators
More than 330 subjects, age 1~77 years

Acted Facial Expression In The Wild (AFEW) Facial Expression Recognition 2013 (FER-2013)

35887 images from the Internet
48x48 pixels in grayscale
184 emotion-related keywords
7 basic facial expressions

PosedSpontaneous Mirco

SMIC
CASME, CASME II, CAS(ME)2
MEVIEW (Micro-Expressions VIdEos in the

Wild)

Micro-Expression Datasets: Suppressed emotion, difficult to observe

SLIDE 8

Lab-controlledMoiveIn-the-Wild

EmotioNet AffectNet

1,000,000 images from Internet
457 concepts of emotion-related keywords
23 basic and compound emotion categories
Action Units
1,000,000 images from Internet
1,250 emotion-related keywords
8 basic emotion categories
Valence and Arousal

SLIDE 9

Advanced Databases

2017 2015 2013 2011 2009 2007

FER2013 EmotioNet RAF-DB AffectNet EmotiW CK+ MMI TFD Multi-PIE JAFFE Oulu-CASIA … … … …

Datasets collected from real world are

more diverse and naturalistic, most of which contain large-scale samples.

FER2013: https://github.com/npinto/fer2013 SFEW, AFEW: https://cs.anu.edu.au/few/ EmotioNet: http://cbcsl.ece.ohio-state.edu/dbform_emotionet.html AffectNet: http://mohammadmahoor.com/affectnet/ RAF-DB, RAF-ML: http://whdeng.cn/Emotion/projects.html Aff-Wild:https://ibug.doc.ic.ac.uk/resources/first-affect-wild-challenge/ ExpW: http://mmlab.ie.cuhk.edu.hk/projects/socialrelation/index.html

Facial expression datasets in-the-wild

Li, S., & Deng, W. Deep facial expression recognition: A survey. CoRR abs/1804.08348 (2018).

SLIDE 10

Outlines

SLIDE 12

Two annotation Challenges

1,200,000 labels

Learning from labels

Crowd-sourcing

315 volunteers online Each image labelled 40 times

Annotation

Basic & Compound & Blended

0.12 0.24 0.12 0.02 0.29 0.2 0.2 0.4 0.6 0.8 1

Probability

SLIDE 13

Collection

 Image Collection

 Flickr (Image social network)

 https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key={}&text={}&tags={}

&per_page={}&page={}&sort=relevance

 XML response→Interpreted into URLs of the images→Download

~ 30,000 images

download

Keywords

‘smile’ ‘crying’ ‘OMG’… 60,000 images downloaded parse

1.

’s API

XML

response

URLs

Collection

Real-world Affective Face Database (RAF-DB)

S. Li, W. Deng, and J. Du, “Reliable crowdsourcing and deep locality preserving learning for expression recognition in the wild,” in 2017 IEEE Conference on

Computer Vision and Pattern Recognition (CVPR). IEEE, 2017, pp. 2584–2593.

SLIDE 14

Annotation

1,200,000 labels

Learning from labels

2.

 Image Annotation

 Crowd-sourcing

 315 well-trained

annotators were asked to label facial images with one

f the seven basic categories

 Each image is annotated

enough times independently, i.e., around 40 times in our experiment. Annotation

Real-world Affective Face Database (RAF-DB)

Crowd-sourcing

315 volunteers online Each image labelled 40 times

SLIDE 15

Reliability Estimation

3.

EM

framework

Filter out unreliable labels

Optimal Reliability

 Reliability Estimation

 Filter noisy annotators and labels

 an Expectation Maximization

(EM) framework was used to iteratively optimize and assess the target parameters of each labeler’s reliability.

Real-world Affective Face Database (RAF-DB)

SLIDE 16

Database Statistics
29672 number of real-world images,
a 7-dimensional expression distribution vector for each image,
two different subsets: single-label subset, including 7 classes of basic emotions; two-tab subset,

including 12 classes of compound emotions,

5 accurate landmark locations, 37 automatic landmark locations, race, age range and gender

attributes annotations per image.

Real-world Affective Face Database (RAF-DB)

SLIDE 17

? ? ?

Background

1. “ Nonverbal communication”, M. Anderson, 1987. 2. “ Facial expression and emotion”, P. Ekman, 1993. 3. “Compound facial expressions of emotion”, Martinez et al. PNAS 2014.

Compound Emotions

S. Du, Y. Tao, and A. M. Martinez, “Compound facial expressions
f emotion,” Proceedings of the National Academy of Sciences,
vol. 111, no. 15, pp. E1454–E1462, 2014.

While past research had identifed facial expressions associated with a single internally felt category (e.g, the facial expression of happiness when we feel joyful), we have recently studied facial expressions observed when people experience compound emotions (e.g, the facial expression of happy surprise when we feel joyful in a surprised way, as, for example, at a surprise birthday party)

SLIDE 18

Real-world Affective Face Database (RAF-DB)

SLIDE 19

AU1,2 AU5 AU1,2 AU5 AU25 AU25, AU26 AU1,2 AU1,2 AU26 AU4 AU5 AU1,4 AU7 AU26 AU27 AU1,4 AU1 AU5 AU26 AU27 AU20, AU25 AU12 AU6 AU25 AU12 AU6 AU26 AU26 AU12 AU6 AU12 AU6 AU17 AU4 AU7 AU24 AU5 AU10 AU25 AU26 AU25 AU9 AU7 AU10 AU26 AU27 AU4 AU7 AU1,4 AU15 AU15 AU17 AU7 AU25 AU17 AU1,4 AU25 AU10 AU4 AU7 AU9 AU17 AU5 AU10 AU20, AU25 AU7 AU4 AU5 AU10

AU24 AU4 AU7 AU17 AU1,4 AU7 AU20 , AU25 AU6 AU12 AU25 AU7 AU9 AU17 AU4 AU15 AU17 AU1,4 AU1,2 AU5 AU25, AU27

Surprise Fear Joy Anger Disgust Sadness

CK+ RAF-DB

Action Units: RAF-DB is more diverse

S. Li and W. Deng, “Reliable crowdsourcing and deep locality preserving learning for unconstrained facial expression recognition,” IEEE Transactions on Image

Processing, vol. 28, no. 1, pp. 356–370, Jan 2019.

SLIDE 20

C: The convolution layer P: The max-pooling layer R: The ReLU layer F: The fully connected layer

Softmax Loss

C R P C R P C R C R P C R C R F R F Input

Face Images Separable Features Locality- preserving Loss

λ

Discriminative Features

min

𝑗,𝑘

𝑇𝑗𝑘||𝑦𝑗 − 𝑦𝑘||2

2

𝑇𝑗𝑘 = 1, 𝑦𝑘 is among 𝑙 nearest neighbors of 𝑦𝑗 𝐩𝐬 𝑦𝑗 is among k nearest neighbors of 𝑦𝑘 0, 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓

Our goal:

Locality Preserving Loss:

𝑀𝑚𝑞 = 1 2𝑜

i=1 𝑜

||𝑦𝑗 − 1 𝑙

𝑦∈𝑂𝑙 𝑦𝑗

𝑦 ||2

2

DLP-CNN: Deep Locality-preserving CNN

SLIDE 21

DLP-CNN: Deep Locality-preserving CNN

S. Li and W. Deng, “Reliable crowdsourcing and deep locality preserving learning for unconstrained facial expression recognition,” IEEE Transactions on Image

Processing, vol. 28, no. 1, pp. 356–370, Jan 2019.

SLIDE 22

Table 1. Expression recognition performance of different DCNNs on RAF. The metric is the mean diagonal value of the confusion matrix.

[6] [7] [8] [6] [7] [8]

6. Simonyan & Zisserman, arXiv:1409.1556 (2014).
7. Krizhevsky et al. NIPS, 1097–1105 (2012).
8. Wen et al. ECCV, 499–515 (2016).

DLP-CNN: Experiment Results

SLIDE 23

Table 2. Comparison results of DLP-CNN and other state-of-the-art methods on CK+, SFEW and MMI

databases. To validate the generalization of our model, the well-trained DLP-CNN has been employed as a

feature extraction tool without finetune.

[9] [10] [11] [11] [11] [11] [12] [13] [14] [15] [12] [9] [13] [14] [13] [16] [17] [18] [19] [20] [21] [17] [22]

9. Zhong et al. CVPR, 2562–2569 (2012).
10. LV et al. SMARTCOMP, 303–308 (2014).
11. Liu et al. FG, 1–6 (2013).
12. Liu et al. ACCV, 143-157 (2014).
13. Mollahosseini et al. WACV, 1-10 (2016).
14. Liu et al. IEEE TIP, 25(12):5920–5932, (2016).
15. Shojaeilangari et al. IEEE TIP, 24(7):2140–2152, (2015).
16. Eleftheriadis et al. IEEE TIP, 24(1):189–204, (2015).
17. Liu et al. CVPR, 1749–1756 (2014).
18. Ng et al. ICMI, 443-449 (2015).
19. Yu et al. ICMI, 435-442 (2015).
20. Kim et al. ICMI, 427-434 (2015).
21. Jung et al. CVPR, 2983–2991 (2015).
22. Sariyanidi et al. IEEE TIP, 26(4):1965-1978, (2017).

DLP-CNN: Experiment Results

SLIDE 24

 Plutchik’s Wheel of Emotions

Many emotions are simply a combination of basic emotions or are derived from one (or more) of these basic emotions.

Real life emotions are often blended and involve several simultaneous superposed

r masked emotions. Many students of

emotion have noted that facial expressions may contain more than one message (Ekman & Friesen, 1969; Izard, 1971; Plutchik, 1962;Tomkins, 1963).

Blended Emotions

SLIDE 25

Li, Shan, and Weihong Deng. "Blended Emotion in-the-Wild: Multi-label Facial Expression Recognition Using Crowdsourced Annotations and Deep Locality Feature Learning." International Journal of Computer Vision (2019): 1-23.

Real-world Affective Face Multi Label (RAF-ML)

Blended Emotions With Multiple Labels

0.12 0.34 0.110.02 0.39 0.01 0 0.2 0.4 0.6 0.8 1

Probability

Emotion Distribution

4,908 images from Internet
40 independent labelers

per images

Blended emotions with

multi-label

SLIDE 26

𝑀𝑐𝑛 = 1 2𝑜

𝑗=1 𝑜

2𝑦𝑗

𝑔

− 1 𝑙

𝑦∈𝑂𝑙

𝑔 𝑦𝑗

𝑦𝑔 − 1 𝑙

𝑦∈𝑂𝑙

𝑚 𝑦𝑗

𝑦𝑔

min

𝑗,𝑘

(𝑇𝑗𝑘

𝑔 + 𝑇𝑗𝑘 𝑚 )||𝑦𝑗 𝑔 − 𝑦𝑘 𝑔||2 2

𝑇𝑗𝑘

∗ = 1, 𝑦𝑘 ∗𝑗𝑡 𝑙𝑜𝑜 𝑝𝑔𝑦𝑗 ∗, 𝑤𝑗𝑑𝑓 𝑤𝑓𝑠𝑡𝑏

0, 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓

Our goal: Bi-Manifold Loss:

𝑦𝑔: samples in feature manifold 𝑦𝑚: samples in label manifold

Two-dimensional deep feature embedding by DBM-CNN on RAF-ML :

DBM-CNN: Deep Bi-Manifold CNN

SLIDE 27

DBM-CNN: Deep Bi-Manifold CNN

Smoothness in terms of both face appearance and emotion perception

SLIDE 28

*For each evaluation criterion, “↓” indicates the smaller the better while “↑” indicates the bigger the better. **Bold values indicate the best result in term of each performance index

Table 1. Comparison results of DBM-CNN and other training models using MLkNN

DBM-CNN: Experiment Results

Li, Shan, and Weihong Deng. "Blended Emotion in-the-Wild: Multi-label Facial Expression Recognition Using Crowdsourced Annotations and Deep Locality Feature Learning." International Journal of Computer Vision (2019): 1-23.

SLIDE 29

Table 2. Experimental results of comparing features on RAF-ML using different algorithms

DBM-CNN: Experiment Results

SLIDE 30

Table 2. (continued) Experimental results of comparing features on RAF-ML using different algorithms

DBM-CNN: Experiment Results

SLIDE 31

A Deeper Look at Facial Expression Datasets Bias

Li, S., & Deng, W. A Deeper Look at Facial Expression Dataset Bias. CoRR abs/1904.11150 (2019).

Datasets play an important role in the progress of facial expression recognition algorithms, but they may suffer from obvious biases caused by different cultures and collection conditions. Hence, evaluating methods with intra-database protocol would render them lack generalization capability on unseen samples at test time.

SLIDE 32

Capture Bias:

Each dataset tends to have its own preference during the construction processing.

Experiment Ⅰ Database Recognition Experiment Ⅱ Cross-dataset Generation

Category Bias:

Annotators in each dataset may have different perceptions of the emotion conveyed in images, and many images tend to express more than one expression which enhances the uncertainty of annotation.

A Deeper Look at Facial Expression Datasets Bias

Li, S., & Deng, W. A Deeper Look at Facial Expression Dataset Bias. CoRR abs/1904.11150 (2019).

SLIDE 33

ECAN: deep Emotion-Conditional Adaption Network

In traditional MMD, only the marginal distributions are considered to be restricted. Since domain-invariance does not mean discriminativeness and class distribution bias exists across domains, samples in target domains are still prone to be misclassified.

ECAN

We explore the underlying label information of target data, and match both the marginal and class conditional distributions to mitigate the discrepancy. With the Re-weighed MMD redistributing the class distribution and the class conditional MMD learning the conditional invariant transformation, the discriminative separating hyperplane thus can generalize well on the target data.

SLIDE 34

Domain Adaption: From RAF-DB to other datasets

SLIDE 35

Domain Adaption: Experimental Results

Li, S., & Deng, W. A Deeper Look at Facial Expression Dataset Bias. CoRR abs/1904.11150 (2019).

SLIDE 36

Introduction & Background 01 Facial Expression Databases 02 Our Works 03 La Late test t Sur urvey vey 04

Ourlines

SLIDE 37

Therapies with autistic children
Automatic sensing behavioral cues of depression
Ads and apps for marketing & entertainment
Driver Monitoring Systems for road safety
Lie-detector and monitor for suspicious behavior
So on … …

Applications

Emotional Artificial Intelligence can be used for great purposes to help people and humanity.

SLIDE 38

Latest Progresses

Y. Li, J. Zeng, S. Shan, CVPR 2019
S. Wang, B. Li, Y. Liu, W. Yan, Y. Chen, X. Fu

Neurocomputing 2018

Cross-domain color FER Facial expression synthesis Self-supervised AU Detection Micro-expression recognition with small sample size

Z. Chen, D. Huang, Y. Wang, L. Chen

ACMMM 2018

3D FER

L. Song, Z. Lu, R. He, Z. Sun et al.

ACMMM 2018

W. Zheng, Y. Zong, X. Zhou, M. Xin

TAC 2018

Spatial and temporal patterns for Dynamic FER

S. Wang, Z. Zheng, S. Yin, J. Yang et al.

TPAMI 2019

SLIDE 39

Latest Survey

For more detailed and long-term survey, please refer to:

SLIDE 40

2017 2015 2013 2011 2009 2007

Dataset

CK+ MMI FER2013 EmotioNet RAF-DB AffectNet EmotiW

LP loss tuplet cluster loss Island loss … … HoloNet PPDN IACNN FaceNet2ExpNet … …

Zhao et al. [15] (LBP-TOP, SVM) Shan et al. [12] (LBP, AdaBoost) Zhi et al. [19] (NMF) Zhong et al. [20] (Sparse learning) Tang (CNN) [130] (winner of FER2013) Kahou et al. [57] (CNN, DBN, DAE)

(winner of EmotiW 2013)

--->
--->

Fan et al. [108] (CNN-LSTM, C3D)

(winner of EmotiW 2016)

Algorithm

SLIDE 41

Latest Survey

Li, S., & Deng, W. Deep facial expression recognition: A survey. CoRR abs/1804.08348 (2018).

Facial Expression Datasets
Deep Facial Expression Recognition
Pre-processing
Face alignment
Data augmentation
Face normalization
Deep networks (CNN, DBN, DAE, RNN and

GAN)

Facial expression classification

SLIDE 42

State-of-the-Art

50 60 70 80 90 100

Accuracy

performance tendency

The state-of-the-arts on different facial expression databases

Lab-controlled Real-world

Micro-expression

Li, S., & Deng, W. Deep facial expression recognition: A survey. CoRR abs/1804.08348 (2018).

SLIDE 43

Additional Related Issues

Occlusion and non-frontal head pose
FER on infrared data
FER on 3D static and dynamic data
Facial expression synthesis
Visualization techniques
Other special problems

Li, S., & Deng, W. Deep facial expression recognition: A survey. CoRR abs/1804.08348 (2018).

SLIDE 44

Future Trends

Beyond Six Basic Emotions:
Facial Action Coding System (Action Units)
Compound emotions & Blended emotions
Objective labeling on Subjective expressions.
Multimodal FER:
Audio and Video
Infrared images and Thermal images
Depth information from 3D face models
Physiological data
Challenging Variations:
Head poses and Facial occlusions (Users are

always be spontaneous)

Race/Identity dependence
Cross-dataset FER for generalization

Li, S., & Deng, W. Deep facial expression recognition: A survey. CoRR abs/1804.08348 (2018).

SLIDE 45

Acknowledgements

[1] Shan Li, Weihong Deng, Blended Emotion in-the-Wild: Multi-label Facial Expression Recognition Using Crowdsourced Annotations and Deep Locality Feature Learning. International Journal of Computer Vision 127(6-7): 884-906 (2019). [2] Shan Li, Weihong Deng, Reliable Crowdsourcing and Deep Locality- Preserving Learning for Unconstrained Facial Expression Recognition. IEEE

Trans. Image Processing 28(1): 356-370 (2019).

[3] Shan Li, Weihong Deng, A Deeper Look at Facial Expression Dataset Bias. arXiv: 1904.11150 (2019). [4] Shan Li, Weihong Deng, Deep Facial Expression Recognition: A Survey. arXiv:1804.08348v2 (2018).

Collaborator: Shan Li i (李珊)

SLIDE 46

真实世界人脸表情识别

Weihong Deng（邓伟洪）

In Intr trod

uction tion & Bac ackgrou kground nd 01 Facial Expression Databases 02 Our Works 03 Latest Survey 04

Outlines

by natural selection

Evolution Creates Facial Expressions

Basic Emotions are Universal

Introduction & Background 01 Fac acia ial Expre pression ssion Dat atab abas ases es 02 Our Works 03 Latest Survey 04

Outliness

Prototype Databases

PosedSpontaneous Mirco

Lab-controlledMoiveIn-the-Wild

Advanced Databases

Contents

Expression Labeling Dataset Bias Latest Survey Discussions

Introduction & Background 01 Facial Expression Databases 02 Our ur Wor

ks 03 Latest Survey 04

Outlines

Two annotation Challenges

1.

Real-world Affective Face Database (RAF-DB)

2.

Real-world Affective Face Database (RAF-DB)

3.

Real-world Affective Face Database (RAF-DB)

Real-world Affective Face Database (RAF-DB)

? ? ?

Background

Compound Emotions

Real-world Affective Face Database (RAF-DB)

Action Units: RAF-DB is more diverse

DLP-CNN: Deep Locality-preserving CNN

DLP-CNN: Deep Locality-preserving CNN

DLP-CNN: Experiment Results

DLP-CNN: Experiment Results

Blended Emotions

Real-world Affective Face Multi Label (RAF-ML)

DBM-CNN: Deep Bi-Manifold CNN

DBM-CNN: Deep Bi-Manifold CNN

DBM-CNN: Experiment Results

DBM-CNN: Experiment Results

DBM-CNN: Experiment Results

A Deeper Look at Facial Expression Datasets Bias

A Deeper Look at Facial Expression Datasets Bias

ECAN: deep Emotion-Conditional Adaption Network

Domain Adaption: From RAF-DB to other datasets

Domain Adaption: Experimental Results

Introduction & Background 01 Facial Expression Databases 02 Our Works 03 La Late test t Sur urvey vey 04

Ourlines

Applications

Latest Progresses

Latest Survey

Latest Survey

State-of-the-Art

Additional Related Issues

Future Trends

Acknowledgements

Collaborator: Shan Li i (李珊)

Thanks!