SLIDE 1 Classification of Rare Recipes Requires Linguistic Features as Special Ingredients
Elham Mohammadi, Nada Naji, Louis Marceau, Marc Queudot, Eric Charton, Leila Kosseim, and Marie-Jean Meurs
Banque Nationale du Canada Concordia University Université du Québec à Montréal
SLIDE 2
2
Contents
❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion
SLIDE 3
3
Contents
❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion
SLIDE 4
4
Introduction
❖ Motivation
➢ Many real-life scenarios involve the use of highly imbalanced datasets. ➢ Extraction of discriminative features
■ Discriminative features can be used alongside distributed representations.
SLIDE 5
5
Introduction
❖ Goal
➢ Investigating the efgectiveness of the use of discriminative features in a task with imbalanced data
SLIDE 6
6
Contents
❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion
SLIDE 7 7
Dataset and Tasks
❖ DEFT (Defj Fouille de Texte) 2013 (Grouin et al., 2013)
➢ A dataset of French cooking recipes labelled as
■ Task 1: Level of diffjculty
- Very Easy, Easy, Fairly Diffjcult, and Diffjcult
■ Task 2: Meal type
- Starter, Main Dish, and Dessert
SLIDE 8
8
Dataset Statistics
Task 1 Task 2
SLIDE 9
9
Contents
❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion
SLIDE 10 10
Methodology
❖ Neural sub-model
➢ Embedding layer: pretrained BERT or CamemBERT ➢ Hidden layer: CNN or GRU ➢ Pooling layer: Attention, Average, Max
❖ Linguistic sub-model
➢ Feature extractor
■ The extraction and selection of linguistic features was done according to Charton et al. (2014)
➢ Fully-connected layer
SLIDE 11
11
Experiments
❖ The joint model ❖ The independent neural-based sub-model ❖ Fine-tuned BERT and CamemBERT models
SLIDE 12
12
Contents
❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion
SLIDE 13
13
Results: Task 1
SLIDE 14
14
Results: Task 1 (Per-class F1)
SLIDE 15
15
Results: Task 2
SLIDE 16
16
Results: Task 2 (Per-class F1)
SLIDE 17
17
Discussion
❖ The joint model is more efgective in task 1 compared to task 2
➢ The linguistic features used for task 2 ■ might not be as representative of the classes as those for task 1 ■ are signifjcantly more sparse ➢ The improvement caused by the joint model is higher in case of rare classes
SLIDE 18
18
Contents
❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion
SLIDE 19
19
Conclusion
❖ In both tasks, the joint models outperform their neural counterparts ❖ The improvement by the joint models is higher in Task 1 ❖ The improvement by the joint models is more signifjcant for rare classes ❖ The strength of the joint architecture is in the handling of rare classes
SLIDE 20
20 Contents
❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion
SLIDE 21
Thank you!