Classification of Rare Recipes Requires Linguistic Features as - - PowerPoint PPT Presentation

▶

Aug 25, 2022 179 likes •395 views

Classification of Rare Recipes Requires Linguistic Features as Special Ingredients Elham Mohammadi, Nada Naji, Louis Marceau, Marc Queudot, Eric Charton, Leila Kosseim, and Marie-Jean Meurs Banque Nationale du Canada Concordia University

SLIDE 1

Classification of Rare Recipes Requires Linguistic Features as Special Ingredients

Elham Mohammadi, Nada Naji, Louis Marceau, Marc Queudot, Eric Charton, Leila Kosseim, and Marie-Jean Meurs

Banque Nationale du Canada Concordia University Université du Québec à Montréal

SLIDE 2

2

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

SLIDE 3

3

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

SLIDE 4

4

Introduction

❖ Motivation

➢ Many real-life scenarios involve the use of highly imbalanced datasets. ➢ Extraction of discriminative features

■ Discriminative features can be used alongside distributed representations.

SLIDE 5

5

Introduction

❖ Goal

➢ Investigating the efgectiveness of the use of discriminative features in a task with imbalanced data

SLIDE 6

6

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

SLIDE 7

7

Dataset and Tasks

❖ DEFT (Defj Fouille de Texte) 2013 (Grouin et al., 2013)

➢ A dataset of French cooking recipes labelled as

■ Task 1: Level of diffjculty

Very Easy, Easy, Fairly Diffjcult, and Diffjcult

■ Task 2: Meal type

Starter, Main Dish, and Dessert

SLIDE 8

8

Dataset Statistics

Task 1 Task 2

SLIDE 9

9

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

SLIDE 10

10

Methodology

❖ Neural sub-model

➢ Embedding layer: pretrained BERT or CamemBERT ➢ Hidden layer: CNN or GRU ➢ Pooling layer: Attention, Average, Max

❖ Linguistic sub-model

➢ Feature extractor

■ The extraction and selection of linguistic features was done according to Charton et al. (2014)

➢ Fully-connected layer

SLIDE 11

11

Experiments

❖ The joint model ❖ The independent neural-based sub-model ❖ Fine-tuned BERT and CamemBERT models

SLIDE 12

12

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

SLIDE 13

13

Results: Task 1

SLIDE 14

14

Results: Task 1 (Per-class F1)

SLIDE 15

15

Results: Task 2

SLIDE 16

16

Results: Task 2 (Per-class F1)

SLIDE 17

17

Discussion

❖ The joint model is more efgective in task 1 compared to task 2

➢ The linguistic features used for task 2 ■ might not be as representative of the classes as those for task 1 ■ are signifjcantly more sparse ➢ The improvement caused by the joint model is higher in case of rare classes

SLIDE 18

18

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

SLIDE 19

19

Conclusion

❖ In both tasks, the joint models outperform their neural counterparts ❖ The improvement by the joint models is higher in Task 1 ❖ The improvement by the joint models is more signifjcant for rare classes ❖ The strength of the joint architecture is in the handling of rare classes

SLIDE 20

20 Contents

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

SLIDE 21

Classification of Rare Recipes Requires Linguistic Features as Special Ingredients

Elham Mohammadi, Nada Naji, Louis Marceau, Marc Queudot, Eric Charton, Leila Kosseim, and Marie-Jean Meurs

2

Contents

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

3

Contents

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

4

Introduction

❖ Motivation

➢ Many real-life scenarios involve the use of highly imbalanced datasets. ➢ Extraction of discriminative features

■ Discriminative features can be used alongside distributed representations.

5

Introduction

❖ Goal

➢ Investigating the efgectiveness of the use of discriminative features in a task with imbalanced data

6

Contents

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

7

Dataset and Tasks

❖ DEFT (Defj Fouille de Texte) 2013 (Grouin et al., 2013)

➢ A dataset of French cooking recipes labelled as

■ Task 1: Level of diffjculty

■ Task 2: Meal type

8

Dataset Statistics

Task 1 Task 2

9

Contents

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

10

Methodology

❖ Neural sub-model

➢ Embedding layer: pretrained BERT or CamemBERT ➢ Hidden layer: CNN or GRU ➢ Pooling layer: Attention, Average, Max

❖ Linguistic sub-model

➢ Feature extractor

➢ Fully-connected layer

11

Experiments

❖ The joint model ❖ The independent neural-based sub-model ❖ Fine-tuned BERT and CamemBERT models

12

Contents

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

13

Results: Task 1

14

Results: Task 1 (Per-class F1)

15

Results: Task 2

16

Results: Task 2 (Per-class F1)

17

Discussion

❖ The joint model is more efgective in task 1 compared to task 2

➢ The linguistic features used for task 2 ■ might not be as representative of the classes as those for task 1 ■ are signifjcantly more sparse ➢ The improvement caused by the joint model is higher in case of rare classes

18

Contents

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

19

Conclusion

❖ In both tasks, the joint models outperform their neural counterparts ❖ The improvement by the joint models is higher in Task 1 ❖ The improvement by the joint models is more signifjcant for rare classes ❖ The strength of the joint architecture is in the handling of rare classes

20 Contents

❖ Introduction ❖ Dataset and Tasks ❖ Methodology ❖ Results and Discussion ❖ Conclusion

Thank you!