Contextualization of Morphological Inflection Ekaterina Vylomova 1 - - PowerPoint PPT Presentation

contextualization of morphological inflection
SMART_READER_LITE
LIVE PREVIEW

Contextualization of Morphological Inflection Ekaterina Vylomova 1 - - PowerPoint PPT Presentation

Contextualization of Morphological Inflection Ekaterina Vylomova 1 Ryan Cotterell 2 Timothy Baldwin 1 Trevor Cohn 1 Jason Eisner 2 1 School of Computing and Information Systems The University of Melbourne 2 Department of Computer Science Johns


slide-1
SLIDE 1

Contextualization of Morphological Inflection

Ekaterina Vylomova1 Ryan Cotterell2 Timothy Baldwin1 Trevor Cohn1 Jason Eisner2

1School of Computing and Information Systems

The University of Melbourne

2Department of Computer Science

Johns Hopkins University

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 1 / 39

slide-2
SLIDE 2

Language Modelling

This is Marvin:

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 2 / 39

slide-3
SLIDE 3

Language Modelling

OK, Marvin, which word comes next: Two cats are ___ Hmmm, let me guess ... sitting 3.01 ∗ 10−4 play 2.87 ∗ 10−4 running 2.53 ∗ 10−4 nice 2.32 ∗ 10−4 lost 1.97 ∗ 10−4 playing 1.66 ∗ 10−4 sat 1.54 ∗ 10−4 plays 1.32 ∗ 10−4 . . . . . .

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 3 / 39

slide-4
SLIDE 4

Language Modelling

Let’s add a constraint by providing a lemma: Two cats are [PLAY] That narrows things down a lot ... sitting 3.01 ∗ 10−4 play 2.87 ∗ 10−4 running 2.53 ∗ 10−4 nice 2.32 ∗ 10−4 lost 1.97 ∗ 10−4 playing 1.66 ∗ 10−4 sat 1.54 ∗ 10−4 plays 1.32 ∗ 10−4 . . . . . .

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 4 / 39

slide-5
SLIDE 5

Language Modelling

Hey, this reminds me a bit of .... a wug ... and a second wug:

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 5 / 39

slide-6
SLIDE 6

Morphological (Re-)Inflection

... as well as the SIGMORPHON morphological inflection task

SIGMORPHON Shared Task 2016–2019

PLAY + PRESENT PARTICIPLE → playing played + PRESENT PARTICIPLE → playing Lemma Tag Form RUN PAST ran RUN PRES;1SG run RUN PRES;2SG run RUN PRES;3SG runs RUN PRES;PL run RUN PART running

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 6 / 39

2018 :∼ 96% accuracy on avg. in high-resource setting

slide-7
SLIDE 7

Morphological (Re-)Inflection

Contextualization: But why choose PRESENT PARTICIPLE? Context!

SIGMORPHON Shared Task 2016–2019

PLAY + PRESENT PARTICIPLE → playing played + PRESENT PARTICIPLE → playing

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 7 / 39

slide-8
SLIDE 8

Morphological (Re-)Inflection

Contextualization: The tags must be inferred from the context!

SIGMORPHON Shared Task 2018 Task 2

SubTask 1 Two cats are ??? together TWO/NUM CAT/N+PL BE/AUX+PRES+3PL PLAY TOGETHER/ADV SubTask 2 Two cats are ??? together PLAY

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 8 / 39

slide-9
SLIDE 9

Morphological (Re-)Inflection

Contextualization: The tags must be inferred from the context!

SIGMORPHON Shared Task 2018 Task 2

SubTask 1 Two cats are playing together TWO/NUM CAT/N+PL BE/AUX+PRES+3PL PLAY TOGETHER/ADV SubTask 2 Two cats are playing together PLAY

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 9 / 39

slide-10
SLIDE 10

A Hybrid (Structured–unstructured) Model

Let’s predict both tags and forms!

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 10 / 39

Lemmatized Sequence Predicted Tag Sequence Predicted Form Sequence

slide-11
SLIDE 11

A Hybrid (Structured–unstructured) Model

... or, in other words, p(w, m | ℓ) = (n

i=1 p(wi | ℓi, mi)) p(m | ℓ)

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 11 / 39

Lemmatized Sequence Predicted Tag Sequence Predicted Form Sequence

slide-12
SLIDE 12

A Hybrid (Structured–unstructured) Model

... or, in other words, p(w, m | ℓ) = (n

i=1 p(wi | ℓi, mi)) p(m | ℓ)

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 12 / 39

Lemmatized Sequence p(m | ℓ) Neural CRF Lample et al., 2016 Predicted Tag Sequence Predicted Form Sequence Aharoni et al., 2017 p(wi | ℓi, mi) Hard Monotonic Attention

slide-13
SLIDE 13

Languages and Grammar Categories

Let’s test the model on a wide variety of languages!

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 13 / 39

slide-14
SLIDE 14

Languages and Grammar Categories

Languages differ in what is explicitly morphosyntactically marked, and how:

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 14 / 39

Bulgarian (bg), Slavic English (en), Germanic Basque (eu), Isolate Finnish (fi), Uralic Gaelic (ga), Celtic Hindi (hi), Indic Italian (it), Romance Latin (la), Romance Polish (pl), Slavic Swedish (sv), Germanic

slide-15
SLIDE 15

Languages and Grammar Categories

Some languages use word order to express relations between words, while

  • thers use morphosyntactic marking:

English: Kim gives Sandy an interesting book Polish: Jenia daje Maszy ciekawą książkę

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 15 / 39

slide-16
SLIDE 16

Languages and Grammar Categories

Some languages use word order to express relations between words, while

  • thers use morphosyntactic marking:

English: Kim gives Sandy an interesting book Subject IObject DObject Polish: Jenia daje Maszy ciekawą książkę Nom Dat Acc.Fem.Sg Acc.Sg == Maszy daje Jenia ciekawą książkę == ciekawą książkę daje Jenia Maszy != Jenie daje Masza ciekawą książkę

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 16 / 39

slide-17
SLIDE 17

Experiments

How well can such categories and corresponding forms be predicted in each language?

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 17 / 39

slide-18
SLIDE 18

Experiments

How well can such categories and corresponding forms be predicted in each language? Do linguistic features enhance performance?

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 18 / 39

slide-19
SLIDE 19

Experiments

How well can such categories and corresponding forms be predicted in each language? Do linguistic features enhance performance? Does morphological complexity impact on empirical performance?

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 19 / 39

slide-20
SLIDE 20

Experiments

Data: Universal Dependencies v1.2 Baselines: the baseline of the SIGMORPHON 2018 shared task as well as the best performing system of that year

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 20 / 39

Nivre et al.,2016

slide-21
SLIDE 21

Experiments

SM: biLSTM encoder–decoder with context window of size 2 input = concat (left+right forms, lemma, tags, char-level center lemma)

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 21 / 39

Cotterell et al.,2018

slide-22
SLIDE 22

Experiments

CPH: biLSTM encoder–decoder with no context window size restrictions input = concat (full context, lemma, tags, char-level center lemma) also predicts target tags as an auxiliary task Direct: more basic model that relies only on forms and lemmas

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 22 / 39

Kementchedjhieva et al.,2018

slide-23
SLIDE 23

Experiments

Let’s condition only on contextual forms and lemmas (1-best accuracy for form prediction):

25 50 75 100 BG EN EU FI GA HI IT LA PL SV 1.Direct Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 23 / 39

slide-24
SLIDE 24

Experiments

Now also supply contextual tag information, still predicting forms only:

25 50 75 100 BG EN EU FI GA HI IT LA PL SV 1.Direct 2.SM Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 24 / 39

slide-25
SLIDE 25

Experiments

Now use a wider context and predict tags as an auxiliary task:

25 50 75 100 BG EN EU FI GA HI IT LA PL SV 1.Direct 2.SM 3.CPH Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 25 / 39

slide-26
SLIDE 26

Experiments

Finally, use neural CRF to predict tag sequence and hard monotonic attention model for forms:

25 50 75 100 BG EN EU FI GA HI IT LA PL SV 1.Direct 2.SM 3.CPH 4.Joint Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 26 / 39

slide-27
SLIDE 27

Experiments

How far are we from the results for forms predicted from gold tag sequence?

25 50 75 100 BG EN EU FI GA HI IT LA PL SV 1.Direct 2.SM 3.CPH 4.Joint 5.Gold Tags Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 27 / 39

slide-28
SLIDE 28

Discussion

Q1: Do linguistic features help?

Yes, they do!

Most systems that make use of morphological tags outperform the “Direct” baseline on most languages Joint prediction of tags and forms further improves the results

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 28 / 39

slide-29
SLIDE 29

Discussion

Q2: Does morphological complexity impact empirical performance?

Yes, it does!

Performance drops in languages with rich case systems such as Slavic and Uralic The model needs to learn which grammatical categories should be in agreement

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 29 / 39

slide-30
SLIDE 30

Discussion

Q3: How well is agreement captured?

Adjective – Noun (AMod)

is captured quite well

Verb – Noun (Subject – Verb)

is more challenging, since agreement categories can vary depending on tense General-purpose inference of agreement categories is still a challenging task!

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 30 / 39

slide-31
SLIDE 31

Discussion

Q4: Where does most uncertainty come from?

Inherent and Contextual Morphological Categories

Contextual categories participate in agreement: adjective number, case, gender, verbal gender, etc. Inherent express the speaker’s intentions: noun number, verbal tense

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 31 / 39

slide-32
SLIDE 32

Discussion

Q4: Where does most uncertainty come from?

Inherent and Contextual Morphological Categories

Contextual categories participate in agreement: adjective number, case, gender, verbal gender, etc. Inherent express the speaker’s intentions: noun number, verbal tense

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 32 / 39

Most uncertainty comes from inherent categories!

slide-33
SLIDE 33

Discussion

Q4: Where does most uncertainty come from?

Inherent and Contextual Morphological Categories

Contextual categories participate in agreement: adjective number, case, gender, verbal gender, etc. Inherent express the speaker’s intentions: noun number, verbal tense

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 33 / 39

Most uncertainty comes from inherent categories! Often such categories must be inferred

slide-34
SLIDE 34

Discussion

Q5: Which language is least affected by lemmatization?

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 34 / 39

slide-35
SLIDE 35

Discussion

Q5: Which language is least affected by lemmatization?

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 35 / 39

slide-36
SLIDE 36

Discussion

Q5: Which language is least affected by lemmatization?

Word Order vs. Morphology

Most information on roles and dependencies is expressed non-morphologically, e.g. in word order or by prepositions: EN: Kim gives Sandy an interesting book → KIM GIVE SANDY AN INTERESTING BOOK PL: Jenia daje Maszy ciekawą książkę → JENIA DAWAĆ’ MASZA CIEKAWY KSIĄŻKA

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 36 / 39

Why English?

slide-37
SLIDE 37

Discussion

Q5: Which language is least affected by lemmatization?

Word Order vs. Morphology

Most information on roles and dependencies is expressed non-morphologically, e.g. in word order or by prepositions: EN: Kim gives Sandy an interesting book → KIM GIVE SANDY AN INTERESTING BOOK PL: Jenia daje Maszy ciekawą książkę → JENIA DAWAĆ’ MASZA CIEKAWY KSIĄŻKA

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 37 / 39

Why English? SVO/Roles are still there Flexible/Roles are partially lost

slide-38
SLIDE 38

Future Directions

Evaluation of grammaticality

How well do neural models model grammaticality?

Data de-biasing (e.g., En–Ru )

smart student → umnyj.Nom.Masc.Sg student.Nom.Sg augment with: smart student → umnaja.Nom.Fem.Sg studentka.Nom.Fem.Sg

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 38 / 39

slide-39
SLIDE 39

Thank you! Questions?

Vylomova, Cotterell, Baldwin, Cohn, Eisner Contextualization of Morphological Inflection 39 / 39