Higher Order Structures in Minimalist Derivations Greg Kobele - - PowerPoint PPT Presentation

higher order structures in minimalist derivations
SMART_READER_LITE
LIVE PREVIEW

Higher Order Structures in Minimalist Derivations Greg Kobele - - PowerPoint PPT Presentation

Higher Order Structures in Minimalist Derivations Greg Kobele TAG+13 Universitt Leipzig Intro Intro Grammar formalisms, like programming languages, are useful because they allow us to factor our explanation of linguistic be- haviour into


slide-1
SLIDE 1

Higher Order Structures in Minimalist Derivations

Greg Kobele TAG+13

Universität Leipzig

slide-2
SLIDE 2

Intro

slide-3
SLIDE 3

Intro

Grammar formalisms, like programming languages, are useful because they allow us to factor our explanation of linguistic be- haviour into a statement of abstract regularities (the grammar), and a description of how these are com- puted online (the parser/parser-generator)

1

slide-4
SLIDE 4

Intro

Current MG parsing algorithms

  • needlessly explode state space (making beam search

implausible)

  • are based on (exponentially less succinct) MCFGs
  • have only extrema on GLC lattice (inherited from MCFG)

1

slide-5
SLIDE 5

Intro

We exploit the structure of MGs to define a MG-specific TD parsing strategy

  • structures search space by ’sharing’ infinite classes of

items

  • bringing us closer to LC

This gives a formal (very literal) reconstruction of popular psycholinguistic ideas about the human sentence processing mechanism

1

slide-6
SLIDE 6

MGs

slide-7
SLIDE 7

Overview

a formalization of Chomsky’s “minimalist program”

  • I think they are an exact formalization
  • I am interested in them because they are a bridge

between linguistics and computer science

2

slide-8
SLIDE 8

Properties

MGs belong to family of MCS grammar formalisms

  • TAG is Monadic CFTG, and MG is (contained in) MRTG
  • Share the regularity of derivation trees
  • TALs are all well-nested MCFLs, but MLs are the

non-well-nested MCFLs separation: (Kanazawa & Salvati, 2010) {w#w | w ∈ L, L is in CFL − EDT0L} well-nested MCFLs can have crossing dependencies, but not between syntactically complicated objects

3

slide-9
SLIDE 9

Minimalist Grammars

  • To specify a grammar, we need to specify two things:
  • 1. The features

(which features we will use in our grammar)

  • 2. The lexicon

(which syntactic feature sequences are assigned to which words)

4

slide-10
SLIDE 10

Features

Features come in pairs

  • =x and x
  • +y and -y

Like in CG, categories are structured

  • list of features

tradition calls categories: feature bundles =n.d.-k

5

slide-11
SLIDE 11

Data structure

Binary branching trees

  • internal node labels: > and <
  • leaf labels: (w, δ) and t

Headed trees head( >(u,v) ) = head( v ) head( <(u,v) ) = head( u ) head( l ) = l

δ

6

slide-12
SLIDE 12

Merge

=x.γ x.δ γ δ + ) <

7

slide-13
SLIDE 13

Move +y.γ

  • y

γ ) >

8

slide-14
SLIDE 14

Move +y.γ

  • y

γ ) >

SMC No other possible mover

8

slide-15
SLIDE 15

A working example

boy n every =n.d.-k laugh =d.v will =v.+k.s

9

slide-16
SLIDE 16

Representing derivations

10

slide-17
SLIDE 17

Representing derivations

  • 1. select every

every

10

slide-18
SLIDE 18

Representing derivations

  • 1. select every
  • 2. select boy

every boy

10

slide-19
SLIDE 19

Representing derivations

  • 1. select every
  • 2. select boy
  • 3. merge 1 and 2

[DP every [NP boy ]] merge every boy

10

slide-20
SLIDE 20

Representing derivations

  • 1. select every
  • 2. select boy
  • 3. merge 1 and 2

[DP every [NP boy ]]

  • 4. select laugh

laugh merge every boy

10

slide-21
SLIDE 21

Representing derivations

  • 1. select every
  • 2. select boy
  • 3. merge 1 and 2

[DP every [NP boy ]]

  • 4. select laugh
  • 5. merge 4 and 3

[VP laugh [DP every boy ]] merge laugh merge every boy

10

slide-22
SLIDE 22

Representing derivations

  • 1. select every
  • 2. select boy
  • 3. merge 1 and 2

[DP every [NP boy ]]

  • 4. select laugh
  • 5. merge 4 and 3

[VP laugh [DP every boy ]]

  • 6. select will

will merge laugh merge every boy

10

slide-23
SLIDE 23

Representing derivations

  • 1. select every
  • 2. select boy
  • 3. merge 1 and 2

[DP every [NP boy ]]

  • 4. select laugh
  • 5. merge 4 and 3

[VP laugh [DP every boy ]]

  • 6. select will
  • 7. merge 6 and 5

[IP will [VP laugh [DP every boy ]]] merge will merge laugh merge every boy

10

slide-24
SLIDE 24

Representing derivations

  • 1. select every
  • 2. select boy
  • 3. merge 1 and 2

[DP every [NP boy ]]

  • 4. select laugh
  • 5. merge 4 and 3

[VP laugh [DP every boy ]]

  • 6. select will
  • 7. merge 6 and 5

[IP will [VP laugh [DP every boy ]]]

  • 8. move every boy

[IP[DP every boy ][I ′ will [VP laugh t]]] move merge will merge laugh merge every boy

10

slide-25
SLIDE 25

The determinacy of movement

move merge will merge laugh merge every boy Attract Closest Minimal Link Shortest Move SMC can only be 1 thing moving for a particular reason at any time

11

slide-26
SLIDE 26

The determinacy of movement

move merge will merge laugh merge every boy Attract Closest Minimal Link Shortest Move SMC can only be 1 thing moving for a particular reason at any time

11

slide-27
SLIDE 27

The determinacy of movement

move merge will merge laugh merge every boy The proof objects of minimalism

  • are first order (i.e. trees)

11

slide-28
SLIDE 28

The determinacy of movement

move merge will merge laugh merge every boy The proof objects of minimalism

  • are first order (i.e. trees)
  • the proofs of any

proposition (e.g. S) form a regular tree language

11

slide-29
SLIDE 29

Towards MCFGs (I.)

  • a categorized string is a pair φ = (u, δ), where

u is a string δ is a feature bundle

  • an expression is a finite sequence of categorized strings

φ0, . . . , φn

  • each φi, 1 ≤ i ≤ n represents a moving subtree
  • φ0 represents the rest of the tree

12

slide-30
SLIDE 30

Towards MCFGs (II.)

=x.γ x.δ γ δ + ) <

(u, =x.γ), φ1, . . . , φm (v, x.δ), ψ1, . . . , ψn (u, γ), φ1, . . . , φm, (v, δ), ψ1, . . . , ψn

13

slide-31
SLIDE 31

Towards MCFGs (II.)

=x.γ x γ + ) <

(u, =x.γ), φ1, . . . , φm (v, x), ψ1, . . . , ψn (uv, γ), φ1, . . . , φm, ψ1, . . . , ψn

13

slide-32
SLIDE 32

Towards MCFGs (III.)

+y.γ

  • y

γ ) >

(u, +y.γ), φ1, . . . , φj−1, (v, -y), φj+1, . . . , φm (vu, γ), φ1, . . . , φj−1, φj+1, . . . , φm

14

slide-33
SLIDE 33

Automata

An rule like: (u, +y.γ), φ1, . . . , φj−1, (v, -y), φj+1, . . . , φm (vu, γ), φ1, . . . , φj−1, φj+1, . . . , φm gives us an ldmbutts (tree-to-string) production: move(q(u, v1, . . . , vj−1, v, vj+1, . . . , vm)) → q′(vu, v1, . . . , vj−1, vj+1, . . . , vm) where q = +y.γ, δ1, . . . , δj−1, -y, δj, . . . , δm q′ = γ, δ1, . . . , δj−1, δj, . . . , δm

15

slide-34
SLIDE 34

An example

16

slide-35
SLIDE 35

An example

(every, =n.d.-k) every

16

slide-36
SLIDE 36

An example

(every, =n.d.-k) (boy, n) every boy

16

slide-37
SLIDE 37

An example

(every, =n.d.-k) (boy, n) (every boy, d.-k) merge every boy

16

slide-38
SLIDE 38

An example

(laugh, =d.v) (every, =n.d.-k) (boy, n) (every boy, d.-k) laugh merge every boy

16

slide-39
SLIDE 39

An example

(laugh, =d.v) (every, =n.d.-k) (boy, n) (every boy, d.-k) (laugh, v), (every boy, -k) merge laugh merge every boy

16

slide-40
SLIDE 40

An example

(will, =v.+k.s) (laugh, =d.v) (every, =n.d.-k) (boy, n) (every boy, d.-k) (laugh, v), (every boy, -k) will merge laugh merge every boy

16

slide-41
SLIDE 41

An example

(will, =v.+k.s) (laugh, =d.v) (every, =n.d.-k) (boy, n) (every boy, d.-k) (laugh, v), (every boy, -k) (will laugh, +k.s), (every boy, -k) merge will merge laugh merge every boy

16

slide-42
SLIDE 42

An example

(will, =v.+k.s) (laugh, =d.v) (every, =n.d.-k) (boy, n) (every boy, d.-k) (laugh, v), (every boy, -k) (will laugh, +k.s), (every boy, -k) (every boy will laugh, s) move merge will merge laugh merge every boy

16

slide-43
SLIDE 43

An example

(will, =v.+k.s) (laugh, =d.v) (every, =n.d.-k) (boy, n) (every boy, d.-k) (laugh, v), (every boy, -k) (will laugh, +k.s), (every boy, -k) (every boy will laugh, s) move merge will merge laugh merge every boy

16

slide-44
SLIDE 44

A slightly larger example

boy n every =n.d.-k laugh =d.v will =v.+k.s to =v.i seem =i.v

17

slide-45
SLIDE 45

More derivations

move merge will merge laugh merge every boy merge seem merge to merge∗

18

slide-46
SLIDE 46

Yoda

boy n every =n.d.-k laugh =d.v will =v.+k.s to =v.i seem =i.v ǫ =v.v.-top ǫ =s.+top.c

19

slide-47
SLIDE 47

Remnant movement

move merge ǫ move merge will merge ǫ merge laugh merge every boy

20

slide-48
SLIDE 48

Parsing

slide-49
SLIDE 49

Top-down parsing

Items represent cuts of derivation tree

  • 21
slide-50
SLIDE 50

Top-down parsing

Items represent cuts of derivation tree move

  • 21
slide-51
SLIDE 51

Top-down parsing

Items represent cuts of derivation tree move merge

  • 21
slide-52
SLIDE 52

Top-down parsing

Items represent cuts of derivation tree move merge

  • merge
  • 21
slide-53
SLIDE 53

Top-down parsing

Items represent cuts of derivation tree move merge

  • merge
  • merge
  • 21
slide-54
SLIDE 54

Top-down parsing

Items represent cuts of derivation tree move merge

  • merge
  • merge

every

  • 21
slide-55
SLIDE 55

Top-down parsing

Items represent cuts of derivation tree move merge

  • merge
  • merge

every boy

21

slide-56
SLIDE 56

Top-down parsing

Items represent cuts of derivation tree move merge will merge

  • merge

every boy

21

slide-57
SLIDE 57

Top-down parsing

Items represent cuts of derivation tree move merge will merge laugh merge every boy

21

slide-58
SLIDE 58

Local trees

this exploits: MG derivation trees form a local set s

22

slide-59
SLIDE 59

Local trees

this exploits: MG derivation trees form a local set move +k.s;-k

22

slide-60
SLIDE 60

Local trees

this exploits: MG derivation trees form a local set move merge =v.+k.s v;-k

22

slide-61
SLIDE 61

Local trees

this exploits: MG derivation trees form a local set move merge =v.+k.s merge =d.v d.-k

22

slide-62
SLIDE 62

Local trees

this exploits: MG derivation trees form a local set move merge =v.+k.s merge =d.v merge =n.d.-k n

22

slide-63
SLIDE 63

Local trees

this exploits: MG derivation trees form a local set move merge =v.+k.s merge =d.v merge every n

22

slide-64
SLIDE 64

Local trees

this exploits: MG derivation trees form a local set move merge =v.+k.s merge =d.v merge every boy

22

slide-65
SLIDE 65

Local trees

this exploits: MG derivation trees form a local set move merge will merge =d.v merge every boy

22

slide-66
SLIDE 66

Local trees

this exploits: MG derivation trees form a local set move merge will merge laugh merge every boy

22

slide-67
SLIDE 67

Undoing movement

  • When we hypothesize a move node:

move +k.s;-k

23

slide-68
SLIDE 68

Undoing movement

  • When we hypothesize a move node:

move +k.s;-k

  • We next must hypothesize where the mover is:

move merge

  • merge
  • d.-k

23

slide-69
SLIDE 69

Appearances can be deceiving

Every boy will (seem to)∗ laugh

move merge

  • merge
  • d.-k

move merge

  • merge
  • merge
  • merge
  • d.-k

move merge

  • merge
  • merge
  • merge
  • merge
  • merge
  • d.-k

24

slide-70
SLIDE 70

If only. . .

move merge

  • merge
  • d.-k

25

slide-71
SLIDE 71

If only. . .

move merge

  • merge
  • d.-k
  • Might work in this case,
  • but is there a non-analysis specific principle?

25

slide-72
SLIDE 72

Structure in derivations

MG derivations are subregular (Graf) (Tier-based) strictly local strict locality conjunction of negative literals (with immediate successor) tier-based relativized successors (⊳T, where T ⊆ Σ)

26

slide-73
SLIDE 73

Example (strings)

Primary stress ⊳ := ⊳´

σ

Have primary stress ¬($ ⊳ $) Have at most one stress ¬(´ σ ⊳ ´ σ)

27

slide-74
SLIDE 74

Example (trees)

Movement (Graf) ⊳ := ⊳+k,-k Movers gonna move ¬($ ⊳ ℓ) No movement without movement ¬(move ⊳ $) No competition ¬(move ⊳ ℓ1, ℓ2)

28

slide-75
SLIDE 75

Argument structure via n-grams

Every lexical item ℓ appears in a derivation with a unique local context

  • depends exclusively on positive feature sequence

(=x and +y) (will, =v.+k.s)

29

slide-76
SLIDE 76

Argument structure via n-grams

Every lexical item ℓ appears in a derivation with a unique local context

  • depends exclusively on positive feature sequence

(=x and +y) merge (will, =v.+k.s)

29

slide-77
SLIDE 77

Argument structure via n-grams

Every lexical item ℓ appears in a derivation with a unique local context

  • depends exclusively on positive feature sequence

(=x and +y) move merge (will, =v.+k.s)

29

slide-78
SLIDE 78

Exploiting regularities in derivations

  • When we hypothesize a move node:

move +k.s;-k

30

slide-79
SLIDE 79

Exploiting regularities in derivations

  • When we hypothesize a move node:

move +k.s;-k

  • We know it immediately dominates a mover (on the

relevant tier): move +k.s d.-k

30

slide-80
SLIDE 80

A sketch

  • 31
slide-81
SLIDE 81

A sketch

move

  • 31
slide-82
SLIDE 82

A sketch

move merge

  • 31
slide-83
SLIDE 83

A sketch

move merge every

  • 31
slide-84
SLIDE 84

A sketch

move merge every boy

31

slide-85
SLIDE 85

A sketch

move merge

  • merge

every boy

31

slide-86
SLIDE 86

A sketch

move merge will merge every boy

31

slide-87
SLIDE 87

A sketch

move merge will merge

  • merge

every boy

31

slide-88
SLIDE 88

A sketch

move merge will merge laugh merge every boy

31

slide-89
SLIDE 89

A sketch

move merge will merge laugh merge every boy

31

slide-90
SLIDE 90

A basic ’hole’ data structure

α g xs

  • g is a gorn address

where we are in the derived tree data Hole t b x = Hole t [(b,x)]

32

slide-91
SLIDE 91

A basic ’hole’ data structure

α g xs

  • g is a gorn address

where we are in the derived tree

  • xs is a (finite) list of

data Hole t b x = Hole t [(b,x)]

32

slide-92
SLIDE 92

A basic ’hole’ data structure

α g xs

  • g is a gorn address

where we are in the derived tree

  • xs is a (finite) list of
  • derivations with holes

elements in separate tiers

data Hole t b x = Hole t [(b,x)]

32

slide-93
SLIDE 93

A basic ’hole’ data structure

α g xs

  • g is a gorn address

where we are in the derived tree

  • xs is a (finite) list of
  • derivations with holes

elements in separate tiers

  • . . . paired with feature bundles

information about the occupied tier

data Hole t b x = Hole t [(b,x)]

32

slide-94
SLIDE 94

Unmerge1

  • Given

α : g xs

33

slide-95
SLIDE 95

Unmerge1

  • Given

α : g xs

  • merge could have applied

merge =x.α : g0 us x : g1 vs

33

slide-96
SLIDE 96

Unmerge1

  • Given

α : g xs

  • merge could have applied

merge =x.α : g0 us x : g1 vs

33

slide-97
SLIDE 97

Unmerge1

  • Given

α : g xs

  • merge could have applied

merge =x.α : g0 us x : g1 vs xs = sort (us ++ vs)

33

slide-98
SLIDE 98

Unmove

  • Given

α : g xs

34

slide-99
SLIDE 99

Unmove

  • Given

α : g xs

  • move could have applied

move +y.α : g1 x.-y : g0 x.-y xs

34

slide-100
SLIDE 100

Unmerge2

  • Given

α g xs

35

slide-101
SLIDE 101

Unmerge2

  • Given

α g xs

  • merge could have applied to a mover

merge =x.α g1 us x.-y vs

35

slide-102
SLIDE 102

Unmerge2

  • Given

α g xs

  • merge could have applied to a mover

merge =x.α g1 us x.-y vs

35

slide-103
SLIDE 103

Unmerge2

  • Given

α g xs

  • merge could have applied to a mover

merge =x.α g1 us x.-y vs xs = sort (us ++ vs)

35

slide-104
SLIDE 104

Completion (I)

  • Given

x.-y x.-y

36

slide-105
SLIDE 105

Completion (I)

  • Given

x.-y x.-y

  • this is the tree you’re looking for

36

slide-106
SLIDE 106

ATNs and filling gaps

  • Psycholinguists
  • you process moved items (fillers)
  • and then you try to find where they moved from (gap)
  • TD MG parsing

to process filler, first find gap!

  • Here
  • unmove constructs a filler
  • unmerge2 constructs a gap
  • complete fills the gap

37

slide-107
SLIDE 107

Remnant movement

move merge ǫ move merge will merge ǫ merge laugh merge every boy boy n every =n.d.-k laugh =d.v will =v.+k.s ǫ =v.v.-top ǫ =s.+top.s

38

slide-108
SLIDE 108

Remnant movement

s ǫ

38

slide-109
SLIDE 109

Remnant movement

move +top.s 1 v.-top 0 v.-top

38

slide-110
SLIDE 110

Remnant movement

move +top.s 1 merge v.-top =v.v.-top 00 v 01

38

slide-111
SLIDE 111

Remnant movement

move +top.s 1 merge v.-top ǫ v 01

38

slide-112
SLIDE 112

Remnant movement

move +top.s 1 merge v.-top ǫ merge =d.v d.-k

38

slide-113
SLIDE 113

Remnant movement

move +top.s 1 merge v.-top ǫ merge laugh d.-k

38

slide-114
SLIDE 114

Remnant movement

move merge =s.+top.s 10 s 11 merge v.-top ǫ merge laugh d.-k

38

slide-115
SLIDE 115

Remnant movement

move merge ǫ s 11 merge v.-top ǫ merge laugh d.-k

38

slide-116
SLIDE 116

Remnant movement

move merge ǫ move +k.s 111 d.-k 110 d.-k merge v.-top ǫ merge laugh d.-k

38

slide-117
SLIDE 117

Remnant movement

move merge ǫ move +k.s 111 merge d.-k =n.d.-k 1100 n 1101 merge v.-top ǫ merge laugh d.-k

38

slide-118
SLIDE 118

Remnant movement

move merge ǫ move +k.s 111 merge d.-k every n 1101 merge v.-top ǫ merge laugh d.-k

38

slide-119
SLIDE 119

Remnant movement

move merge ǫ move +k.s 111 merge d.-k every boy merge v.-top ǫ merge laugh d.-k

38

slide-120
SLIDE 120

Remnant movement

move merge ǫ move merge =v.+k.s 1110 v.-top merge d.-k every boy merge v.-top ǫ merge laugh d.-k

38

slide-121
SLIDE 121

Remnant movement

move merge ǫ move merge will v.-top merge d.-k every boy merge v.-top ǫ merge laugh d.-k

38

slide-122
SLIDE 122

Remnant movement

move merge ǫ move merge will merge ǫ merge laugh d.-k merge d.-k every boy

38

slide-123
SLIDE 123

Remnant movement

move merge ǫ move merge will merge ǫ merge laugh merge every boy

38

slide-124
SLIDE 124

Enforcing the SMC

Recall:

  • 1. Movers gonna move : ¬($ ⊳ ℓ)
  • 2. No movement without movement : ¬(move ⊳ $)
  • 3. No competition : ¬(move ⊳ ℓ1, ℓ2)

How are these being enforced?

39

slide-125
SLIDE 125

Enforcing the SMC

Recall:

  • 1. Movers gonna move : ¬($ ⊳ ℓ)
  • 2. No movement without movement : ¬(move ⊳ $)
  • 3. No competition : ¬(move ⊳ ℓ1, ℓ2)

How are these being enforced?

  • 1. Two ways of generating a mover:

39

slide-126
SLIDE 126

Enforcing the SMC

Recall:

  • 1. Movers gonna move : ¬($ ⊳ ℓ)
  • 2. No movement without movement : ¬(move ⊳ $)
  • 3. No competition : ¬(move ⊳ ℓ1, ℓ2)

How are these being enforced?

  • 1. Two ways of generating a mover:
  • via unmerge2 (i.e. a gap)

must be filled

  • via unmove (i.e. a filler)

born dominated

39

slide-127
SLIDE 127

Enforcing the SMC

Recall:

  • 1. Movers gonna move : ¬($ ⊳ ℓ)
  • 2. No movement without movement : ¬(move ⊳ $)
  • 3. No competition : ¬(move ⊳ ℓ1, ℓ2)

How are these being enforced?

  • 1. Two ways of generating a mover:
  • 2. move nodes and movers postulated simultaneously

39

slide-128
SLIDE 128

Enforcing the SMC

Recall:

  • 1. Movers gonna move : ¬($ ⊳ ℓ)
  • 2. No movement without movement : ¬(move ⊳ $)
  • 3. No competition : ¬(move ⊳ ℓ1, ℓ2)

How are these being enforced?

  • 1. Two ways of generating a mover:
  • 2. move nodes and movers postulated simultaneously
  • 3. via restrictions

39

slide-129
SLIDE 129

Restricting Unmove

α : g xs ⇒ move +y.α : g1 x.-y : g0 x.-y xs As long as nothing in xs is on the -y tier

40

slide-130
SLIDE 130

Restricting Unmerge2

α g xs ⇒ merge =z.α g1 us z.-y vs If there is something on the -y tier in xs it must complete this gap in other words, the -y tier is hereby blocked!

41

slide-131
SLIDE 131

Completion (II)

A partial proof tree with an n-ary hole is an operation of type (α1 → · · · → αn → t) → t The ’α’s are the types of the arguments to the hole Upper bounds on

  • 1. number of holes
  • 2. their arity

depending on number of -y feature types in lexicon

42

slide-132
SLIDE 132

Completion (III)

x.-y ∆ x.-y xs ⇒ ∆[us1, . . . , usk] Conditions xs = sort ( us1 ++ ... ++ usk ) each substitution path is free for the relevant tier

43

slide-133
SLIDE 133

A Note on Semantic Interpretation

[[merge]] → λm, n.( |m ⊕ n| ) [[merge]] → λm, n.( |m ⊕ n| ) [[move]] → λm.m [[move]] → λm.mk

[[ℓ]] = I(ℓ)↑ (| f m n |) = do x <- m y <- n return (f x y)

44

slide-134
SLIDE 134

The meaning of partial parse trees

                                      move merge every boy                                       = λf.[[move]](f ([[merge]] [[every]] [[boy]])) = λf.f ([[every]] [[boy]])↑k

FA 45

slide-135
SLIDE 135

Conclusion : Exploiting structure

  • MGs have more structure in their derivations than is

being made use of

  • how can we take advantage of it?
  • Simple intersection w/ regular sets:

(will, =rvs.+pkq.pcs), where δ(q, will) = r

  • how to do scheduling to obtain a version of the present

algorithm?

  • Left-corner parsing (for CFGs) has similar looking partial

proof trees

  • can we use these ideas to get a left-corner parser for MGs

and solve the problem of left branch movement?

46