Lecture 17: Expressive grammars Julia Hockenmaier - - PowerPoint PPT Presentation

lecture 17 expressive grammars
SMART_READER_LITE
LIVE PREVIEW

Lecture 17: Expressive grammars Julia Hockenmaier - - PowerPoint PPT Presentation

CS498JH: Introduction to NLP (Fall 2012) http://cs.illinois.edu/class/cs498jh Lecture 17: Expressive grammars Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office Hours: Wednesday, 12:15-1:15pm Why grammar? Meaning


slide-1
SLIDE 1

CS498JH: Introduction to NLP (Fall 2012)

http://cs.illinois.edu/class/cs498jh

Julia Hockenmaier

juliahmr@illinois.edu 3324 Siebel Center Office Hours: Wednesday, 12:15-1:15pm

Lecture 17: Expressive grammars

slide-2
SLIDE 2

CS498JH: Introduction to NLP

Why grammar?

Surface string Mary saw John Meaning representation

Logical form: saw(Mary,John)

Grammar

Parsing Generation

Dependency graph:

saw Mary John

Pred-arg structure:

PRED saw AGENT Mary PATIENT John

2

slide-3
SLIDE 3

CS498JH: Introduction to NLP

Grammar formalisms

Formalisms provide a language in which linguistic theories can be expressed and implemented Formalisms define elementary objects (trees, strings, feature structures) and recursive operations which generate complex objects from simple objects. Formalisms may impose constraints (e.g. on the kinds of dependencies they can capture)

3

slide-4
SLIDE 4

CS498JH: Introduction to NLP

How do grammar formalisms differ?

Formalisms define different representations

Tree-adjoining Grammar (TAG): Fragments of phrase-structure trees Lexical-functional Grammar (LFG): Annotated phrase-structure trees (c-structure) linked to feature structures (f-structure) Combinatory Categorial Grammar (CCG): Syntactic categories paired with meaning representations Head-Driven Phrase Structure Grammar(HPSG): Complex feature structures (Attribute-value matrices)

4

slide-5
SLIDE 5

CS498JH: Introduction to NLP

The dependencies so far:

Arguments:

  • Verbs take arguments: subject, object, complements, ...

Heads subcategorize for their arguments

Adjuncts/Modifiers:

Adjectives modify nouns, adverbs modify VPs or adjectives, PPs modify NPs or VPs Modifiers subcategorize for the head

Typically, these are local dependencies: they can be expressed within individual CFG rules

VP → Adv Verb NP

5

slide-6
SLIDE 6

CS498JH: Introduction to NLP

Context-free grammars

CFGs capture only nested dependencies

The dependency graph is a tree The dependencies do not cross

6

slide-7
SLIDE 7

CS498JH: Introduction to NLP

Beyond CFGs: Nonprojective dependencies

Dependencies: tree with crossing branches

Arise in the following constructions

  • (Non-local) scrambling (free word order languages)

Die Pizza hat Klaus versprochen zu bringen

  • Extraposition (The guy is coming who is wearing a hat)
  • Topicalization (Cheeseburgers, I thought he likes)

7

slide-8
SLIDE 8

CS498JH: Introduction to NLP

Beyond CFGs: Nonlocal dependencies

Dependencies form a DAG

(a node may have multiple incoming edges)

Arise in the following constructions:

  • Control (He has promised me to go), raising (He seems to go)
  • Wh-movement (the man who you saw yesterday is here again),
  • Non-constituent coordination

(right-node raising, gapping, argument-cluster coordination)

8

slide-9
SLIDE 9

CS498JH: Introduction to NLP

Non-local dependencies

9

slide-10
SLIDE 10

CS498JH: Introduction to NLP

Bounded long-range dependencies:

limited distance between the head and argument

Unbounded long-range dependencies:

arbitrary distance (within the same sentence) between the head and argument

Unbounded long-range dependencies cannot (in general) be represented with CFGs. Chomsky’s solution: Add null elements (and coindexation)

Long-range dependencies

10

slide-11
SLIDE 11

CS498JH: Introduction to NLP

Unbounded nonlocal dependencies

Wh-questions and relative clauses contain unbounded nonlocal dependencies, where the missing NP may be arbitrarily deeply embedded:

‘the sushi that [you told me [John saw [Mary eat]]]’ ‘what [did you tell me [John saw [Mary eat]]]?’ Linguists call this phenomenon wh-extraction (wh-movement).

11

slide-12
SLIDE 12

CS498JH: Introduction to NLP

Non-local dependencies in wh-extraction

12

NP NP SBAR S IN VP NP S VP NP V V the sushi that you told NP me John saw S VP NP V Mary eat

slide-13
SLIDE 13

CS498JH: Introduction to NLP

The trace analysis of wh-extraction

13

NP NP NP SBAR S IN VP NP S VP NP V V the sushi that you told NP me John saw S VP NP V Mary eat *T*

trace

slide-14
SLIDE 14

CS498JH: Introduction to NLP

Slash categories for wh-extraction

Because only one element can be extracted, we can use slash categories. This is still a CFG: the set of nonterminals is finite.

Generalized Phrase Structure Grammar (GPSG), Gazdar et al. (1985)

14

NP NP SBAR S/NP IN VP/NP NP S/NP VP/NP NP V V the sushi that you told NP me John saw S/NP VP/NP NP V Mary eat

slide-15
SLIDE 15

CS498JH: Introduction to NLP

German: center embedding

...daß ich [Hans schwimmen] sah ...that I Hans swim saw ...that I saw [Hans swim] ...daß ich [Maria [Hans schwimmen] helfen] sah ...that I Maria Hans swim help saw ...that I saw [Mary help [Hans swim]] ...daß ich [Anna [Maria [Hans schwimmen] helfen] lassen] sah ...that I Anna Maria Hans swim help let saw ...that I saw [Anna let [Mary help [Hans swim]]]

15

slide-16
SLIDE 16

CS498JH: Introduction to NLP

Dutch: cross-serial dependencies

...dat ik Hans zag zwemmen ...that I Hans saw swim ...that I saw [Hans swim] ...dat ik Maria Hans zag helpen zwemmen ...that I Maria Hans saw help swim ...that I saw [Mary help [Hans swim]] ...dat ik Anna Maria Hans zag laten helpen zwemmen ...that I Anna Maria Hans saw let help swim ...that I saw [Anna let [Mary help [Hans swim]]]

Such cross-serial dependencies require mildly context-sensitive grammars

16

slide-17
SLIDE 17

CS498JH: Introduction to NLP

Two mildly context-sensitvie formalisms: TAG and CCG

17

slide-18
SLIDE 18

CS498JH: Introduction to NLP

Recursively enumerable

The Chomsky Hierarchy

Context-sensitive Mildly context-sensitive Context-free Regular

18

slide-19
SLIDE 19

CS498JH: Introduction to NLP

Mildly context-sensitive grammars

Contain all context-free grammars/languages Can be parsed in polynomial time (TAG/CCG: O(n6)) (Strong generative capacity) capture certain kinds of dependencies: nested (like CFGs) and cross-serial (like the Dutch example), but not the MIX language:

MIX: the set of strings w ∈ {a, b, c}* that contain equal numbers of as, bs and cs

Have the constant growth property: the length of strings grows in a linear way The power-of-2 language {a2n} does not have the constant growth propery.

19

slide-20
SLIDE 20

CS498JH: Introduction to NLP

TAG and CCG are lexicalized formalisms

The lexicon:

  • pairs words with elementary objects
  • specifies all language-specific information

(e.g. subcategorization information)

The grammatical operations:

  • are universal
  • define (and impose constraints on) recursion.

20

slide-21
SLIDE 21

CS498JH: Introduction to NLP

Tree-Adjoining Grammar

21

slide-22
SLIDE 22

CS498JH: Introduction to NLP

(Lexicalized) Tree-Adjoining Grammar

AK Joshi and Y Schabes (1996) Tree Adjoining Grammars. In G. Rosenberg and A. Salomaa, Eds., Handbook of Formal Languages

TAG is a tree-rewriting formalism:

TAG defines operations (substitution, adjunction) on trees. The elementary objects in TAG are trees (not strings)

TAG is lexicalized:

Each elementary tree is anchored to a lexical item (word) “Extended domain of locality”: The elementary tree contains all arguments of the anchor. TAG requires a linguistic theory which specifies the shape

  • f these elementary trees.

TAG is mildly context-sensitive:

can capture Dutch cross-serial dependencies but is still efficiently parseable

22

slide-23
SLIDE 23

CS498JH: Introduction to NLP

Extended domain of locality

S NP VP VBZ NP eats We want to capture all arguments of a word in a single elementary object. We also want to retain certain syntactic structures (e.g. VPs). Our elementary objects are tree fragments:

23

slide-24
SLIDE 24

CS498JH: Introduction to NLP

TAG substitution (arguments)

Substitute

X Y X↓ Y↓ α1: X α2: Y α3: α2 α3 α1 Derivation tree: Derived tree:

24

slide-25
SLIDE 25

CS498JH: Introduction to NLP

ADJOIN

TAG adjunction

X X* X X X*

Auxiliary tree Foot node

α1: β1: α1 β1

Derived tree: Derivation tree:

25

slide-26
SLIDE 26

CS498JH: Introduction to NLP

The effect of adjunction

TIG: sister adjunction TAG: wrapping adjunction

No adjunction: TSG (Tree substitution grammar)

TSG is context-free

Sister adjunction: TIG (Tree insertion grammar)

TIG is also context-free, but has a linguistically more adequate treatment of modifiers

Wrapping adjunction: TAG (Tree-adjoining grammar)

TAG is mildy context-sensitive

26

slide-27
SLIDE 27

CS498JH: Introduction to NLP

A small TAG lexicon

S NP VP VBZ NP eats α1: NP John α2: VP RB VP* always β1: NP tapas α3:

27

slide-28
SLIDE 28

CS498JH: Introduction to NLP

A TAG derivation

S NP VP VBZ NP eats NP John NP tapas VP RB VP* always NP NP NP NP α2: α1: β1: α3: α1 α3 α2

28

slide-29
SLIDE 29

CS498JH: Introduction to NLP

A TAG derivation

S NP VP VBZ NP eats tapas VP RB VP* always John VP VP

α1

α3 α2 β1

β1

29

slide-30
SLIDE 30

CS498JH: Introduction to NLP

A TAG derivation

S NP

VBZ

VP NP eats tapas

VP RB VP* always John

30

slide-31
SLIDE 31

CS498JH: Introduction to NLP

S

anbn: Cross-serial dependencies

Elementary trees: Deriving aabb S a b S S* S a b S S a b S S a b S S a b S S* S a b S S

31

slide-32
SLIDE 32

CS498JH: Introduction to NLP

Combinatory Categorial Grammar

32

slide-33
SLIDE 33

CS498JH: Introduction to NLP

CCG: the machinery

Categories:

specify subcat lists of words/constituents.

Combinatory rules:

specify how constituents can combine.

The lexicon:

specifies which categories a word can have.

Derivations:

spell out process of combining constituents.

33

slide-34
SLIDE 34

CS498JH: Introduction to NLP

CCG categories

Simple (atomic) categories: NP, S, PP Complex categories (functions): Return a result when combined with an argument

VP, intransitive verb S\NP Transitive verb (S\NP)/NP Adverb (S\NP)\(S\NP) Prepositions ((S\NP)\(S\NP))/NP (NP\NP)/NP PP/NP

34

slide-35
SLIDE 35

CS498JH: Introduction to NLP

Function application

Combines a function X/Y or X\Y with its argument Y to yield the result X: (S\NP)/NP NP -> S\NP eats tapas eats tapas NP S\NP -> S John eats tapas

  • John eats tapas

35

slide-36
SLIDE 36

CS498JH: Introduction to NLP

Forward appli application (>): (>): (S\NP)/NP NP ⇒> S\NP eats tapas eats tapas Backward app d application (<): n (<): NP S\NP ⇒< S John eats tapas John eats tapas

Function application

Used in all variants of categorial grammar

36

slide-37
SLIDE 37

CS498JH: Introduction to NLP

A (C)CG derivation

37

slide-38
SLIDE 38

CS498JH: Introduction to NLP

Function composition

Harmonic forward Harmonic forward com

  • rward compositio
  • sition (>B):

X / Y Y / Z ⇒>B X / Z

Harmonic ba ic backward c ard compositi position (<B):

Y \ Z X \ Y ⇒<B X \ Z

Forward crossing Forward crossing com compositio

  • sition (>Bx):

X / Y Y \ Z ⇒>Bx X \ Z

Backward crossing Backward crossing co Backward crossing compositi position (<Bx):

Y / Z X \ Y ⇒<Bx X / Z

38

slide-39
SLIDE 39

CS498JH: Introduction to NLP

Type-raising

Forward typ d type-raisin aising (>T): (>T): X ⇒>T T / ( T \ X) Backward ty ard type-raisi

  • raising (<T

g (<T): X ⇒<T T \ ( T / X)

39

slide-40
SLIDE 40

CS498JH: Introduction to NLP

Type-raising and composition

Type-raising: X → T/(T\X)

Turns an argument into a function. NP → S/(S\NP)

  • (subject)

NP → (S\NP)\((S\NP)/NP) (object)

Harmonic composition: X/Y Y/Z → X/Z

Composes two functions (complex categories) (S\NP)/PP PP/NP → (S\NP)/NP S/(S\NP) (S\NP)/NP → S/NP

Crossing function composition: X/Y Y\Z → X\Z

Composes two functions (complex categories) (S\NP)/S S\NP → (S\NP)\NP

40

slide-41
SLIDE 41

CS498JH: Introduction to NLP

Type-raising and composition

41

Wh-movement (relative clause): Right-node raising:

slide-42
SLIDE 42

CS498JH: Introduction to NLP

42

Function applicatio cation (> and <): X / Y Y ⇒> X Y X \ Y ⇒< X Harmonic compositi position (>B and <B): <B): X / Y Y / Z ⇒>B X / Z Y \ Z X \ Y ⇒<B X \ Z Crossing compositi position (>Bx and <Bx): <Bx): X / Y Y \ Z ⇒>Bx X \ Z Y / Z X \ Y ⇒<Bx X / Z Generalized compo mposition (>Bn and <B d <Bn): X / Y (...(Y | Z1)|...)|Zn ⇒>Bn (...(X | Z1)|...)|Zn (...(Y | Z1)|...)|Zn X \ Y ⇒<Bn (...(X | Z1)|...)|Zn Type-raising (>T an (>T and <T): X ⇒>B T / (T \ X) X ⇒<B T \ (T / X)

slide-43
SLIDE 43

CS498JH: Introduction to NLP

Dutch cross-serial dependencies

43

ik Maria Hans zag helpen zwimmen NP NP NP (S\NP)/S ((S\NP)\NP)/(S\NP) S\NP

>

(S\NP)\NP

>B×

((S\NP)\NP)\NP

<

(S\NP)\NP

<

S\NP

<

S

slide-44
SLIDE 44

CS498JH: Introduction to NLP

Another example

CCG derivations are binary trees: we can use standard chart parsing techniques. CCG derivations represent long-range dependencies and complement-adjunct distinctions directly:

44

slide-45
SLIDE 45

CS498JH: Introduction to NLP

Combinatory Categorial Grammar

  • CCG is lexicalized

(the “rules” of the grammar are completely general, all language-specific information is given in the lexicon)

  • CCG is mildly context-sensitive

(can capture Dutch crossing dependencies, but is still efficiently parseable)

  • CCG has a flexible constituent structure
  • CCG has a unified treatment of extraction/coordination
  • CCG has a transparent syntax-semantics interface

(every syntactic category and operation has a semantic counterpart)

  • CCG rules are monotonic

(movement or traces don’t exist)

45

slide-46
SLIDE 46

CS498JH: Introduction to NLP

Today’s key concepts

Phenomena that require extensions of standard context-free grammars:

non-local dependencies cross-serial dependencies

Two lexicalized formalisms:

Tree-adjoining Grammar Combinatory Categorial Grammar

46