CS498JH: Introduction to NLP (Fall 2012)
http://cs.illinois.edu/class/cs498jh
Julia Hockenmaier
juliahmr@illinois.edu 3324 Siebel Center Office Hours: Wednesday, 12:15-1:15pm
Lecture 17: Expressive grammars Julia Hockenmaier - - PowerPoint PPT Presentation
CS498JH: Introduction to NLP (Fall 2012) http://cs.illinois.edu/class/cs498jh Lecture 17: Expressive grammars Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office Hours: Wednesday, 12:15-1:15pm Why grammar? Meaning
CS498JH: Introduction to NLP (Fall 2012)
http://cs.illinois.edu/class/cs498jh
Julia Hockenmaier
juliahmr@illinois.edu 3324 Siebel Center Office Hours: Wednesday, 12:15-1:15pm
CS498JH: Introduction to NLP
Surface string Mary saw John Meaning representation
Logical form: saw(Mary,John)
Parsing Generation
Dependency graph:
saw Mary John
Pred-arg structure:
PRED saw AGENT Mary PATIENT John
2
CS498JH: Introduction to NLP
Formalisms provide a language in which linguistic theories can be expressed and implemented Formalisms define elementary objects (trees, strings, feature structures) and recursive operations which generate complex objects from simple objects. Formalisms may impose constraints (e.g. on the kinds of dependencies they can capture)
3
CS498JH: Introduction to NLP
Formalisms define different representations
Tree-adjoining Grammar (TAG): Fragments of phrase-structure trees Lexical-functional Grammar (LFG): Annotated phrase-structure trees (c-structure) linked to feature structures (f-structure) Combinatory Categorial Grammar (CCG): Syntactic categories paired with meaning representations Head-Driven Phrase Structure Grammar(HPSG): Complex feature structures (Attribute-value matrices)
4
CS498JH: Introduction to NLP
Arguments:
Heads subcategorize for their arguments
Adjuncts/Modifiers:
Adjectives modify nouns, adverbs modify VPs or adjectives, PPs modify NPs or VPs Modifiers subcategorize for the head
Typically, these are local dependencies: they can be expressed within individual CFG rules
VP → Adv Verb NP
5
CS498JH: Introduction to NLP
CFGs capture only nested dependencies
The dependency graph is a tree The dependencies do not cross
6
CS498JH: Introduction to NLP
Dependencies: tree with crossing branches
Arise in the following constructions
Die Pizza hat Klaus versprochen zu bringen
7
CS498JH: Introduction to NLP
Dependencies form a DAG
(a node may have multiple incoming edges)
Arise in the following constructions:
(right-node raising, gapping, argument-cluster coordination)
8
CS498JH: Introduction to NLP
9
CS498JH: Introduction to NLP
Bounded long-range dependencies:
limited distance between the head and argument
Unbounded long-range dependencies:
arbitrary distance (within the same sentence) between the head and argument
Unbounded long-range dependencies cannot (in general) be represented with CFGs. Chomsky’s solution: Add null elements (and coindexation)
10
CS498JH: Introduction to NLP
Wh-questions and relative clauses contain unbounded nonlocal dependencies, where the missing NP may be arbitrarily deeply embedded:
‘the sushi that [you told me [John saw [Mary eat]]]’ ‘what [did you tell me [John saw [Mary eat]]]?’ Linguists call this phenomenon wh-extraction (wh-movement).
11
CS498JH: Introduction to NLP
12
NP NP SBAR S IN VP NP S VP NP V V the sushi that you told NP me John saw S VP NP V Mary eat
CS498JH: Introduction to NLP
13
NP NP NP SBAR S IN VP NP S VP NP V V the sushi that you told NP me John saw S VP NP V Mary eat *T*
trace
CS498JH: Introduction to NLP
Because only one element can be extracted, we can use slash categories. This is still a CFG: the set of nonterminals is finite.
Generalized Phrase Structure Grammar (GPSG), Gazdar et al. (1985)
14
NP NP SBAR S/NP IN VP/NP NP S/NP VP/NP NP V V the sushi that you told NP me John saw S/NP VP/NP NP V Mary eat
CS498JH: Introduction to NLP
...daß ich [Hans schwimmen] sah ...that I Hans swim saw ...that I saw [Hans swim] ...daß ich [Maria [Hans schwimmen] helfen] sah ...that I Maria Hans swim help saw ...that I saw [Mary help [Hans swim]] ...daß ich [Anna [Maria [Hans schwimmen] helfen] lassen] sah ...that I Anna Maria Hans swim help let saw ...that I saw [Anna let [Mary help [Hans swim]]]
15
CS498JH: Introduction to NLP
...dat ik Hans zag zwemmen ...that I Hans saw swim ...that I saw [Hans swim] ...dat ik Maria Hans zag helpen zwemmen ...that I Maria Hans saw help swim ...that I saw [Mary help [Hans swim]] ...dat ik Anna Maria Hans zag laten helpen zwemmen ...that I Anna Maria Hans saw let help swim ...that I saw [Anna let [Mary help [Hans swim]]]
Such cross-serial dependencies require mildly context-sensitive grammars
16
CS498JH: Introduction to NLP
17
CS498JH: Introduction to NLP
Recursively enumerable
Context-sensitive Mildly context-sensitive Context-free Regular
18
CS498JH: Introduction to NLP
Contain all context-free grammars/languages Can be parsed in polynomial time (TAG/CCG: O(n6)) (Strong generative capacity) capture certain kinds of dependencies: nested (like CFGs) and cross-serial (like the Dutch example), but not the MIX language:
MIX: the set of strings w ∈ {a, b, c}* that contain equal numbers of as, bs and cs
Have the constant growth property: the length of strings grows in a linear way The power-of-2 language {a2n} does not have the constant growth propery.
19
CS498JH: Introduction to NLP
The lexicon:
(e.g. subcategorization information)
The grammatical operations:
20
CS498JH: Introduction to NLP
21
CS498JH: Introduction to NLP
AK Joshi and Y Schabes (1996) Tree Adjoining Grammars. In G. Rosenberg and A. Salomaa, Eds., Handbook of Formal Languages
TAG is a tree-rewriting formalism:
TAG defines operations (substitution, adjunction) on trees. The elementary objects in TAG are trees (not strings)
TAG is lexicalized:
Each elementary tree is anchored to a lexical item (word) “Extended domain of locality”: The elementary tree contains all arguments of the anchor. TAG requires a linguistic theory which specifies the shape
TAG is mildly context-sensitive:
can capture Dutch cross-serial dependencies but is still efficiently parseable
22
CS498JH: Introduction to NLP
S NP VP VBZ NP eats We want to capture all arguments of a word in a single elementary object. We also want to retain certain syntactic structures (e.g. VPs). Our elementary objects are tree fragments:
23
CS498JH: Introduction to NLP
Substitute
X Y X↓ Y↓ α1: X α2: Y α3: α2 α3 α1 Derivation tree: Derived tree:
24
CS498JH: Introduction to NLP
ADJOIN
X X* X X X*
Auxiliary tree Foot node
α1: β1: α1 β1
Derived tree: Derivation tree:
25
CS498JH: Introduction to NLP
TIG: sister adjunction TAG: wrapping adjunction
No adjunction: TSG (Tree substitution grammar)
TSG is context-free
Sister adjunction: TIG (Tree insertion grammar)
TIG is also context-free, but has a linguistically more adequate treatment of modifiers
Wrapping adjunction: TAG (Tree-adjoining grammar)
TAG is mildy context-sensitive
26
CS498JH: Introduction to NLP
S NP VP VBZ NP eats α1: NP John α2: VP RB VP* always β1: NP tapas α3:
27
CS498JH: Introduction to NLP
S NP VP VBZ NP eats NP John NP tapas VP RB VP* always NP NP NP NP α2: α1: β1: α3: α1 α3 α2
28
CS498JH: Introduction to NLP
S NP VP VBZ NP eats tapas VP RB VP* always John VP VP
α1
α3 α2 β1
β1
29
CS498JH: Introduction to NLP
S NP
VBZ
VP NP eats tapas
VP RB VP* always John
30
CS498JH: Introduction to NLP
S
Elementary trees: Deriving aabb S a b S S* S a b S S a b S S a b S S a b S S* S a b S S
31
CS498JH: Introduction to NLP
32
CS498JH: Introduction to NLP
Categories:
specify subcat lists of words/constituents.
Combinatory rules:
specify how constituents can combine.
The lexicon:
specifies which categories a word can have.
Derivations:
spell out process of combining constituents.
33
CS498JH: Introduction to NLP
Simple (atomic) categories: NP, S, PP Complex categories (functions): Return a result when combined with an argument
VP, intransitive verb S\NP Transitive verb (S\NP)/NP Adverb (S\NP)\(S\NP) Prepositions ((S\NP)\(S\NP))/NP (NP\NP)/NP PP/NP
34
CS498JH: Introduction to NLP
Combines a function X/Y or X\Y with its argument Y to yield the result X: (S\NP)/NP NP -> S\NP eats tapas eats tapas NP S\NP -> S John eats tapas
35
CS498JH: Introduction to NLP
Used in all variants of categorial grammar
36
CS498JH: Introduction to NLP
37
CS498JH: Introduction to NLP
Harmonic forward Harmonic forward com
X / Y Y / Z ⇒>B X / Z
Harmonic ba ic backward c ard compositi position (<B):
Y \ Z X \ Y ⇒<B X \ Z
Forward crossing Forward crossing com compositio
X / Y Y \ Z ⇒>Bx X \ Z
Backward crossing Backward crossing co Backward crossing compositi position (<Bx):
Y / Z X \ Y ⇒<Bx X / Z
38
CS498JH: Introduction to NLP
Forward typ d type-raisin aising (>T): (>T): X ⇒>T T / ( T \ X) Backward ty ard type-raisi
g (<T): X ⇒<T T \ ( T / X)
39
CS498JH: Introduction to NLP
Type-raising: X → T/(T\X)
Turns an argument into a function. NP → S/(S\NP)
NP → (S\NP)\((S\NP)/NP) (object)
Harmonic composition: X/Y Y/Z → X/Z
Composes two functions (complex categories) (S\NP)/PP PP/NP → (S\NP)/NP S/(S\NP) (S\NP)/NP → S/NP
Crossing function composition: X/Y Y\Z → X\Z
Composes two functions (complex categories) (S\NP)/S S\NP → (S\NP)\NP
40
CS498JH: Introduction to NLP
41
Wh-movement (relative clause): Right-node raising:
CS498JH: Introduction to NLP
42
Function applicatio cation (> and <): X / Y Y ⇒> X Y X \ Y ⇒< X Harmonic compositi position (>B and <B): <B): X / Y Y / Z ⇒>B X / Z Y \ Z X \ Y ⇒<B X \ Z Crossing compositi position (>Bx and <Bx): <Bx): X / Y Y \ Z ⇒>Bx X \ Z Y / Z X \ Y ⇒<Bx X / Z Generalized compo mposition (>Bn and <B d <Bn): X / Y (...(Y | Z1)|...)|Zn ⇒>Bn (...(X | Z1)|...)|Zn (...(Y | Z1)|...)|Zn X \ Y ⇒<Bn (...(X | Z1)|...)|Zn Type-raising (>T an (>T and <T): X ⇒>B T / (T \ X) X ⇒<B T \ (T / X)
CS498JH: Introduction to NLP
43
ik Maria Hans zag helpen zwimmen NP NP NP (S\NP)/S ((S\NP)\NP)/(S\NP) S\NP
>
(S\NP)\NP
>B×
((S\NP)\NP)\NP
<
(S\NP)\NP
<
S\NP
<
S
CS498JH: Introduction to NLP
CCG derivations are binary trees: we can use standard chart parsing techniques. CCG derivations represent long-range dependencies and complement-adjunct distinctions directly:
44
CS498JH: Introduction to NLP
(the “rules” of the grammar are completely general, all language-specific information is given in the lexicon)
(can capture Dutch crossing dependencies, but is still efficiently parseable)
(every syntactic category and operation has a semantic counterpart)
(movement or traces don’t exist)
45
CS498JH: Introduction to NLP
Phenomena that require extensions of standard context-free grammars:
non-local dependencies cross-serial dependencies
Two lexicalized formalisms:
Tree-adjoining Grammar Combinatory Categorial Grammar
46