Natural Language Processing
Spring 2017
Professor Liang Huang liang.huang.sh@gmail.com
Unit 3: Tree Models
Lectures 9-11: Context-Free Grammars and Parsing
required hard
- ptional
Natural Language Processing Spring 2017 Unit 3: Tree Models - - PowerPoint PPT Presentation
Natural Language Processing Spring 2017 Unit 3: Tree Models Lectures 9-11: Context-Free Grammars and Parsing required hard Professor Liang Huang optional liang.huang.sh@gmail.com Big Picture only 2 ideas in this course: Noisy-Channel and
Professor Liang Huang liang.huang.sh@gmail.com
CS 562 - CFGs and Parsing
Viterbi (DP)
2
CS 562 - CFGs and Parsing
3
CS 562 - CFGs and Parsing
4
(courtesy of Julia Hockenmaier)
CS 562 - CFGs and Parsing
5
CS 562 - CFGs and Parsing
6
(courtesy of Julia Hockenmaier)
CS 562 - CFGs and Parsing
7
(courtesy of Julia Hockenmaier)
the ball the ball in the garden the ball in the garden behind the house the ball in the garden behind the house near the school ....
CS 562 - CFGs and Parsing
8
(courtesy of Julia Hockenmaier)
CS 562 - CFGs and Parsing
9
The mouse ate the corn. The mouse that the snake ate ate the corn. The mouse that the snake that the hawk ate ate ate the corn.
....
vs. The claim that the house he bought was valuable was wrong. vs. I saw the ball in the garden behind the house near the school.
(courtesy of Julia Hockenmaier)
CS 562 - CFGs and Parsing
10
NP VP |
Det N |
V NP |
CS 562 - CFGs and Parsing
11
CS 562 - CFGs and Parsing
12
A CFG is a 4-tuple〈N,Σ,R,S〉
A set of nonterminals N (e.g. N = {S, NP, VP, PP, Noun, Verb, ....}) A set of terminals Σ (e.g. Σ = {I, you, he, eat, drink, sushi, ball, }) A set of rules R R ⊆ {A → β with left-hand-side (LHS) A ∈ N and right-hand-side (RHS) β ∈ (N ∪ Σ)* } A start symbol S (sentence)
CS 562 - CFGs and Parsing
13
CS 562 - CFGs and Parsing
14
The mouse ate the corn. The mouse that the snake ate ate the corn. The mouse that the snake that the hawk ate ate ate the corn.
....
CS 562 - CFGs and Parsing
15
CS 562 - CFGs and Parsing
16
CS 562 - CFGs and Parsing
17
https://www.slideshare.net/kevinjmcmullin/computational-accounts-of-human-learning-bias
CS 562 - CFGs and Parsing
18
three models of computation:
2. Turing machine (A. Turing, 1935)
https://chomsky.info/wp-content/uploads/195609-.pdf
https://www.researchgate.net/publication/272082985_Principles_of_structure_building_in_music_language_and_animal_song
CS 562 - CFGs and Parsing
19
CS 498 JH: Introduction to NLP (Fall ʼ08)
CS 562 - CFGs and Parsing
20
CS 498 JH: Introduction to NLP (Fall ʼ08)
CS 562 - CFGs and Parsing
21
CS 498 JH: Introduction to NLP (Fall ʼ08)
CS 562 - CFGs and Parsing
22
CS 498 JH: Introduction to NLP (Fall ʼ08)
CS 562 - CFGs and Parsing
23
CS 498 JH: Introduction to NLP (Fall ʼ08)
CS 562 - CFGs and Parsing
24
CS 498 JH: Introduction to NLP (Fall ʼ08)
CS 562 - CFGs and Parsing
25
CS 498 JH: Introduction to NLP (Fall ʼ08)
CS 562 - CFGs and Parsing
26
CS 498 JH: Introduction to NLP (Fall ʼ08)
CS 562 - CFGs and Parsing
27
CS 498 JH: Introduction to NLP (Fall ʼ08)
CS 562 - CFGs and Parsing
28
CS 498 JH: Introduction to NLP (Fall ʼ08)
CS 562 - CFGs and Parsing
29
CS 498 JH: Introduction to NLP (Fall ʼ08)
CS 562 - CFGs and Parsing
30
CS 498 JH: Introduction to NLP (Fall ␣08)
CS 562 - CFGs and Parsing
31
CS 498 JH: Introduction to NLP (Fall ␣08)
CS 562 - CFGs and Parsing
32
NAACL 2009 Dynamic Programming
33
(S, 0, n) w0 w1 ... wn-1
NAACL 2009 Dynamic Programming
34
flies like a flower
S → NP VP NP → DT NN NP → NNS NP → NP PP VP → VB NP VP → VP PP VP → VB PP → P NP
VB → flies NNS → flies VB → like P → like DT → a NN → flower
NAACL 2009 Dynamic Programming
35
N N S , N P V B , V P
S S, VP , NP
V B , P , V P
VP , PP DT NP NN
flies like a flower
S → NP VP NP → DT NN NP → NNS NP → NP PP VP → VB NP VP → VP PP VP → VB PP → P NP
VB → flies NNS → flies VB → like P → like DT → a NN → flower S → VP
NAACL 2009 Dynamic Programming
36
CS 498 JH: Introduction to NLP (Fall ␣08)
CS 562 - CFGs and Parsing
37
NAACL 2009 Dynamic Programming
38
NAACL 2009 Dynamic Programming
39
NAACL 2009 Dynamic Programming
40
NAACL 2009 Dynamic Programming
41
: a × b × Pr(A → B C) (B, i, k) (C, k, j) (A, i, j)
A→B C
NAACL 2009 Dynamic Programming
42
: a × b × Pr(A → B C) (B, i, k) (C, k, j) (A, i, j)
A→B C
NAACL 2009 Dynamic Programming
43
(Klein and Manning, 2001; Huang and Chiang, 2005)
0 I 1 saw 2 him 3 with 4 a 5 mirror 6
nodes hyperedges
a hypergraph
NAACL 2009 Dynamic Programming
44
NAACL 2009 Dynamic Programming
45
(Nederhof, 2003)
v
u1 u2
fe
: a × b × Pr(A → B C)
(A, i, j) (C, k, j) (B, i, k)
(B, i, k) (C, k, j) (A, i, j)
A→B C
v
u1 u2
tails
head
fe
: fe (a,b)
v
u1 u2
fe
: fe (a,b)
antecedents
consequent
NAACL 2009 Dynamic Programming
46
v
u1 u2
e v
u1 u2
e
AND-node OR-node OR-nodes
NAACL 2009 Dynamic Programming
47
v u
w(u, v)
d(v) ⊕ = d(u) ⊗ w(u, v)
NAACL 2009 Dynamic Programming
V + E ) (assuming constant arity)
48
v
u1 u2
fe
d(v) ⊕ = fe(d(u1), · · · , d(u|e|))
NAACL 2009 Dynamic Programming
49
(S, 0, n) (S, 0, n)
NAACL 2009 Dynamic Programming
50
(S, 0, n) (S, 0, n) (S, 0, n)
NAACL 2009 Dynamic Programming
51
NAACL 2009 Dynamic Programming
, 2, 5);
52
matched=6 predicted=7 gold=7 precision=6/7 recall=6/7 F=6/7
NAACL 2009 Dynamic Programming
53
NAACL 2009 Dynamic Programming
54
NAACL 2009 Dynamic Programming
, 0, n) or alpha(w_i, i, i+1) for any i
55
NAACL 2009 Dynamic Programming
56
X Z i j k Y TOP n X Z i j k Y TOP n
NAACL 2009 Dynamic Programming
Y,i,k Z,k,j) is
Viterbi inside: best way to derive X,i,j
Viterbi outside: best way to go to TOP from X,i,j
57
NAACL 2009 Dynamic Programming
58
(e.g., CKY)
NAACL 2009 Dynamic Programming
59
NAACL 2009 Dynamic Programming
V + E )
60
v = ui
h(e) u1
v
fe
u2 = Q: how to avoid repeated checking? maintain a counter r[e] for each e: how many tails yet to be fixed? fire this hyperedge only if r[e]=0
h(e)
fe
NAACL 2009 Dynamic Programming
61
NAACL 2009 Dynamic Programming
62
NAACL 2009 Dynamic Programming
V + E ) (assuming constant arity)
63
v
u1 u2
fe
d(v) ⊕ = fe(d(u1), · · · , d(u|e|))
NAACL 2009 Dynamic Programming
V + E )
64
v = ui
h(e) u1
v
fe
u2 = Q: how to avoid repeated checking? maintain a counter r[e] for each e: how many tails yet to be fixed? fire this hyperedge only if r[e]=0
h(e)
fe
NAACL 2009 Dynamic Programming
V - S vertices
65
u
w(v, u)
v s
d(u) ⊕ = d(v) ⊗ w(v, u)
time complexity: O((V+E) lgV) (binary heap) O(V lgV + E) (fib. heap)
v
NAACL 2009 Dynamic Programming
V - S vertices
66
v s
time complexity: O((V+E) lgV) (binary heap) O(V lgV + E) (fib. heap)
u1
v h(e)
fe
v
NAACL 2009 Dynamic Programming
67
NAACL 2009 Dynamic Programming
68
PP1, 3 VP3, 6 VP1, 6
yu Shalong juxing le huitan
with Sharon held a talk held a talk with Sharon
VP → PP(1) VP(2), VP(2) PP(1) VP → juxing le huitan, held a meeting PP → yu Shalong, with Sharon
complexity: same as CKY parsing -- O(n3)
NAACL 2009 Dynamic Programming
69
PP1, 3 VP3, 6 VP1, 6
_ _●
_ _ _ _ _
_ _●
_ _●
with ... Sharon along ... Sharon with ... Shalong held ... talk held ... meeting hold ... talks
with Sharon
bigram complexity: O(n3 V4(m-1) )
held ... talk
VP3, 6
with ... Sharon
PP1, 3
bigram
held ... Sharon
VP1, 6