Parsing Probabilistic Context Free Grammars
CMSC 473/673 UMBC November 8th, 2017
Parsing Probabilistic Context Free Grammars CMSC 473/673 UMBC - - PowerPoint PPT Presentation
Parsing Probabilistic Context Free Grammars CMSC 473/673 UMBC November 8 th , 2017 Recap from last time Constituents Help Form Grammars constituent: spans of words that act (syntactically) as a group X phrase (noun phrase) Baltimore
Parsing Probabilistic Context Free Grammars
CMSC 473/673 UMBC November 8th, 2017
Constituents Help Form Grammars
constituent: spans of words that act (syntactically) as a group “X phrase” (noun phrase) Baltimore is a great place to be. This house is a great place to be. This red house is a great place to be. This red house on the hill is a great place to be. This red house near the hill is a great place to be. This red house atop the hill is a great place to be. The hill is a great place to be.
S NP VP NP Det Noun NP Noun NP Det AdjP NP NP PP PP P NP AdjP Adj Noun VP V NP Noun Baltimore
Context Free Grammar
Set of rewrite rules, comprised of terminals and non-terminals Terminals: the words in the language (the lexicon), e.g., Baltimore Non-terminals: symbols that can trigger rewrite rules, e.g., S, NP, Noun (Sometimes) Pre-terminals: symbols that can
S NP VP NP Det Noun NP Noun NP Det AdjP NP NP PP PP P NP AdjP Adj Noun VP V NP Noun Baltimore
Generate from a Context Free Grammar
S NP VP NP Det Noun NP Noun NP Det AdjP NP NP PP PP P NP AdjP Adj Noun VP V NP Noun Baltimore … Baltimore is a great city S NP VP
Noun
Baltimore
Verb
NP
is a great city
Assign Structure (Parse) with a Context Free Grammar
S NP VP NP Det Noun NP Noun NP Det AdjP NP NP PP PP P NP AdjP Adj Noun VP V NP Noun Baltimore … Baltimore is a great city S NP VP
Noun
Baltimore
Verb
NP
is a great city [S [NP [Noun Baltimore] ] [VP [Verb is] [NP a great city]]]
bracket notation (S (NP (Noun Baltimore)) (VP (V is) (NP a great city))) S-expression
Parsing as a Core NLP Problem
sentence 1 sentence 2 sentence 3 sentence 4 Parser Grammar
Evaluation
score Other NLP task (entity coref., MT, Q&A, …)
independent
Gold (correct) reference trees
Grammars Aren’t Just for Syntax
general
A AV generalize V
VN generalization N
NN
N
Clearly Show Ambiguity… But Not Necessarily All Ambiguity
I ate the meal with friends
NP VP VP NP PP S
I ate the meal with gusto I ate the meal with a fork
PP Attachment
(a common source of errors, even still today)
Semantic Ambiguities
Issue 1: Which grammar? Issue 2: Discourse demands flexibility
How Do We Robustly Handle Ambiguities?
How Do We Robustly Handle Ambiguities?
Add probabilities (to what?)
Probabilistic Context Free Grammar
Set of weighted (probabilistic) rewrite rules, comprised of terminals and non-terminals Terminals: the words in the language (the lexicon), e.g., Baltimore Non-terminals: symbols that can trigger rewrite rules, e.g., S, NP, Noun (Sometimes) Pre-terminals: symbols that can only trigger lexical rewrites, e.g., Noun
S NP VP NP Det Noun NP Noun NP Det AdjP NP NP PP PP P NP AdjP Adj Noun VP V NP Noun Baltimore …
Probabilistic Context Free Grammar
Set of weighted (probabilistic) rewrite rules, comprised of terminals and non-terminals Terminals: the words in the language (the lexicon), e.g., Baltimore Non-terminals: symbols that can trigger rewrite rules, e.g., S, NP, Noun (Sometimes) Pre-terminals: symbols that can only trigger lexical rewrites, e.g., Noun
Q: What are the distributions? What must sum to 1? S NP VP NP Det Noun NP Noun NP Det AdjP NP NP PP PP P NP AdjP Adj Noun VP V NP Noun Baltimore …
Probabilistic Context Free Grammar
Set of weighted (probabilistic) rewrite rules, comprised of terminals and non-terminals Terminals: the words in the language (the lexicon), e.g., Baltimore Non-terminals: symbols that can trigger rewrite rules, e.g., S, NP, Noun (Sometimes) Pre-terminals: symbols that can only trigger lexical rewrites, e.g., Noun
1.0 S NP VP .4 NP Det Noun .3 NP Noun .2 NP Det AdjP .1 NP NP PP 1.0 PP P NP .34 AdjP Adj Noun .26 VP V NP .0003 Noun Baltimore … Q: What are the distributions? What must sum to 1?
A: P(X Y Z | X)
Probabilistic Context Free Grammar
S NP VP
Noun
Baltimore
Verb
NP
is a great city
product of probabilities of individual rules used in the derivation
Probabilistic Context Free Grammar
S NP VP
Noun
Baltimore
Verb
NP
is a great city
S NP VP ) *
product of probabilities of individual rules used in the derivation
Probabilistic Context Free Grammar
S NP VP
Noun
Baltimore
Verb
NP
is a great city
S NP VP ) *
NP
Noun
Noun
Baltimore
product of probabilities of individual rules used in the derivation
Probabilistic Context Free Grammar
S NP VP
Noun
Baltimore
Verb
NP
is a great city
S NP VP ) *
NP
Noun
Noun
Baltimore
VP
Verb
NP
Verb
is
NP
a great city
product of probabilities of individual rules used in the derivation
Log Probabilistic Context Free Grammar
S NP VP
Noun
Baltimore
Verb
NP
is a great city
S NP VP ) +
NP
Noun
Noun
Baltimore
VP
Verb
NP
Verb
is
NP
a great city
sum of log probabilities of individual rules used in the derivation
Estimating PCFGs
Attempt 1:
syntactically annotated sentences), e.g., the English Penn Treebank
Probabilistic Context Free Grammar (PCFG) Tasks
Find the most likely parse (for an observed sequence) Calculate the (log) likelihood of an observed sequence w1, …, wN Learn the grammar parameters
Probabilistic Context Free Grammar (PCFG) Tasks
Find the most likely parse (for an observed sequence) Calculate the (log) likelihood of an observed sequence w1, …, wN Learn the grammar parameters
Probabilistic Context Free Grammar (PCFG) Tasks
Find the most likely parse (for an observed sequence) Calculate the (log) likelihood of an observed sequence w1, …, wN Learn the grammar parameters
any
Parsing with a CFG
Top-down backtracking (brute force) CKY Algorithm: dynamic bottom-up Earley’s Algorithm: dynamic top-down
Parsing with a CFG
Top-down backtracking (brute force) CKY Algorithm: dynamic bottom-up Earley’s Algorithm: dynamic top-down
CKY Precondition
Grammar must be in Chomsky Normal Form (CNF) non-terminal non-terminal non-terminal non-terminal terminal
CKY Precondition
Grammar must be in Chomsky Normal Form (CNF) non-terminal non-terminal non-terminal non-terminal terminal
X Y Z X a
CKY Precondition
Grammar must be in Chomsky Normal Form (CNF) non-terminal non-terminal non-terminal non-terminal terminal
X Y Z X a
binary rules can only involve non-terminals unary rules can only involve terminals no ternary (+) rules
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
Example from Jason Eisner
Entire grammar Assume uniform weights
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
Goal: (S, 0, 7)
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
Check 1: What are the non- terminals?
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
Check 1: What are the non- terminals?
S NP VP PP N V P Det
Check 2: What are the terminals?
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
Check 1: What are the non- terminals?
S NP VP PP N V P Det
Check 2: What are the terminals?
Papa caviar spoon ate with the a
Check 3: What are the pre- terminals?
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
Check 1: What are the non- terminals?
S NP VP PP N V P Det
Check 2: What are the terminals?
Papa caviar spoon ate with the a
Check 3: What are the pre- terminals?
N V P Det
Check 4: Is this in CNF?
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
Check 1: What are the non- terminals?
S NP VP PP N V P Det
Check 2: What are the terminals?
Papa caviar spoon ate with the a
Check 3: What are the pre- terminals?
N V P Det
Check 4: Is this in CNF?
Yes
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
(NP, 0, 1): Papa (NP, 2, 4): the caviar (NP, 5, 7): a spoon (NP, 2, 7): the caviar with a spoon
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
(NP, 0, 1): Papa (NP, 2, 4): the caviar (NP, 5, 7): a spoon (NP, 2, 7): the caviar with a spoon
Second: Let’s find all VPs
(VP, 1, 7): ate the caviar with a spoon (VP, 1, 4): ate the caviar
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
(NP, 0, 1): Papa (NP, 2, 4): the caviar (NP, 5, 7): a spoon (NP, 2, 7): the caviar with a spoon
Second: Let’s find all VPs
(VP, 1, 7): ate the caviar with a spoon (VP, 1, 4): ate the caviar
Third: Let’s find all Ss
(S, 0, 7): Papa ate the caviar with a spoon (S, 0, 4): Papa ate the caviar
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
(NP, 0, 1): Papa (NP, 2, 4): the caviar (NP, 5, 7): a spoon (NP, 2, 7): the caviar with a spoon
Second: Let’s find all VPs
(VP, 1, 7): ate the caviar with a spoon (VP, 1, 4): ate the caviar
Third: Let’s find all Ss
(S, 0, 7): Papa ate the caviar with a spoon (S, 0, 4): Papa ate the caviar
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
(NP, 0, 1): Papa (NP, 2, 4): the caviar (NP, 5, 7): a spoon (NP, 2, 7): the caviar with a spoon
Second: Let’s find all VPs
(VP, 1, 7): ate the caviar with a spoon (VP, 1, 4): ate the caviar
Third: Let’s find all Ss
(S, 0, 7): Papa ate the caviar with a spoon (S, 0, 4): Papa ate the caviar
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
(NP, 0, 1): Papa (NP, 2, 4): the caviar (NP, 5, 7): a spoon (NP, 2, 7): the caviar with a spoon
Second: Let’s find all VPs
(VP, 1, 7): ate the caviar with a spoon (VP, 1, 4): ate the caviar
Third: Let’s find all Ss
(S, 0, 7): Papa ate the caviar with a spoon (S, 0, 4): Papa ate the caviar
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
(NP, 0, 1): Papa (NP, 2, 4): the caviar (NP, 5, 7): a spoon (NP, 2, 7): the caviar with a spoon
Second: Let’s find all VPs
(VP, 1, 7): ate the caviar with a spoon (VP, 1, 4): ate the caviar
Third: Let’s find all Ss
(S, 0, 7): Papa ate the caviar with a spoon (S, 0, 4): Papa ate the caviar
(NP, 0, 1) (VP, 1, 7) (S, 0, 7)
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
(NP, 0, 1): Papa (NP, 2, 4): the caviar (NP, 5, 7): a spoon (NP, 2, 7): the caviar with a spoon
Second: Let’s find all VPs
(VP, 1, 7): ate the caviar with a spoon (VP, 1, 4): ate the caviar
Third: Let’s find all Ss
(S, 0, 7): Papa ate the caviar with a spoon (S, 0, 4): Papa ate the caviar 6 1 2 3 4 5 1 2 3 4 5 6 7
(NP, 0, 1) (VP, 1, 7) (S, 0, 7)
start end
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
(NP, 0, 1): Papa (NP, 2, 4): the caviar (NP, 5, 7): a spoon (NP, 2, 7): the caviar with a spoon
Second: Let’s find all VPs
(VP, 1, 7): ate the caviar with a spoon (VP, 1, 4): ate the caviar
Third: Let’s find all Ss
(S, 0, 7): Papa ate the caviar with a spoon (S, 0, 4): Papa ate the caviar
NP
6 1 2 3 4 5 1 2 3 4 5 6 7
(NP, 0, 1) (VP, 1, 7) (S, 0, 7)
start end
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
(NP, 0, 1): Papa (NP, 2, 4): the caviar (NP, 5, 7): a spoon (NP, 2, 7): the caviar with a spoon
Second: Let’s find all VPs
(VP, 1, 7): ate the caviar with a spoon (VP, 1, 4): ate the caviar
Third: Let’s find all Ss
(S, 0, 7): Papa ate the caviar with a spoon (S, 0, 4): Papa ate the caviar
NP VP
6 1 2 3 4 5 1 2 3 4 5 6 7
(NP, 0, 1) (VP, 1, 7) (S, 0, 7)
start end
“Papa ate the caviar with a spoon”
S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP NP Papa N caviar N spoon V spoon V ate P with Det the Det a
1 2 3 4 5 6 7
Example from Jason Eisner
Entire grammar Assume uniform weights
First: Let’s find all NPs
(NP, 0, 1): Papa (NP, 2, 4): the caviar (NP, 5, 7): a spoon (NP, 2, 7): the caviar with a spoon
Second: Let’s find all VPs
(VP, 1, 7): ate the caviar with a spoon (VP, 1, 4): ate the caviar
Third: Let’s find all Ss
(S, 0, 7): Papa ate the caviar with a spoon (S, 0, 4): Papa ate the caviar
NP VP
6 1 2 3 4 5 1 2 3 4 5 6 7
(NP, 0, 1) (VP, 1, 7) (S, 0, 7)
start end S
CKY Recognizer
Input: * string of N words * grammar in CNF Output: True (with parse)/False Data structure: N*N table T Rows indicate span start (0 to N-1) Columns indicate span end (1 to N) T[i][j] lists constituents spanning i j
CKY Recognizer
Input: * string of N words * grammar in CNF Output: True (with parse)/False Data structure: N*N table T Rows indicate span start (0 to N-1) Columns indicate span end (1 to N) T[i][j] lists constituents spanning i j For Viterbi in HMMs: build table left-to-right For CKY in trees:
CKY Recognizer
T = Cell[N][N+1]
CKY Recognizer
T = Cell[N][N+1] for(j = 1; j ≤ N; ++j) { T[j-1][j].add(X for non-terminal X in G if X wordj) }
CKY Recognizer
T = Cell[N][N+1] for(j = 1; j ≤ N; ++j) { T[j-1][j].add(X for non-terminal X in G if X wordj) } for(width = 2; width ≤ N; ++width) { }
CKY Recognizer
T = Cell[N][N+1] for(j = 1; j ≤ N; ++j) { T[j-1][j].add(X for non-terminal X in G if X wordj) } for(width = 2; width ≤ N; ++width) { for(start = 0; start < N - width; ++start) { end = start + width } }
CKY Recognizer
T = Cell[N][N+1] for(j = 1; j ≤ N; ++j) { T[j-1][j].add(X for non-terminal X in G if X wordj) } for(width = 2; width ≤ N; ++width) { for(start = 0; start < N - width; ++start) { end = start + width for(mid = start+1; mid < end; ++mid) { } } }
CKY Recognizer
T = Cell[N][N+1] for(j = 1; j ≤ N; ++j) { T[j-1][j].add(X for non-terminal X in G if X wordj) } for(width = 2; width ≤ N; ++width) { for(start = 0; start < N - width; ++start) { end = start + width for(mid = start+1; mid < end; ++mid) { for(non-terminal Y : T[start][mid]) { for(non-terminal Z : T[mid][end]) { T[start][end].add(X for rule X Y Z : G) } } } } }
X Y Z Y Z
CKY Recognizer
T = Cell[N][N+1] for(j = 1; j ≤ N; ++j) { T[j-1][j].add(X for non-terminal X in G if X wordj) } for(width = 2; width ≤ N; ++width) { for(start = 0; start < N - width; ++start) { end = start + width for(mid = start+1; mid < end; ++mid) { for(non-terminal Y : T[start][mid]) { for(non-terminal Z : T[mid][end]) { T[start][end].add(X for rule X Y Z : G) } } } } }
Q: What do we return?
CKY Recognizer
T = Cell[N][N+1] for(j = 1; j ≤ N; ++j) { T[j-1][j].add(X for non-terminal X in G if X wordj) } for(width = 2; width ≤ N; ++width) { for(start = 0; start < N - width; ++start) { end = start + width for(mid = start+1; mid < end; ++mid) { for(non-terminal Y : T[start][mid]) { for(non-terminal Z : T[mid][end]) { T[start][end].add(X for rule X Y Z : G) } } } } }
Q: What do we return? A: S in T[0][N]
CKY Recognizer
T = Cell[N][N+1] for(j = 1; j ≤ N; ++j) { T[j-1][j].add(X for non-terminal X in G if X wordj) } for(width = 2; width ≤ N; ++width) { for(start = 0; start < N - width; ++start) { end = start + width for(mid = start+1; mid < end; ++mid) { for(non-terminal Y : T[start][mid]) { for(non-terminal Z : T[mid][end]) { T[start][end].add(X for rule X Y Z : G) } } } } }
Q: How do we get the parse?
CKY Recognizer
T = Cell[N][N+1] for(j = 1; j ≤ N; ++j) { T[j-1][j].add(X for non-terminal X in G if X wordj) } for(width = 2; width ≤ N; ++width) { for(start = 0; start < N - width; ++start) { end = start + width for(mid = start+1; mid < end; ++mid) { for(non-terminal Y : T[start][mid]) { for(non-terminal Z : T[mid][end]) { T[start][end].add(X for rule X Y Z : G) } } } } }
Q: How do we get the parse? A: Follow backpointers (stored where?)
CKY Recognizer
T = Cell[N][N+1] for(j = 1; j ≤ N; ++j) { T[j-1][j].add(X for non-terminal X in G if X wordj) } for(width = 2; width ≤ N; ++width) { for(start = 0; start < N - width; ++start) { end = start + width for(mid = start+1; mid < end; ++mid) { for(non-terminal Y : T[start][mid]) { for(non-terminal Z : T[mid][end]) { T[start][end].add(X for rule X Y Z : G) } } } } }
CKY Recognizer
T = Cell[N][N+1] for(j = 1; j ≤ N; ++j) { T[j-1][j].add(X for non-terminal X in G if X wordj) } for(width = 2; width ≤ N; ++width) { for(start = 0; start < N - width; ++start) { end = start + width for(mid = start+1; mid < end; ++mid) { for(rule X Y Z : G) { T[start][end].add(X if Y in T[start][mid] & Z in T[mid][end]) } } } }
CKY Recognizer
T = bool[K][N][N+1] for(j = 1; j ≤ N; ++j) { for(non-terminal X in G if X wordj) { T[X][j-1][j] = True } } for(width = 2; width ≤ N; ++width) { for(start = 0; start < N - width; ++start) { end = start + width for(mid = start+1; mid < end; ++mid) { for(rule X Y Z : G) { for rule X Y Z : G) { T[X][start][end] = T[Y][start][mid] & T[Z][mid][end] } } } } }