Effective Slot Filling Based on Shallow Distant Supervision Methods - - PowerPoint PPT Presentation

effective slot filling based on shallow distant
SMART_READER_LITE
LIVE PREVIEW

Effective Slot Filling Based on Shallow Distant Supervision Methods - - PowerPoint PPT Presentation

Effective Slot Filling Based on Shallow Distant Supervision Methods Benjamin Roth, Tassilo Barth, Michael Wiegand, Mittul Singh, Dietrich Klakow Spoken Language Systems (LSV), Saarland University November 18, 2013 1/19 Outline Task and


slide-1
SLIDE 1

Effective Slot Filling Based on Shallow Distant Supervision Methods

Benjamin Roth, Tassilo Barth, Michael Wiegand, Mittul Singh, Dietrich Klakow

Spoken Language Systems (LSV), Saarland University

November 18, 2013

1/19

slide-2
SLIDE 2

Outline

1

Task and System Overview

2

Candidate Generation

3

Candidate Validation Distant Supervision SVM’s Distant Supervision Patterns

4

Per-Component Analysis

5

Conclusion

2/19

slide-3
SLIDE 3

Task and System Overview

TAC KBP English Slot Filling

3/19

slide-4
SLIDE 4

Task and System Overview

LSV / Saarland University 2013 Slot Filling System

Modular and easily extensible distant supervision relation extractor Using shallow textual representations and features Based on LSV 2012 system [Roth et al., 2012]

same training data same architecture improved algorithms & context modeling

4/19

slide-5
SLIDE 5

Task and System Overview

Data Flow

5/19

slide-6
SLIDE 6

Candidate Generation

Outline

1

Task and System Overview

2

Candidate Generation

3

Candidate Validation Distant Supervision SVM’s Distant Supervision Patterns

4

Per-Component Analysis

5

Conclusion

6/19

slide-7
SLIDE 7

Candidate Generation

Candidate Generation

Entity expansion based on Wikipedia anchor text language models

Query:“Badr Organization” Expansion: “Badr Brigade”, “Badr Organisation”, “Badr Brigades”, “Badr”, “Badr Corps” Also used for removing redundant answers (postprocessing)

7/19

slide-8
SLIDE 8

Candidate Generation

Candidate Generation

Entity expansion based on Wikipedia anchor text language models

Query:“Badr Organization” Expansion: “Badr Brigade”, “Badr Organisation”, “Badr Brigades”, “Badr”, “Badr Corps” Also used for removing redundant answers (postprocessing)

Document retrieval

Lucene index Selection of expansion terms based on point-wise mutual information

7/19

slide-9
SLIDE 9

Candidate Generation

Candidate Generation

Entity expansion based on Wikipedia anchor text language models

Query:“Badr Organization” Expansion: “Badr Brigade”, “Badr Organisation”, “Badr Brigades”, “Badr”, “Badr Corps” Also used for removing redundant answers (postprocessing)

Document retrieval

Lucene index Selection of expansion terms based on point-wise mutual information

Candidate matching

NE Tagger [Chrupa la and Klakow, 2010] NE types from Freebase: CAUSE-OF-DEATH, JOB-TITLE, CRIMINAL-CHARGES, RELIGION

7/19

slide-10
SLIDE 10

Candidate Validation

Outline

1

Task and System Overview

2

Candidate Generation

3

Candidate Validation Distant Supervision SVM’s Distant Supervision Patterns

4

Per-Component Analysis

5

Conclusion

8/19

slide-11
SLIDE 11

Candidate Validation

Candidate Validation Modules

Distant Supervision SVM Classifiers Distant Supervision Patterns Manual Patterns Alternate Names from Query Expansion

9/19

slide-12
SLIDE 12

Candidate Validation

Candidate Validation Modules

Distant Supervision SVM Classifiers Distant Supervision Patterns Manual Patterns Alternate Names from Query Expansion

9/19

slide-13
SLIDE 13

Candidate Validation Distant Supervision SVM’s

Distant Supervision

Knowledge Base

per:city_of_birth

(B. Obama, Honululu) ... (M. Jackson, Gary)

10/19

slide-14
SLIDE 14

Candidate Validation Distant Supervision SVM’s

Distant Supervision

Knowledge Base

per:city_of_birth

(B. Obama, Honululu) ... (M. Jackson, Gary)

Corpus

  • B. Obama was born in Honululu
  • B. Obama moved from Honululu

Gary, M. Jackson's birthplace Training Data 10/19

slide-15
SLIDE 15

Candidate Validation Distant Supervision SVM’s

Distant Supervision

Knowledge Base

per:city_of_birth

(B. Obama, Honululu) ... (M. Jackson, Gary)

Corpus

  • B. Obama was born in Honululu
  • B. Obama moved from Honululu

Gary, M. Jackson's birthplace Training Data

Classifier

10/19

slide-16
SLIDE 16

Candidate Validation Distant Supervision SVM’s

Distant Supervision

Knowledge Base

per:city_of_birth

(B. Obama, Honululu) ... (M. Jackson, Gary)

(N. Chomsky, Philadelphia) Corpus

  • B. Obama was born in Honululu
  • B. Obama moved from Honululu

Gary, M. Jackson's birthplace Training Data

Classifier

Corpus

  • F. Hollande visited Berlin

Born in Philadelphia, N. Chomsky ... Instance Candidates 10/19

slide-17
SLIDE 17

Candidate Validation Distant Supervision SVM’s

Distant Supervision (DS) SVM Classifiers

“Workhorse” for candidate validation.

11/19

slide-18
SLIDE 18

Candidate Validation Distant Supervision SVM’s

Distant Supervision (DS) SVM Classifiers

“Workhorse” for candidate validation. Argument pairs for training data

Freebase Pattern matches

11/19

slide-19
SLIDE 19

Candidate Validation Distant Supervision SVM’s

Distant Supervision (DS) SVM Classifiers

“Workhorse” for candidate validation. Argument pairs for training data

Freebase Pattern matches

Minimalistic feature set

n-grams between relation arguments n-grams outside relation arguments sparse (or skip) n-grams marking of argument order for every feature

11/19

slide-20
SLIDE 20

Candidate Validation Distant Supervision SVM’s

Distant Supervision (DS) SVM Classifiers

“Workhorse” for candidate validation. Argument pairs for training data

Freebase Pattern matches

Minimalistic feature set

n-grams between relation arguments n-grams outside relation arguments sparse (or skip) n-grams marking of argument order for every feature

Training scheme:

aggregate training global parameter tuning

11/19

slide-21
SLIDE 21

Candidate Validation Distant Supervision SVM’s

DS SVMs: Training

One binary SVM per relation

12/19

slide-22
SLIDE 22

Candidate Validation Distant Supervision SVM’s

DS SVMs: Training

One binary SVM per relation Aggregate training

Training sentences are aggregaged per argument pair Feature weights averaged Better generalization than single-sentence training

12/19

slide-23
SLIDE 23

Candidate Validation Distant Supervision SVM’s

DS SVMs: Training

One binary SVM per relation Aggregate training

Training sentences are aggregaged per argument pair Feature weights averaged Better generalization than single-sentence training

Parameter tuning

Misclassification cost tuning is essential Optimizing per-relation cost parameter does not lead to global

  • ptimum

⇒ Greedy parameter tuning algorithm for global F1 optimization

12/19

slide-24
SLIDE 24

Candidate Validation Distant Supervision Patterns

Distant Supervision Patterns

Surface patterns from DS data

with “goodness” scores

  • rg:alternate names

0.9784 [ARG1] , abbreviated [ARG2] 0.4023 [ARG2] is the core division of [ARG1]

13/19

slide-25
SLIDE 25

Candidate Validation Distant Supervision Patterns

Distant Supervision Patterns

Surface patterns from DS data

with “goodness” scores

  • rg:alternate names

0.9784 [ARG1] , abbreviated [ARG2] 0.4023 [ARG2] is the core division of [ARG1]

Combination of DS noise reduction models [Roth and Klakow, 2013]

discriminative at-least-one perceptron model: P(relation|pattern, θ) generative hierarchical topic model: n(pattern, topic(relation)) relative frequency of pattern: n(pattern, relation) n(pattern)

13/19

slide-26
SLIDE 26

Per-Component Analysis

Outline

1

Task and System Overview

2

Candidate Generation

3

Candidate Validation Distant Supervision SVM’s Distant Supervision Patterns

4

Per-Component Analysis

5

Conclusion

14/19

slide-27
SLIDE 27

Per-Component Analysis

Effect of Removing Single Components (one at a time)

Component P R F1 F1 gain LSV main run 42.5 33.2 37.3 −Query expansion 41.1 17.5 24.5 +12.8 −Distsup SVM classifier 53.3 21.8 30.9 +6.4 −Distsup patterns 39.6 28.6 33.2 +4.1 −Manual patterns 38.2 29.5 33.2 +4.1 −Alternate names 41.1 31.0 35.4 +1.9

15/19

slide-28
SLIDE 28

Per-Component Analysis

Effect of Removing Single Components (one at a time)

Component P R F1 F1 gain LSV main run 42.5 33.2 37.3 −Query expansion 41.1 17.5 24.5 +12.8 −Distsup SVM classifier 53.3 21.8 30.9 +6.4 −Distsup patterns 39.6 28.6 33.2 +4.1 −Manual patterns 38.2 29.5 33.2 +4.1 −Alternate names 41.1 31.0 35.4 +1.9

15/19

slide-29
SLIDE 29

Per-Component Analysis

Effect of Removing Single Components (one at a time)

Component P R F1 F1 gain LSV main run 42.5 33.2 37.3 −Query expansion 41.1 17.5 24.5 +12.8 −Distsup SVM classifier 53.3 21.8 30.9 +6.4 −Distsup patterns 39.6 28.6 33.2 +4.1 −Manual patterns 38.2 29.5 33.2 +4.1 −Alternate names 41.1 31.0 35.4 +1.9

15/19

slide-30
SLIDE 30

Per-Component Analysis

Effect of Removing Single Components (one at a time)

Component P R F1 F1 gain LSV main run 42.5 33.2 37.3 −Query expansion 41.1 17.5 24.5 +12.8 −Distsup SVM classifier 53.3 21.8 30.9 +6.4 −Distsup patterns 39.6 28.6 33.2 +4.1 −Manual patterns 38.2 29.5 33.2 +4.1 −Alternate names 41.1 31.0 35.4 +1.9

15/19

slide-31
SLIDE 31

Per-Component Analysis

Effect of Removing Single Components (one at a time)

Component P R F1 F1 gain LSV main run 42.5 33.2 37.3 −Query expansion 41.1 17.5 24.5 +12.8 −Distsup SVM classifier 53.3 21.8 30.9 +6.4 −Distsup patterns 39.6 28.6 33.2 +4.1 −Manual patterns 38.2 29.5 33.2 +4.1 −Alternate names 41.1 31.0 35.4 +1.9

15/19

slide-32
SLIDE 32

Per-Component Analysis

Effect of Removing Single Components (one at a time)

Component P R F1 F1 gain LSV main run 42.5 33.2 37.3 −Query expansion 41.1 17.5 24.5 +12.8 −Distsup SVM classifier 53.3 21.8 30.9 +6.4 −Distsup patterns 39.6 28.6 33.2 +4.1 −Manual patterns 38.2 29.5 33.2 +4.1 −Alternate names 41.1 31.0 35.4 +1.9

15/19

slide-33
SLIDE 33

Per-Component Analysis

Single Component Performance

Component P R F1 Distsup SVM classifier 34.7 23.6 28.1 Distsup Patterns 42.7 15.6 22.9 Manual patterns 50.2 10.3 17.1 Alternate names 54.2 1.8 3.4

16/19

slide-34
SLIDE 34

Per-Component Analysis

Bottleneck: Candidate Generation

Lost recall on candidate level cannot be undone by validation modules. Query and argument matching is of crucial importance. Recall analysis (on 2012 queries):

good recall on document level big potential on candidate sentence extraction

Query expansion document recall candidate recall end-to-end F1 yes 90.2 58.8 32.1 no 87.7 34.4 23.0

17/19

slide-35
SLIDE 35

Conclusion

Outline

1

Task and System Overview

2

Candidate Generation

3

Candidate Validation Distant Supervision SVM’s Distant Supervision Patterns

4

Per-Component Analysis

5

Conclusion

18/19

slide-36
SLIDE 36

Conclusion

Conclusion

TAC KBP English Slot-Filling

19/19

slide-37
SLIDE 37

Conclusion

Conclusion

TAC KBP English Slot-Filling

Query-driven relation extraction

19/19

slide-38
SLIDE 38

Conclusion

Conclusion

TAC KBP English Slot-Filling

Query-driven relation extraction Subtasks

19/19

slide-39
SLIDE 39

Conclusion

Conclusion

TAC KBP English Slot-Filling

Query-driven relation extraction Subtasks

Candidate retrieval and matching

19/19

slide-40
SLIDE 40

Conclusion

Conclusion

TAC KBP English Slot-Filling

Query-driven relation extraction Subtasks

Candidate retrieval and matching Relation modeling

19/19

slide-41
SLIDE 41

Conclusion

Conclusion

TAC KBP English Slot-Filling

Query-driven relation extraction Subtasks

Candidate retrieval and matching Relation modeling

LSV system

19/19

slide-42
SLIDE 42

Conclusion

Conclusion

TAC KBP English Slot-Filling

Query-driven relation extraction Subtasks

Candidate retrieval and matching Relation modeling

LSV system

Modular, shallow approach

19/19

slide-43
SLIDE 43

Conclusion

Conclusion

TAC KBP English Slot-Filling

Query-driven relation extraction Subtasks

Candidate retrieval and matching Relation modeling

LSV system

Modular, shallow approach Query expansion: Wikipedia anchor text

19/19

slide-44
SLIDE 44

Conclusion

Conclusion

TAC KBP English Slot-Filling

Query-driven relation extraction Subtasks

Candidate retrieval and matching Relation modeling

LSV system

Modular, shallow approach Query expansion: Wikipedia anchor text N-gram-based distant supervision SVMs

19/19

slide-45
SLIDE 45

Conclusion

Conclusion

TAC KBP English Slot-Filling

Query-driven relation extraction Subtasks

Candidate retrieval and matching Relation modeling

LSV system

Modular, shallow approach Query expansion: Wikipedia anchor text N-gram-based distant supervision SVMs Scored distant supervision surface patterns

19/19

slide-46
SLIDE 46

Conclusion

Conclusion

TAC KBP English Slot-Filling

Query-driven relation extraction Subtasks

Candidate retrieval and matching Relation modeling

LSV system

Modular, shallow approach Query expansion: Wikipedia anchor text N-gram-based distant supervision SVMs Scored distant supervision surface patterns

More details and analysis in our workshop paper!

19/19