Computing with Natural Language Percy Liang ACL Workshop on - - PowerPoint PPT Presentation

computing with natural language
SMART_READER_LITE
LIVE PREVIEW

Computing with Natural Language Percy Liang ACL Workshop on - - PowerPoint PPT Presentation

Computing with Natural Language Percy Liang ACL Workshop on Semantic Parsing - June 15, 2014 Stanford University [PaleoDeepDive (Shanan Peters, Chris R e)] Paleobiology 1 [PaleoDeepDive (Shanan Peters, Chris R e)] Paleobiology 1


slide-1
SLIDE 1

Computing with Natural Language

Percy Liang ACL Workshop on Semantic Parsing - June 15, 2014 Stanford University

slide-2
SLIDE 2

Paleobiology

[PaleoDeepDive (Shanan Peters, Chris R´ e)] 1

slide-3
SLIDE 3

Paleobiology

[PaleoDeepDive (Shanan Peters, Chris R´ e)] 1

slide-4
SLIDE 4

Paleobiology

paleobiodb.org

[PaleoDeepDive (Shanan Peters, Chris R´ e)] 1

slide-5
SLIDE 5

Paleobiology

paleobiodb.org Where was the last American Mastadon found?

[PaleoDeepDive (Shanan Peters, Chris R´ e)] 1

slide-6
SLIDE 6

Paleobiology

paleobiodb.org Where was the last American Mastadon found? How long do species tend to exist before going extinct?

[PaleoDeepDive (Shanan Peters, Chris R´ e)] 1

slide-7
SLIDE 7

Paleobiology

paleobiodb.org Where was the last American Mastadon found? How long do species tend to exist before going extinct? Goal: help scientists answer macro-questions Challenge: requires computation / aggregation

[PaleoDeepDive (Shanan Peters, Chris R´ e)] 1

slide-8
SLIDE 8

Question answering via semantic parsing

Where was the last American Mastadon found?

2

slide-9
SLIDE 9

Question answering via semantic parsing

Where was the last American Mastadon found?

semantic parsing

LocationOf.argmax(Type.Occurrence ⊓ Genus.Mammut, Period)

2

slide-10
SLIDE 10

Question answering via semantic parsing

Where was the last American Mastadon found?

semantic parsing

LocationOf.argmax(Type.Occurrence ⊓ Genus.Mammut, Period)

execute

New Mexico

2

slide-11
SLIDE 11

Question answering via semantic parsing

Where was the last American Mastadon found?

semantic parsing execute

New Mexico

2

slide-12
SLIDE 12

Email assistant via semantic parsing

Send a reminder to all authors who haven’t sent an abstract.

3

slide-13
SLIDE 13

Email assistant via semantic parsing

Send a reminder to all authors who haven’t sent an abstract.

semantic parsing

∀x ∈ (Author ⊓ ¬Sent.Subject.Abstract) : Remind(x)

3

slide-14
SLIDE 14

Email assistant via semantic parsing

Send a reminder to all authors who haven’t sent an abstract.

semantic parsing

∀x ∈ (Author ⊓ ¬Sent.Subject.Abstract) : Remind(x)

execute

[5 emails sent]

3

slide-15
SLIDE 15

Email assistant via semantic parsing

Send a reminder to all authors who haven’t sent an abstract.

semantic parsing execute

[5 emails sent]

3

slide-16
SLIDE 16

Semantic parsing

[utterance: user input]

semantic parsing

[program]

execute

[behavior: user output]

Programs affect the world

4

slide-17
SLIDE 17

Outline

  • Semantic parsing in 5 minutes
  • A closer look at the elements

– Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets

  • Final remarks

5

slide-18
SLIDE 18

Framework

x θ z w y people who have lived in Chicago parameters

Type.Person ⊓ PlacesLived.Location.Chicago

world {BarackObama,MichelleObama,...}

6

slide-19
SLIDE 19

World: Freebase

100M entities (nodes) 1B assertions (edges)

BarackObama Person

Type

Politician

Profession

1961.08.04

DateOfBirth

Honolulu

PlaceOfBirth

Hawaii

ContainedBy

City

Type

UnitedStates

ContainedBy

USState

Type

Event8

Marriage

MichelleObama

Spouse Type

Female

Gender

1992.10.03

StartDate

Event3

PlacesLived

Chicago

Location

Event21

PlacesLived Location ContainedBy

[Bollacker, 2008; Google, 2013] 7

slide-20
SLIDE 20

Logical forms

Type.Person ⊓ PlacesLived.Location.Chicago

[Liang, 2013] 8

slide-21
SLIDE 21

Logical forms

Type.Person ⊓ PlacesLived.Location.Chicago

  • Person

Type

?

PlacesLived

Chicago

Location

[Liang, 2013] 8

slide-22
SLIDE 22

Logical forms

Type.Person ⊓ PlacesLived.Location.Chicago

  • Person

Type

?

PlacesLived

Chicago

Location BarackObama Person

Type

Politician

Profession

1961.08.04

DateOfBirth

Honolulu

PlaceOfBirth

Hawaii

ContainedBy

City

Type

UnitedStates

ContainedBy

USState

Type

Event8

Marriage

MichelleObama

Spouse Type

Female

Gender

1992.10.03

StartDate

Event3

PlacesLived

Chicago

Location

Event21

PlacesLived Location ContainedBy

[Liang, 2013] 8

slide-23
SLIDE 23

Logical forms

Type.Person ⊓ PlacesLived.Location.Chicago

  • Person

Type

?

PlacesLived

Chicago

Location BarackObama Person

Type

Politician

Profession

1961.08.04

DateOfBirth

Honolulu

PlaceOfBirth

Hawaii

ContainedBy

City

Type

UnitedStates

ContainedBy

USState

Type

Event8

Marriage

MichelleObama

Spouse Type

Female

Gender

1992.10.03

StartDate

Event3

PlacesLived

Chicago

Location

Event21

PlacesLived Location ContainedBy

[Liang, 2013] 8

slide-24
SLIDE 24

Framework

x θ z w y people who have lived in Chicago parameters

Type.Person ⊓ PlacesLived.Location.Chicago

world {BarackObama,MichelleObama,...}

9

slide-25
SLIDE 25

Derivations

Derivation: construction of logical form given utterance

Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Location where was R[PlaceOfBirth].BarackObama BarackObama Obama R[PlaceOfBirth] born ?

10

slide-26
SLIDE 26

Derivations

Derivation: construction of logical form given utterance

Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Location where was R[PlaceOfBirth].BarackObama BarackObama Obama R[PlaceOfBirth] born ? lexicon lexicon lexicon

10

slide-27
SLIDE 27

Derivations

Derivation: construction of logical form given utterance

Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Location where was R[PlaceOfBirth].BarackObama BarackObama Obama R[PlaceOfBirth] born ? lexicon lexicon lexicon join

10

slide-28
SLIDE 28

Derivations

Derivation: construction of logical form given utterance

Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Location where was R[PlaceOfBirth].BarackObama BarackObama Obama R[PlaceOfBirth] born ? lexicon lexicon lexicon join intersect

10

slide-29
SLIDE 29

Grammar

utterance Grammar derivation 1 derivation 2 ...

11

slide-30
SLIDE 30

Grammar

utterance Grammar derivation 1 derivation 2 ...

A Really Dumb Grammar (lexicon) Obama ⇒ Unary : BarackObama (lexicon) born ⇒ Binary : PlaceOfBirth ... (join) Unary : u Binary : b ⇒ Unary : b.u (intersect) Unary : u Unary : v ⇒ Unary : u ⊓ v

11

slide-31
SLIDE 31

Many possible derivations!

Where was Obama born?

12

slide-32
SLIDE 32

Many possible derivations!

Where was Obama born?

?

set of candidate derivations D(x)

12

slide-33
SLIDE 33

Many possible derivations!

Where was Obama born?

?

set of candidate derivations D(x)

Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Location where was R[PlaceOfBirth].BarackObama BarackObama Obama R[PlaceOfBirth] born ? lexicon lexicon lexicon join intersect

12

slide-34
SLIDE 34

Many possible derivations!

Where was Obama born?

?

set of candidate derivations D(x)

Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Location where was R[PlaceOfBirth].BarackObama BarackObama Obama R[PlaceOfBirth] born ? lexicon lexicon lexicon join intersect

...

Type.Date ⊓ R[Founded].ObamaJapan Type.Date where was R[Founded].ObamaJapan ObamaJapan Obama R[Founded] born ? lexicon lexicon lexicon join intersect

12

slide-35
SLIDE 35

x: utterance d: derivation

Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Location where was R[PlaceOfBirth].BarackObama BarackObama Obama R[PlaceOfBirth] born ? lexicon lexicon lexicon join intersect

Feature vector φ(x, d) ∈ Rf:

13

slide-36
SLIDE 36

x: utterance d: derivation

Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Location where was R[PlaceOfBirth].BarackObama BarackObama Obama R[PlaceOfBirth] born ? lexicon lexicon lexicon join intersect

Feature vector φ(x, d) ∈ Rf:

apply join 1 apply intersect 1 apply lexicon 3 skipped VBD-AUX 1 skipped NN born maps to PlaceOfBirth 1 born maps to PlacesLived.Location 0 alignmentScore 1.52 denotation-size=1 1 ... ...

13

slide-37
SLIDE 37

Scoring derivations

Feature vector: φ(x, d) = [1.3, 2, 0, 1, 0, 0, . . . ] ∈ RF Parameter vector: θ = [1.2, −2.7, 3.4, . . . ] ∈ RF Scoring function: Scoreθ(x, d) = φ(x, d) · θ

14

slide-38
SLIDE 38

Log-linear model

Candidate derivations: D(x)

15

slide-39
SLIDE 39

Log-linear model

Candidate derivations: D(x) Model: distribution over derivations d given utterance x p(d | x, θ) =

exp(Scoreθ(x,d))

  • d′∈D(x) exp(Scoreθ(x,d′))

15

slide-40
SLIDE 40

Learning

Training data:

What’s Bulgaria’s capital? Sofia When was Walmart started? 1962 What movies has Tom Cruise been in? TopGun,VanillaSky,... ...

16

slide-41
SLIDE 41

Learning

Training data:

What’s Bulgaria’s capital? Sofia When was Walmart started? 1962 What movies has Tom Cruise been in? TopGun,VanillaSky,... ...

Objective: Maximum likelihood arg maxθ n

i=1 log pθ(y(i) | x(i))

16

slide-42
SLIDE 42

Learning

Training data:

What’s Bulgaria’s capital? Sofia When was Walmart started? 1962 What movies has Tom Cruise been in? TopGun,VanillaSky,... ...

Objective: Maximum likelihood arg maxθ n

i=1 log pθ(y(i) | x(i))

Algorithm: AdaGrad (SGD with per-feature step size)

16

slide-43
SLIDE 43

Training intuition

Where did Mozart tupress? Vienna

17

slide-44
SLIDE 44

Training intuition

Where did Mozart tupress? PlaceOfBirth.Mozart PlaceOfDeath.Mozart PlaceOfMarriage.Mozart Vienna

17

slide-45
SLIDE 45

Training intuition

Where did Mozart tupress? PlaceOfBirth.Mozart ⇒ Salzburg PlaceOfDeath.Mozart ⇒ Vienna PlaceOfMarriage.Mozart ⇒ Vienna Vienna

17

slide-46
SLIDE 46

Training intuition

Where did Mozart tupress? PlaceOfBirth.Mozart ⇒ Salzburg PlaceOfDeath.Mozart ⇒ Vienna PlaceOfMarriage.Mozart ⇒ Vienna Vienna

17

slide-47
SLIDE 47

Training intuition

Where did Mozart tupress? PlaceOfBirth.Mozart ⇒ Salzburg PlaceOfDeath.Mozart ⇒ Vienna PlaceOfMarriage.Mozart ⇒ Vienna Vienna Where did William Hogarth tuppress?

17

slide-48
SLIDE 48

Training intuition

Where did Mozart tupress? PlaceOfBirth.Mozart ⇒ Salzburg PlaceOfDeath.Mozart ⇒ Vienna PlaceOfMarriage.Mozart ⇒ Vienna Vienna Where did William Hogarth tuppress? PlaceOfBirth.WilliamHogarth PlaceOfDeath.WilliamHogarth PlaceOfMarriage.WilliamHogarth London

17

slide-49
SLIDE 49

Training intuition

Where did Mozart tupress? PlaceOfBirth.Mozart ⇒ Salzburg PlaceOfDeath.Mozart ⇒ Vienna PlaceOfMarriage.Mozart ⇒ Vienna Vienna Where did William Hogarth tuppress? PlaceOfBirth.WilliamHogarth ⇒ London PlaceOfDeath.WilliamHogarth ⇒ London PlaceOfMarriage.WilliamHogarth ⇒ Paddington London

17

slide-50
SLIDE 50

Training intuition

Where did Mozart tupress? PlaceOfBirth.Mozart ⇒ Salzburg PlaceOfDeath.Mozart ⇒ Vienna PlaceOfMarriage.Mozart ⇒ Vienna Vienna Where did William Hogarth tuppress? PlaceOfBirth.WilliamHogarth ⇒ London PlaceOfDeath.WilliamHogarth ⇒ London PlaceOfMarriage.WilliamHogarth ⇒ Paddington London

17

slide-51
SLIDE 51

Training intuition

Where did Mozart tupress? PlaceOfBirth.Mozart ⇒ Salzburg PlaceOfDeath.Mozart ⇒ Vienna PlaceOfMarriage.Mozart ⇒ Vienna Vienna Where did William Hogarth tuppress? PlaceOfBirth.WilliamHogarth ⇒ London PlaceOfDeath.WilliamHogarth ⇒ London PlaceOfMarriage.WilliamHogarth ⇒ Paddington London

17

slide-52
SLIDE 52

Outline

  • Semantic parsing in 5 minutes
  • A closer look at the elements

– Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets

  • Final remarks

18

slide-53
SLIDE 53

Outline

  • Semantic parsing in 5 minutes
  • A closer look at the elements

– Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets

  • Final remarks

19

slide-54
SLIDE 54

Challenge: incomplete knowledge base

hiking trails

hiking trails in Baltimore Avalon Super Loop Patapsco Valley State Park Gunpowder Falls State Park Union Mills Hike Greenbury Point ...

What are the longest in Baltimore?

Data Source

20

slide-55
SLIDE 55

BarackObama Person

Type

Politician

Profession

1961.08.04

DateOfBirth

Honolulu

PlaceOfBirth

Hawaii

ContainedBy

City

Type

UnitedStates

ContainedBy

USState

Type

Event8

Marriage

MichelleObama

Spouse Type

Female

Gender

1992.10.03

StartDate

Event3

PlacesLived

Chicago

Location

Event21

PlacesLived Location ContainedBy

21

slide-56
SLIDE 56

BarackObama Person

Type

Politician

Profession

1961.08.04

DateOfBirth

Honolulu

PlaceOfBirth

Hawaii

ContainedBy

City

Type

UnitedStates

ContainedBy

USState

Type

Event8

Marriage

MichelleObama

Spouse Type

Female

Gender

1992.10.03

StartDate

Event3

PlacesLived

Chicago

Location

Event21

PlacesLived Location ContainedBy

Fewer than 10% general web questions can be answered via Freebase

21

slide-57
SLIDE 57

22

slide-58
SLIDE 58

Semantic parsing on the web

Input:

  • query x

hiking trails near Baltimore

  • web page w

[Pasupat & Liang, 2014] 23

slide-59
SLIDE 59

Semantic parsing on the web

Input:

  • query x

hiking trails near Baltimore

  • web page w

[Pasupat & Liang, 2014] 23

slide-60
SLIDE 60

Semantic parsing on the web

Input:

  • query x

hiking trails near Baltimore

  • web page w

[Pasupat & Liang, 2014] 23

slide-61
SLIDE 61

Semantic parsing on the web

Input:

  • query x

hiking trails near Baltimore

  • web page w

Output:

  • list of entities y

[Avalon Super Loop, Patapsco Valley State Park, ...]

[Pasupat & Liang, 2014] 23

slide-62
SLIDE 62

Logical forms: XPath expressions

html head body table tr td td td td h1 table tr th th tr td td ... tr td td

z = /html[1]/body[1]/table[2]/tr/td[1]

[Sahuguet and Azavant, 1999; Liu et al., 2000; Crescenzi et al., 2001] 24

slide-63
SLIDE 63

Framework

x w hiking trails near Baltimore

html head ... body ...

25

slide-64
SLIDE 64

Framework

x w Generation Z hiking trails near Baltimore

html head ... body ...

(|Z| ≈ 8500)

25

slide-65
SLIDE 65

Framework

x w Generation Z Model z hiking trails near Baltimore

html head ... body ...

(|Z| ≈ 8500) /html[1]/body[1]/table[2]/tr/td[1]

25

slide-66
SLIDE 66

Framework

x w Generation Z Model z Execution y hiking trails near Baltimore

html head ... body ...

(|Z| ≈ 8500) /html[1]/body[1]/table[2]/tr/td[1] [Avalon Super Loop, Patapsco Valley State Park, ...]

25

slide-67
SLIDE 67

Outline

  • Semantic parsing in 5 minutes
  • A closer look at the elements

– Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets

  • Final remarks

26

slide-68
SLIDE 68

Challenge: lexical coverage

born ⇒ Type.City, PeopleBornHere, Profession.Lawyer, ...

?

27

slide-69
SLIDE 69

Solution: alignment

Open information extraction on ClueWeb09:

(Barack Obama, was born in, Honolulu) (Albert Einstein, was born in, Ulm) (Barack Obama, lived in, Chicago) ... 15M triples ...

[Fader et al. 2011] 28

slide-70
SLIDE 70

Solution: alignment

Open information extraction on ClueWeb09:

(Barack Obama, was born in, Honolulu) (Albert Einstein, was born in, Ulm) (Barack Obama, lived in, Chicago) ... 15M triples ...

Freebase:

BarackObama Person

Type

Politician

Profession

1961.08.04

DateOfBirth

Honolulu

PlaceOfBirth

Hawaii

ContainedBy

City

Type

UnitedStates

ContainedBy

USState

Type

Event8

Marriage

MichelleObama

Spouse Type

Female

Gender

1992.10.03

StartDate

Event3

PlacesLived

Chicago

Location

Event21

PlacesLived Location ContainedBy

(BarackObama, PlaceOfBirth, Honolulu) (Albert Einstein, PlaceOfBirth, Ulm) (BarackObama, PlacesLived.Location, Chicago) ... 400M triples ...

[Fader et al. 2011] 28

slide-71
SLIDE 71

Match text and Freebase predicates

grew up in born in married in born in DateOfBirth PlaceOfBirth Marriage.StartDate PlacesLived.Location

Similar schema matching / alignment ideas [Cai & Yates, 2013, Fader et. al, 2013, Yao & van Durme, 2014; etc.]

29

slide-72
SLIDE 72

Challenge: variability in language

What is the currency in the US?

30

slide-73
SLIDE 73

Challenge: variability in language

What is the currency in the US? What money do they use in the states? How do you pay in America? What’s the currency of the US? What money is accepted in the United States? What money to take to the US? . . .

30

slide-74
SLIDE 74

A solution: paraphrasing

How many people live in Seattle? What is the population of Seattle? PopulationOf(Seattle) 850,000

paraphrase

Convert to a text-only problem

[Berant & Liang, 2014] 31

slide-75
SLIDE 75

Challenge: ”sub-lexical compositionality”

grandmother

λx.Gender.Female ⊓ Parent.Parent.x

mayor

λx.GovtPositionsHeld.(Title.Mayor ⊓ OfficeOfJurisdiction.x)

32

slide-76
SLIDE 76

Challenge: ”sub-lexical compositionality”

grandmother

λx.Gender.Female ⊓ Parent.Parent.x

mayor

λx.GovtPositionsHeld.(Title.Mayor ⊓ OfficeOfJurisdiction.x)

presidents who have served two non-consecutive terms [requires higher-order quantification] presidents who were previously vice-presidents [anaphora] every other president [weird quantification anaphora]

32

slide-77
SLIDE 77

Outline

  • Semantic parsing in 5 minutes
  • A closer look at the elements

– Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets

  • Final remarks

33

slide-78
SLIDE 78

Many possible derivations!

Where was Obama born?

A Really Dumb Grammar (lexicon) Obama ⇒ Unary : BarackObama (lexicon) born ⇒ Binary : PlaceOfBirth ... (join) Unary : u Binary : b ⇒ Unary : b.u (intersect) Unary : u Unary : v ⇒ Unary : u ⊓ v

set of candidate derivations D(x)

Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Location where was R[PlaceOfBirth].BarackObama BarackObama Obama R[PlaceOfBirth] born ? lexicon lexicon lexicon join intersect

...

Type.Date ⊓ R[Founded].ObamaJapan Type.Date where was R[Founded].ObamaJapan ObamaJapan Obama R[Founded] born ? lexicon lexicon lexicon join intersect

34

slide-79
SLIDE 79

Bridging

Type.University BarackObama Which college did Obama go to ?

alignment alignment

[Berant et al., 2013] 35

slide-80
SLIDE 80

Bridging

Type.University Education BarackObama Which college did Obama go to ?

alignment alignment

bridging

Bridging: use neighboring predicates / type constraints

[Berant et al., 2013] 35

slide-81
SLIDE 81

Bridging

Type.University Education BarackObama Which college did Obama go to ?

alignment alignment

bridging

Bridging: use neighboring predicates / type constraints Start building from parts with more certainty

[Berant et al., 2013] 35

slide-82
SLIDE 82

Bridging to nowhere

Search logical forms based on ”prior”:

What countries in the world speak Arabic?

[Berant & Liang, 2014] 36

slide-83
SLIDE 83

Bridging to nowhere

Search logical forms based on ”prior”:

What countries in the world speak Arabic?

ArabicAlphabet ArabicLang

[Berant & Liang, 2014] 36

slide-84
SLIDE 84

Bridging to nowhere

Search logical forms based on ”prior”:

What countries in the world speak Arabic?

ArabicAlphabet ArabicLang LangSpoken.ArabicLang LangFamily.Arabic

[Berant & Liang, 2014] 36

slide-85
SLIDE 85

Bridging to nowhere

Search logical forms based on ”prior”:

What countries in the world speak Arabic?

ArabicAlphabet ArabicLang LangSpoken.ArabicLang Type.Country ⊓ LangSpoken.ArabicLang Count(Type.Country ⊓ LangSpoken.ArabicLang) LangFamily.Arabic

[Berant & Liang, 2014] 36

slide-86
SLIDE 86

Bridging to nowhere

Search logical forms based on ”prior”:

What countries in the world speak Arabic?

ArabicAlphabet ArabicLang LangSpoken.ArabicLang Type.Country ⊓ LangSpoken.ArabicLang Count(Type.Country ⊓ LangSpoken.ArabicLang) LangFamily.Arabic

Start building from parts with more certainty

[Berant & Liang, 2014] 36

slide-87
SLIDE 87

Oracle on WebQuestions

For what fraction of utterances was a candidate logical form correct?

[Berant et al., 2013] Paraphrasing

10 20 30 40 50 60 70

37

slide-88
SLIDE 88

Overapproximation via simple grammars

  • Modeling correct derivations requires complex rules

38

slide-89
SLIDE 89

Overapproximation via simple grammars

  • Modeling correct derivations requires complex rules
  • Simple rules generate overapproximation of good deriva-

tions

38

slide-90
SLIDE 90

Overapproximation via simple grammars

  • Modeling correct derivations requires complex rules
  • Simple rules generate overapproximation of good deriva-

tions

  • Hard grammar rules ⇒ soft/overlapping features

38

slide-91
SLIDE 91

Outline

  • Semantic parsing in 5 minutes
  • A closer look at the elements

– Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets

  • Final remarks

39

slide-92
SLIDE 92

Bootstrapping from easy examples

Iteration 1

Example 1 Example 2 Example 3 Example 4 Example 5

... ... ... ... ...

40

slide-93
SLIDE 93

Bootstrapping from easy examples

Iteration 2

Example 1 Example 2 Example 3 Example 4 Example 5

... ... ... ... ...

40

slide-94
SLIDE 94

Bootstrapping from easy examples

Iteration 3

Example 1 Example 2 Example 3 Example 4 Example 5

... ... ... ... ...

40

slide-95
SLIDE 95

Bootstrapping from easy examples

Iteration 4

Example 1 Example 2 Example 3 Example 4 Example 5

... ... ... ... ...

40

slide-96
SLIDE 96

Bootstrapping from easy examples

On GeoQuery [Liang et al., 2011]:

1 2 3 4

iteration

25 50 75 100

% train examples

41

slide-97
SLIDE 97

Outline

  • Semantic parsing in 5 minutes
  • A closer look at the elements

– Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets

  • Final remarks

42

slide-98
SLIDE 98

x: utterance d: derivation

Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Location where was R[PlaceOfBirth].BarackObama BarackObama Obama R[PlaceOfBirth] born ? lexicon lexicon lexicon join intersect

Feature vector φ(x, d) ∈ Rf:

43

slide-99
SLIDE 99

x: utterance d: derivation

Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Location where was R[PlaceOfBirth].BarackObama BarackObama Obama R[PlaceOfBirth] born ? lexicon lexicon lexicon join intersect

Feature vector φ(x, d) ∈ Rf:

apply join 1 apply intersect 1 apply lexicon 3 skipped VBD-AUX 1 skipped NN born maps to PlaceOfBirth 1 born maps to PlacesLived.Location 0 alignmentScore 1.52 denotation-size=1 1 ... ...

43

slide-100
SLIDE 100

Denotation features for entity extraction

/html[1]/body[1]/table[2]/tr/td[1] /html[1]/body[1]/div[2]/a

hiking trails near Baltimore Avalon Super Loop Patapsco Valley State Park Gunpowder Falls State Park Rachel Carson Conservation Park Union Mills Hike ...

>

hiking trails near Baltimore Home About Baltimore Tour Pricing Contact Online Support ...

44

slide-101
SLIDE 101

Impact of denotation features

  • denotation

+denotation 20 40 60 80 100

Free917

45

slide-102
SLIDE 102

Impact of denotation features

  • denotation

+denotation 20 40 60 80 100

  • denotation

+denotation 10 20 30 40 50

Free917 WebQuestions

45

slide-103
SLIDE 103

Impact of denotation features

  • denotation

+denotation 10 20 30 40 50

OpenWeb

46

slide-104
SLIDE 104

Impact of denotation features

  • denotation

+denotation 10 20 30 40 50

OpenWeb

Working with denotations actually provides more in- formation than just logical forms

46

slide-105
SLIDE 105

Outline

  • Semantic parsing in 5 minutes
  • A closer look at the elements

– Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets

  • Final remarks

47

slide-106
SLIDE 106

Dataset collection

Obtain naturally occurring questions (inputs)

48

slide-107
SLIDE 107

Dataset collection

Obtain naturally occurring questions (inputs) Strategy: breadth-first search over Google Suggest graph

48

slide-108
SLIDE 108

Dataset collection

Obtain naturally occurring questions (inputs) Strategy: breadth-first search over Google Suggest graph Where was Barack Obama born?

48

slide-109
SLIDE 109

Dataset collection

Obtain naturally occurring questions (inputs) Strategy: breadth-first search over Google Suggest graph Where was Barack Obama born? Where was born?

Google Suggest Barack Obama Lady Gaga Steve Jobs

48

slide-110
SLIDE 110

Dataset collection

Obtain naturally occurring questions (inputs) Strategy: breadth-first search over Google Suggest graph Where was Barack Obama born? Where was born?

Google Suggest Barack Obama Lady Gaga Steve Jobs

Where was Steve Jobs born?

48

slide-111
SLIDE 111

Dataset collection

Obtain naturally occurring questions (inputs) Strategy: breadth-first search over Google Suggest graph Where was Barack Obama born? Where was born?

Google Suggest Barack Obama Lady Gaga Steve Jobs

Where was Steve Jobs born? Where was Steve Jobs ?

Google Suggest born raised

  • n the Forbes list

48

slide-112
SLIDE 112

Dataset collection

Obtain naturally occurring questions (inputs) Strategy: breadth-first search over Google Suggest graph Where was Barack Obama born? Where was born?

Google Suggest Barack Obama Lady Gaga Steve Jobs

Where was Steve Jobs born? Where was Steve Jobs ?

Google Suggest born raised

  • n the Forbes list

Where was Steve Jobs raised?

48

slide-113
SLIDE 113

Dataset collection

Obtain naturally occurring questions (inputs) Strategy: breadth-first search over Google Suggest graph Where was Barack Obama born? Where was born?

Google Suggest Barack Obama Lady Gaga Steve Jobs

Where was Steve Jobs born? Where was Steve Jobs ?

Google Suggest born raised

  • n the Forbes list

Where was Steve Jobs raised? ... AMT annotation ⇒ 6.6K question/answer pairs

48

slide-114
SLIDE 114

Question answering on

WebQuestions dataset (6K questions) [Berant et al., 2013] what did obama study in school where to fly into bali what was tupac name in juice

BarackObama Person

Type

Politician

Profession

1961.08.04

DateOfBirth

Honolulu

PlaceOfBirth

Hawaii

ContainedBy

City

Type

UnitedStates

ContainedBy

USState

Type

Event8

Marriage

MichelleObama

Spouse Type

Female

Gender

1992.10.03

StartDate

Event3

PlacesLived

Chicago

Location

Event21

PlacesLived Location ContainedBy

49

slide-115
SLIDE 115

Question answering on

WebQuestions dataset (6K questions) [Berant et al., 2013] what did obama study in school where to fly into bali what was tupac name in juice

BarackObama Person

Type

Politician

Profession

1961.08.04

DateOfBirth

Honolulu

PlaceOfBirth

Hawaii

ContainedBy

City

Type

UnitedStates

ContainedBy

USState

Type

Event8

Marriage

MichelleObama

Spouse Type

Female

Gender

1992.10.03

StartDate

Event3

PlacesLived

Chicago

Location

Event21

PlacesLived Location ContainedBy

[Yao & van Durme, 2014] [Berant et al., 2013] [Bao et al., 2014] [Berant & Liang, 2014]

10 20 30 40 50

35.4 35.7 37.5 39.9

49

slide-116
SLIDE 116

OpenWeb dataset

airlines of italy natural causes of global warming lsu football coaches bf3 submachine guns badminton tournaments foods high in dha technical colleges in south carolina songs on glee season 5 singers who use auto tune san francisco radio stations

50

slide-117
SLIDE 117

OpenWeb dataset

airlines of italy natural causes of global warming lsu football coaches

50

slide-118
SLIDE 118

Results on OpenWeb

Baseline (Most frequent extraction predicates) [Pasupat & Liang, 2014] 10 20 30 40 50 60 70

10.3 40.5

51

slide-119
SLIDE 119

A new dataset?

compositional AND open-domain

52

slide-120
SLIDE 120

A new dataset?

compositional AND open-domain How old are presidents when they take office on average?

52

slide-121
SLIDE 121

A new dataset?

compositional AND open-domain How old are presidents when they take office on average?

52

slide-122
SLIDE 122

A new dataset?

compositional AND open-domain How old are presidents when they take office on average?

52

slide-123
SLIDE 123

Other tasks

Playing computer games [Branavan et al., 2010, 2011] Following navigational instructions [Tellex et. al 2011; Chen et. al 2012; Artzi et. al 2013] Understanding visual world [Matuszek et. al; 2012, Krishna- murthy & Kollar, 2013] Solving algebra word problems [Kushman et. al, 2013]

53

slide-124
SLIDE 124

Outline

  • Semantic parsing in 5 minutes
  • A closer look at the elements

– Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets

  • Final remarks

54

slide-125
SLIDE 125

SHRDLU [1971]

55

slide-126
SLIDE 126

SHRDLU [1971]

Person: Pick up a big red block. Computer: OK.

55

slide-127
SLIDE 127

SHRDLU [1971]

Person: Pick up a big red block. Computer: OK. Person: Grasp the pyramid. Computer: I don’t understand which pyramid you mean.

55

slide-128
SLIDE 128

SHRDLU [1971]

Person: Pick up a big red block. Computer: OK. Person: Grasp the pyramid. Computer: I don’t understand which pyramid you mean. Person (changing their mind): Find a block which is taller than the one you are holding and put it into the box. Computer: By ”it”, I assume you mean the block which is taller than the one I am holding.

55

slide-129
SLIDE 129

SHRDLU [1971]

Person: Pick up a big red block. Computer: OK. Person: Grasp the pyramid. Computer: I don’t understand which pyramid you mean. Person (changing their mind): Find a block which is taller than the one you are holding and put it into the box. Computer: By ”it”, I assume you mean the block which is taller than the one I am holding. Computer: OK. Person: What does the box contain? Computer: The blue pyramid and the blue block. Person: What is the pyramid supported by? Computer: The box.

55

slide-130
SLIDE 130

Takeaway 1/3: denotations

[utterance: user input]

semantic parsing

[intermediate semantic representations (text? logical forms? vectors?)]

execute

[denotation: user output]

Semantic representations as a means to an end

56

slide-131
SLIDE 131

Takeaway 2/3: learning and search

Challenging search / learning problem

57

slide-132
SLIDE 132

Takeaway 2/3: learning and search

Challenging search / learning problem Non-convex optimization

57

slide-133
SLIDE 133

Takeaway 2/3: learning and search

Challenging search / learning problem Non-convex optimization Exponential search space

57

slide-134
SLIDE 134

Takeaway 2/3: learning and search

Challenging search / learning problem Non-convex optimization Exponential search space Need to create better abstractions for people to work

  • n the core search/learning issues

57

slide-135
SLIDE 135

Takeaway 3/3: data and users

Semantic parsing provides utility to users Users provide get back realistic datasets How long do species tend to exist before going extinct? Semantic parsing is useful

58

slide-136
SLIDE 136

Code and data online

http://www-nlp.stanford.edu/software/sempre/ http://www-nlp.stanford.edu/software/web-entity-extractor-ACL2014/

59

slide-137
SLIDE 137

Code and data online

http://www-nlp.stanford.edu/software/sempre/ http://www-nlp.stanford.edu/software/web-entity-extractor-ACL2014/

Collaborators

Jonathan Berant (post-doc) Andrew Chou (masters) Roy Frostig (Ph.D.) Panupong Pasupat (Ph.D.)

59

slide-138
SLIDE 138

Code and data online

http://www-nlp.stanford.edu/software/sempre/ http://www-nlp.stanford.edu/software/web-entity-extractor-ACL2014/

Collaborators

Jonathan Berant (post-doc) Andrew Chou (masters) Roy Frostig (Ph.D.) Panupong Pasupat (Ph.D.)

Thank you!

59