Foundations of Language Science and Technology Discourse: - - PowerPoint PPT Presentation

foundations of language science and technology discourse
SMART_READER_LITE
LIVE PREVIEW

Foundations of Language Science and Technology Discourse: - - PowerPoint PPT Presentation

Foundations of Language Science and Technology Discourse: Co-Reference Caroline Sporleder Universit at des Saarlandes Wintersemester 2009/10 11.01.2010 Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse Discourse Caroline


slide-1
SLIDE 1

Foundations of Language Science and Technology Discourse: Co-Reference

Caroline Sporleder

Universit¨ at des Saarlandes

Wintersemester 2009/10 11.01.2010

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-2
SLIDE 2

Discourse

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-3
SLIDE 3

Why Discourse Processing?

Natural language rarely comes in isolated sentences. . . newspaper articles novels dialogues speeches by politicians . . .

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-4
SLIDE 4

Why Discourse Processing?

Natural language rarely comes in isolated sentences. . . newspaper articles novels dialogues speeches by politicians . . . NLP applications need to be able to deal with discourse. . .

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-5
SLIDE 5

Why Discourse Processing?

Natural language rarely comes in isolated sentences. . . newspaper articles novels dialogues speeches by politicians . . . NLP applications need to be able to deal with discourse. . . dialogue systems question answering text summarisation information extraction natural language generation natural language understanding . . .

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-6
SLIDE 6

Discourse Context Matters

Example: Co-reference Campaigning has closed in Argentina ahead of Sunday’s election to elect a successor to President Nestor Kirchner. The front-runner in the opinion polls is the current first lady, Senator Cristina Fernandez de Kirchner. She praised the economic record of her husband’s government during a rally in Buenos Aires.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-7
SLIDE 7

Discourse Context Matters

Example: Co-reference She praised the economic record of her husband’s government during a rally in Buenos Aires.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-8
SLIDE 8

Discourse Context Matters

Example: Co-reference She praised the economic record of her husband’s government during a rally in Buenos Aires. Task: Co-reference Resolution

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-9
SLIDE 9

Discourse Context Matters

Example: Question Answering (“Why was Caesar killed?”) Caesar was proclaimed dictator for life, and he heavily centralised the bureaucracy of the Republic. These events provoked a hitherto friend of Caesar, Marcus Junius Brutus, and a group of other senators, to assassinate the dictator

  • n the Ides of March (March 15) in 44 BC.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-10
SLIDE 10

Discourse Context Matters

Example: Question Answering (“Why was Caesar killed?”) Caesar was proclaimed dictator for life, and he heavily centralised the bureaucracy of the Republic. These events provoked a hitherto friend of Caesar, Marcus Junius Brutus, and a group of other senators, to assassinate the dictator

  • n the Ides of March (March 15) in 44 BC.

Task: Inferring Discourse Relations (Discourse Parsing)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-11
SLIDE 11

Discourse Context Matters

Example: Coherence/Text Generation The Eurostar service between Britain, France and Belgium ran a lim- ited service on Saturday, with a reduced service planned for Sunday and Monday. Many countries across Europe have been hit by the bitter weather conditions. Passengers have been urged to cancel or postpone their journeys if they do not have to travel.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-12
SLIDE 12

Discourse Context Matters

Example: Coherence/Text Generation Many countries across Europe have been hit by the bitter weather conditions. The Eurostar service between Britain, France and Belgium ran a lim- ited service on Saturday, with a reduced service planned for Sunday and Monday. Passengers have been urged to cancel or postpone their journeys if they do not have to travel.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-13
SLIDE 13

Discourse Context Matters

Example: Coherence/Text Generation Many countries across Europe have been hit by the bitter weather conditions. The Eurostar service between Britain, France and Belgium ran a lim- ited service on Saturday, with a reduced service planned for Sunday and Monday. Passengers have been urged to cancel or postpone their journeys if they do not have to travel. Task: Judging Text Coherence

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-14
SLIDE 14

Discourse Context Matters

Example: Referring Expressions He claims record The 22-year-old computer science undergraduate from Bath is claim- ing a world record for the longest distance ridden on a unicycle in 24 hours. A unicycling student covered exactly 282 miles at Aberystwyth Uni- versity’s athletics track. Sam Wakeling was aiming to beat the existing record of 235.3 miles.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-15
SLIDE 15

Discourse Context Matters

Example: Referring Expressions Unicycling student claims record A student is claiming a world record for the longest distance ridden

  • n a unicycle in 24 hours.

Sam Wakeling covered exactly 282 miles at Aberystwyth University’s athletics track. The 22-year-old computer science undergraduate from Bath was aiming to beat the existing record of 235.3 miles.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-16
SLIDE 16

Discourse Context Matters

Example: Referring Expressions Unicycling student claims record A student is claiming a world record for the longest distance ridden

  • n a unicycle in 24 hours.

Sam Wakeling covered exactly 282 miles at Aberystwyth University’s athletics track. The 22-year-old computer science undergraduate from Bath was aiming to beat the existing record of 235.3 miles. Task: Generating Referring Expressions

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-17
SLIDE 17

Discourse Context Matters

Example: Temporal Order

1 John arrived at an oasis. He saw the camels around the

waterhole.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-18
SLIDE 18

Discourse Context Matters

Example: Temporal Order

1 John arrived at an oasis. He saw the camels around the

waterhole.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-19
SLIDE 19

Discourse Context Matters

Example: Temporal Order

1 John arrived at an oasis. He saw the camels around the

waterhole.

2 John arrived at an oasis. He left the camels around the

waterhole.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-20
SLIDE 20

Discourse Context Matters

Example: Temporal Order

1 John arrived at an oasis. He saw the camels around the

waterhole.

2 John arrived at an oasis. He left the camels around the

waterhole. Task: Temporal Ordering

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-21
SLIDE 21

Co-reference Resolution

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-22
SLIDE 22

Co-reference

Example I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-23
SLIDE 23

Co-reference

Example: pronoun resolution (relatively straightforward) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-24
SLIDE 24

Co-reference

Example: pronoun resolution (relatively straightforward) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-25
SLIDE 25

Co-reference

Example: pronoun resolution (relatively straightforward) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-26
SLIDE 26

Co-reference

Example: pronoun resolution (relatively straightforward) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-27
SLIDE 27

Co-reference

Example: pronoun resolution (relatively straightforward) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-28
SLIDE 28

Co-reference

Example: pronoun resolution (relatively straightforward) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-29
SLIDE 29

Co-reference

Example: pronoun resolution (relatively straightforward) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-30
SLIDE 30

Co-reference

Example: pronoun resolution (relatively straightforward) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-31
SLIDE 31

Co-reference

Example: pronoun resolution (relatively straightforward) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-32
SLIDE 32

Co-reference

Example: pronoun resolution (trickier) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-33
SLIDE 33

Co-reference

Example: pronoun resolution (trickier) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-34
SLIDE 34

Co-reference

Example: pronoun resolution (trickier) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-35
SLIDE 35

Co-reference

Example: pronoun resolution (trickier) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-36
SLIDE 36

Co-reference

Example: pronoun resolution (trickier) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-37
SLIDE 37

Co-reference

Example: NP co-reference resolution (also tricky) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-38
SLIDE 38

Co-reference

Example: NP co-reference resolution (also tricky) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-39
SLIDE 39

Co-reference

Example: NP co-reference resolution (also tricky) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-40
SLIDE 40

Co-reference

Example: NP co-reference resolution (also tricky) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-41
SLIDE 41

Co-reference

Example: NP co-reference resolution (also tricky) I asked Georg Bernreuter about the EU. The Bavarian brewer likes the family of nations - but not the bureaucracy ”We are paying for Europe, not getting that much, but paying for

  • it. Bureaucracy is growing faster than the European Union itself.”

So I ask him whether he still has faith in Europe. ”Absolutely,” he cuts across me, before I can finish the sentence. ”The only way to go in Europe is this coming together of the nations.” Later we head off to a beer tent. People are sitting at long tables drinking enormous glasses of Georg’s beer . . . it’s all quite mad. Nearly everyone says they’ll vote in the elections. Some have complaints, of course, but ask them how the relationship is between Europe and its biggest member, and everyone is singing from the same hymn sheet. “Europe is the future.”

Adapted from http://news.bbc.co.uk/2/hi/europe/8084685.stm

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-42
SLIDE 42

Co-reference

Co-reference and Anaphora Co-reference chain: a set of co-referent referring expressions in a discourse Anaphora: co-reference of one referring expression with its antecedent Anaphor: a referring expression (often a pronoun) which refers back to something mentioned previously (e.g. she, this day, the cat . . . but not Peter etc.) analogous: cataphor for expressions referring forward (e.g., While he was in office, Bill Clinton . . . ) co-reference vs. anaphora

cross-document co-reference (=not anaphoric) some anaphora are not strictly co-referent (Everybody has his

  • wn destiny.)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-43
SLIDE 43

Co-reference

Co-reference and Anaphora Co-reference chain: a set of co-referent referring expressions in a discourse Anaphora: co-reference of one referring expression with its antecedent Anaphor: a referring expression (often a pronoun) which refers back to something mentioned previously (e.g. she, this day, the cat . . . but not Peter etc.) analogous: cataphor for expressions referring forward (e.g., While he was in office, Bill Clinton . . . ) co-reference vs. anaphora

cross-document co-reference (=not anaphoric) some anaphora are not strictly co-referent (Everybody has his

  • wn destiny.)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-44
SLIDE 44

Co-reference Resolution vs. Anaphora Resolution

Co-reference Resolution: find the co-reference chains in a text. Anaphora Resolution: find the antecendent of an anaphor.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-45
SLIDE 45

Co-Reference Resolution

How would you model anaphora / co-reference resolution? Which linguistic factors provide clues?

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-46
SLIDE 46

Ambiguity and Disambiguating Factors

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-47
SLIDE 47

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-48
SLIDE 48

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-49
SLIDE 49

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-50
SLIDE 50

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-51
SLIDE 51

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-52
SLIDE 52

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared. ⇒ selectional preferences: it = the cat

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-53
SLIDE 53

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared. ⇒ selectional preferences: it = the cat Jane told Mary she was in danger.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-54
SLIDE 54

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared. ⇒ selectional preferences: it = the cat Jane told Mary she was in danger. ⇒ salience (e.g., subject position): she = Jane

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-55
SLIDE 55

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared. ⇒ selectional preferences: it = the cat Jane told Mary she was in danger. ⇒ salience (e.g., subject position): she = Jane Jane told Mary SHE was in danger.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-56
SLIDE 56

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared. ⇒ selectional preferences: it = the cat Jane told Mary she was in danger. ⇒ salience (e.g., subject position): she = Jane Jane told Mary SHE was in danger. ⇒ prosody: she = Mary

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-57
SLIDE 57

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared. ⇒ selectional preferences: it = the cat Jane told Mary she was in danger. ⇒ salience (e.g., subject position): she = Jane Jane told Mary SHE was in danger. ⇒ prosody: she = Mary Jane warned Mary she was in danger.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-58
SLIDE 58

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared. ⇒ selectional preferences: it = the cat Jane told Mary she was in danger. ⇒ salience (e.g., subject position): she = Jane Jane told Mary SHE was in danger. ⇒ prosody: she = Mary Jane warned Mary she was in danger. ⇒ lexical semantics (warned): she = Mary

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-59
SLIDE 59

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared. ⇒ selectional preferences: it = the cat Jane told Mary she was in danger. ⇒ salience (e.g., subject position): she = Jane Jane told Mary SHE was in danger. ⇒ prosody: she = Mary Jane warned Mary she was in danger. ⇒ lexical semantics (warned): she = Mary Tony Blair met President Yeltsin. The old man had just recovered from a heart attack.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-60
SLIDE 60

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared. ⇒ selectional preferences: it = the cat Jane told Mary she was in danger. ⇒ salience (e.g., subject position): she = Jane Jane told Mary SHE was in danger. ⇒ prosody: she = Mary Jane warned Mary she was in danger. ⇒ lexical semantics (warned): she = Mary Tony Blair met President Yeltsin. The old man had just recovered from a heart attack. ⇒ world knowledge: the old man = Yeltsin

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-61
SLIDE 61

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared. ⇒ selectional preferences: it = the cat Jane told Mary she was in danger. ⇒ salience (e.g., subject position): she = Jane Jane told Mary SHE was in danger. ⇒ prosody: she = Mary Jane warned Mary she was in danger. ⇒ lexical semantics (warned): she = Mary Tony Blair met President Yeltsin. The old man had just recovered from a heart attack. ⇒ world knowledge: the old man = Yeltsin Georg Bernreuter ... Mr. Bernreuter

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-62
SLIDE 62

Ambiguity and Disambiguating Factors

Jane told Peter he was in danger. ⇒ Agreement (gender, number etc.): he = Peter Peter said that John is running the business for himself. ⇒ syntactic constraints: himself = John The cat did not come down from the tree. It was scared. ⇒ selectional preferences: it = the cat Jane told Mary she was in danger. ⇒ salience (e.g., subject position): she = Jane Jane told Mary SHE was in danger. ⇒ prosody: she = Mary Jane warned Mary she was in danger. ⇒ lexical semantics (warned): she = Mary Tony Blair met President Yeltsin. The old man had just recovered from a heart attack. ⇒ world knowledge: the old man = Yeltsin Georg Bernreuter ... Mr. Bernreuter ⇒ surface string similarity

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-63
SLIDE 63

Co-Reference Resolution

Difficulties:

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-64
SLIDE 64

Co-Reference Resolution

Difficulties: different form ⇒ different referents (Georg Bernreuter vs. the Bavarian brewer vs. he)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-65
SLIDE 65

Co-Reference Resolution

Difficulties: different form ⇒ different referents (Georg Bernreuter vs. the Bavarian brewer vs. he) same form ⇒ same referents (the cat, Michael Jackson the singer vs. Michael Jackson the British general)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-66
SLIDE 66

Co-Reference Resolution Steps

1 identify anaphor / markable

difficulties: NPs which aren’t referring expressions; pleonastic it (It’s raining.) etc.

2 identify potential antecendents 3 find correct co-referent for each anaphor / markable Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-67
SLIDE 67

Co-Reference Resolution Approaches

Before 1990 . . . reference resolution = pronoun resolution rule-based (manually created rules) Examples:

SHRDLU (Winograd, 1972): complex heuristics (focus,

  • bliqueness etc.)

Hobbs’s (1976, 1978): heuristically directed search in parse trees centering-based (Brennan et al. 1987) Lappin & Leass (1994): agreement, syntax, salience

After 1990 . . . corpus-based (co-occurrence statistics, machine learning) ⇒ Message Understanding Conference (MUC): annotated data reference resolution for non-pronominal expressions (definite NPs, bridging; z.B. Vieira & Poesio, 2000)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-68
SLIDE 68

Rule-based Approaches

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-69
SLIDE 69

RAP (Lappin & Leass, 1994)

Resolution of Anaphora Procedure Scope third person pronouns lexical anaphors (reflexives and reciprocals) Software numerous (re-)implementations, e.g., http: //wing.comp.nus.edu.sg/~qiu/NLPTools/JavaRAP.html

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-70
SLIDE 70

RAP (Lappin & Leass, 1994)

Components procedure for identifying pleonastic/expletive pronouns morpho-syntactic filters salience weighting a resolution procedure

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-71
SLIDE 71

RAP: Pleonastic Pronoun Filter

pre-specified list of modal adjectives (necessary, certain, good, possible . . . ) pre-specified list of cognitive verbs (recommend, think, believe, expect . . . ) manually built rules, e.g.: It is modaladj that S. It is cogv-ed that S. It is time to VP.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-72
SLIDE 72

RAP: Morpho-Syntactic Filters

expressions that don’t agree in person, number and gender are not co-referent manually built syntactic filter rules (e.g., John seems to want to see him., His portrait of John is interesting.)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-73
SLIDE 73

RAP: Salience Weighting

Salience Factors associated with one or more discourse referents (which are in its scope) each factor is weighted all weights decay as discourse goes on (at steps of -2 for each new sentence) factor is removed when weight reaches zero

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-74
SLIDE 74

RAP: Salience Weighting

Salience Factors sentence recency subject emphasis: The postman delivered a parcel to Peter. existential emphasis: There are only a few restrictions on the courses one can choose. accusative emphasis: The postman delivered a parcel to Peter. indirect object and oblique complement emphasis: The postman delivered a parcel to Peter. head noun emphasis: embedded NPs don’t receive this factor (e.g., Experts still discuss the impact of Opel’s restructuring plans) non-adverbial emphasis: any NP not contained in an adverbial PP demarcated by a separator (e.g., not: In the first year, the company made a healthy profit.)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-75
SLIDE 75

RAP: Salience Weighting

Initial Weights sentence recency 100

  • subj. emphasis

80

  • exist. emphasis

70

  • acc. emphasis

50

  • ind. obj and oblique compl. emphasis

40 head noun emphasis 80 non-adv. emphasis 50

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-76
SLIDE 76

RAP: Salience Weighting

Equivalence classes referring expressions are grouped into equivalence classes (note: no co-reference between definite NPs) each equivalence class has a salience weight (= the sum of the weights of all salience factors associated with the most recent expression in the class)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-77
SLIDE 77

RAP: Resolution Procedure

In a nutshell:

1 classify referring NPs in current sentence (definite NP,

indefinite NP, pleonastic pronoun, other pronoun)

2 for all non-pleonastic pronouns apply morpho-syntactic filters

and compute remaining potential antecedents

3 modify salience scores for possible anaphor antecedent pairs:

if antecedent follows anaphor, decrease weight by 175 (i.e., cataphora are penalised) if grammatical roles between anaphor and antecedent are parallel increase weight by 35 (i.e., parallelism is rewarded)

4 rank possible antecents by salience score 5 apply salience threshold 6 of antecedents above the threshold choose highest scoring one,

in case of a tie select the antecedent closest to the anaphor

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-78
SLIDE 78

RAP: Pronoun Resolution

Example John Smith talks about the EU. Weights: John Smith: 100 (recency) + 80 (subj) + 80 (head noun) + 50 (non-adv) = 310 the EU: 100 (recency) + 50 (acc) + 80 (head noun) + 50 (non-adv) = 280

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-79
SLIDE 79

RAP: Pronoun Resolution

Example John Smith talks about the EU. He likes the family of nations. Weights: John Smith: 98 (recency) + 78 (subj) + 78 (head noun) + 48 (non-adv) = 302 the EU: 98 (recency) + 48 (acc) + 78 (head noun) + 48 (non-adv) = 272 the family of nations: 100 (recency) + 50 (acc) + 80 (head noun) + 50 (non-adv) = 280 nations: 100 (recency) + 50 (acc) + 50 (non-adv) = 200

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-80
SLIDE 80

RAP: Pronoun Resolution

Example John Smith talks about the EU. He likes the family of nations. Weights: John Smith: 98 (recency) + 78 (subj) + 78 (head noun) + 48 (non-adv) = 302 the EU: 98 (recency) + 48 (acc) + 78 (head noun) + 48 (non-adv) = 272 the family of nations: 100 (recency) + 50 (acc) + 80 (head noun) + 50 (non-adv) = 280 nations: 100 (recency) + 50 (acc) + 50 (non-adv) = 200 Resolving “he”: “he” = “John Smith” by morpho-syntactic filter

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-81
SLIDE 81

RAP: Pronoun Resolution

Example John Smith talks about the EU. He likes the family of nations. Weights: John Smith: 100 (recency) + 80 (subj) + 80 (head noun) + 50 (non-adv) = 310 the EU: 98 (recency) + 48 (acc) + 78 (head noun) + 48 (non-adv) = 272 the family of nations: 100 (recency) + 50 (acc) + 80 (head noun) + 50 (non-adv) = 280 nations: 100 (recency) + 50 (acc) + 50 (non-adv) = 200 Resolving “he”: “he” = “John Smith” by morpho-syntactic filter

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-82
SLIDE 82

RAP: Pronoun Resolution

Example John Smith talks about the EU. He likes the family of nations. It is a good thing. Weights: John Smith: 98 (recency) + 78 (subj) + 78 (head noun) + 48 (non-adv) = 302 the EU: 96 (recency) + 46 (acc) + 76 (head noun) + 46 (non-adv) = 264 the family of nations: 98 (recency) + 42 (acc) + 78 (head noun) + 42 (non-adv) = 272 nations: 98 (recency) + 42 (acc) + 42 (non-adv) = 194 a good thing: 100 (recency) + 50 (acc) + 80 (head) + 50 (non-adv) = 280

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-83
SLIDE 83

RAP: Pronoun Resolution

Example John Smith talks about the EU. He likes the family of nations. It is a good thing. Weights: John Smith: 98 (recency) + 78 (subj) + 78 (head noun) + 48 (non-adv) = 302 the EU: 96 (recency) + 46 (acc) + 76 (head noun) + 46 (non-adv) = 264 the family of nations: 98 (recency) + 42 (acc) + 78 (head noun) + 42 (non-adv) = 272 nations: 98 (recency) + 42 (acc) + 42 (non-adv) = 194 a good thing: 100 (recency) + 50 (acc) + 80 (head) + 50 (non-adv) = 280 Resolving “it” “the family of nations” (272) > “the EU” (264) > “nations” (194) > “a good thing” (105, cataphor)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-84
SLIDE 84

RAP: Evaluation

Set-Up unseen test set of 345 randomly selected sentence pairs (sentence with pronoun plus preceding sentence) subject to constraints: RAP generates a candidate list of at least two elements correct antecedent is on that list Result 86% accuracy

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-85
SLIDE 85

RAP

Can you think of any cases that RAP would not do well on?

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-86
SLIDE 86

Machine Learning Approaches

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-87
SLIDE 87

Hybrid RAP

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-88
SLIDE 88

RAPSTAT (Dagan & Itai (1990, 1991)): RAP Hybrid with Statistics

Motivation RAP disregards selectional preferences.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-89
SLIDE 89

RAPSTAT (Dagan & Itai (1990, 1991)): RAP Hybrid with Statistics

Motivation RAP disregards selectional preferences. Example We gave the bananas to the monkeys because they were hungry.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-90
SLIDE 90

RAPSTAT (Dagan & Itai (1990, 1991)): RAP Hybrid with Statistics

Motivation RAP disregards selectional preferences. Example We gave the bananas to the monkeys because they were hungry. Salience Scores the bananas: 100 (recency) + 50 (acc) + 80 (head) + 50 (non-adv) = 280 the monkeys: 100 (recency) + 40 (ind. obj) + 80 (head) + 50 (non-adv) = 270

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-91
SLIDE 91

RAPSTAT (Dagan & Itai (1990, 1991)): RAP Hybrid with Statistics

Motivation RAP disregards selectional preferences. Example We gave the bananas to the monkeys because they were hungry. Salience Scores the bananas: 100 (recency) + 50 (acc) + 80 (head) + 50 (non-adv) = 280 the monkeys: 100 (recency) + 40 (ind. obj) + 80 (head) + 50 (non-adv) = 270 Resolving “they” “they”=”the bananas” however: p(areHungry(bananas)) << p(areHungry(monkeys))

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-92
SLIDE 92

Modelling Selectional Preferences

Any ideas how to do this?

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-93
SLIDE 93

RAPSTAT

Use statistics to improve anaphora resolution selectional preferences are automatically computed from corpus (co-occurrence statistics) if statistics point to another antecedent than RAP and the salience difference between the two potential antecedents is not too high, select statistically more plausible antecedent

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-94
SLIDE 94

RAPSTAT

Example They held tax money aside on the basis that the government said it was going to collect it.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-95
SLIDE 95

RAPSTAT

Example They held tax money aside on the basis that the government said it was going to collect it.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-96
SLIDE 96

RAPSTAT

Example They held tax money aside on the basis that the government said it was going to collect it. Subject(it, collect) Object(it, collect)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-97
SLIDE 97

RAPSTAT

Example They held tax money aside on the basis that the government said it was going to collect it. Subject(it, collect) Object(it, collect) co-occurrence statistics: Subject(money,collect) = 5 Subject(government,collect) = 198 Object(money,collect) = 149 Object(government,collect) = 0

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-98
SLIDE 98

RAPSTAT

Example They held tax money aside on the basis that the government said it was going to collect it. Subject(it, collect) Object(it, collect) co-occurrence statistics: Subject(money,collect) = 5 Subject(government,collect) = 198 Objekc(money,collect) = 149 Objekc(government,collect) = 0 ⇒ it = government ⇒ it = money

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-99
SLIDE 99

RAPSTAT

Comparison RAP vs. RAPSTAT RAPSTAT has 89% accuracy (vs. 86% for RAP)

  • verrules RAP’s decision in 22% of the cases, 61% of these

are correctly resolved by RAPSTAT

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-100
SLIDE 100

From Anaphora to Co-reference Resolution

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-101
SLIDE 101

Moving on to co-reference resolution

Co-Reference Resolution identity of reference between two markables (definite NPs, proper names, demonstrative NPs, appositives, embedded NPs, pronouns etc.) annotated data from Message Understanding Conferences (MUC-6, MUC-7) Example Ms Washington’s candidacy is being championed by several powerful lawmakers including her boss, Chairman John Dingell. She is currently a counsel to the committee.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-102
SLIDE 102

Moving on to co-reference resolution

Co-Reference Resolution identity of reference between two markables (definite NPs, proper names, demonstrative NPs, appositives, embedded NPs, pronouns etc.) annotated data from Message Understanding Conferences (MUC-6, MUC-7) Example: markables [[Ms Washington]’s candidacy] is being championed by [several powerful lawmakers] including [[her] boss], [Chairman John Dingell]. [She] is currently [a counsel] to [the committee].

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-103
SLIDE 103

Moving on to co-reference resolution

Co-Reference Resolution identity of reference between two markables (definite NPs, proper names, demonstrative NPs, appositives, embedded NPs, pronouns etc.) annotated data from Message Understanding Conferences (MUC-6, MUC-7) Example: co-reference resolution [[Ms Washington]’s candidacy] is being championed by [several powerful lawmakers] including [[her] boss], [Chairman John Dingell]. [She] is currently [a counsel] to [the committee].

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-104
SLIDE 104

Soon et al. (2001): Overview

supervised machine learning (C.5 - decision tree)

  • n MUC-6 and MUC-7 data

12 shallow features

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-105
SLIDE 105

Different methods for extracting training data

Generous all pairs in a co-reference chain are positive examples all other pairs are negative examples More selective (Soon et al., 2001) adjacent pairs in co-reference chain are positive training data for all markables between the two co-referent expressions, pair the markable with either expression and label as ’negative’

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-106
SLIDE 106

Different methods for extracting training data

Generous all pairs in a co-reference chain are positive examples all other pairs are negative examples More selective (Soon et al., 2001) adjacent pairs in co-reference chain are positive training data for all markables between the two co-referent expressions, pair the markable with either expression and label as ’negative’ Note: in both cases (especially the first one) the training set will be imbalanced.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-107
SLIDE 107

Soon et al. (2001): Training Data

Example [[Ms Washington]’s candidacy] is being championed by [several powerful lawmakers] including [[her] boss], [Chairman John Dingell]. [She] is currently [a counsel] to [the committee]. Training Data (Ms Washington, her): pos (her, she): pos (her boss, Chairman John Dingell): pos (Ms Washington, several powerful lawmakers): neg (her, several powerful lawmakers): neg

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-108
SLIDE 108

Soon et al. (2001): Features

Twelve shallow features: distance (in terms of sentences): numeric pronoun features (i-pronoun, j-pronoun): boolean string match (excluding determiners): boolean j type features (def. NP, dem. NP): boolean number agreement: boolean semantic class agreement (WordNet, most frequent sense): true, false, unknown gender agreement: true, false, unknown both proper names (i and j): boolean alias feature (“Mr. Simpson” - “Bent Simpson”): boolean appositive feature: boolean

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-109
SLIDE 109

Soon et al. (2001): Decision Tree Learnt

J−Pronoun Gender I−Pronoun Dist Number Appositive Alias + − + − + −, unknown + − + − + − >0 <=0 + − pos neg neg pos neg pos pos neg pos Str−Match

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-110
SLIDE 110

Soon et al. (2001): Building Co-reference Chains

Greedy chain building algorithm

1 compare each markable j with each preceding markable i,

starting from the closest

2 apply decision tree to the pair (j, i) 3 stop as soon as decision tree returns ’true’ Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-111
SLIDE 111

Soon et al. (2001): Evaluation

Scores for MUC-6 and MUC-7 Recall: 56-59% Precision: 66-67% F-Score: 60-63% ⇒ (competitive with other systems)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-112
SLIDE 112

How would you try to improve on Soon et al. (2001)?

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-113
SLIDE 113

Beyond Soon et al. (2001). . .

Ng and Cardie (2002): improve on Soon et al. through:

extra-linguistic changes to the learning framework large-scale expansion of the feature set, incorporating “more sophisticated linguistic knowledge”

MUC F-Scores: 70.4% and 63.4% (Soon et al: 62.6% and 60.4%)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-114
SLIDE 114

Changes to the Learning Framework

Best-first instead of greedy clustering: Soon et al. search right-to-left for a possible antecedent and select the first (i.e., rightmost) expression which is classified as co-referent Ng and Cardie search right-to-left and select the best expression that is classified as coreferent (i.e., the one that scores highest) Split string match feature: implement separate string match features for different types of expressions (pronouns, proper names, non-pronominal NPs) Results (C4.5 and Ripper) statistically significant gains in precision over Soon et al. baseline no drop in recall

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-115
SLIDE 115

Expanding the feature set

41 new features, e.g.: more complex string matching more semantic features (e.g., testing for ancestor-descendant relationships in WordNet, graph-distance in WordNet) 26 new grammatical features hard-coded linguistic constraints, indicator features (agreement, binding etc.)

  • utput of rule-based pronoun resolution system

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-116
SLIDE 116

Expanding the feature set

Results significant increases in recall even bigger decreases in precision ⇒ F-Score goes down Error Analysis drop in precision due to bad precision on common nouns counter intuitive rules were learnt Example (i,j) = coreferent iff properName(i) ∧ definiteNP(j) ∧ subject(j) ∧ semClass(i) = semClass(j) ∧ distance(i, j) ≤ 1 ⇒ rule covers 38 examples with 18 exceptions ⇒ this is a data sparseness problem!

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-117
SLIDE 117

Expanding the feature set

Solution: manual feature selection

  • n data overall: increase in F-Score

but large drop in precision for pronouns Conclusion pronoun and common noun resolution remain challenging

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-118
SLIDE 118

Further Approaches

Don’t treat co-reference resolution as a classification task! Intuitively pairwise decisions are not what one wants ⇒ ranking instead of classification (e.g., Yang et al., 2003; Denis and Baldridge, 2007) ⇒ graph partitioning to convert pairwise scores into final coherent clustering (McCallum and Wellner, 2004)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-119
SLIDE 119

Summary

Co-reference Resolution . . . is a heterogeneous task (pronoun resolution, proper name matching, co-reference resolution for definite NPs) ⇒ one-size-fits-all may not be the best strategy . . . is a complex task, many factors are involved (focus structure, similarity of surface strings, grammatical constraints, semantic constraints etc.) . . . maybe shouldn’t be modelled as a classification task (artificial pairwise decisions, class imbalance etc.)

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-120
SLIDE 120

Bibliographie I

Brennan, Susan E., Marilyn W. Friedman and Carl J. Pollard, A centering approach to pronouns, Proceedings of the 25th annual meeting on Association for Computational Linguistics, p.155-162, July 06-09, 1987, Stanford, California. Dagan, Ido and Alon Itai, Automatic Acquisition of Constraints for the Resolution of Anaphora References and Syntactic Ambiguities, Proceedings of COLING, pp. 330-332, 1990. Dagan, Ido and Alon Itai, A Statistical Filter for Resolving Pronoun References, In: Y. A. Feldman and A. Bruckstein (Eds.), Artificial Intelligence and Computer Vision, Elsevier Science Publishers B.V., 1991, pp. 125-135. Denis, Pascal and Jason Baldridge, A ranking approach to pronoun resolution, In Proceedings of IJCAI-2007. Hyderabad, India, 2007.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-121
SLIDE 121

Bibliographie II

Hobbs, Jerry R. Pronoun Resolution, Research Report 76-1, Department of Computer Sciences, City College, City University of New York, 1976. Hobbs, Jerry R., Resolving Pronoun References, Lingua, Vol. 44, pp. 311-338, 1978. Lappin, Shalom and Herbert J. Leass, An Algorithm for Pronominal Anaphora Resolution, Computational Linguistics, 20(4), pp. 535-561, 1994. McCallum, Andrew and Ben Wellner, Conditional Models of Identity Uncertainty with Application to Noun Coreference, Neural Information Processing Systems (NIPS), 2004.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse

slide-122
SLIDE 122

Bibliographie III

Ng, Vincent and Claire Cardie, Improving Machine Learning Approaches to Coreference Resolution, Proceedings of the ACL, 2002. Soon, Wee Meng, Daniel Chung Yong Lim and Hwee Tou Ng, A Machine Learning Approach to Coreference Resolution of Noun Phrases, Computational Linguistics 27(4), pp. 521-544, 2001. Vieira, Renata and Massimo Poesio, An Empirically-Based System for Processing Definite Descriptions, Computational Linguistics, 26(4), p. 539-593, 2000. Winograd, Terry Understanding Natural Language, Academic Press, 1972. Yang, X., G. Zhou, J. Su, and C.L. Tan, Coreference resolution using competitive learning approach. In Proceedings of ACL, pages 176183, 2003.

Caroline Sporleder csporled@coli.uni-sb.de FSLT: Discourse