Towards Probabilistic Acceptors and Transducers for Feature - - PowerPoint PPT Presentation

towards probabilistic acceptors and transducers for
SMART_READER_LITE
LIVE PREVIEW

Towards Probabilistic Acceptors and Transducers for Feature - - PowerPoint PPT Presentation

Towards Probabilistic Acceptors and Transducers for Feature Structures Daniel Quernheim Institute for Natural Language Processing, University of Stuttgart daniel@ims.uni-stuttgart.de Kevin Knight Information Sciences Institute, University of


slide-1
SLIDE 1

Towards Probabilistic Acceptors and Transducers for Feature Structures

Daniel Quernheim

Institute for Natural Language Processing, University of Stuttgart

daniel@ims.uni-stuttgart.de Kevin Knight

Information Sciences Institute, University of Southern California

knight@isi.edu

Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-6)

July 12, 2012

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 1 / 21

slide-2
SLIDE 2

Linguistic structures

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 2 / 21

slide-3
SLIDE 3

Linguistic structures

Strings

surface forms, phonology, morphology

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 2 / 21

slide-4
SLIDE 4

Linguistic structures

Strings

surface forms, phonology, morphology

Trees

syntax

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 2 / 21

slide-5
SLIDE 5

Linguistic structures

Strings

surface forms, phonology, morphology

Trees

syntax

Feature structures (= directed acyclic graphs)

deep syntax (LFG etc.) semantics (abstract meaning representations)

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 2 / 21

slide-6
SLIDE 6

Feature structures

                            

INSTANCE

charge

THEME

1

  • INSTANCE

person

NAME

“Pascale”

  • PRED

                   

INSTANCE

and

OP1

      

INSTANCE

resist

AGENT

1

THEME

  • INSTANCE

arrest

THEME

1

     

OP2

    

INSTANCE

intoxicate

THEME

1

LOCATION

  • INSTANCE

public

                                                    

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 3 / 21

slide-7
SLIDE 7

Directed acyclic graphs

CHARGE AND RESIST INTOXICATE ARREST PERSON PUBLIC PASCALE CHARGE → charge(theme, pred) AND → and(op1, op2) RESIST → resist(agent, theme) ARREST → arrest(theme) INTOXICATE → intoxicate

(theme, location)

PUBLIC → public() PERSON → person(name) PASCALE → "Pascale"

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 4 / 21

slide-8
SLIDE 8

Translation pipelines

Syntax-based MT pipeline

fstring → translate → etree → language model → estring

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 5 / 21

slide-9
SLIDE 9

Translation pipelines

Syntax-based MT pipeline

fstring → translate → etree → language model → estring

◮ The individual components are efficiently represented as

weighted tree acceptors and transducers.

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 5 / 21

slide-10
SLIDE 10

Translation pipelines

Syntax-based MT pipeline

FSA

fstring →

nl-XTOPs−1

translate →

RTG

etree →

FSA

language model →

FSA

estring

◮ The individual components are efficiently represented as

weighted tree acceptors and transducers. estring = BESTPATH(INTERSECT(language model, YIELD(BACKWARDS(translate, fstring)))).

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 5 / 21

slide-11
SLIDE 11

Translation pipelines (2)

Semantics-based MT pipeline

fstring → understand → AMR → rank → AMR → generate → etree → rank → estring

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 6 / 21

slide-12
SLIDE 12

Translation pipelines (2)

Semantics-based MT pipeline

fstring → understand → AMR → rank → AMR → generate → etree → rank → estring

◮ No suitable automaton framework is known!

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 6 / 21

slide-13
SLIDE 13

Algorithms and automata

string automata tree automata graph automata k-best paths through a WFSA trees in a weighted forest EM training Forward-backward EM Tree transducer EM training Determinization

  • f weighted string ac-

ceptors

  • f weighted tree ac-

ceptors Transducer composition WFST composition Many transducers not closed under compo- sition General tools AT&T FSM, Carmel, OpenFST Tiburon

Table: General-purpose algorithms for strings, trees and feature structures.

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 7 / 21

slide-14
SLIDE 14

Algorithms and automata

string automata tree automata graph automata k-best paths through a WFSA trees in a weighted forest ? EM training Forward-backward EM Tree transducer EM training ? Determinization

  • f weighted string ac-

ceptors

  • f weighted tree ac-

ceptors ? Transducer composition WFST composition Many transducers not closed under compo- sition ? General tools AT&T FSM, Carmel, OpenFST Tiburon ?

Table: General-purpose algorithms for strings, trees and feature structures.

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 7 / 21

slide-15
SLIDE 15

Algorithms and automata (2)

Our goal

◮ Find an adequate automaton model for the pipeline parts ◮ Investigate algorithms and fill all the blanks!

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 8 / 21

slide-16
SLIDE 16

Algorithms and automata (2)

Our goal

◮ Find an adequate automaton model for the pipeline parts ◮ Investigate algorithms and fill all the blanks!

Candidates

◮ Treating everything as a tree (too weak?) ◮ Unification grammars (HPSG, LFG) (too powerful?) ◮ Hyperedge replacement grammar (too powerful?)

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 8 / 21

slide-17
SLIDE 17

Algorithms and automata (2)

Our goal

◮ Find an adequate automaton model for the pipeline parts ◮ Investigate algorithms and fill all the blanks!

Candidates

◮ Treating everything as a tree (too weak?) ◮ Unification grammars (HPSG, LFG) (too powerful?) ◮ Hyperedge replacement grammar (too powerful?) ◮

Some straightforward extension of string/tree automata?

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 8 / 21

slide-18
SLIDE 18

Dag automata

finite string automaton: (FSA)

  • ne input state, one input symbol, one output state

. . . p σ q . . .

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 9 / 21

slide-19
SLIDE 19

Dag automata

finite string automaton: (FSA)

  • ne input state, one input symbol, one output state

. . . p σ q . . . finite tree automaton: (FTA)

  • ne input state, one input symbol, many output states

. . . q1 p σ q2 q3 . . .

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 9 / 21

slide-20
SLIDE 20

Dag automata

finite string automaton: (FSA)

  • ne input state, one input symbol, one output state

. . . p σ q . . . finite tree automaton: (FTA)

  • ne input state, one input symbol, many output states

. . . q1 p σ q2 q3 . . . finite dag automaton: (FDA?) many input states, one input symbol, many output states . . . p1 q1 σ q2 p2 q3 . . .

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 9 / 21

slide-21
SLIDE 21

Dag automata (2)

KAMIMURA and SLUTZKI (1981, 1982)

◮ Dag acceptors and dag-to-tree transducers ◮ They proved a couple of technical properties, no algorithms

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 10 / 21

slide-22
SLIDE 22

Dag automata (2)

KAMIMURA and SLUTZKI (1981, 1982)

◮ Dag acceptors and dag-to-tree transducers ◮ They proved a couple of technical properties, no algorithms ◮ We investigate their model with some adjustments:

◮ not only adjacent leaves can be

connected

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 10 / 21

slide-23
SLIDE 23

Dag automata (2)

KAMIMURA and SLUTZKI (1981, 1982)

◮ Dag acceptors and dag-to-tree transducers ◮ They proved a couple of technical properties, no algorithms ◮ We investigate their model with some adjustments:

◮ not only adjacent leaves can be

connected

WANT BELIEVE BOY GIRL

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 10 / 21

slide-24
SLIDE 24

Dag automata (2)

KAMIMURA and SLUTZKI (1981, 1982)

◮ Dag acceptors and dag-to-tree transducers ◮ They proved a couple of technical properties, no algorithms ◮ We investigate their model with some adjustments:

◮ not only adjacent leaves can be

connected

◮ top-down transducers instead of

bottom-up

◮ we introduce weights (probabilities)

WANT BELIEVE BOY GIRL

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 10 / 21

slide-25
SLIDE 25

Example dag automaton

q →WANT(r, q)0.3 q →BELIEVE(r, q)0.2 q →r 0.4 | ∅ 0.1 r →BOY0.3 | GIRL 0.3 | ∅ 0.1 [r, r] →r 0.2 [r, r, r] →r 0.1

WANT → want(agent, theme) BELIEVE → believe(agent, theme) BOY → boy() GIRL → girl()

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 11 / 21

slide-26
SLIDE 26

Example dag generation

q

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 12 / 21

slide-27
SLIDE 27

Example dag generation

q

0.3

= ⇒

WANT r q

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 12 / 21

slide-28
SLIDE 28

Example dag generation

q

0.3

= ⇒

WANT r q

0.2

= ⇒

WANT r BELIEVE r q

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 12 / 21

slide-29
SLIDE 29

Example dag generation

q

0.3

= ⇒

WANT r q

0.2

= ⇒

WANT r BELIEVE r q

0.2

= ⇒

WANT BELIEVE r q

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 12 / 21

slide-30
SLIDE 30

Example dag generation

q

0.3

= ⇒

WANT r q

0.2

= ⇒

WANT r BELIEVE r q

0.2

= ⇒

WANT BELIEVE r q

0.4

= ⇒

WANT BELIEVE r r

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 12 / 21

slide-31
SLIDE 31

Example dag generation

q

0.3

= ⇒

WANT r q

0.2

= ⇒

WANT r BELIEVE r q

0.2

= ⇒

WANT BELIEVE r q

0.4

= ⇒

WANT BELIEVE r r

0.3

= ⇒

WANT BELIEVE BOY r

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 12 / 21

slide-32
SLIDE 32

Example dag generation

q

0.3

= ⇒

WANT r q

0.2

= ⇒

WANT r BELIEVE r q

0.2

= ⇒

WANT BELIEVE r q

0.4

= ⇒

WANT BELIEVE r r

0.3

= ⇒

WANT BELIEVE BOY r

0.3

= ⇒

WANT BELIEVE BOY GIRL

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 12 / 21

slide-33
SLIDE 33

Example dag transducer rules

◮ Rules have m incoming edges with states and produce m trees

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 13 / 21

slide-34
SLIDE 34

Example dag transducer rules

◮ Rules have m incoming edges with states and produce m trees

[qnomb, qaccb].BOY → NP(the boy), NP(him) qaccg.GIRL → NP(the girl)

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 13 / 21

slide-35
SLIDE 35

Example dag transducer rules

◮ Rules have m incoming edges with states and produce m trees ◮ Rules have n outgoing edges and n variables to pass states down

[qnomb, qaccb].BOY → NP(the boy), NP(him) qaccg.GIRL → NP(the girl) qs.WANT(x, y) → S(qnomb.x, wants, qinfb.y) qinfb.BELIEVE(x, y) → INF(qaccg.x, to believe, qaccb.y)

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 13 / 21

slide-36
SLIDE 36

Example dag transduction

WANT BELIEVE BOY GIRL

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 14 / 21

slide-37
SLIDE 37

Example dag transduction

WANT BELIEVE BOY GIRL

= ⇒

S qnomb wants qinfb BELIEVE BOY GIRL

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 14 / 21

slide-38
SLIDE 38

Example dag transduction

WANT BELIEVE BOY GIRL

= ⇒

S qnomb wants qinfb BELIEVE BOY GIRL

= ⇒

S qnomb wants INF qaccg to believe qaccb BOY GIRL

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 14 / 21

slide-39
SLIDE 39

Example dag transduction

WANT BELIEVE BOY GIRL

= ⇒

S qnomb wants qinfb BELIEVE BOY GIRL

= ⇒

S qnomb wants INF qaccg to believe qaccb BOY GIRL

= ⇒

S INF NP NP NP the boy wants the girl to believe him

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 14 / 21

slide-40
SLIDE 40

What’s it for?

What’s wrong with phrase-based and syntax-based MT?

◮ We want the “who did what to whom, when, where, and why” ◮ Preservation of meaning can be more important than

grammaticality/fluency

◮ We are aiming for useful translation!

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 15 / 21

slide-41
SLIDE 41

What’s it for?

What’s wrong with phrase-based and syntax-based MT?

◮ We want the “who did what to whom, when, where, and why” ◮ Preservation of meaning can be more important than

grammaticality/fluency

◮ We are aiming for useful translation!

But haven’t people tried and failed?

Yes, but. . .

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 15 / 21

slide-42
SLIDE 42

What’s it for?

What’s wrong with phrase-based and syntax-based MT?

◮ We want the “who did what to whom, when, where, and why” ◮ Preservation of meaning can be more important than

grammaticality/fluency

◮ We are aiming for useful translation!

But haven’t people tried and failed?

Yes, but. . .

◮ that was before statistics ◮ small-scale, hand-crafted ◮ people said the same about syntax-based MT and look where it’s

now!

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 15 / 21

slide-43
SLIDE 43

Different MT paradigms

Phrase Syntax Semantics Foreign English

phrase-based MT: n-grammatical

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 16 / 21

slide-44
SLIDE 44

Different MT paradigms

Phrase Syntax Semantics Foreign English

phrase-based MT: n-grammatical syntax-based MT: grammatical

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 16 / 21

slide-45
SLIDE 45

Different MT paradigms

Phrase Syntax Semantics Foreign English

phrase-based MT: n-grammatical syntax-based MT: grammatical semantics-based MT: sensible and grammatical

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 16 / 21

slide-46
SLIDE 46

Building an NLP system

With the theoretical background, it should be possible to carry out the same program that worked for syntax-based MT:

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 17 / 21

slide-47
SLIDE 47

Building an NLP system

With the theoretical background, it should be possible to carry out the same program that worked for syntax-based MT:

◮ Collect lots of training data

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 17 / 21

slide-48
SLIDE 48

Building an NLP system

With the theoretical background, it should be possible to carry out the same program that worked for syntax-based MT:

◮ Collect lots of training data

WANT BELIEVE BOY GIRL

⇐ ⇒

S INF NP NP NP the boy wants the girl to believe him

◮ Train models for parts of the translation pipeline

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 17 / 21

slide-49
SLIDE 49

Building an NLP system

With the theoretical background, it should be possible to carry out the same program that worked for syntax-based MT:

◮ Collect lots of training data

WANT BELIEVE BOY GIRL

⇐ ⇒

S INF NP NP NP the boy wants the girl to believe him

◮ Train models for parts of the translation pipeline ◮ Use them in a bucket-brigade approach or in an integrated

decoder

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 17 / 21

slide-50
SLIDE 50

Toolkit (FSMNLP 2012)

We implemented (in Python):

◮ unweighted and weighted membership checking ◮ unweighted and weighted dag-to-tree transductions

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 18 / 21

slide-51
SLIDE 51

Toolkit (FSMNLP 2012)

We implemented (in Python):

◮ unweighted and weighted membership checking ◮ unweighted and weighted dag-to-tree transductions ◮ packing the set of derivations into a dag acceptor ◮ packing the set of output trees into an RTG

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 18 / 21

slide-52
SLIDE 52

Toolkit (FSMNLP 2012)

We implemented (in Python):

◮ unweighted and weighted membership checking ◮ unweighted and weighted dag-to-tree transductions ◮ packing the set of derivations into a dag acceptor ◮ packing the set of output trees into an RTG ◮ unweighted and weighted n-best generation

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 18 / 21

slide-53
SLIDE 53

Toolkit (FSMNLP 2012)

We implemented (in Python):

◮ unweighted and weighted membership checking ◮ unweighted and weighted dag-to-tree transductions ◮ packing the set of derivations into a dag acceptor ◮ packing the set of output trees into an RTG ◮ unweighted and weighted n-best generation ◮ backward application (tree to dag)

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 18 / 21

slide-54
SLIDE 54

Toolkit (FSMNLP 2012)

We implemented (in Python):

◮ unweighted and weighted membership checking ◮ unweighted and weighted dag-to-tree transductions ◮ packing the set of derivations into a dag acceptor ◮ packing the set of output trees into an RTG ◮ unweighted and weighted n-best generation ◮ backward application (tree to dag) ◮ product construction: intersection and union

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 18 / 21

slide-55
SLIDE 55

Toolkit (FSMNLP 2012)

We implemented (in Python):

◮ unweighted and weighted membership checking ◮ unweighted and weighted dag-to-tree transductions ◮ packing the set of derivations into a dag acceptor ◮ packing the set of output trees into an RTG ◮ unweighted and weighted n-best generation ◮ backward application (tree to dag) ◮ product construction: intersection and union ◮ visualization of trees and graphs using GraphViz

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 18 / 21

slide-56
SLIDE 56

Future work

◮ Rule extraction ◮ Training

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 19 / 21

slide-57
SLIDE 57

Future work

◮ Rule extraction ◮ Training ◮ Composition with tree transducers

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 19 / 21

slide-58
SLIDE 58

Future work

◮ Rule extraction ◮ Training ◮ Composition with tree transducers ◮ Horizontal processing and sparse structures

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 19 / 21

slide-59
SLIDE 59

Future work

◮ Rule extraction ◮ Training ◮ Composition with tree transducers ◮ Horizontal processing and sparse structures ◮ Annotated gold-standard data

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 19 / 21

slide-60
SLIDE 60

Future work

◮ Rule extraction ◮ Training ◮ Composition with tree transducers ◮ Horizontal processing and sparse structures ◮ Annotated gold-standard data ◮ . . .

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 19 / 21

slide-61
SLIDE 61

Future work

◮ Rule extraction ◮ Training ◮ Composition with tree transducers ◮ Horizontal processing and sparse structures ◮ Annotated gold-standard data ◮ . . .

Download

http://www.ims.uni-stuttgart.de/~daniel/dagger/

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 19 / 21

slide-62
SLIDE 62

The end beginning

Thank you for your attention! – Questions?

What are you in for? (c / charge-05 :theme (m / me) :predicate (a / and :op1 (r / resist-01 :agent m :theme (a2 / arrest-01 :theme m))) :op2 (i / intoxicate-01 :theme m :location (p2 / public)))) You got arrested for resisting arrest? I know, right? This policeman grabs me, and I’m like what the f-- Sounds like you are playing four different roles here. It’s just semantics.

c KEVIN KNIGHT Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 20 / 21

slide-63
SLIDE 63

References

Tsutomu Kamimura and Giora Slutzki. 1981. Parallel and two-way automata on directed ordered acyclic graphs.

  • Inf. Control, 49(1):10–51.

Tsutomu Kamimura and Giora Slutzki. 1982. Transductions of dags and trees.

  • Math. Syst. Theory, 15(3):225–249.

Quernheim and Knight Towards Probabilistic Acceptors and Transducers for Feature Structures July 12, 2012 21 / 21