Open System Categorical Quantum Semantics in Natural Language - - PowerPoint PPT Presentation

open system categorical quantum semantics in natural
SMART_READER_LITE
LIVE PREVIEW

Open System Categorical Quantum Semantics in Natural Language - - PowerPoint PPT Presentation

Open System Categorical Quantum Semantics in Natural Language Processing R. Piedeleu 1 D. Kartsaklis 2 B. Coecke 1 M. Sadrzadeh 2 1 Department of Computer Science University of Oxford 2 School of Electronic Engineering and Computer Science Queen


slide-1
SLIDE 1

Open System Categorical Quantum Semantics in Natural Language Processing

  • R. Piedeleu1
  • D. Kartsaklis2
  • B. Coecke1
  • M. Sadrzadeh2

1Department of Computer Science

University of Oxford

2School of Electronic Engineering

and Computer Science Queen Mary University of London

CALCO 2015

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 1/28

slide-2
SLIDE 2

In a nutshell

Categorical compositional distributional semantics unifies two

  • rthogonal semantic paradigms:

The type-logical compositional approach of formal semantics The quantitative perspective of vector space models of meaning

The goal is to represent sentences as points in some high dimensional metric space In this work: Inspired by categorical quantum mechanics, we extend the model in order to explicitly take into account lexical ambiguity during the compositional process.

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 2/28

slide-3
SLIDE 3

Outline

1

Categorical compositional distributional models

2

Composition and lexical ambiguity

3

Open system quantum semantics

4

From theory to practice

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 3/28

slide-4
SLIDE 4

The meaning of words

Distributional hypothesis Words that occur in similar contexts have similar meanings [Harris, 1958].

The functional interplay of philosophy and ? should, as a minimum, guarantee... ...and among works of dystopian ? fiction... The rapid advance in ? today suggests... ...calculus, which are more popular in ?

  • oriented schools.

But because ? is based on mathematics... ...the value of opinions formed in ? as well as in the religions... ...if ? can discover the laws of human nature.... ...is an art, not an exact ? . ...factors shaping the future of our civilization: ? and religion. ...certainty which every new discovery in ? either replaces or reshapes. ...if the new technology of computer ? is to grow significantly He got a ? scholarship to Yale. ...frightened by the powers of destruction ? has given... ...but there is also specialization in ? and technology...

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 4/28

slide-5
SLIDE 5

The meaning of words

Distributional hypothesis Words that occur in similar contexts have similar meanings [Harris, 1958].

The functional interplay of philosophy and science should, as a minimum, guarantee... ...and among works of dystopian science fiction... The rapid advance in science today suggests... ...calculus, which are more popular in science -oriented schools. But because science is based on mathematics... ...the value of opinions formed in science as well as in the religions... ...if science can discover the laws of human nature.... ...is an art, not an exact science . ...factors shaping the future of our civilization: science and religion. ...certainty which every new discovery in science either replaces or reshapes. ...if the new technology of computer science is to grow significantly He got a science scholarship to Yale. ...frightened by the powers of destruction science has given... ...but there is also specialization in science and technology...

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 4/28

slide-6
SLIDE 6

Distributional models of meaning

A word is a vector of co-occurrence statistics with every other word in a selected subset of the vocabulary:

milk cute dog bank money 12 8 5 1 cat cat dog account money pet

Semantic relatedness is usually based on cosine similarity:

sim(− → v , − → u ) = cos θ−

→ v ,− → u = −

→ v · − → u − → v − → u

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 5/28

slide-7
SLIDE 7

Moving to phrases and sentences

We would like to generalize this idea to phrases and sentences However, it’s not clear how There are practical problems—there is not enough data: But even if we had a very large corpus, what the context of a sentence would be?

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 6/28

slide-8
SLIDE 8

Moving to phrases and sentences

We would like to generalize this idea to phrases and sentences However, it’s not clear how There are practical problems—there is not enough data: But even if we had a very large corpus, what the context of a sentence would be? A solution: For a sentence w1w2 . . . wn, find a function f such that:

− → s = f (− → w1, − → w2, . . . , − → wn)

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 6/28

slide-9
SLIDE 9

Categorical compositional distributional semantics

Coecke, Sadrzadeh and Clark (2010): Let syntax drive the semantic derivation, as in formal semantics. Pregroup grammars are structurally homomorphic with the category of finite-dimensional Hilbert spaces and linear maps (both share compact closure) In abstract terms, there exists a structure-preserving passage from grammar to meaning: F : Grammar → Meaning The meaning of a sentence w1w2 . . . wn with grammatical derivation α is defined as:

− − − − − − − → w1w2 . . . wn := F(α)(− → w1 ⊗ − → w2 ⊗ . . . ⊗ − → wn)

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 7/28

slide-10
SLIDE 10

A multi-linear model

The grammatical type of a word defines the vector space in which the word lives: Nouns are vectors in N; adjectives are linear maps N → N, i.e elements in N ⊗ N; intransitive verbs are linear maps N → S, i.e. elements in N ⊗ S; transitive verbs are bi-linear maps N ⊗ N → S, i.e. elements

  • f N ⊗ S ⊗ N;

and so on. The composition operation is tensor contraction, based on inner product.

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 8/28

slide-11
SLIDE 11

Categorical composition: example

S NP Adj happy N kids VP V play N games happy kids play games n nl n nr s nl n

Type reduction morphism:

(ǫr

n · 1s) ◦ (1n · ǫl n · 1nr ·s · ǫl n) : n · nl · n · nr · s · nl · n → s

F

  • (ǫr

n · 1s) ◦ (1n · ǫl n · 1nr ·s · ǫl n)

happy ⊗ − − → kids ⊗ play ⊗ − − − − → games

  • =

(ǫN ⊗ 1S) ◦ (1N ⊗ ǫN ⊗ 1N⊗S ⊗ ǫN)

  • happy ⊗ −

− → kids ⊗ play ⊗ − − − − → games

  • =

happy × − − → kids × play × − − − − → games − − → kids, − − − − → games ∈ N happy ∈ N ⊗ N play ∈ N ⊗ S ⊗ N

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 9/28

slide-12
SLIDE 12

Outline

1

Categorical compositional distributional models

2

Composition and lexical ambiguity

3

Open system quantum semantics

4

From theory to practice

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 10/28

slide-13
SLIDE 13

Ambiguity in word spaces

Compositional distributional models of meaning are mainly based

  • n ambiguous semantic spaces:

0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8

donor transplant liver transplantation kidney lung

  • rgan (medicine)

accompaniment bass

  • rchestra

hymn recital violin concert

  • rgan (music)
  • rgan

∗real vectors projected onto a 2-dimensional space using MDS d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 11/28

slide-14
SLIDE 14

Homonymy and polysemy (1/2)

We distinguish between two types of lexical ambiguity: In cases of homonymy (organ, bank, vessel etc.), due to some historical accident the same word is used to describe two (or more) completely unrelated concepts. Polysemy relates to subtle deviations between the different senses of the same word. Example: The distinction between the financial sense and the river sense

  • f bank is a case of homonymy;

Within the financial sense, a distinction between the abstract concept of bank as an institution and the concrete building is a case of polysemy.

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 12/28

slide-15
SLIDE 15

Homonymy and polysemy (2/2)

Example #1: “I went to the bank to open a savings account” The word bank is used with its financial sense The sayer refers to both of the polysemous meanings of bankfin (institution and building) at the same time Example #2: “I went to the bank” The word bank is probably used with the financial sense in mind (because most of the time this is the case) However, a small possibility that the sayer has actually visited a river bank still exists

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 13/28

slide-16
SLIDE 16

Homonymy and polysemy (2/2)

Example #1: “I went to the bank to open a savings account” The word bank is used with its financial sense The sayer refers to both of the polysemous meanings of bankfin (institution and building) at the same time Example #2: “I went to the bank” The word bank is probably used with the financial sense in mind (because most of the time this is the case) However, a small possibility that the sayer has actually visited a river bank still exists Main point: Polysemy: Relatively coherent and self-contained concepts Homonymy: Lack of specification

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 13/28

slide-17
SLIDE 17

Setting our goals

The problem: How can we formalize the explicit treatment of lexical ambiguity in the categorical compositional model? We seek a model that will allow us:

1 to express homonymous words as probabilistic mixings of their

individual meanings;

2 to retain the ambiguity until the presence of sufficient context

that will eventually resolve it during composition time;

3 to achieve all the above in the multi-linear setting imposed by

the vector space semantics of our original model.

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 14/28

slide-18
SLIDE 18

Outline

1

Categorical compositional distributional models

2

Composition and lexical ambiguity

3

Open system quantum semantics

4

From theory to practice

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 15/28

slide-19
SLIDE 19

A little quantum theory

Quantum mechanics and distributional models of meaning are both based on vector space semantics The state of a quantum system is represented by a vector in a Hilbert space H. Fixing a basis for H: |ψ = c1|k1 + c2|k2 + . . . + cn|kn we take |ψ to be a quantum superposition of the basis states {|ki}i. i.e. the quantum system co-exists in all basis states in parallel with strengths denoted by the corresponding weights Such a state is called a pure state.

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 16/28

slide-20
SLIDE 20

Word vectors as quantum states

We take words to be quantum systems, and word vectors specific states of these systems: |w = c1|k1 + c2|k2 + . . . + cn|kn Each element of the ONB {|ki}i is essentially an atomic symbol: |cat = 12|milk′ + 8|cute′ + . . . + 0|bank′ In other words, a word vector is a probability distribution over atomic symbols |w is a pure state: when word w is seen alone, it is like co-occurring with all the basis words with strengths denoted by the various coefficients.

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 17/28

slide-21
SLIDE 21

Encoding homonymy with mixed states

Ideally, every disjoint meaning of a homonymous word must be represented by a distinct pure state: |bankfin = a1|k1 + a2|k2 + . . . + an|kn |bankriv = b1|k1 + b2|k2 + . . . + bn|kn {ai}i = {bi}i, since the financial sense and the river sense are expected to be seen in drastically different contexts So we have two distinct states referring to the same system We cannot be certain under which state our system may be found – we only know that the former state is more probable than the latter In other words, the system is better described by a probabilistic mixture of pure states, i.e. a mixed state.

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 18/28

slide-22
SLIDE 22

Density operators

Mathematically, a mixed state is represented by a density

  • perator:

ρ(w) =

  • i

pi|sisi| For example: ρ(bank) = 0.80|bankfinbankfin| + 0.20|bankrivbankriv| A density operator is a probability distribution over vectors. Properties of a density operator ρ

Positive semi-definite: v|ρ|v ≥ 0 ∀v ∈ H Of trace one: Tr(ρ) = 1 Self-adjoint: ρ = ρ†

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 19/28

slide-23
SLIDE 23

Complete positivity: The CPM construction

We need: to replace word vectors with density operators to replace linear maps with completely positive linear maps, i.e. maps that send density operators to density operators while respecting the monoidal structure. Selinger (2007): Any dagger compact closed category is associated with a category in which the objects are the objects of the original category, but the maps are completely positive maps. For f1 : A ⊗ A∗ → B ⊗ B∗ and f2 : C ⊗ C ∗ → D ⊗ D∗: f1 ⊗CPM f2 : A ⊗ C ⊗ C ∗ ⊗ A∗ ∼

=

− → A ⊗ A∗ ⊗ C ⊗ C ∗

f1⊗f2

− − − → B ⊗ B∗ ⊗ D ⊗ D∗ ∼

=

− → B ⊗ D ⊗ D∗ ⊗ B∗

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 20/28

slide-24
SLIDE 24

Categorical model of meaning: Reprise

The passage from a grammar to distributional meaning is defined according to the following composition: Preg F − → FHilb L − → CPM(FHilb) The meaning of a sentence w1w2 . . . wn with grammatical derivation α becomes: L(F(α)) (ρ(w1) ⊗CPM ρ(w2) ⊗CPM . . . ⊗CPM ρ(wn)) Composition takes this form: Subject-intransitive verb: ρIN = TrN(ρ(v) ◦ (ρ(s) ⊗ 1S)) Adjective-noun: ρAN = TrN(ρ(adj) ◦ (1N ⊗ ρ(n))) Subj-trans. verb-Obj: ρTS = TrN,N(ρ(v) ◦ (ρ(s) ⊗ 1S ⊗ ρ(o)))

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 21/28

slide-25
SLIDE 25

Using Frobenius algebras

Every vector space with fixed space has Frobenius maps ∆ :: |i → |i ⊗ |i and µ :: |i ⊗ |i → |i over it, useful for: reducing space and time complexity (Kartsaklis et al., COLING 2012); encoding the meaning of functional words, such as relative pronouns (Sadrzadeh et al., MoL 2013); modelling various linguistic phenomena, such as intonation (Kartsaklis and Sadrzadeh, MoL 2015). The new formulation allows for non-commutative versions of Frobenius algebras:

µ := (1A ⊗ ǫA ⊗ 1∗

A) ◦ (1A⊗A ⊗ σA,A∗) : A ⊗ A → A

ι := ηA∗ : I → A

µ(ρ(w1) ⊗ ρ(w2)) = ρ(w1) ◦ ρ(w2)

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 22/28

slide-26
SLIDE 26

Outline

1

Categorical compositional distributional models

2

Composition and lexical ambiguity

3

Open system quantum semantics

4

From theory to practice

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 23/28

slide-27
SLIDE 27

Measuring Von Neumann entropy

Relative Clauses noun: verb1/verb2 noun noun that verb1 noun that verb2

  • rgan: enchant/ache

0.18 0.11 0.08 vessel: swell/sail 0.25 0.16 0.01 queen: fly/rule 0.28 0.14 0.16 nail: gleam/grow 0.19 0.06 0.14 bank: overflow/loan 0.21 0.19 0.18 Adjectives noun: adj1/adj2 noun adj1 noun adj2 noun

  • rgan: music/body

0.18 0.10 0.13 vessel: blood/naval 0.25 0.05 0.07 queen: fair/chess 0.28 0.05 0.16 nail: rusty/finger 0.19 0.04 0.11 bank: water/financial 0.21 0.20 0.16

An important aspect of the proposed model: Disambiguation = Purification

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 24/28

slide-28
SLIDE 28

Conclusion and future work

Density operators offer richer semantics representations for distributional models of meaning From probability distributions over symbols we advance to probability distributions over vectors Many opportunities for further research The non-commutative algebras offer a variety of options, the linguistic intuition of which needs to be explored Iterated use of CPM construction is an intriguing feature that deserves separate treatment Density operators support a form of logic whose distributional and compositional properties remains to be examined Large-scale experimental evaluation currently in progress

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 25/28

slide-29
SLIDE 29

Thank you for listening!

d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 26/28

slide-30
SLIDE 30

References I

Abramsky, S. and Coecke, B. (2004). A categorical semantics of quantum protocols. In 19th Annual IEEE Symposium on Logic in Computer Science, pages 415–425. Balkır, E. (2014). Using density matrices in a compositional distributional model of meaning. Master’s thesis, University of Oxford. Coecke, B., Sadrzadeh, M., and Clark, S. (2010). Mathematical Foundations for a Compositional Distributional Model of Meaning. Lambek Festschrift. Linguistic Analysis, 36:345–384. Kartsaklis, D., Kalchbrenner, N., and Sadrzadeh, M. (2014). Resolving lexical ambiguity in tensor regression models of meaning. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 212–217, Baltimore, Maryland. Association for Computational Linguistics. Kartsaklis, D. and Sadrzadeh, M. (2013). Prior disambiguation of word tensors for constructing sentence vectors. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1590–1601, Seattle, Washington, USA. Association for Computational Linguistics. Kartsaklis, D. and Sadrzadeh, M. (2015). A Frobenius model of information structure in categorical compositional distributional semantics. In Proceedings of the 14th Meeting on Mathematics of Language. d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 27/28

slide-31
SLIDE 31

References II

Kartsaklis, D., Sadrzadeh, M., and Pulman, S. (2012). A unified sentence space for categorical distributional-compositional semantics: Theory and experiments. In Proceedings of 24th International Conference on Computational Linguistics (COLING 2012): Posters, pages 549–558, Mumbai, India. The COLING 2012 Organizing Committee. Kartsaklis, D., Sadrzadeh, M., Pulman, S., and Coecke, B. (2015). Reasoning about meaning in natural language with compact closed categories and Frobenius algebras. In Chubb, J., Eskandarian, A., and Harizanov, V., editors, Logic and Algebraic Structures in Quantum Computing and Information, Association for Symbolic Logic Lecture Notes in Logic. Cambridge University Press. Piedeleu, R., Kartsaklis, D., Coecke, B., and Sadrzadeh, M. (2015). Open system categorical quantum semantics in natural language processing. arXiv preprint arXiv:1502.00831. Sadrzadeh, M., Clark, S., and Coecke, B. (2013). The Frobenius anatomy of word meanings I: subject and object relative pronouns. Journal of Logic and Computation, Advance Access. Sadrzadeh, M., Clark, S., and Coecke, B. (2014). The Frobenius anatomy of word meanings II: Possessive relative pronouns. Journal of Logic and Computation. Selinger, P. (2007). Dagger compact closed categories and completely positive maps. Electronic Notes in Theoretical Computer Science, 170:139–163. d.kartsaklis@qmul.ac.uk Open System Categorical Quantum Semantics in NLP 28/28