From standard reasoning problems to non-standard reasoning - - PowerPoint PPT Presentation

from standard reasoning problems to non standard
SMART_READER_LITE
LIVE PREVIEW

From standard reasoning problems to non-standard reasoning - - PowerPoint PPT Presentation

From standard reasoning problems to non-standard reasoning problems and one step further Uli Sattler University of Manchester, University of Oslo uli.sattler@manchester.ac.uk Some Advertisement Lutz, Sattler Baader, Horrocks,


slide-1
SLIDE 1

From standard reasoning problems to non-standard reasoning problems and

  • ne step further


 Uli Sattler University of Manchester, University of Oslo
 uli.sattler@manchester.ac.uk

slide-2
SLIDE 2

Some Advertisement

slide-3
SLIDE 3

Some Advertisement

Cover illustration: The Description Logic logo. Courtesy of Enrico Franconi.

Baader, Horrocks, Lutz, Sattler

An Introduction to Description Logic

Description Logics (DLs) have a long tradition in computer science and knowledge representation, being designed so that domain knowledge can be described and so that computers can reason about this knowledge. DLs have recently gained increased importance since they form the logical basis of widely used ontology languages, in particular the web ontology language OWL. Written by four renowned experts, this is the first textbook on Description Logic. It is suitable for self-study by graduates and as the basis for a university course. Starting from a basic DL, the book introduces the reader to their syntax, semantics, reasoning problems and model theory, and discusses the computational complexity of these reasoning problems and algorithms to solve them. It then explores a variety of reasoning techniques, knowledge-based applications and tools, and describes the relationship between DLs and OWL. Franz Baader is a professor in the Institute of Theoretical Computer Science at TU Dresden. Ian Horrocks is a professor in the Department of Computer Science at the University of Oxford. Carsten Lutz is a professor in the Department of Computer Science at the University of Bremen. Uli Sattler is a professor in the Information Management Group within the School of Computer Science at the University of Manchester.

Designed by Zoe Naylor.

Description Logic

An Introduction to

Franz Baader Ian Horrocks Carsten Lutz Uli Sattler

slide-4
SLIDE 4

Some Advertisement

Cover illustration: The Description Logic logo. Courtesy of Enrico Franconi.

Baader, Horrocks, Lutz, Sattler

An Introduction to Description Logic

Description Logics (DLs) have a long tradition in computer science and knowledge representation, being designed so that domain knowledge can be described and so that computers can reason about this knowledge. DLs have recently gained increased importance since they form the logical basis of widely used ontology languages, in particular the web ontology language OWL. Written by four renowned experts, this is the first textbook on Description Logic. It is suitable for self-study by graduates and as the basis for a university course. Starting from a basic DL, the book introduces the reader to their syntax, semantics, reasoning problems and model theory, and discusses the computational complexity of these reasoning problems and algorithms to solve them. It then explores a variety of reasoning techniques, knowledge-based applications and tools, and describes the relationship between DLs and OWL. Franz Baader is a professor in the Institute of Theoretical Computer Science at TU Dresden. Ian Horrocks is a professor in the Department of Computer Science at the University of Oxford. Carsten Lutz is a professor in the Department of Computer Science at the University of Bremen. Uli Sattler is a professor in the Information Management Group within the School of Computer Science at the University of Manchester.

Designed by Zoe Naylor.

Description Logic

An Introduction to

Franz Baader Ian Horrocks Carsten Lutz Uli Sattler

Get 20% Discount:

www.cambridge.org/9780521695428

and enter the code BAADER2017 at the checkout

15

algorithms; 5. Complexity; 6. Reasoning in the ε family of Description Logics; 7. Query answering; 8. Ontology languages and applications; Appendix A. Description Logic terminology; References; Index.

20% Discount

£59.99 £47.99 $79.99 $63.99 £29.99 £23.99 $39.99 $31.99

Hardback 978-0-521-87361-1 Paperback 978-0-521-69542-8

April

228 x 152 mm 260pp 30 b/w illus.

slide-5
SLIDE 5

Standard & Non-Standard Reasoning Problems

slide-6
SLIDE 6

Standard Reasoning Problems

we all know them: given decide/compute

  • consistency/satisfiability
  • subsumption
  • classification
  • query answering
  • …all only involve entailment checks:
  • …possibly many (classification!)

O !α

C, D, O, T , A, . . .

slide-7
SLIDE 7

we all know them: given

  • …involve finding extreme X such that …
  • subset-minimal or
  • maximally/minimally strong
  • …possibly many such Xs

Non-Standard Reasoning Problems

C, D, O, T , A, . . . Justspα, Oq, PinPointpα, Oq, . . . matchpC, P, Oq, unifypP1, P2, Oq, . . . x-modpΣ, Oq, . . . mscpa, Oq, lcspC, D, Oq, . . .

slide-8
SLIDE 8

we all know them: given

  • …involve finding extreme X such that …
  • subset-minimal or
  • maximally/minimally strong
  • …possibly many such Xs

Non-Standard Reasoning Problems

C, D, O, T , A, . . . Justspα, Oq, PinPointpα, Oq, . . . matchpC, P, Oq, unifypP1, P2, Oq, . . . x-modpΣ, Oq, . . . mscpa, Oq, lcspC, D, Oq, . . . Are

  • (conservative) rewritability
  • (query) inseparability

also standard reasoning problems?

slide-9
SLIDE 9

(Non-)Standard Reasoning: we know how to

understand problems:

  • decidability & computational complexity

– worst case – data – parametrised – …

understand solutions:

  • soundness, completeness, termination
  • relations between them
  • complexity

– see above

  • practicability

– worst case complexity ≠ best case complexity – amenable to optimisation – empirical evaluation

slide-10
SLIDE 10

fact hermit jfact pellet 10 1000 100000 10000000 10 1000 100000 10000000

Ontology/Reasoner pairs Number of tests

"ST" N*log(N) N^2 ST

An interesting side note

from our empirical evaluation: 
 how many subsumption does classification involve?

slide-11
SLIDE 11

Not always that straightforward

  • Which problem/solution to consider when?
  • e.g.,
  • minimal/top/bottom/semantic/…
  • depends on size, signature, application, …
  • but we know properties of/relations between solutions
  • smallest
  • self-contained
  • unique
  • depleting
  • How to measure practicability?
  • benchmarks, ORE,..

x-modpΣ, Oq, . . .

slide-12
SLIDE 12

Not always that straightforward

  • Which problem/solution to consider when?
  • e.g.,
  • minimal/top/bottom/semantic/…
  • depends on size, signature, application, …
  • but we know properties of/relations between solutions
  • smallest
  • self-contained
  • unique
  • depleting
  • How to measure practicability?
  • benchmarks, ORE,..

x-modpΣ, Oq, . . . Extension/variants of DLs

  • probabilistic
  • non-monotonic
  • rules
slide-13
SLIDE 13

Subjective Ontology-Based Problems

slide-14
SLIDE 14

are problems that are based on

  • plus
  • additional parameter(s)

Subjective Ontology-Based Problems

C, D, O, T , A, | =, . . .

slide-15
SLIDE 15

are problems that are based on

  • plus
  • additional parameter(s)

Subjective Ontology-Based Problems

C, D, O, T , A, | =, . . .

SROIQ : ComSubs(C, D, { C v 8R.(A u C), D v 8R.(A u D)})

  • because objective solution is
  • not feasible/computable
  • or makes little sense
  • e.g. in
slide-16
SLIDE 16

are problems that are based on

  • plus
  • additional parameter(s)

Subjective Ontology-Based Problems

C, D, O, T , A, | =, . . .

“ t @R.A, @R.@R.A, @R.@R.@R.A, . . .u

SROIQ : ComSubs(C, D, { C v 8R.(A u C), D v 8R.(A u D)})

  • because objective solution is
  • not feasible/computable
  • or makes little sense
  • e.g. in
slide-17
SLIDE 17

are problems that are based on

  • plus
  • additional parameter(s)

Subjective Ontology-Based Problems

C, D, O, T , A, | =, . . .

“ t @R.A, @R.@R.A, @R.@R.@R.A, . . .u

  • or we want to capture quality criteria
  • interestingness
  • readability
  • relevance …

SROIQ : ComSubs(C, D, { C v 8R.(A u C), D v 8R.(A u D)})

  • because objective solution is
  • not feasible/computable
  • or makes little sense
  • e.g. in
slide-18
SLIDE 18

A subjective OB problem: Mining TBox Axioms from KBs

  • r

Finding Interesting Correlations

slide-19
SLIDE 19
  • learn (implicit) correlations in our data
  • get interesting insights into domain

Mining TBox axioms from KBs

TBox ABox Learner axiom axiom axiom axiom axiom Do not confuse with (exact) learning of TBoxes (via probing queries)

slide-20
SLIDE 20

Mining TBox axioms from KBs

  • Correlations in KB = classical machine learning
  • automatic generation of knowledge from data

– taking background knowledge in KB into account – unbiased: let the data speak! – unsupervised (no positive/negative examples) – Semantic Data Mining

TBox ABox Learner axiom axiom axiom axiom axiom Hypotheses

slide-21
SLIDE 21

Mining TBox axioms from KBs

  • Correlations in KB = classical machine learning
  • automatic generation of knowledge from data

– taking background knowledge in KB into account – unbiased: let the data speak! – unsupervised (no positive/negative examples) – Semantic Data Mining

TBox ABox Learner axiom axiom axiom axiom axiom Hypotheses

slide-22
SLIDE 22

Mining TBox axioms from KBs

  • Which kind of hypotheses to capture 


correlations in KB?

  • 1. expressive: GCIs, role inclusions
  • 2. readable
  • 3. logically sound
  • 4. statistically sound

axiom axiom axiom axiom

axiom

Hypotheses

slide-23
SLIDE 23
  • 2. Readable Hypotheses
  • A hypothesis is

– a small set of short axioms

  • fewer than axioms
  • with concepts shorter than

– in a suitable DL: – free of redundancy

  • no superfluous parts

✓preferred laconic justifications nmax

`max

ALCHI . . . SROIQ

axiom axiom axiom axiom

axiom

Hypotheses

slide-24
SLIDE 24
  • 3. Logically Sound Hypotheses
  • A hypothesis H should be

✓ informative:

✓we want to mine new axioms

✓ consistent: ✓ non-redundant among all hypotheses:

  • there is no 


8α 2 H : O 6| = α O [ H 6| = > v ? axiom axiom axiom axiom

axiom

Hypotheses

H0, H 2 H : H 6= H0 and H0 ⌘ H

slide-25
SLIDE 25
  • 3. Logically Sound Hypotheses
  • A hypothesis H should be

✓ informative:

✓we want to mine new axioms

✓ consistent: ✓ non-redundant among all hypotheses:

  • there is no 

  • Different hypotheses can be compared wrt. their

✓ logical strength:

? maximally strong?

  • no: overfitting!

? minimally strong?

  • no: under-fitting

✓ reconciliatory power

  • brings together terms so

far only loosely related

8α 2 H : O 6| = α O [ H 6| = > v ? axiom axiom axiom axiom

axiom

Hypotheses

H0, H 2 H : H 6= H0 and H0 ⌘ H

slide-26
SLIDE 26
  • we need to assess data support of hypothesis
  • introduce metrics that capture quality of an axiom

– learn from association rule mining (ARM):

  • count individuals that support a GCI

– count instances, neg instances, non-instances

  • using standard DL semantics, OWA, TBox, entailments,….
  • no ‘artificial closure’

– make sure you treat a GCI as an axiom and not as a rule

  • contrapositive!

– coverage, support, …, lift

  • 4. Statistically Sound Hypotheses
slide-27
SLIDE 27

Some useful notation:

  • relativized:
  • projection tables:
  • 4. Statistically Sound Hypotheses

C1 C2 C3 C4 … Ind1 X X X ? … Ind2 X X … Ind3 ? ? X ? … Ind4 ? ? ? … … … … … … …

Inst(C, O) := {a | O | = C(a)} UnKnpC, Oq :“ InstpJ, OqzpInstpC, Oq Y Instp C, Oqq P(C, O) := # Inst(C, O)/# Inst(>, O)

slide-28
SLIDE 28

some axiom measures easily adapted from ARM: for a GCI define its metrics as follows:

  • 4. Statistically Sound Hypotheses: Axioms

basic relativized Coverage Support Contradiction Assumption … Confidence Lift …

# Inst(C, O) # Inst(C u D, O) # Inst(C u ¬D, O) # Inst(C, O) ∩ UnKn(D, O) P(C u ¬D, O) P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

where P(X, O) = # Ind(X, O)/# Ind(>, O) C v D

slide-29
SLIDE 29
  • 4. Statistically Sound Hypotheses: Example

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

relativized Coverage Support Assumption … Confidence Lift

P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

200/400 180/400 180/400 180/400 180/400 180/400 20/400 180/200 180/180 180/180 400/200 400/200 400/180

A v B B v C1 B v C2

slide-30
SLIDE 30
  • 4. Statistically Sound Hypotheses: Example

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

relativized Coverage Support Assumption … Confidence Lift

P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

200/400 180/400 180/400 180/400 180/400 180/400 20/400 180/200 180/180 180/180 400/200 400/200 400/180

A v B B v C1 B v C2

slide-31
SLIDE 31
  • 4. Statistically Sound Hypotheses: Example

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

relativized Coverage Support Assumption … Confidence Lift

P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

200/400 180/400 180/400 180/400 180/400 180/400 20/400 180/200 180/180 180/180 400/200 400/200 400/180

A v B B v C1 B v C2

slide-32
SLIDE 32
  • 4. Statistically Sound Hypotheses: Example

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

relativized Coverage Support Assumption … Confidence Lift

P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

200/400 180/400 180/400 180/400 180/400 180/400 20/400 180/200 180/180 180/180 400/200 400/200 400/180

A v B B v C1 B v C2

slide-33
SLIDE 33
  • 4. Statistically Sound Hypotheses: Example

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

relativized Coverage Support Assumption … Confidence Lift

P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

0.5 0.45 0.45 0.45 0.45 0.45 0.05 0.45 1 1 2 2 2.22

A v B B v C1 B v C2

slide-34
SLIDE 34

Oooops!

  • make sure we treat GCIs as axioms and not as rules

– contrapositive!

  • so: turn each GCI into equivalent


read below as ‘the resulting LHS’…
 read below as ‘the resulting RHS’…

  • 4. Statistically Sound Hypotheses: Axioms

X t ¬Y v Y t ¬X X v Y

C

main relativized Coverage Support Contradiction Assumption … Confidence Lift …

# Inst(C, O) # Inst(C u D, O) # Inst(C u ¬D, O) # Inst(C, O) ∩ UnKn(D, O) P(C u ¬D, O) P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

D

slide-35
SLIDE 35

Oooops!

  • make sure we treat GCIs as axioms and not as rules

– contrapositive!

  • so: turn each GCI into equivalent


read below as ‘the resulting LHS’…
 read below as ‘the resulting RHS’…

  • 4. Statistically Sound Hypotheses: Axioms

X t ¬Y v Y t ¬X X v Y

C

main relativized Coverage Support Contradiction Assumption … Confidence Lift …

# Inst(C, O) # Inst(C u D, O) # Inst(C u ¬D, O) # Inst(C, O) ∩ UnKn(D, O) P(C u ¬D, O) P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

D Axiom measures are semantically faithful, i.e., A s s ( A v B , O ) = A s s ( ¬ B v ¬ A , O )

slide-36
SLIDE 36

Oooops!

  • make sure we treat GCIs as axioms and not as rules

– contrapositive!

  • so: turn each GCI into equivalent


read below as ‘the resulting LHS’…
 read below as ‘the resulting RHS’…

  • 4. Statistically Sound Hypotheses: Axioms

X t ¬Y v Y t ¬X X v Y

C

main relativized Coverage Support Contradiction Assumption … Confidence Lift …

# Inst(C, O) # Inst(C u D, O) # Inst(C u ¬D, O) # Inst(C, O) ∩ UnKn(D, O) P(C u ¬D, O) P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

D Axiom measures are semantically faithful, i.e., A s s ( A v B , O ) = A s s ( ¬ B v ¬ A , O ) Axiom measures are not semantically faithful, e.g., 
 SupportpA Ñ B, Oq ‰ SupportpJ Ñ A \ B, Oq

slide-37
SLIDE 37

Goal: mine small sets of (short) axioms

  • more readable
  • close to what people write
  • synergy between axioms should lead to better quality
  • how to measure their qualities?
  • 4. Stat. Sound Hypotheses: Sets of Axioms
slide-38
SLIDE 38

Goal: learn small sets of (short) axioms

  • more readable
  • close to what people write
  • synergy between axioms should lead to better quality
  • how to measure their qualities?
  • …easy:
  • 1. rewrite set into single axiom as usual
  • 2. measure resulting axiom
  • 4. Stat. Sound Hypotheses: Sets of Axioms
slide-39
SLIDE 39

H1 Coverage 0.5 0.45 0.45 1 always! Support 0.45 0.45 0.45 0.45 min Assumption 0.05 0.55 ? Confidence 0.45 1 1 0.45 support! Lift 2 2 2.22 1 always!

A v B B v C1 B v C2

H1

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

  • 4. Stat. Sound Hypotheses: Sets of Axioms

= {A v B, B v C1} ⌘ {> v (¬A t B) u (¬B t C1)}

slide-40
SLIDE 40

H1 Coverage 0.5 0.45 0.45 1 always! Support 0.45 0.45 0.45 0.45 min Assumption 0.05 0.55 ? Confidence 0.45 1 1 0.45 support! Lift 2 2 2.22 1 always!

A v B B v C1 B v C2

H1

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

  • 4. Stat. Sound Hypotheses: Sets of Axioms

= {A v B, B v C1} ⌘ {> v (¬A t B) u (¬B t C1)}

slide-41
SLIDE 41

Goal: learn small sets of (short) axioms

  • more readable
  • close to what people write
  • synergy between axioms should lead to better quality
  • how to measure their qualities?
  • sum/average quality of their axioms!
  • 4. Stat. Sound Hypotheses: Sets of Axioms
slide-42
SLIDE 42

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

H1 = {A v B, B v C1}

  • 4. Stat. Sound Hypotheses: Sets of Axioms

H1 H2 Coverage 0.5 0.45 0.45 0.475? 0.475? Support 0.45 0.45 0.45 0.45 0.45 Assumption 0.05 0.05 0.05 Confidence 0.45 1 1 ? ? Lift 2 2 2.22 ? ?

A v B B v C1 B v C2

slide-43
SLIDE 43

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

H1 = {A v B, B v C1}

  • 4. Stat. Sound Hypotheses: Sets of Axioms

H1 H2 Coverage 0.5 0.45 0.45 0.475? 0.475? Support 0.45 0.45 0.45 0.45 0.45 Assumption 0.05 0.05 0.05 Confidence 0.45 1 1 ? ? Lift 2 2 2.22 ? ?

A v B B v C1 B v C2

slide-44
SLIDE 44

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

H1 H2 Coverage 0.5 0.45 0.45 0.475? 0.475? Support 0.45 0.45 0.45 0.45 0.45 Assumption 0.05 0.05 0.05 Confidence 0.45 1 1 ? ? Lift 2 2 2.22 ? ?

A v B B v C1 B v C2 = {A v B, B v C2}

H2 H1 = {A v B, B v C1}

  • 4. Stat. Sound Hypotheses: Sets of Axioms
slide-45
SLIDE 45

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

H1 H2 Coverage 0.5 0.45 0.45 0.475? 0.475? Support 0.45 0.45 0.45 0.45 0.45 Assumption 0.05 0.05 0.05 Confidence 0.45 1 1 ? ? Lift 2 2 2.22 ? ?

A v B B v C1 B v C2 = {A v B, B v C2}

H2 H1 = {A v B, B v C1}

  • 4. Stat. Sound Hypotheses: Sets of Axioms
slide-46
SLIDE 46

Goal: learn small sets of (short) axioms

  • more readable
  • close to what people write
  • synergy between axioms should lead to better quality
  • how to measure their qualities?
  • observe that a good hypothesis
  • allows us to shrink our ABox since it
  • captures recurring patterns
  • (minimum description length induction)
  • 4. Stat. Sound Hypotheses: Sets of Axioms
slide-47
SLIDE 47

Goal: learn small sets of (short) axioms

  • more readable
  • close to what people write
  • synergy between axioms should lead to better quality
  • how to measure their qualities?
  • observe that a good hypothesis
  • allows us to shrink our ABox since it
  • captures recurring patterns
  • use this shrinkage factor to measure a hypothesis’
  • fitness - support by data
  • braveness - number of assumptions
  • 4. Stat. Sound Hypotheses: Sets of Axioms
slide-48
SLIDE 48

Capturing shrinkage…for fitness

  • Fix a finite set of

– concepts , closed under negation – roles

C R

slide-49
SLIDE 49

Capturing shrinkage…for fitness

  • Fix a finite set of

– concepts , closed under negation – roles

C R

π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

  • Define a projection:
slide-50
SLIDE 50

Capturing shrinkage…for fitness

  • Fix a finite set of

– concepts , closed under negation – roles

C R

dLen(A, O) = min{`(A0) | A0 ∪ O ≡ A ∪ O}

  • For an ABox, define its description length:

π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

  • Define a projection:
slide-51
SLIDE 51

Capturing shrinkage…for fitness

  • Fix a finite set of

– concepts , closed under negation – roles

C R

dLen(A, O) = min{`(A0) | A0 ∪ O ≡ A ∪ O}

  • For an ABox, define its description length:
  • Define the fitness of a hypothesis H:

fitn(H, O, C, R) = dLen(π(O, C, R), T )− dLen(π(O, C, R), T ∪ H) π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

  • Define a projection:
slide-52
SLIDE 52

Capturing shrinkage…for braveness

π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

  • Define a projection:
  • Fix a finite set of

– concepts , closed under negation – roles

C R

slide-53
SLIDE 53

Capturing shrinkage…for braveness

  • Define a hypothesis’ assumptions:

Ass(O, H, C, R) = π(O ∪ H, C, R) \ π(O, C, R) π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

  • Define a projection:
  • Fix a finite set of

– concepts , closed under negation – roles

C R

slide-54
SLIDE 54

Capturing shrinkage…for braveness

  • Define a hypothesis’ assumptions:

Ass(O, H, C, R) = π(O ∪ H, C, R) \ π(O, C, R)

  • Define the braveness of a hypothesis H:

brave(H, O, C, R) = dLen(Ass(O, H, C, R), O) π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

  • Define a projection:
  • Fix a finite set of

– concepts , closed under negation – roles

C R

slide-55
SLIDE 55

Capturing shrinkage…for braveness

  • Define a hypothesis’ assumptions:

Ass(O, H, C, R) = π(O ∪ H, C, R) \ π(O, C, R)

  • Define the braveness of a hypothesis H:

brave(H, O, C, R) = dLen(Ass(O, H, C, R), O) π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

  • Define a projection:

Axiom set measures are semantically faithful, i.e., H ≡ H0 ⇒ fitn(H, O, C, R) = fitn(H0, O, C, R) brave(H, O, C, R) = brave(H0, O, C, R)

  • Fix a finite set of

– concepts , closed under negation – roles

C R

slide-56
SLIDE 56

A B C1 C2 Ind1 X X X X … … … … … Ind180 X X X X Ind181 X ? X ? … … … … … Ind200 X ? X ? Ind201 ? ? ? ? … … … … … Ind400 ? ? ? ?

= {A v B, B v C2}

H2 H1 = {A v B, B v C1}

  • 4. Stat. Sound Hypotheses: Sets of Axioms

fitn(H1, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H1) = 760 − 380 = 380 fitn(H2, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H2) = 760 − 400 = 360 brave(H1, A, . . .) = dLen(Ass(A, H1, . . .), A) = 20 brave(H2, A, . . .) = dLen(Ass(A, H2, . . .), A) = 40

slide-57
SLIDE 57

A B C1 C2 Ind1 X X X X … … … … … Ind180 X X X X Ind181 X ? X ? … … … … … Ind200 X ? X ? Ind201 ? ? ? ? … … … … … Ind400 ? ? ? ?

= {A v B, B v C2}

H2 H1 = {A v B, B v C1}

  • 4. Stat. Sound Hypotheses: Sets of Axioms

fitn(H1, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H1) = 760 − 380 = 380 fitn(H2, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H2) = 760 − 400 = 360 brave(H1, A, . . .) = dLen(Ass(A, H1, . . .), A) = 20 brave(H2, A, . . .) = dLen(Ass(A, H2, . . .), A) = 40

slide-58
SLIDE 58

A B C1 C2 Ind1 X X X X … … … … … Ind180 X X X X Ind181 X ? X ? … … … … … Ind200 X ? X ? Ind201 ? ? ? ? … … … … … Ind400 ? ? ? ?

= {A v B, B v C2}

H2 H1 = {A v B, B v C1}

  • 4. Stat. Sound Hypotheses: Sets of Axioms

fitn(H1, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H1) = 760 − 380 = 380 fitn(H2, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H2) = 760 − 400 = 360 brave(H1, A, . . .) = dLen(Ass(A, H1, . . .), A) = 20 brave(H2, A, . . .) = dLen(Ass(A, H2, . . .), A) = 40

slide-59
SLIDE 59

A B C1 C2 Ind1 X X X X … … … … … Ind180 X X X X Ind181 X ? X ? … … … … … Ind200 X ? X ? Ind201 ? ? ? ? … … … … … Ind400 ? ? ? ?

= {A v B, B v C2}

H2 H1 = {A v B, B v C1}

  • 4. Stat. Sound Hypotheses: Sets of Axioms

fitn(H1, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H1) = 760 − 380 = 380 fitn(H2, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H2) = 760 − 400 = 360 brave(H1, A, . . .) = dLen(Ass(A, H1, . . .), A) = 20 brave(H2, A, . . .) = dLen(Ass(A, H2, . . .), A) = 40

H1 >> H2

slide-60
SLIDE 60

Example: empty TBox, ABox X A, B A, C A, C RR A, C R B R

A

  • 4. Stat. Sound Hypotheses: Sets of Axioms

fitn({X v 8R.A}, A, . . .) = dLen(π(A, . . .), ;) dLen(π(A, . . .), {X v 8R.A}) = 12 9 = 3 brave({X v 8R.A}, A, . . .) = dLen(Ass(A, {X v 8R.A}, . . .), A) = 1

slide-61
SLIDE 61

Example: empty TBox, ABox X A, B A, C A, C RR A, C R B R

A

  • 4. Stat. Sound Hypotheses: Sets of Axioms

fitn({X v 8R.A}, A, . . .) = dLen(π(A, . . .), ;) dLen(π(A, . . .), {X v 8R.A}) = 12 9 = 3 brave({X v 8R.A}, A, . . .) = dLen(Ass(A, {X v 8R.A}, . . .), A) = 1

slide-62
SLIDE 62

Example: empty TBox, ABox X A, B A, C A, C RR A, C R B R

A

  • 4. Stat. Sound Hypotheses: Sets of Axioms

fitn({X v 8R.A}, A, . . .) = dLen(π(A, . . .), ;) dLen(π(A, . . .), {X v 8R.A}) = 12 9 = 3 brave({X v 8R.A}, A, . . .) = dLen(Ass(A, {X v 8R.A}, . . .), A) = 1

slide-63
SLIDE 63

phew…

slide-64
SLIDE 64

Remember:

TBox ABox Learner axiom axiom axiom axiom axiom Hypotheses we wanted to mine axioms!

slide-65
SLIDE 65

H ≡ H0 ⇒ fitn(O, H, C, R) = fitn(O, H0, C, R)

  • (Sets of) axioms as Hypotheses
  • Loads of measures to capture
  • 1. axiom hypothesis’ coverage, support, assumption, lift, …
  • 2. set of axioms hypothesis fitness, braveness
  • with a focus of a concept/role spaces ,
  • What are their properties?

– semantically faithful: 
 
 
 
 …

  • Can we compute these measure?

– easy for (1), tricky for (2):

So, what have we got?

C R

dLen(A, O) = min{`(A0) | A0 ∪ O ≡ A ∪ O}

O | = H ⇒ Ass(O, H, C, R) = 0 TBox ABox

Learner

Hypotheses

axiom(s) m1,m2,m3,… axiom(s) m1,m2,m3,… axiom(s) m1,m2,m3,… axiom(s) m1,m2,m3,… axiom(s) m1,m2,m3,…

? ?

slide-66
SLIDE 66

H ≡ H0 ⇒ fitn(O, H, C, R) = fitn(O, H0, C, R)

  • (Sets of) axioms as Hypotheses
  • Loads of measures to capture
  • 1. axiom hypothesis’ coverage, support, assumption, lift, …
  • 2. set of axioms hypothesis fitness, braveness
  • with a focus of a concept/role spaces ,
  • What are their properties?

– semantically faithful: 
 
 
 
 …

  • Can we compute these measure?

– easy for (1), tricky for (2):

So, what have we got?

C R

dLen(A, O) = min{`(A0) | A0 ∪ O ≡ A ∪ O}

O | = H ⇒ Ass(O, H, C, R) = 0

slide-67
SLIDE 67

So, what have we got? (2)

  • If we can compute measure, how feasible is this?
  • If “feasible”,

– do these measures correlate? – how independent are they?

  • For which DLs & inputs can we create & evaluate hypotheses?
  • Which measures indicate interesting hypothesis?
  • What is the shape for interesting hypothesis?

– are longer/bigger hypotheses better?

  • What do we do with them?

– how do we guide users through these?

slide-68
SLIDE 68

Slava implements: DL Miner

Ontology Cleaner O Hypothesis Constructor L, Σ Hypothesis Evaluator Q Hypothesis Sorter rf(H) H qf(H, q)

TBox ABox

axiom( m1,m2,m3 axiom(

m1,m2,m3

axiom( m1,m2,m3 axiom( m1,m2,m3

axiom(s) m1,m2,m3,…

parameters

slide-69
SLIDE 69

Slava implements: DL Miner

Ontology Cleaner O Hypothesis Constructor L, Σ Hypothesis Evaluator Q Hypothesis Sorter rf(H) H qf(H, q)

TBox ABox

axiom( m1,m2,m3 axiom(

m1,m2,m3

axiom( m1,m2,m3 axiom( m1,m2,m3

axiom(s) m1,m2,m3,…

parameters

Subjective Solution

slide-70
SLIDE 70

Easy:

  • construct all concepts C1, C2, …

– finitely many thanks to language bias

  • check for each whether it’s logically ok:

– –

if yes, add it to

  • remove redundant hypotheses from H

DL Miner: Hypothesis Constructor

L Ci v Cj

O [ {Ci v Cj} 6| = > v ? O 6| = Ci v Cj

H

slide-71
SLIDE 71

Easy:

  • construct all concepts C1, C2, …

– finitely many thanks to language bias

  • check for each whether it’s logically ok:

– –

if yes, add it to

  • remove redundant hypotheses from H

DL Miner: Hypothesis Constructor

L Ci v Cj

O [ {Ci v Cj} 6| = > v ? O 6| = Ci v Cj

H Bonkers! Even for EL, 100 concept/role names 4 max length of concepts Ci ~100,000,000 concepts Ci ~100,000,0002 GCIs to test

slide-72
SLIDE 72

Easy:

  • construct all concepts C1, C2, …

– finitely many thanks to language bias

  • check for each whether it’s logically ok:

– –

if yes, add it to

  • remove redundant hypotheses from H

DL Miner: Hypothesis Constructor

L Ci v Cj

O [ {Ci v Cj} 6| = > v ? O 6| = Ci v Cj

H Bonkers! Even for EL, 100 concept/role names 4 max length of concepts Ci ~100,000,000 concepts Ci ~100,000,0002 GCIs to test Bonkers! Even for EL, n concept/role names k max length of concepts Ci nk concepts Ci n2k GCIs to test

slide-73
SLIDE 73

Use a refinement operator to build Ci informed by ABox

– used in concept learning, conceptual blending

  • Given a logic , define a refinement operator as

– a function such that, 
 for each

  • A refinement operator is

– proper if, for all – complete if, for all – suitable if, for all

DL Miner: Hypothesis Constructor

L

ρ : Conc(L) 7! P(Conc(L)) C 2 L, C0 2 ρ(C) : C0 v C C 2 L, C0 2 ρ(C) : C0 6⌘ C C P L there is n, C1 P ⇢npJq : C1 ” C and `pC1q § `pCq C, C1 P L : if C1 à C then there is some n, C2 ” C with C1 P ρnpC2q

slide-74
SLIDE 74

Use a refinement operator to build Ci informed by ABox

– used in concept learning, conceptual blending

  • Given a logic , define a refinement operator as

– a function such that, 
 for each

  • A refinement operator is

– proper if, for all – complete if, for all – suitable if, for all

DL Miner: Hypothesis Constructor

L

ρ : Conc(L) 7! P(Conc(L)) C 2 L, C0 2 ρ(C) : C0 v C C 2 L, C0 2 ρ(C) : C0 6⌘ C C P L there is n, C1 P ⇢npJq : C1 ” C and `pC1q § `pCq

Great: there are known refinement

  • perators (proper, complete, suitable,…)

for ALC [LehmHitzler2010]

C, C1 P L : if C1 à C then there is some n, C2 ” C with C1 P ρnpC2q

slide-75
SLIDE 75

DL Miner: Concept Constructor

Algorithm 8 DL-Apriori (O, Σ, DL, `max, pmin)

1: inputs 2:

O := T [ A: an ontology

3:

Σ: a finite set of terms such that > 2 Σ

4:

DL: a DL for concepts

5:

`max: a maximal length of a concept such that 1  `max < 1

6:

pmin: a minimal concept support such that 0 < pmin  |in(O)|

7: outputs 8:

C: the set of suitable concepts

9: do 10:

C ; % initialise the final set of suitable concepts

11:

D {>} % initialise the set of concepts yet to be specialised

12:

⇢ getOperator(DL) % initialise a suitable operator ⇢ for DL

13:

while D 6= ; do

14:

C pick(D) % pick a concept C to be specialised

15:

D D\{C} % remove C from the concepts to be specialised

16:

C C [ {C} % add C to the final set

17:

⇢C specialise(C, ⇢, Σ, `max) % specialise C using ⇢

18:

DC {D 2 urc(⇢C) | @D0 2 C [ D : D0 ⌘ D} % discard variations

19:

D D [ {D 2 DC | p(D, O) pmin} % add suitable specialisations

20:

end while

21:

return C

slide-76
SLIDE 76

DL Miner: Concept Constructor

Algorithm 8 DL-Apriori (O, Σ, DL, `max, pmin)

1: inputs 2:

O := T [ A: an ontology

3:

Σ: a finite set of terms such that > 2 Σ

4:

DL: a DL for concepts

5:

`max: a maximal length of a concept such that 1  `max < 1

6:

pmin: a minimal concept support such that 0 < pmin  |in(O)|

7: outputs 8:

C: the set of suitable concepts

9: do 10:

C ; % initialise the final set of suitable concepts

11:

D {>} % initialise the set of concepts yet to be specialised

12:

⇢ getOperator(DL) % initialise a suitable operator ⇢ for DL

13:

while D 6= ; do

14:

C pick(D) % pick a concept C to be specialised

15:

D D\{C} % remove C from the concepts to be specialised

16:

C C [ {C} % add C to the final set

17:

⇢C specialise(C, ⇢, Σ, `max) % specialise C using ⇢

18:

DC {D 2 urc(⇢C) | @D0 2 C [ D : D0 ⌘ D} % discard variations

19:

D D [ {D 2 DC | p(D, O) pmin} % add suitable specialisations

20:

end while

21:

return C

specialise concepts 


  • nly if they have

instances!

  • `max
slide-77
SLIDE 77

DL Miner: Concept Constructor

Algorithm 8 DL-Apriori (O, Σ, DL, `max, pmin)

1: inputs 2:

O := T [ A: an ontology

3:

Σ: a finite set of terms such that > 2 Σ

4:

DL: a DL for concepts

5:

`max: a maximal length of a concept such that 1  `max < 1

6:

pmin: a minimal concept support such that 0 < pmin  |in(O)|

7: outputs 8:

C: the set of suitable concepts

9: do 10:

C ; % initialise the final set of suitable concepts

11:

D {>} % initialise the set of concepts yet to be specialised

12:

⇢ getOperator(DL) % initialise a suitable operator ⇢ for DL

13:

while D 6= ; do

14:

C pick(D) % pick a concept C to be specialised

15:

D D\{C} % remove C from the concepts to be specialised

16:

C C [ {C} % add C to the final set

17:

⇢C specialise(C, ⇢, Σ, `max) % specialise C using ⇢

18:

DC {D 2 urc(⇢C) | @D0 2 C [ D : D0 ⌘ D} % discard variations

19:

D D [ {D 2 DC | p(D, O) pmin} % add suitable specialisations

20:

end while

21:

return C

specialise concepts 


  • nly if they have

instances!

  • `max

Don’t even construct most of the nk concepts Ci

slide-78
SLIDE 78

Slava implements: DL Miner

Ontology Cleaner O Hypothesis Constructor L, Σ Hypothesis Evaluator Q Hypothesis Sorter rf(H) H qf(H, q)

TBox ABox

axiom( m1,m2,m3 axiom(

m1,m2,m3

axiom( m1,m2,m3 axiom( m1,m2,m3

axiom(s) m1,m2,m3,…

parameters DL-Apriori (·) buildRolesTopDown(·) generateHypotheses(·)

slide-79
SLIDE 79

Slava implements: DL Miner

Ontology Cleaner O Hypothesis Constructor L, Σ Hypothesis Evaluator Q Hypothesis Sorter rf(H) H qf(H, q)

TBox ABox

axiom( m1,m2,m3 axiom(

m1,m2,m3

axiom( m1,m2,m3 axiom( m1,m2,m3

axiom(s) m1,m2,m3,…

parameters DL-Apriori (·) buildRolesTopDown(·) generateHypotheses(·)

Complete (for the parameters provided).

slide-80
SLIDE 80

DL Miner: Hypothesis Evaluator

  • Relatively straightforward for axiom measures

– hard test case for instance retrieval

  • Hard for set-of-axiom measures (fitness & braveness)

– due to – DL Miner implements an approximation that

  • identifies redundant assertions in ABox


  • does consider 1-step interactions between individuals
  • ignores ‘longer’ interactions
  • underestimates fitness, overestimates braveness

– great test case for incremental reasoning: Pellet! dLen(A, O) = min{`(A0) | A0 ∪ O ≡ A ∪ O}

dLen∗(A, O) = `(A) − `(Redundt(A, O))

slide-81
SLIDE 81

DL Miner: Hypothesis Sorter

  • Last step in DL Miner’s workflow
  • Easy:

– throw away all hypotheses that are dominated by another one – i.e., compute the Pareto front wrt the measures provided

slide-82
SLIDE 82

DL Miner: Example

Woman u 9hasChild.> v Mother Man u 9hasChild.> v Father 9hasChild.> v 9marriedTo.> 9marriedTo.> v 9hasChild.> 9marriedTo.Woman v Man 9marriedTo.Mother v Father Father v 9marriedTo.(9hasChild.>) ( Mother v 9marriedTo.(9hasChild.>) ( 9hasChild.> v Mother t Father 9hasChild.> v Man t Woman 9hasChild.> v Father t Woman

Given a Kinship Ontology,1 it mines 536 Hs with 
 confidence above 0.9, e.g. TBox ABox

DL Miner

  • 1. adapted from 


UCI Machine Learning Repository

slide-83
SLIDE 83

DL Miner: Example

Woman u 9hasChild.> v Mother Man u 9hasChild.> v Father 9hasChild.> v 9marriedTo.> 9marriedTo.> v 9hasChild.> 9marriedTo.Woman v Man 9marriedTo.Mother v Father Father v 9marriedTo.(9hasChild.>) ( Mother v 9marriedTo.(9hasChild.>) ( 9hasChild.> v Mother t Father 9hasChild.> v Man t Woman 9hasChild.> v Father t Woman

Given a Kinship Ontology,1 it mines 536 Hs with 
 confidence above 0.9, e.g. TBox ABox

DL Miner

Great - it works really well

  • n a toy ontology!
  • 1. adapted from 


UCI Machine Learning Repository

slide-84
SLIDE 84

Still: many open questions

  • If we can compute measure, how feasible is this?
  • If “feasible”,

– do these measures correlate? – how independent are they?

  • For which DLs & inputs can we create & evaluate hypotheses?
  • Which measures indicate interesting hypothesis?
  • What is the shape of interesting hypothesis?

– are longer/bigger hypotheses better?

  • What do we do with them?

– how do we guide users through these?

slide-85
SLIDE 85

Design, run, analyse experiments

slide-86
SLIDE 86

Design, run, analyse experiments

  • A corpus - or two:
  • 1. handpicked corpus from related work: 16 ontologies
  • 2. principled one:
  • All BioPortal ontologies with >= 100 individuals and 


>= 100 RAs 21 ontologies


slide-87
SLIDE 87

Design, run, analyse experiments

  • A corpus - or two:
  • 1. handpicked corpus from related work: 16 ontologies
  • 2. principled one:
  • All BioPortal ontologies with >= 100 individuals and 


>= 100 RAs 21 ontologies


  • Settings for hypothesis parameters:

– is SHI

– RIAs with inverse, composition

– minsupport = 10 – max concept length in GCIs = 4

L

slide-88
SLIDE 88

Design, run, analyse experiments

  • A corpus - or two:
  • 1. handpicked corpus from related work: 16 ontologies
  • 2. principled one:
  • All BioPortal ontologies with >= 100 individuals and 


>= 100 RAs 21 ontologies


  • Settings for hypothesis parameters:

– is SHI

– RIAs with inverse, composition

– minsupport = 10 – max concept length in GCIs = 4

  • generate & evaluate up to 500 hypotheses per ontology

L

slide-89
SLIDE 89

Design, run, analyse experiments

  • What kind of axioms do people write?

– re. readability of hypotheses: – what kind of axioms should we roughly aim for?

mean mode 5% 25% 50% 75% 95% 99% 99.9% length 2.63 3 2 2 3 3 3 3 5 depth 0.69 1 1 1 1 1 3

Length & role depth of axioms in Bioportal - Taxonomies

DL constructor C 9R.C C u D 8R.C C t D ¬C Axioms, % 99.73 67.82 1.15 0.46 0.09 0.01

Use of DL constructors in Bioportal - Taxonomies

slide-90
SLIDE 90

Design, run, analyse experiments

  • What kind of axioms do people write?

– re. readability of hypotheses: – what kind of axioms should we roughly aim for?

mean mode 5% 25% 50% 75% 95% 99% 99.9% length 2.63 3 2 2 3 3 3 3 5 depth 0.69 1 1 1 1 1 3

Length & role depth of axioms in Bioportal - Taxonomies

DL constructor C 9R.C C u D 8R.C C t D ¬C Axioms, % 99.73 67.82 1.15 0.46 0.09 0.01

Use of DL constructors in Bioportal - Taxonomies

Restricting length of 
 concepts in axioms to 4 (axioms to 8) is fine!

slide-91
SLIDE 91

How do the measures correlate?

Design, run, analyse experiments

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(a) Handpicked corpus

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(b) Principled corpus

slide-92
SLIDE 92

How do the measures correlate?

Design, run, analyse experiments

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(a) Handpicked corpus

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(b) Principled corpus

slide-93
SLIDE 93

How do the measures correlate?

Design, run, analyse experiments

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(a) Handpicked corpus

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(b) Principled corpus

slide-94
SLIDE 94

How do the measures correlate?

Design, run, analyse experiments

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(a) Handpicked corpus

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(b) Principled corpus

slide-95
SLIDE 95

Design, run, analyse experiments

10 20 30 40 50 60 70 OC HC HP HE

Runtime (%)

Handpicked Principled

How feasible is hypothesis mining?

Parsing & 
 Classification Hypothesis Construction Preparatory

  • Ent. Checks

Hypothesis Evaluation

slide-96
SLIDE 96

Design, run, analyse experiments

10 20 30 40 50 60 70 OC HC HP HE

Runtime (%)

Handpicked Principled

How feasible is hypothesis mining? Works fine for classifiable ontologies.

Incremental Reasoning in Pellet works very well for ABoxes

Parsing & 
 Classification Hypothesis Construction Preparatory

  • Ent. Checks

Hypothesis Evaluation

slide-97
SLIDE 97

Design, run, analyse experiments

5 10 15 20 25 30 35 40 AXM1 AXM2 FITN BRAV CONS INFOR STREN REDUN DISSIM COMPL

Runtime (%)

Handpicked Principled

How costly are the different measures?

slide-98
SLIDE 98

Design, run, analyse experiments

5 10 15 20 25 30 35 40 AXM1 AXM2 FITN BRAV CONS INFOR STREN REDUN DISSIM COMPL

Runtime (%)

Handpicked Principled

How costly are the different measures? Consistency is the most costly measure

slide-99
SLIDE 99

But - what about the semantic mining?

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

slide-100
SLIDE 100

So, what have we got? (new version)

✓ Loads of measures to capture aspects of hypotheses

– mostly independent – some superfluous on positive data (unsurprisingly)

✓ Hypothesis generation & evaluation is feasible

– provided our ontology is classifiable – provided our search space isn’t too massive

  • …focus!
  • Which measures indicate interesting hypothesis?
  • What is the shape for interesting hypothesis?

– are longer/bigger hypotheses better?

  • What do we do with them?

– how do we guide users through these?

slide-101
SLIDE 101

Design, run, analyse survey

Can we learn hypotheses are

  • usefull/interesting?

…and how does this correlate with measures? TBox SROIQ sig = 522

ABox 169K CAs 405K RAs

DL Miner

S1: 60 Hypos unfocused S2: 60 Hypos focused

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

SHI |Ci| <= 4

slide-102
SLIDE 102

Design, run, analyse survey

Can we learn hypotheses are

  • usefull/interesting?

…and how does this correlate with measures? TBox SROIQ sig = 522

ABox 169K CAs 405K RAs

DL Miner

S1: 60 Hypos unfocused S2: 60 Hypos focused

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

SHI |Ci| <= 4

30 high-confidence 30 low-confidence

slide-103
SLIDE 103

Design, run, analyse survey

Can we learn hypotheses are

  • usefull/interesting?

…and how does this correlate with measures? TBox SROIQ sig = 522

ABox 169K CAs 405K RAs

DL Miner

S1: 60 Hypos unfocused S2: 60 Hypos focused

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

SHI |Ci| <= 4

Valid? Interesting? 30 high-confidence 30 low-confidence

slide-104
SLIDE 104

Design, run, analyse survey

Validity Interestingness 1 2 3 4 Wrong 6 11 30

  • Survey 1

Don’t know

  • 1
  • 2

4 (unfocused) Correct

  • 6
  • Wrong

1

  • 1
  • 5

Survey 2 Don’t know

  • 49

(focused) Correct

  • 4

How good/valid are the mined hypotheses?

slide-105
SLIDE 105

Design, run, analyse survey

Validity Interestingness 1 2 3 4 Wrong 6 11 30

  • Survey 1

Don’t know

  • 1
  • 2

4 (unfocused) Correct

  • 6
  • Wrong

1

  • 1
  • 5

Survey 2 Don’t know

  • 49

(focused) Correct

  • 4

How good/valid are the mined hypotheses?

slide-106
SLIDE 106

Design, run, analyse survey

How does validity/interestingness correlate with our metrics?

slide-107
SLIDE 107

Design, run, analyse survey

How does validity/interestingness correlate with our metrics?

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 BLIFT LIFT LENGTH DEPTH DISSIM BCONVQ CONVQ BCONF CONF BASSUM ASSUM FITN BRAV BSUPP SUPP COMPL

Correlation coefficient

negative positive

(c) Survey 2: Validity

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 LENGTH DEPTH DISSIM BLIFT LIFT BCONF CONF BCONVQ CONVQ FITN BSUPP SUPP BRAV BASSUM ASSUM COMPL

Correlation coefficient

negative positive

(d) Survey 2: Interestingness

slide-108
SLIDE 108

Design, run, analyse survey

How does validity/interestingness correlate with our metrics?

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 BLIFT LIFT LENGTH DEPTH DISSIM BCONVQ CONVQ BCONF CONF BASSUM ASSUM FITN BRAV BSUPP SUPP COMPL

Correlation coefficient

negative positive

(c) Survey 2: Validity

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 LENGTH DEPTH DISSIM BLIFT LIFT BCONF CONF BCONVQ CONVQ FITN BSUPP SUPP BRAV BASSUM ASSUM COMPL

Correlation coefficient

negative positive

(d) Survey 2: Interestingness

slide-109
SLIDE 109

Design, run, analyse survey

How does validity/interestingness correlate with our metrics?

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 BLIFT LIFT LENGTH DEPTH DISSIM BCONVQ CONVQ BCONF CONF BASSUM ASSUM FITN BRAV BSUPP SUPP COMPL

Correlation coefficient

negative positive

(c) Survey 2: Validity

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 LENGTH DEPTH DISSIM BLIFT LIFT BCONF CONF BCONVQ CONVQ FITN BSUPP SUPP BRAV BASSUM ASSUM COMPL

Correlation coefficient

negative positive

(d) Survey 2: Interestingness

slide-110
SLIDE 110

Design, run, analyse survey

How does validity/interestingness correlate with our metrics?

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 BLIFT LIFT LENGTH DEPTH DISSIM BCONVQ CONVQ BCONF CONF BASSUM ASSUM FITN BRAV BSUPP SUPP COMPL

Correlation coefficient

negative positive

(c) Survey 2: Validity

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 LENGTH DEPTH DISSIM BLIFT LIFT BCONF CONF BCONVQ CONVQ FITN BSUPP SUPP BRAV BASSUM ASSUM COMPL

Correlation coefficient

negative positive

(d) Survey 2: Interestingness

slide-111
SLIDE 111

Design, run, analyse survey

How does validity/interestingness correlate with our metrics?

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 BLIFT LIFT LENGTH DEPTH DISSIM BCONVQ CONVQ BCONF CONF BASSUM ASSUM FITN BRAV BSUPP SUPP COMPL

Correlation coefficient

negative positive

(c) Survey 2: Validity

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 LENGTH DEPTH DISSIM BLIFT LIFT BCONF CONF BCONVQ CONVQ FITN BSUPP SUPP BRAV BASSUM ASSUM COMPL

Correlation coefficient

negative positive

(d) Survey 2: Interestingness

slide-112
SLIDE 112

Design, run, analyse survey

How does validity/interestingness correlate with our metrics?

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 BLIFT LIFT LENGTH DEPTH DISSIM BCONVQ CONVQ BCONF CONF BASSUM ASSUM FITN BRAV BSUPP SUPP COMPL

Correlation coefficient

negative positive

(c) Survey 2: Validity

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 LENGTH DEPTH DISSIM BLIFT LIFT BCONF CONF BCONVQ CONVQ FITN BSUPP SUPP BRAV BASSUM ASSUM COMPL

Correlation coefficient

negative positive

(d) Survey 2: Interestingness

slide-113
SLIDE 113

Design, run, analyse survey

How does validity/interestingness correlate with our metrics?

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 BLIFT LIFT LENGTH DEPTH DISSIM BCONVQ CONVQ BCONF CONF BASSUM ASSUM FITN BRAV BSUPP SUPP COMPL

Correlation coefficient

negative positive

(c) Survey 2: Validity

−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 LENGTH DEPTH DISSIM BLIFT LIFT BCONF CONF BCONVQ CONVQ FITN BSUPP SUPP BRAV BASSUM ASSUM COMPL

Correlation coefficient

negative positive

(d) Survey 2: Interestingness

slide-114
SLIDE 114

What we learned: 3 kinds of hypotheses!

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

high confidence/lift/… low assumptions/braveness

  • 1. An interesting hypothesis can give 


new insights into domain

slide-115
SLIDE 115

Semantic Mining

What we learned: 3 kinds of hypotheses!

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

high confidence/lift/… low assumptions/braveness

  • 1. An interesting hypothesis can give 


new insights into domain

slide-116
SLIDE 116

What we learned: 3 kinds of hypotheses!

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

  • 2. An interesting hypothesis can reveal 


axioms missing from TBox

!!!

slide-117
SLIDE 117

What we learned: 3 kinds of hypotheses!

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

  • 2. An interesting hypothesis can reveal 


axioms missing from TBox TBox completion

  • ntology learning from data

!!!

slide-118
SLIDE 118

What we learned: 3 kinds of hypotheses!

  • 3. An interesting hypothesis can reveal 


bias & errors in the ontology

high confidence/lift/… low assumptions/braveness

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

slide-119
SLIDE 119

What we learned: 3 kinds of hypotheses!

  • 3. An interesting hypothesis can reveal 


bias & errors in the ontology

high confidence/lift/… low assumptions/braveness

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

???

Semantic Data Analysis

slide-120
SLIDE 120

3 kinds of hypotheses - can we predict?

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3, high confidence/lift/… low assumptions/braveness

No - they look alike

slide-121
SLIDE 121

3 kinds of hypotheses - can we predict?

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3, high confidence/lift/… low assumptions/braveness

No - they look alike Perhaps - with different ABoxes/other sources

slide-122
SLIDE 122

Summary & Outlook

  • Mining rich axioms from ontologies is possible

– gives us more than we thought – expressive axioms are better!

  • Fine test case for incremental/ABox reasoning
  • More surveys

– to better understand relevance of metrics – but we’ve got the shape now

  • Redundancy in general is tricky & costly

– stripping superfluous parts from concepts, (sets of) axioms

  • We need even better refinement operators:

– for more expressive DLs – redundancy-free – ontology-aware

slide-123
SLIDE 123

Subjective ontology-based problems

  • are great fun

– design of experiments & surveys – but also rather complex: sooo many design choices

  • specifying & implementing good parameters is tricky

– metrics make “ontology mining” subjective – requires understanding of logic & reasoners & …

  • are plentiful/numerous

– abduction – similarity – good explanations/proofs for entailments justifications – good counter-models for non-entailments – good repair of inconsistent/incoherent ontologies – …

slide-124
SLIDE 124

Special Thanks to Slava Sazonau

slide-125
SLIDE 125

Some Advertisement

Cover illustration: The Description Logic logo. Courtesy of Enrico Franconi.

Baader, Horrocks, Lutz, Sattler

An Introduction to Description Logic

Description Logics (DLs) have a long tradition in computer science and knowledge representation, being designed so that domain knowledge can be described and so that computers can reason about this knowledge. DLs have recently gained increased importance since they form the logical basis of widely used ontology languages, in particular the web ontology language OWL. Written by four renowned experts, this is the first textbook on Description Logic. It is suitable for self-study by graduates and as the basis for a university course. Starting from a basic DL, the book introduces the reader to their syntax, semantics, reasoning problems and model theory, and discusses the computational complexity of these reasoning problems and algorithms to solve them. It then explores a variety of reasoning techniques, knowledge-based applications and tools, and describes the relationship between DLs and OWL. Franz Baader is a professor in the Institute of Theoretical Computer Science at TU Dresden. Ian Horrocks is a professor in the Department of Computer Science at the University of Oxford. Carsten Lutz is a professor in the Department of Computer Science at the University of Bremen. Uli Sattler is a professor in the Information Management Group within the School of Computer Science at the University of Manchester.

Designed by Zoe Naylor.

Description Logic

An Introduction to

Franz Baader Ian Horrocks Carsten Lutz Uli Sattler

Get 20% Discount:

www.cambridge.org/9780521695428

and enter the code BAADER2017 at the checkout

15

algorithms; 5. Complexity; 6. Reasoning in the ε family of Description Logics; 7. Query answering; 8. Ontology languages and applications; Appendix A. Description Logic terminology; References; Index.

20% Discount

£59.99 £47.99 $79.99 $63.99 £29.99 £23.99 $39.99 $31.99

Hardback 978-0-521-87361-1 Paperback 978-0-521-69542-8

April

228 x 152 mm 260pp 30 b/w illus.