[PPT] - From standard reasoning problems to non-standard reasoning PowerPoint Presentation

SLIDE 1

From standard reasoning problems to non-standard reasoning problems and

ne step further

  Uli Sattler University of Manchester, University of Oslo  uli.sattler@manchester.ac.uk

SLIDE 2

Some Advertisement

SLIDE 3

Some Advertisement

Cover illustration: The Description Logic logo. Courtesy of Enrico Franconi.

Baader, Horrocks, Lutz, Sattler

An Introduction to Description Logic

Description Logics (DLs) have a long tradition in computer science and knowledge representation, being designed so that domain knowledge can be described and so that computers can reason about this knowledge. DLs have recently gained increased importance since they form the logical basis of widely used ontology languages, in particular the web ontology language OWL. Written by four renowned experts, this is the first textbook on Description Logic. It is suitable for self-study by graduates and as the basis for a university course. Starting from a basic DL, the book introduces the reader to their syntax, semantics, reasoning problems and model theory, and discusses the computational complexity of these reasoning problems and algorithms to solve them. It then explores a variety of reasoning techniques, knowledge-based applications and tools, and describes the relationship between DLs and OWL. Franz Baader is a professor in the Institute of Theoretical Computer Science at TU Dresden. Ian Horrocks is a professor in the Department of Computer Science at the University of Oxford. Carsten Lutz is a professor in the Department of Computer Science at the University of Bremen. Uli Sattler is a professor in the Information Management Group within the School of Computer Science at the University of Manchester.

Designed by Zoe Naylor.

Description Logic

An Introduction to

Franz Baader Ian Horrocks Carsten Lutz Uli Sattler

SLIDE 4

Some Advertisement

Cover illustration: The Description Logic logo. Courtesy of Enrico Franconi.

Baader, Horrocks, Lutz, Sattler

An Introduction to Description Logic

Description Logics (DLs) have a long tradition in computer science and knowledge representation, being designed so that domain knowledge can be described and so that computers can reason about this knowledge. DLs have recently gained increased importance since they form the logical basis of widely used ontology languages, in particular the web ontology language OWL. Written by four renowned experts, this is the first textbook on Description Logic. It is suitable for self-study by graduates and as the basis for a university course. Starting from a basic DL, the book introduces the reader to their syntax, semantics, reasoning problems and model theory, and discusses the computational complexity of these reasoning problems and algorithms to solve them. It then explores a variety of reasoning techniques, knowledge-based applications and tools, and describes the relationship between DLs and OWL. Franz Baader is a professor in the Institute of Theoretical Computer Science at TU Dresden. Ian Horrocks is a professor in the Department of Computer Science at the University of Oxford. Carsten Lutz is a professor in the Department of Computer Science at the University of Bremen. Uli Sattler is a professor in the Information Management Group within the School of Computer Science at the University of Manchester.

Designed by Zoe Naylor.

Description Logic

An Introduction to

Franz Baader Ian Horrocks Carsten Lutz Uli Sattler

Get 20% Discount:

www.cambridge.org/9780521695428

and enter the code BAADER2017 at the checkout

15

algorithms; 5. Complexity; 6. Reasoning in the ε family of Description Logics; 7. Query answering; 8. Ontology languages and applications; Appendix A. Description Logic terminology; References; Index.

20% Discount

£59.99 £47.99 $79.99 $63.99 £29.99 £23.99 $39.99 $31.99

Hardback 978-0-521-87361-1 Paperback 978-0-521-69542-8

April

228 x 152 mm 260pp 30 b/w illus.

SLIDE 5

Standard & Non-Standard Reasoning Problems

SLIDE 6

Standard Reasoning Problems

we all know them: given decide/compute

consistency/satisfiability
subsumption
classification
query answering
…all only involve entailment checks:
…possibly many (classification!)

O !α

C, D, O, T , A, . . .

SLIDE 7

we all know them: given

…involve finding extreme X such that …
subset-minimal or
maximally/minimally strong
…possibly many such Xs

Non-Standard Reasoning Problems

C, D, O, T , A, . . . Justspα, Oq, PinPointpα, Oq, . . . matchpC, P, Oq, unifypP1, P2, Oq, . . . x-modpΣ, Oq, . . . mscpa, Oq, lcspC, D, Oq, . . .

SLIDE 8

we all know them: given

…involve finding extreme X such that …
subset-minimal or
maximally/minimally strong
…possibly many such Xs

Non-Standard Reasoning Problems

C, D, O, T , A, . . . Justspα, Oq, PinPointpα, Oq, . . . matchpC, P, Oq, unifypP1, P2, Oq, . . . x-modpΣ, Oq, . . . mscpa, Oq, lcspC, D, Oq, . . . Are

(conservative) rewritability
(query) inseparability

also standard reasoning problems?

SLIDE 9

(Non-)Standard Reasoning: we know how to

understand problems:

decidability & computational complexity

– worst case – data – parametrised – …

understand solutions:

soundness, completeness, termination
relations between them
complexity

– see above

practicability

– worst case complexity ≠ best case complexity – amenable to optimisation – empirical evaluation

SLIDE 10

fact hermit jfact pellet 10 1000 100000 10000000 10 1000 100000 10000000

Ontology/Reasoner pairs Number of tests

"ST" N*log(N) N^2 ST

An interesting side note

from our empirical evaluation:   how many subsumption does classification involve?

SLIDE 11

Not always that straightforward

Which problem/solution to consider when?
e.g.,
minimal/top/bottom/semantic/…
depends on size, signature, application, …
but we know properties of/relations between solutions
smallest
self-contained
unique
depleting
…
How to measure practicability?
benchmarks, ORE,..

x-modpΣ, Oq, . . .

SLIDE 12

Not always that straightforward

Which problem/solution to consider when?
e.g.,
minimal/top/bottom/semantic/…
depends on size, signature, application, …
but we know properties of/relations between solutions
smallest
self-contained
unique
depleting
…
How to measure practicability?
benchmarks, ORE,..

x-modpΣ, Oq, . . . Extension/variants of DLs

probabilistic
non-monotonic
rules
…

SLIDE 13

Subjective Ontology-Based Problems

SLIDE 14

are problems that are based on

plus
additional parameter(s)

Subjective Ontology-Based Problems

C, D, O, T , A, | =, . . .

SLIDE 15

are problems that are based on

plus
additional parameter(s)

Subjective Ontology-Based Problems

C, D, O, T , A, | =, . . .

SROIQ : ComSubs(C, D, { C v 8R.(A u C), D v 8R.(A u D)})

because objective solution is
not feasible/computable
or makes little sense
e.g. in

SLIDE 16

are problems that are based on

plus
additional parameter(s)

Subjective Ontology-Based Problems

C, D, O, T , A, | =, . . .

“ t @R.A, @R.@R.A, @R.@R.@R.A, . . .u

SROIQ : ComSubs(C, D, { C v 8R.(A u C), D v 8R.(A u D)})

because objective solution is
not feasible/computable
or makes little sense
e.g. in

SLIDE 17

are problems that are based on

plus
additional parameter(s)

Subjective Ontology-Based Problems

C, D, O, T , A, | =, . . .

“ t @R.A, @R.@R.A, @R.@R.@R.A, . . .u

or we want to capture quality criteria
interestingness
readability
relevance …

SROIQ : ComSubs(C, D, { C v 8R.(A u C), D v 8R.(A u D)})

because objective solution is
not feasible/computable
or makes little sense
e.g. in

SLIDE 18

A subjective OB problem: Mining TBox Axioms from KBs

r

Finding Interesting Correlations

SLIDE 19

learn (implicit) correlations in our data
get interesting insights into domain

Mining TBox axioms from KBs

TBox ABox Learner axiom axiom axiom axiom axiom Do not confuse with (exact) learning of TBoxes (via probing queries)

SLIDE 20

Mining TBox axioms from KBs

Correlations in KB = classical machine learning
automatic generation of knowledge from data

– taking background knowledge in KB into account – unbiased: let the data speak! – unsupervised (no positive/negative examples) – Semantic Data Mining

TBox ABox Learner axiom axiom axiom axiom axiom Hypotheses

SLIDE 21

Mining TBox axioms from KBs

Correlations in KB = classical machine learning
automatic generation of knowledge from data

– taking background knowledge in KB into account – unbiased: let the data speak! – unsupervised (no positive/negative examples) – Semantic Data Mining

TBox ABox Learner axiom axiom axiom axiom axiom Hypotheses

SLIDE 22

Mining TBox axioms from KBs

Which kind of hypotheses to capture

correlations in KB?

1. expressive: GCIs, role inclusions
2. readable
3. logically sound
4. statistically sound

axiom axiom axiom axiom

axiom

Hypotheses

SLIDE 23

2. Readable Hypotheses
A hypothesis is

– a small set of short axioms

fewer than axioms
with concepts shorter than

– in a suitable DL: – free of redundancy

no superfluous parts

✓preferred laconic justifications nmax

`max

ALCHI . . . SROIQ

axiom axiom axiom axiom

axiom

Hypotheses

SLIDE 24

3. Logically Sound Hypotheses
A hypothesis H should be

✓ informative:

✓we want to mine new axioms

✓ consistent: ✓ non-redundant among all hypotheses:

there is no

8α 2 H : O 6| = α O [ H 6| = > v ? axiom axiom axiom axiom

axiom

Hypotheses

H0, H 2 H : H 6= H0 and H0 ⌘ H

SLIDE 25

3. Logically Sound Hypotheses
A hypothesis H should be

✓ informative:

✓we want to mine new axioms

✓ consistent: ✓ non-redundant among all hypotheses:

there is no  
Different hypotheses can be compared wrt. their

✓ logical strength:

? maximally strong?

no: overfitting!

? minimally strong?

no: under-fitting

✓ reconciliatory power

brings together terms so

far only loosely related

8α 2 H : O 6| = α O [ H 6| = > v ? axiom axiom axiom axiom

axiom

Hypotheses

H0, H 2 H : H 6= H0 and H0 ⌘ H

SLIDE 26

we need to assess data support of hypothesis
introduce metrics that capture quality of an axiom

– learn from association rule mining (ARM):

count individuals that support a GCI

– count instances, neg instances, non-instances

using standard DL semantics, OWA, TBox, entailments,….
no ‘artificial closure’

– make sure you treat a GCI as an axiom and not as a rule

contrapositive!

– coverage, support, …, lift

4. Statistically Sound Hypotheses

SLIDE 27

Some useful notation:

relativized:
projection tables:
4. Statistically Sound Hypotheses

C1 C2 C3 C4 … Ind1 X X X ? … Ind2 X X … Ind3 ? ? X ? … Ind4 ? ? ? … … … … … … …

Inst(C, O) := {a | O | = C(a)} UnKnpC, Oq :“ InstpJ, OqzpInstpC, Oq Y Instp C, Oqq P(C, O) := # Inst(C, O)/# Inst(>, O)

SLIDE 28

some axiom measures easily adapted from ARM: for a GCI define its metrics as follows:

4. Statistically Sound Hypotheses: Axioms

basic relativized Coverage Support Contradiction Assumption … Confidence Lift …

# Inst(C, O) # Inst(C u D, O) # Inst(C u ¬D, O) # Inst(C, O) ∩ UnKn(D, O) P(C u ¬D, O) P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

where P(X, O) = # Ind(X, O)/# Ind(>, O) C v D

SLIDE 29

4. Statistically Sound Hypotheses: Example

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

relativized Coverage Support Assumption … Confidence Lift

P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

200/400 180/400 180/400 180/400 180/400 180/400 20/400 180/200 180/180 180/180 400/200 400/200 400/180

A v B B v C1 B v C2

SLIDE 30

4. Statistically Sound Hypotheses: Example

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

relativized Coverage Support Assumption … Confidence Lift

P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

200/400 180/400 180/400 180/400 180/400 180/400 20/400 180/200 180/180 180/180 400/200 400/200 400/180

A v B B v C1 B v C2

SLIDE 31

4. Statistically Sound Hypotheses: Example

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

relativized Coverage Support Assumption … Confidence Lift

P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

200/400 180/400 180/400 180/400 180/400 180/400 20/400 180/200 180/180 180/180 400/200 400/200 400/180

A v B B v C1 B v C2

SLIDE 32

4. Statistically Sound Hypotheses: Example

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

relativized Coverage Support Assumption … Confidence Lift

P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

200/400 180/400 180/400 180/400 180/400 180/400 20/400 180/200 180/180 180/180 400/200 400/200 400/180

A v B B v C1 B v C2

SLIDE 33

4. Statistically Sound Hypotheses: Example

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

relativized Coverage Support Assumption … Confidence Lift

P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

0.5 0.45 0.45 0.45 0.45 0.45 0.05 0.45 1 1 2 2 2.22

A v B B v C1 B v C2

SLIDE 34

Oooops!

make sure we treat GCIs as axioms and not as rules

– contrapositive!

so: turn each GCI into equivalent

read below as ‘the resulting LHS’…  read below as ‘the resulting RHS’…

4. Statistically Sound Hypotheses: Axioms

X t ¬Y v Y t ¬X X v Y

C

main relativized Coverage Support Contradiction Assumption … Confidence Lift …

# Inst(C, O) # Inst(C u D, O) # Inst(C u ¬D, O) # Inst(C, O) ∩ UnKn(D, O) P(C u ¬D, O) P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

D

SLIDE 35

Oooops!

make sure we treat GCIs as axioms and not as rules

– contrapositive!

so: turn each GCI into equivalent

read below as ‘the resulting LHS’…  read below as ‘the resulting RHS’…

4. Statistically Sound Hypotheses: Axioms

X t ¬Y v Y t ¬X X v Y

C

main relativized Coverage Support Contradiction Assumption … Confidence Lift …

# Inst(C, O) # Inst(C u D, O) # Inst(C u ¬D, O) # Inst(C, O) ∩ UnKn(D, O) P(C u ¬D, O) P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

D Axiom measures are semantically faithful, i.e., A s s ( A v B , O ) = A s s ( ¬ B v ¬ A , O )

SLIDE 36

Oooops!

make sure we treat GCIs as axioms and not as rules

– contrapositive!

so: turn each GCI into equivalent

read below as ‘the resulting LHS’…  read below as ‘the resulting RHS’…

4. Statistically Sound Hypotheses: Axioms

X t ¬Y v Y t ¬X X v Y

C

main relativized Coverage Support Contradiction Assumption … Confidence Lift …

# Inst(C, O) # Inst(C u D, O) # Inst(C u ¬D, O) # Inst(C, O) ∩ UnKn(D, O) P(C u ¬D, O) P(C u D, O) P(C, O) P(C u D, O)/P(C, O) P(C u D, O)/P(C, O)P(D, O)

D Axiom measures are semantically faithful, i.e., A s s ( A v B , O ) = A s s ( ¬ B v ¬ A , O ) Axiom measures are not semantically faithful, e.g.,   SupportpA Ñ B, Oq ‰ SupportpJ Ñ A \ B, Oq

SLIDE 37

Goal: mine small sets of (short) axioms

more readable
close to what people write
synergy between axioms should lead to better quality
how to measure their qualities?
4. Stat. Sound Hypotheses: Sets of Axioms

SLIDE 38

Goal: learn small sets of (short) axioms

more readable
close to what people write
synergy between axioms should lead to better quality
how to measure their qualities?
…easy:
1. rewrite set into single axiom as usual
2. measure resulting axiom
4. Stat. Sound Hypotheses: Sets of Axioms

SLIDE 39

H1 Coverage 0.5 0.45 0.45 1 always! Support 0.45 0.45 0.45 0.45 min Assumption 0.05 0.55 ? Confidence 0.45 1 1 0.45 support! Lift 2 2 2.22 1 always!

A v B B v C1 B v C2

H1

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

4. Stat. Sound Hypotheses: Sets of Axioms

= {A v B, B v C1} ⌘ {> v (¬A t B) u (¬B t C1)}

SLIDE 40

H1 Coverage 0.5 0.45 0.45 1 always! Support 0.45 0.45 0.45 0.45 min Assumption 0.05 0.55 ? Confidence 0.45 1 1 0.45 support! Lift 2 2 2.22 1 always!

A v B B v C1 B v C2

H1

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

4. Stat. Sound Hypotheses: Sets of Axioms

= {A v B, B v C1} ⌘ {> v (¬A t B) u (¬B t C1)}

SLIDE 41

Goal: learn small sets of (short) axioms

more readable
close to what people write
synergy between axioms should lead to better quality
how to measure their qualities?
sum/average quality of their axioms!
4. Stat. Sound Hypotheses: Sets of Axioms

SLIDE 42

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

H1 = {A v B, B v C1}

4. Stat. Sound Hypotheses: Sets of Axioms

H1 H2 Coverage 0.5 0.45 0.45 0.475? 0.475? Support 0.45 0.45 0.45 0.45 0.45 Assumption 0.05 0.05 0.05 Confidence 0.45 1 1 ? ? Lift 2 2 2.22 ? ?

A v B B v C1 B v C2

SLIDE 43

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

H1 = {A v B, B v C1}

4. Stat. Sound Hypotheses: Sets of Axioms

H1 H2 Coverage 0.5 0.45 0.45 0.475? 0.475? Support 0.45 0.45 0.45 0.45 0.45 Assumption 0.05 0.05 0.05 Confidence 0.45 1 1 ? ? Lift 2 2 2.22 ? ?

A v B B v C1 B v C2

SLIDE 44

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

H1 H2 Coverage 0.5 0.45 0.45 0.475? 0.475? Support 0.45 0.45 0.45 0.45 0.45 Assumption 0.05 0.05 0.05 Confidence 0.45 1 1 ? ? Lift 2 2 2.22 ? ?

A v B B v C1 B v C2 = {A v B, B v C2}

H2 H1 = {A v B, B v C1}

4. Stat. Sound Hypotheses: Sets of Axioms

SLIDE 45

A B C1 C2 … Ind1 X X X X … … … … … … … Ind180 X X X X … Ind181 X ? X ? … … … … … … … Ind200 X ? X ? … Ind201 ? ? ? ? … … … … … … … Ind400 ? ? ? ? …

H1 H2 Coverage 0.5 0.45 0.45 0.475? 0.475? Support 0.45 0.45 0.45 0.45 0.45 Assumption 0.05 0.05 0.05 Confidence 0.45 1 1 ? ? Lift 2 2 2.22 ? ?

A v B B v C1 B v C2 = {A v B, B v C2}

H2 H1 = {A v B, B v C1}

4. Stat. Sound Hypotheses: Sets of Axioms

SLIDE 46

Goal: learn small sets of (short) axioms

more readable
close to what people write
synergy between axioms should lead to better quality
how to measure their qualities?
observe that a good hypothesis
allows us to shrink our ABox since it
captures recurring patterns
(minimum description length induction)
4. Stat. Sound Hypotheses: Sets of Axioms

SLIDE 47

Goal: learn small sets of (short) axioms

more readable
close to what people write
synergy between axioms should lead to better quality
how to measure their qualities?
observe that a good hypothesis
allows us to shrink our ABox since it
captures recurring patterns
use this shrinkage factor to measure a hypothesis’
fitness - support by data
braveness - number of assumptions
4. Stat. Sound Hypotheses: Sets of Axioms

SLIDE 48

Capturing shrinkage…for fitness

Fix a finite set of

– concepts , closed under negation – roles

C R

SLIDE 49

Capturing shrinkage…for fitness

Fix a finite set of

– concepts , closed under negation – roles

C R

π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

Define a projection:

SLIDE 50

Capturing shrinkage…for fitness

Fix a finite set of

– concepts , closed under negation – roles

C R

dLen(A, O) = min{`(A0) | A0 ∪ O ≡ A ∪ O}

For an ABox, define its description length:

π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

Define a projection:

SLIDE 51

Capturing shrinkage…for fitness

Fix a finite set of

– concepts , closed under negation – roles

C R

dLen(A, O) = min{`(A0) | A0 ∪ O ≡ A ∪ O}

For an ABox, define its description length:
Define the fitness of a hypothesis H:

fitn(H, O, C, R) = dLen(π(O, C, R), T )− dLen(π(O, C, R), T ∪ H) π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

Define a projection:

SLIDE 52

Capturing shrinkage…for braveness

π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

Define a projection:
Fix a finite set of

– concepts , closed under negation – roles

C R

SLIDE 53

Capturing shrinkage…for braveness

Define a hypothesis’ assumptions:

Ass(O, H, C, R) = π(O ∪ H, C, R) \ π(O, C, R) π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

Define a projection:
Fix a finite set of

– concepts , closed under negation – roles

C R

SLIDE 54

Capturing shrinkage…for braveness

Define a hypothesis’ assumptions:

Ass(O, H, C, R) = π(O ∪ H, C, R) \ π(O, C, R)

Define the braveness of a hypothesis H:

brave(H, O, C, R) = dLen(Ass(O, H, C, R), O) π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

Define a projection:
Fix a finite set of

– concepts , closed under negation – roles

C R

SLIDE 55

Capturing shrinkage…for braveness

Define a hypothesis’ assumptions:

Ass(O, H, C, R) = π(O ∪ H, C, R) \ π(O, C, R)

Define the braveness of a hypothesis H:

brave(H, O, C, R) = dLen(Ass(O, H, C, R), O) π(O, C, R) = {C(a) | O | = C(a) ∧ C ∈ C} ∪ {R(a, b) | O | = C(a) ∧ R ∈ R}

Define a projection:

Axiom set measures are semantically faithful, i.e., H ≡ H0 ⇒ fitn(H, O, C, R) = fitn(H0, O, C, R) brave(H, O, C, R) = brave(H0, O, C, R)

Fix a finite set of

– concepts , closed under negation – roles

C R

SLIDE 56

A B C1 C2 Ind1 X X X X … … … … … Ind180 X X X X Ind181 X ? X ? … … … … … Ind200 X ? X ? Ind201 ? ? ? ? … … … … … Ind400 ? ? ? ?

= {A v B, B v C2}

H2 H1 = {A v B, B v C1}

4. Stat. Sound Hypotheses: Sets of Axioms

fitn(H1, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H1) = 760 − 380 = 380 fitn(H2, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H2) = 760 − 400 = 360 brave(H1, A, . . .) = dLen(Ass(A, H1, . . .), A) = 20 brave(H2, A, . . .) = dLen(Ass(A, H2, . . .), A) = 40

SLIDE 57

A B C1 C2 Ind1 X X X X … … … … … Ind180 X X X X Ind181 X ? X ? … … … … … Ind200 X ? X ? Ind201 ? ? ? ? … … … … … Ind400 ? ? ? ?

= {A v B, B v C2}

H2 H1 = {A v B, B v C1}

4. Stat. Sound Hypotheses: Sets of Axioms

fitn(H1, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H1) = 760 − 380 = 380 fitn(H2, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H2) = 760 − 400 = 360 brave(H1, A, . . .) = dLen(Ass(A, H1, . . .), A) = 20 brave(H2, A, . . .) = dLen(Ass(A, H2, . . .), A) = 40

SLIDE 58

A B C1 C2 Ind1 X X X X … … … … … Ind180 X X X X Ind181 X ? X ? … … … … … Ind200 X ? X ? Ind201 ? ? ? ? … … … … … Ind400 ? ? ? ?

= {A v B, B v C2}

H2 H1 = {A v B, B v C1}

4. Stat. Sound Hypotheses: Sets of Axioms

fitn(H1, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H1) = 760 − 380 = 380 fitn(H2, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H2) = 760 − 400 = 360 brave(H1, A, . . .) = dLen(Ass(A, H1, . . .), A) = 20 brave(H2, A, . . .) = dLen(Ass(A, H2, . . .), A) = 40

SLIDE 59

A B C1 C2 Ind1 X X X X … … … … … Ind180 X X X X Ind181 X ? X ? … … … … … Ind200 X ? X ? Ind201 ? ? ? ? … … … … … Ind400 ? ? ? ?

= {A v B, B v C2}

H2 H1 = {A v B, B v C1}

4. Stat. Sound Hypotheses: Sets of Axioms

fitn(H1, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H1) = 760 − 380 = 380 fitn(H2, A, . . .) = dLen(π(A, . . .), ∅) − dLen(π(A, . . .), H2) = 760 − 400 = 360 brave(H1, A, . . .) = dLen(Ass(A, H1, . . .), A) = 20 brave(H2, A, . . .) = dLen(Ass(A, H2, . . .), A) = 40

H1 >> H2

SLIDE 60

Example: empty TBox, ABox X A, B A, C A, C RR A, C R B R

A

4. Stat. Sound Hypotheses: Sets of Axioms

fitn({X v 8R.A}, A, . . .) = dLen(π(A, . . .), ;) dLen(π(A, . . .), {X v 8R.A}) = 12 9 = 3 brave({X v 8R.A}, A, . . .) = dLen(Ass(A, {X v 8R.A}, . . .), A) = 1

SLIDE 61

Example: empty TBox, ABox X A, B A, C A, C RR A, C R B R

A

4. Stat. Sound Hypotheses: Sets of Axioms

fitn({X v 8R.A}, A, . . .) = dLen(π(A, . . .), ;) dLen(π(A, . . .), {X v 8R.A}) = 12 9 = 3 brave({X v 8R.A}, A, . . .) = dLen(Ass(A, {X v 8R.A}, . . .), A) = 1

SLIDE 62

Example: empty TBox, ABox X A, B A, C A, C RR A, C R B R

A

4. Stat. Sound Hypotheses: Sets of Axioms

fitn({X v 8R.A}, A, . . .) = dLen(π(A, . . .), ;) dLen(π(A, . . .), {X v 8R.A}) = 12 9 = 3 brave({X v 8R.A}, A, . . .) = dLen(Ass(A, {X v 8R.A}, . . .), A) = 1

SLIDE 63

phew…

SLIDE 64

Remember:

TBox ABox Learner axiom axiom axiom axiom axiom Hypotheses we wanted to mine axioms!

SLIDE 65

H ≡ H0 ⇒ fitn(O, H, C, R) = fitn(O, H0, C, R)

(Sets of) axioms as Hypotheses
Loads of measures to capture
1. axiom hypothesis’ coverage, support, assumption, lift, …
2. set of axioms hypothesis fitness, braveness
with a focus of a concept/role spaces ,
What are their properties?

– semantically faithful:         …

Can we compute these measure?

– easy for (1), tricky for (2):

So, what have we got?

C R

dLen(A, O) = min{`(A0) | A0 ∪ O ≡ A ∪ O}

O | = H ⇒ Ass(O, H, C, R) = 0 TBox ABox

Learner

Hypotheses

axiom(s) m1,m2,m3,… axiom(s) m1,m2,m3,… axiom(s) m1,m2,m3,… axiom(s) m1,m2,m3,… axiom(s) m1,m2,m3,…

? ?

SLIDE 66

H ≡ H0 ⇒ fitn(O, H, C, R) = fitn(O, H0, C, R)

(Sets of) axioms as Hypotheses
Loads of measures to capture
1. axiom hypothesis’ coverage, support, assumption, lift, …
2. set of axioms hypothesis fitness, braveness
with a focus of a concept/role spaces ,
What are their properties?

– semantically faithful:         …

Can we compute these measure?

– easy for (1), tricky for (2):

So, what have we got?

C R

dLen(A, O) = min{`(A0) | A0 ∪ O ≡ A ∪ O}

O | = H ⇒ Ass(O, H, C, R) = 0

SLIDE 67

So, what have we got? (2)

If we can compute measure, how feasible is this?
If “feasible”,

– do these measures correlate? – how independent are they?

For which DLs & inputs can we create & evaluate hypotheses?
Which measures indicate interesting hypothesis?
What is the shape for interesting hypothesis?

– are longer/bigger hypotheses better?

What do we do with them?

– how do we guide users through these?

SLIDE 68

Slava implements: DL Miner

Ontology Cleaner O Hypothesis Constructor L, Σ Hypothesis Evaluator Q Hypothesis Sorter rf(H) H qf(H, q)

TBox ABox

axiom( m1,m2,m3 axiom(

m1,m2,m3

axiom( m1,m2,m3 axiom( m1,m2,m3

axiom(s) m1,m2,m3,…

parameters

SLIDE 69

Slava implements: DL Miner

Ontology Cleaner O Hypothesis Constructor L, Σ Hypothesis Evaluator Q Hypothesis Sorter rf(H) H qf(H, q)

TBox ABox

axiom( m1,m2,m3 axiom(

m1,m2,m3

axiom( m1,m2,m3 axiom( m1,m2,m3

axiom(s) m1,m2,m3,…

parameters

Subjective Solution

SLIDE 70

Easy:

construct all concepts C1, C2, …

– finitely many thanks to language bias

check for each whether it’s logically ok:

– –

if yes, add it to

remove redundant hypotheses from H

DL Miner: Hypothesis Constructor

L Ci v Cj

O [ {Ci v Cj} 6| = > v ? O 6| = Ci v Cj

H

SLIDE 71

Easy:

construct all concepts C1, C2, …

– finitely many thanks to language bias

check for each whether it’s logically ok:

– –

if yes, add it to

remove redundant hypotheses from H

DL Miner: Hypothesis Constructor

L Ci v Cj

O [ {Ci v Cj} 6| = > v ? O 6| = Ci v Cj

H Bonkers! Even for EL, 100 concept/role names 4 max length of concepts Ci ~100,000,000 concepts Ci ~100,000,0002 GCIs to test

SLIDE 72

Easy:

construct all concepts C1, C2, …

– finitely many thanks to language bias

check for each whether it’s logically ok:

– –

if yes, add it to

remove redundant hypotheses from H

DL Miner: Hypothesis Constructor

L Ci v Cj

O [ {Ci v Cj} 6| = > v ? O 6| = Ci v Cj

H Bonkers! Even for EL, 100 concept/role names 4 max length of concepts Ci ~100,000,000 concepts Ci ~100,000,0002 GCIs to test Bonkers! Even for EL, n concept/role names k max length of concepts Ci nk concepts Ci n2k GCIs to test

SLIDE 73

Use a refinement operator to build Ci informed by ABox

– used in concept learning, conceptual blending

Given a logic , define a refinement operator as

– a function such that,   for each

A refinement operator is

– proper if, for all – complete if, for all – suitable if, for all

DL Miner: Hypothesis Constructor

L

ρ : Conc(L) 7! P(Conc(L)) C 2 L, C0 2 ρ(C) : C0 v C C 2 L, C0 2 ρ(C) : C0 6⌘ C C P L there is n, C1 P ⇢npJq : C1 ” C and `pC1q § `pCq C, C1 P L : if C1 à C then there is some n, C2 ” C with C1 P ρnpC2q

SLIDE 74

Use a refinement operator to build Ci informed by ABox

– used in concept learning, conceptual blending

Given a logic , define a refinement operator as

– a function such that,   for each

A refinement operator is

– proper if, for all – complete if, for all – suitable if, for all

DL Miner: Hypothesis Constructor

L

ρ : Conc(L) 7! P(Conc(L)) C 2 L, C0 2 ρ(C) : C0 v C C 2 L, C0 2 ρ(C) : C0 6⌘ C C P L there is n, C1 P ⇢npJq : C1 ” C and `pC1q § `pCq

Great: there are known refinement

perators (proper, complete, suitable,…)

for ALC [LehmHitzler2010]

C, C1 P L : if C1 à C then there is some n, C2 ” C with C1 P ρnpC2q

SLIDE 75

DL Miner: Concept Constructor

Algorithm 8 DL-Apriori (O, Σ, DL, `max, pmin)

1: inputs 2:

O := T [ A: an ontology

3:

Σ: a finite set of terms such that > 2 Σ

4:

DL: a DL for concepts

5:

`max: a maximal length of a concept such that 1  `max < 1

6:

pmin: a minimal concept support such that 0 < pmin  |in(O)|

7: outputs 8:

C: the set of suitable concepts

9: do 10:

C ; % initialise the final set of suitable concepts

11:

D {>} % initialise the set of concepts yet to be specialised

12:

⇢ getOperator(DL) % initialise a suitable operator ⇢ for DL

13:

while D 6= ; do

14:

C pick(D) % pick a concept C to be specialised

15:

D D\{C} % remove C from the concepts to be specialised

16:

C C [ {C} % add C to the final set

17:

⇢C specialise(C, ⇢, Σ, `max) % specialise C using ⇢

18:

DC {D 2 urc(⇢C) | @D0 2 C [ D : D0 ⌘ D} % discard variations

19:

D D [ {D 2 DC | p(D, O) pmin} % add suitable specialisations

20:

end while

21:

return C

SLIDE 76

DL Miner: Concept Constructor

Algorithm 8 DL-Apriori (O, Σ, DL, `max, pmin)

1: inputs 2:

O := T [ A: an ontology

3:

Σ: a finite set of terms such that > 2 Σ

4:

DL: a DL for concepts

5:

`max: a maximal length of a concept such that 1  `max < 1

6:

pmin: a minimal concept support such that 0 < pmin  |in(O)|

7: outputs 8:

C: the set of suitable concepts

9: do 10:

C ; % initialise the final set of suitable concepts

11:

D {>} % initialise the set of concepts yet to be specialised

12:

⇢ getOperator(DL) % initialise a suitable operator ⇢ for DL

13:

while D 6= ; do

14:

C pick(D) % pick a concept C to be specialised

15:

D D\{C} % remove C from the concepts to be specialised

16:

C C [ {C} % add C to the final set

17:

⇢C specialise(C, ⇢, Σ, `max) % specialise C using ⇢

18:

DC {D 2 urc(⇢C) | @D0 2 C [ D : D0 ⌘ D} % discard variations

19:

D D [ {D 2 DC | p(D, O) pmin} % add suitable specialisations

20:

end while

21:

return C

specialise concepts  

nly if they have

instances!

`max

SLIDE 77

DL Miner: Concept Constructor

Algorithm 8 DL-Apriori (O, Σ, DL, `max, pmin)

1: inputs 2:

O := T [ A: an ontology

3:

Σ: a finite set of terms such that > 2 Σ

4:

DL: a DL for concepts

5:

`max: a maximal length of a concept such that 1  `max < 1

6:

pmin: a minimal concept support such that 0 < pmin  |in(O)|

7: outputs 8:

C: the set of suitable concepts

9: do 10:

C ; % initialise the final set of suitable concepts

11:

D {>} % initialise the set of concepts yet to be specialised

12:

⇢ getOperator(DL) % initialise a suitable operator ⇢ for DL

13:

while D 6= ; do

14:

C pick(D) % pick a concept C to be specialised

15:

D D\{C} % remove C from the concepts to be specialised

16:

C C [ {C} % add C to the final set

17:

⇢C specialise(C, ⇢, Σ, `max) % specialise C using ⇢

18:

DC {D 2 urc(⇢C) | @D0 2 C [ D : D0 ⌘ D} % discard variations

19:

D D [ {D 2 DC | p(D, O) pmin} % add suitable specialisations

20:

end while

21:

return C

specialise concepts  

nly if they have

instances!

`max

Don’t even construct most of the nk concepts Ci

SLIDE 78

Slava implements: DL Miner

Ontology Cleaner O Hypothesis Constructor L, Σ Hypothesis Evaluator Q Hypothesis Sorter rf(H) H qf(H, q)

TBox ABox

axiom( m1,m2,m3 axiom(

m1,m2,m3

axiom( m1,m2,m3 axiom( m1,m2,m3

axiom(s) m1,m2,m3,…

parameters DL-Apriori (·) buildRolesTopDown(·) generateHypotheses(·)

SLIDE 79

Slava implements: DL Miner

Ontology Cleaner O Hypothesis Constructor L, Σ Hypothesis Evaluator Q Hypothesis Sorter rf(H) H qf(H, q)

TBox ABox

axiom( m1,m2,m3 axiom(

m1,m2,m3

axiom( m1,m2,m3 axiom( m1,m2,m3

axiom(s) m1,m2,m3,…

parameters DL-Apriori (·) buildRolesTopDown(·) generateHypotheses(·)

Complete (for the parameters provided).

SLIDE 80

DL Miner: Hypothesis Evaluator

Relatively straightforward for axiom measures

– hard test case for instance retrieval

Hard for set-of-axiom measures (fitness & braveness)

– due to – DL Miner implements an approximation that

identifies redundant assertions in ABox

does consider 1-step interactions between individuals
ignores ‘longer’ interactions
underestimates fitness, overestimates braveness

– great test case for incremental reasoning: Pellet! dLen(A, O) = min{`(A0) | A0 ∪ O ≡ A ∪ O}

dLen∗(A, O) = `(A) − `(Redundt(A, O))

SLIDE 81

DL Miner: Hypothesis Sorter

Last step in DL Miner’s workflow
Easy:

– throw away all hypotheses that are dominated by another one – i.e., compute the Pareto front wrt the measures provided

SLIDE 82

DL Miner: Example

Woman u 9hasChild.> v Mother Man u 9hasChild.> v Father 9hasChild.> v 9marriedTo.> 9marriedTo.> v 9hasChild.> 9marriedTo.Woman v Man 9marriedTo.Mother v Father Father v 9marriedTo.(9hasChild.>) ( Mother v 9marriedTo.(9hasChild.>) ( 9hasChild.> v Mother t Father 9hasChild.> v Man t Woman 9hasChild.> v Father t Woman

Given a Kinship Ontology,1 it mines 536 Hs with   confidence above 0.9, e.g. TBox ABox

DL Miner

1. adapted from

UCI Machine Learning Repository

SLIDE 83

DL Miner: Example

Woman u 9hasChild.> v Mother Man u 9hasChild.> v Father 9hasChild.> v 9marriedTo.> 9marriedTo.> v 9hasChild.> 9marriedTo.Woman v Man 9marriedTo.Mother v Father Father v 9marriedTo.(9hasChild.>) ( Mother v 9marriedTo.(9hasChild.>) ( 9hasChild.> v Mother t Father 9hasChild.> v Man t Woman 9hasChild.> v Father t Woman

Given a Kinship Ontology,1 it mines 536 Hs with   confidence above 0.9, e.g. TBox ABox

DL Miner

Great - it works really well

n a toy ontology!
1. adapted from

UCI Machine Learning Repository

SLIDE 84

Still: many open questions

If we can compute measure, how feasible is this?
If “feasible”,

– do these measures correlate? – how independent are they?

For which DLs & inputs can we create & evaluate hypotheses?
Which measures indicate interesting hypothesis?
What is the shape of interesting hypothesis?

– are longer/bigger hypotheses better?

What do we do with them?

– how do we guide users through these?

SLIDE 85

Design, run, analyse experiments

SLIDE 86

Design, run, analyse experiments

A corpus - or two:
1. handpicked corpus from related work: 16 ontologies
2. principled one:
All BioPortal ontologies with >= 100 individuals and

>= 100 RAs 21 ontologies 

SLIDE 87

Design, run, analyse experiments

A corpus - or two:
1. handpicked corpus from related work: 16 ontologies
2. principled one:
All BioPortal ontologies with >= 100 individuals and

>= 100 RAs 21 ontologies 

Settings for hypothesis parameters:

– is SHI

– RIAs with inverse, composition

– minsupport = 10 – max concept length in GCIs = 4

L

SLIDE 88

Design, run, analyse experiments

A corpus - or two:
1. handpicked corpus from related work: 16 ontologies
2. principled one:
All BioPortal ontologies with >= 100 individuals and

>= 100 RAs 21 ontologies 

Settings for hypothesis parameters:

– is SHI

– RIAs with inverse, composition

– minsupport = 10 – max concept length in GCIs = 4

generate & evaluate up to 500 hypotheses per ontology

L

SLIDE 89

Design, run, analyse experiments

What kind of axioms do people write?

– re. readability of hypotheses: – what kind of axioms should we roughly aim for?

mean mode 5% 25% 50% 75% 95% 99% 99.9% length 2.63 3 2 2 3 3 3 3 5 depth 0.69 1 1 1 1 1 3

Length & role depth of axioms in Bioportal - Taxonomies

DL constructor C 9R.C C u D 8R.C C t D ¬C Axioms, % 99.73 67.82 1.15 0.46 0.09 0.01

Use of DL constructors in Bioportal - Taxonomies

SLIDE 90

Design, run, analyse experiments

What kind of axioms do people write?

– re. readability of hypotheses: – what kind of axioms should we roughly aim for?

mean mode 5% 25% 50% 75% 95% 99% 99.9% length 2.63 3 2 2 3 3 3 3 5 depth 0.69 1 1 1 1 1 3

Length & role depth of axioms in Bioportal - Taxonomies

DL constructor C 9R.C C u D 8R.C C t D ¬C Axioms, % 99.73 67.82 1.15 0.46 0.09 0.01

Use of DL constructors in Bioportal - Taxonomies

Restricting length of   concepts in axioms to 4 (axioms to 8) is fine!

SLIDE 91

How do the measures correlate?

Design, run, analyse experiments

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(a) Handpicked corpus

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(b) Principled corpus

SLIDE 92

How do the measures correlate?

Design, run, analyse experiments

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(a) Handpicked corpus

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(b) Principled corpus

SLIDE 93

How do the measures correlate?

Design, run, analyse experiments

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(a) Handpicked corpus

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(b) Principled corpus

SLIDE 94

How do the measures correlate?

Design, run, analyse experiments

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(a) Handpicked corpus

BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH BSUPP BASSUM BCONF BLIFT BCONVN BCONVQ SUPP ASSUM CONF LIFT CONVN CONVQ CONTR FITN BRAV COMPL DISSIM LENGTH DEPTH

(b) Principled corpus

SLIDE 95

Design, run, analyse experiments

10 20 30 40 50 60 70 OC HC HP HE

Runtime (%)

Handpicked Principled

How feasible is hypothesis mining?

Parsing &   Classification Hypothesis Construction Preparatory

Ent. Checks

Hypothesis Evaluation

SLIDE 96

Design, run, analyse experiments

10 20 30 40 50 60 70 OC HC HP HE

Runtime (%)

Handpicked Principled

How feasible is hypothesis mining? Works fine for classifiable ontologies.

Incremental Reasoning in Pellet works very well for ABoxes

Parsing &   Classification Hypothesis Construction Preparatory

Ent. Checks

Hypothesis Evaluation

SLIDE 97

Design, run, analyse experiments

5 10 15 20 25 30 35 40 AXM1 AXM2 FITN BRAV CONS INFOR STREN REDUN DISSIM COMPL

Runtime (%)

Handpicked Principled

How costly are the different measures?

SLIDE 98

Design, run, analyse experiments

5 10 15 20 25 30 35 40 AXM1 AXM2 FITN BRAV CONS INFOR STREN REDUN DISSIM COMPL

Runtime (%)

Handpicked Principled

How costly are the different measures? Consistency is the most costly measure

SLIDE 99

But - what about the semantic mining?

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

SLIDE 100

So, what have we got? (new version)

✓ Loads of measures to capture aspects of hypotheses

– mostly independent – some superfluous on positive data (unsurprisingly)

✓ Hypothesis generation & evaluation is feasible

– provided our ontology is classifiable – provided our search space isn’t too massive

…focus!
Which measures indicate interesting hypothesis?
What is the shape for interesting hypothesis?

– are longer/bigger hypotheses better?

What do we do with them?

– how do we guide users through these?

SLIDE 101

Design, run, analyse survey

Can we learn hypotheses are

usefull/interesting?

…and how does this correlate with measures? TBox SROIQ sig = 522

ABox 169K CAs 405K RAs

DL Miner

S1: 60 Hypos unfocused S2: 60 Hypos focused

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

SHI |Ci| <= 4

SLIDE 102

Design, run, analyse survey

Can we learn hypotheses are

usefull/interesting?

…and how does this correlate with measures? TBox SROIQ sig = 522

ABox 169K CAs 405K RAs

DL Miner

S1: 60 Hypos unfocused S2: 60 Hypos focused

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

SHI |Ci| <= 4

30 high-confidence 30 low-confidence

SLIDE 103

Design, run, analyse survey

Can we learn hypotheses are

usefull/interesting?

…and how does this correlate with measures? TBox SROIQ sig = 522

ABox 169K CAs 405K RAs

DL Miner

S1: 60 Hypos unfocused S2: 60 Hypos focused

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

SHI |Ci| <= 4

Valid? Interesting? 30 high-confidence 30 low-confidence

SLIDE 104

Design, run, analyse survey

Validity Interestingness 1 2 3 4 Wrong 6 11 30

Survey 1

Don’t know

1
2

4 (unfocused) Correct

6
Wrong

1

1
5

Survey 2 Don’t know

49

(focused) Correct

4

How good/valid are the mined hypotheses?

SLIDE 105

Design, run, analyse survey

Validity Interestingness 1 2 3 4 Wrong 6 11 30

Survey 1

Don’t know

1
2

4 (unfocused) Correct

6
Wrong

1

1
5

Survey 2 Don’t know

49

(focused) Correct

4

How good/valid are the mined hypotheses?

SLIDE 106

Design, run, analyse survey

(d) Survey 2: Interestingness

SLIDE 114

What we learned: 3 kinds of hypotheses!

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

high confidence/lift/… low assumptions/braveness

1. An interesting hypothesis can give

new insights into domain

SLIDE 115

Semantic Mining

What we learned: 3 kinds of hypotheses!

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

high confidence/lift/… low assumptions/braveness

1. An interesting hypothesis can give

new insights into domain

SLIDE 116

What we learned: 3 kinds of hypotheses!

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

2. An interesting hypothesis can reveal

axioms missing from TBox

!!!

SLIDE 117

What we learned: 3 kinds of hypotheses!

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

2. An interesting hypothesis can reveal

axioms missing from TBox TBox completion

ntology learning from data

!!!

SLIDE 118

What we learned: 3 kinds of hypotheses!

3. An interesting hypothesis can reveal

bias & errors in the ontology

high confidence/lift/… low assumptions/braveness

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

SLIDE 119

What we learned: 3 kinds of hypotheses!

3. An interesting hypothesis can reveal

bias & errors in the ontology

high confidence/lift/… low assumptions/braveness

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

???

Semantic Data Analysis

SLIDE 120

3 kinds of hypotheses - can we predict?

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3, high confidence/lift/… low assumptions/braveness

No - they look alike

SLIDE 121

3 kinds of hypotheses - can we predict?

TBox ABox

DL Miner

Hypotheses

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3,

axiom(s)

m1,m2,m3, high confidence/lift/… low assumptions/braveness

No - they look alike Perhaps - with different ABoxes/other sources

SLIDE 122

Summary & Outlook

Mining rich axioms from ontologies is possible

– gives us more than we thought – expressive axioms are better!

Fine test case for incremental/ABox reasoning
More surveys

– to better understand relevance of metrics – but we’ve got the shape now

Redundancy in general is tricky & costly

– stripping superfluous parts from concepts, (sets of) axioms

We need even better refinement operators:

– for more expressive DLs – redundancy-free – ontology-aware

SLIDE 123

Subjective ontology-based problems

are great fun

– design of experiments & surveys – but also rather complex: sooo many design choices

specifying & implementing good parameters is tricky

– metrics make “ontology mining” subjective – requires understanding of logic & reasoners & …

are plentiful/numerous

– abduction – similarity – good explanations/proofs for entailments justifications – good counter-models for non-entailments – good repair of inconsistent/incoherent ontologies – …

SLIDE 124

Special Thanks to Slava Sazonau

SLIDE 125

Some Advertisement

Cover illustration: The Description Logic logo. Courtesy of Enrico Franconi.

Baader, Horrocks, Lutz, Sattler

An Introduction to Description Logic

Description Logics (DLs) have a long tradition in computer science and knowledge representation, being designed so that domain knowledge can be described and so that computers can reason about this knowledge. DLs have recently gained increased importance since they form the logical basis of widely used ontology languages, in particular the web ontology language OWL. Written by four renowned experts, this is the first textbook on Description Logic. It is suitable for self-study by graduates and as the basis for a university course. Starting from a basic DL, the book introduces the reader to their syntax, semantics, reasoning problems and model theory, and discusses the computational complexity of these reasoning problems and algorithms to solve them. It then explores a variety of reasoning techniques, knowledge-based applications and tools, and describes the relationship between DLs and OWL. Franz Baader is a professor in the Institute of Theoretical Computer Science at TU Dresden. Ian Horrocks is a professor in the Department of Computer Science at the University of Oxford. Carsten Lutz is a professor in the Department of Computer Science at the University of Bremen. Uli Sattler is a professor in the Information Management Group within the School of Computer Science at the University of Manchester.

Designed by Zoe Naylor.

Description Logic

An Introduction to

Franz Baader Ian Horrocks Carsten Lutz Uli Sattler

Get 20% Discount:

www.cambridge.org/9780521695428

and enter the code BAADER2017 at the checkout

15

algorithms; 5. Complexity; 6. Reasoning in the ε family of Description Logics; 7. Query answering; 8. Ontology languages and applications; Appendix A. Description Logic terminology; References; Index.

20% Discount

£59.99 £47.99 $79.99 $63.99 £29.99 £23.99 $39.99 $31.99

Hardback 978-0-521-87361-1 Paperback 978-0-521-69542-8

April

228 x 152 mm 260pp 30 b/w illus.