E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model - - PowerPoint PPT Presentation
E XPRESSIVITY L IMITATIONS OF OWL 1 At least one tree-shaped model - - PowerPoint PPT Presentation
E XTENDING L OGIC P ROGRAMMING FOR L IFE S CIENCES A PPLICATIONS Despoina Magka Department of Computer Science, University of Oxford November 16, 2012 B IOINFORMATICS AND S EMANTIC T ECHNOLOGIES Life sciences data deluge 1 B IOINFORMATICS AND S
BIOINFORMATICS AND SEMANTIC TECHNOLOGIES
Life sciences data deluge
1
BIOINFORMATICS AND SEMANTIC TECHNOLOGIES
Life sciences data deluge Hierarchical organisation of biochemical knowledge
1
BIOINFORMATICS AND SEMANTIC TECHNOLOGIES
Life sciences data deluge Hierarchical organisation of biochemical knowledge
1
BIOINFORMATICS AND SEMANTIC TECHNOLOGIES
Life sciences data deluge Hierarchical organisation of biochemical knowledge
1
BIOINFORMATICS AND SEMANTIC TECHNOLOGIES
Life sciences data deluge Hierarchical organisation of biochemical knowledge Fast, automatic and repeatable classification driven by Semantic technologies
1
BIOINFORMATICS AND SEMANTIC TECHNOLOGIES
Life sciences data deluge Hierarchical organisation of biochemical knowledge Fast, automatic and repeatable classification driven by Semantic technologies Web Ontology Language, a W3C standard family
- f logic-based formalisms
1
BIOINFORMATICS AND SEMANTIC TECHNOLOGIES
Life sciences data deluge Hierarchical organisation of biochemical knowledge Fast, automatic and repeatable classification driven by Semantic technologies Web Ontology Language, a W3C standard family
- f logic-based formalisms
OWL bio- and chemo-ontologies widely adopted
1
THE CHEBI ONTOLOGY
OWL ontology Chemical Entities of Biological Interest
2
THE CHEBI ONTOLOGY
OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information
2
THE CHEBI ONTOLOGY
OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information caffeine is a cyclic molecule
2
THE CHEBI ONTOLOGY
OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information serotonin is an organic molecule
2
THE CHEBI ONTOLOGY
OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information ascorbic acid is a carboxylic ester
2
THE CHEBI ONTOLOGY
OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways
2
THE CHEBI ONTOLOGY
OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented
2
THE CHEBI ONTOLOGY
OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented Currently ~30,000 chemical entities, expands at 3,500/yr
2
THE CHEBI ONTOLOGY
OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented Currently ~30,000 chemical entities, expands at 3,500/yr Existing chemical databases describe millions of molecules
2
THE CHEBI ONTOLOGY
OWL ontology Chemical Entities of Biological Interest Dictionary of molecules with taxonomical information Pharmaceutical design and study of biological pathways ChEBI is manually incremented Currently ~30,000 chemical entities, expands at 3,500/yr Existing chemical databases describe millions of molecules Speed up growth by automating chemical classification
2
EXPRESSIVITY LIMITATIONS OF OWL
1 At least one tree-shaped model for each consistent OWL
- ntology problematic representation of cycles
3
EXPRESSIVITY LIMITATIONS OF OWL
1 At least one tree-shaped model for each consistent OWL
- ntology problematic representation of cycles
EXAMPLE
C C C C
3
EXPRESSIVITY LIMITATIONS OF OWL
1 At least one tree-shaped model for each consistent OWL
- ntology problematic representation of cycles
EXAMPLE
Cyclobutane ⊑ ∃(= 4)hasAtom.(Carbon ⊓ ∃(= 2)hasBond.Carbon) C C C C
3
EXPRESSIVITY LIMITATIONS OF OWL
1 At least one tree-shaped model for each consistent OWL
- ntology problematic representation of cycles
EXAMPLE
Cyclobutane ⊑ ∃(= 4)hasAtom.(Carbon ⊓ ∃(= 2)hasBond.Carbon) C C C C
3
EXPRESSIVITY LIMITATIONS OF OWL
1 At least one tree-shaped model for each consistent OWL
- ntology problematic representation of cycles
EXAMPLE
Cyclobutane ⊑ ∃(= 4)hasAtom.(Carbon ⊓ ∃(= 2)hasBond.Carbon) C C C C OWL-based reasoning support
1 Is cyclobutane a cyclic molecule? ✘
3
EXPRESSIVITY LIMITATIONS OF OWL
1 At least one tree-shaped model for each consistent OWL
- ntology problematic representation of cycles
2 No minimality condition on the models hard to axiomatise
classes based on the absence of attributes
EXAMPLE
Cyclobutane ⊑ ∃(= 4)hasAtom.(Carbon ⊓ ∃(= 2)hasBond.Carbon) C C C C OWL-based reasoning support
1 Is cyclobutane a cyclic molecule? ✘
3
EXPRESSIVITY LIMITATIONS OF OWL
1 At least one tree-shaped model for each consistent OWL
- ntology problematic representation of cycles
2 No minimality condition on the models hard to axiomatise
classes based on the absence of attributes
EXAMPLE
Cyclobutane ⊑ ∃(= 4)hasAtom.(Carbon ⊓ ∃(= 2)hasBond.Carbon) C C C C Oxygen OWL-based reasoning support
1 Is cyclobutane a cyclic molecule? ✘
3
EXPRESSIVITY LIMITATIONS OF OWL
1 At least one tree-shaped model for each consistent OWL
- ntology problematic representation of cycles
2 No minimality condition on the models hard to axiomatise
classes based on the absence of attributes
EXAMPLE
Cyclobutane ⊑ ∃(= 4)hasAtom.(Carbon ⊓ ∃(= 2)hasBond.Carbon) C C C C Oxygen OWL-based reasoning support
1 Is cyclobutane a cyclic molecule? ✘ 2 Is cyclobutane a hydrocarbon? ✘
3
EXPRESSIVITY LIMITATIONS OF OWL
1 At least one tree-shaped model for each consistent OWL
- ntology problematic representation of cycles
2 No minimality condition on the models hard to axiomatise
classes based on the absence of attributes
EXAMPLE
Cyclobutane ⊑ ∃(= 4)hasAtom.(Carbon ⊓ ∃(= 2)hasBond.Carbon) C C C C Oxygen
3
EXPRESSIVITY LIMITATIONS OF OWL
1 At least one tree-shaped model for each consistent OWL
- ntology problematic representation of cycles
2 No minimality condition on the models hard to axiomatise
classes based on the absence of attributes
EXAMPLE
Cyclobutane ⊑ ∃(= 4)hasAtom.(Carbon ⊓ ∃(= 2)hasBond.Carbon) C C C C Oxygen Required reasoning support
1 Is cyclobutane a cyclic molecule? 2 Is cyclobutane a hydrocarbon?
3
EXPRESSIVITY LIMITATIONS OF OWL
1 At least one tree-shaped model for each consistent OWL
- ntology problematic representation of cycles
2 No minimality condition on the models hard to axiomatise
classes based on the absence of attributes
EXAMPLE
Cyclobutane ⊑ ∃(= 4)hasAtom.(Carbon ⊓ ∃(= 2)hasBond.Carbon) C C C C Oxygen Required reasoning support
1 Is cyclobutane a cyclic molecule? ✓ 2 Is cyclobutane a hydrocarbon? ✓
3
RESULTS OVERVIEW
1 Expressive and decidable formalism for modelling
structured domains: Description Graphs Logic Programs
4
RESULTS OVERVIEW
1 Expressive and decidable formalism for modelling
structured domains: Description Graphs Logic Programs
2 Acyclicity conditions for existential rules that extend
previously suggested criteria
4
RESULTS OVERVIEW
1 Expressive and decidable formalism for modelling
structured domains: Description Graphs Logic Programs
2 Acyclicity conditions for existential rules that extend
previously suggested criteria
Model-faithful acyclicity: 2EXPTIME-complete to check
4
RESULTS OVERVIEW
1 Expressive and decidable formalism for modelling
structured domains: Description Graphs Logic Programs
2 Acyclicity conditions for existential rules that extend
previously suggested criteria
Model-faithful acyclicity: 2EXPTIME-complete to check Model-summarising acyclicity: EXPTIME-complete to check
4
RESULTS OVERVIEW
1 Expressive and decidable formalism for modelling
structured domains: Description Graphs Logic Programs
2 Acyclicity conditions for existential rules that extend
previously suggested criteria
Model-faithful acyclicity: 2EXPTIME-complete to check Model-summarising acyclicity: EXPTIME-complete to check
3 Implementation that draws upon DLV and performs
structure-based classification with a significant speedup
4
RESULTS OVERVIEW
1 Expressive and decidable formalism for modelling
structured domains: Description Graphs Logic Programs
2 Acyclicity conditions for existential rules that extend
previously suggested criteria
Model-faithful acyclicity: 2EXPTIME-complete to check Model-summarising acyclicity: EXPTIME-complete to check
3 Implementation that draws upon DLV and performs
structure-based classification with a significant speedup
4 Evaluation over part of the manually curated ChEBI
- ntology revealed modelling errors
4
RESULTS OVERVIEW
1 Expressive and decidable formalism for modelling
structured domains: Description Graphs Logic Programs
2 Acyclicity conditions for existential rules that extend
previously suggested criteria
Model-faithful acyclicity: 2EXPTIME-complete to check Model-summarising acyclicity: EXPTIME-complete to check
3 Implementation that draws upon DLV and performs
structure-based classification with a significant speedup
4 Evaluation over part of the manually curated ChEBI
- ntology revealed modelling errors
Language for representing complex objects with a favourable performance/expressivity trade-off
4
CLASSIFYING STRUCTURED OBJECTS
5
CLASSIFYING STRUCTURED OBJECTS
hasAtom single double ascorbicAcid :
1
- 4 o
3 o 7
c
2
- 8
c
9
c
5
- 12
c
11
c
6
- 10
c
13
h
5
CLASSIFYING STRUCTURED OBJECTS
hasAtom single double ascorbicAcid :
1
- 4 o
3 o 7
c
2
- 8
c
9
c
5
- 12
c
11
c
6
- 10
c
13
h
ascorbicAcid(x) →hasAtom(x, f1(x)) ∧ . . . ∧ hasAtom(x, f13(x))
- (f1(x)) ∧ . . . ∧ c(f7(x)) ∧ . . . ∧
single(f1(x), f7(x)) ∧ double(f7(x), f2(x)) ∧ . . .
5
CLASSIFYING STRUCTURED OBJECTS
hasAtom single double ascorbicAcid :
1
- 4 o
3 o 7
c
2
- 8
c
9
c
5
- 12
c
11
c
6
- 10
c
13
h
ascorbicAcid(x) →hasAtom(x, f1(x)) ∧ . . . ∧ hasAtom(x, f13(x))
- (f1(x)) ∧ . . . ∧ c(f7(x)) ∧ . . . ∧
single(f1(x), f7(x)) ∧ double(f7(x), f2(x)) ∧ . . . hasAtom(x, y1) ∧ hasAtom(x, y2) ∧ y1 = y2 → polyatomicEntity(x) ∧5
i=1hasAtom(x, yi) ∧ c(y1) ∧ o(y2) ∧ o(y3)∧
c(y4) ∧ horc(y5) ∧ double(y1, y2)∧ single(y1, y3) ∧ single(y3, y4) ∧ single(y1, y5) → carboxylicEster(x)
5
CLASSIFYING STRUCTURED OBJECTS
hasAtom single double ascorbicAcid :
1
- 4 o
3 o 7
c
2
- 8
c
9
c
5
- 12
c
11
c
6
- 10
c
13
h
Input fact: ascorbicAcid(a) Stable model: ascorbicAcid(a), hasAtom(a, af
i) for 1 ≤ i ≤ 13,
- (af
i) for 1 ≤ i ≤ 6, c(af i) for 7 ≤ i ≤ 12, h(af 13), single(af 8, af 3),
single(af
9, af 4), single(af 12, af i) for i ∈ {5, 11}, single(af 11, af 6),
single(af
10, af i) for i ∈ {1, 9, 11, 13}, single(af 7, af i) for i ∈ {1, 8},
double(af
2, af 7), double(af 8, af 9), horc(af i) for 7 ≤ i ≤ 13,
polyatomicEntity(a), carboxylicEster(a), cyclic(a)
5
CLASSIFYING STRUCTURED OBJECTS
hasAtom single double ascorbicAcid :
1
- 4 o
3 o 7
c
2
- 8
c
9
c
5
- 12
c
11
c
6
- 10
c
13
h
Input fact: ascorbicAcid(a) Stable model: ascorbicAcid(a), hasAtom(a, af
i) for 1 ≤ i ≤ 13,
- (af
i) for 1 ≤ i ≤ 6, c(af i) for 7 ≤ i ≤ 12, h(af 13), single(af 8, af 3),
single(af
9, af 4), single(af 12, af i) for i ∈ {5, 11}, single(af 11, af 6),
single(af
10, af i) for i ∈ {1, 9, 11, 13}, single(af 7, af i) for i ∈ {1, 8},
double(af
2, af 7), double(af 8, af 9), horc(af i) for 7 ≤ i ≤ 13,
polyatomicEntity(a), carboxylicEster(a), cyclic(a) Ascorbic acid is a cyclic polyatomic entity and a carboxylic ester
5
ACYCLICITY CONDITIONS
Rules with function symbols in the head can axiomatise infinitely large structures
6
ACYCLICITY CONDITIONS
Rules with function symbols in the head can axiomatise infinitely large structures Reasoning with unrestricted DGLP ontologies is undecidable
6
ACYCLICITY CONDITIONS
Rules with function symbols in the head can axiomatise infinitely large structures Reasoning with unrestricted DGLP ontologies is undecidable Acyclicity checks are sufficient but not necessary conditions for chase termination
6
ACYCLICITY CONDITIONS
Rules with function symbols in the head can axiomatise infinitely large structures Reasoning with unrestricted DGLP ontologies is undecidable Acyclicity checks are sufficient but not necessary conditions for chase termination Model-faithful and model-summarising acyclicity (MFA and MSA): capture as generally as possible class of programs with models of finite size
6
ACYCLICITY CONDITIONS
Rules with function symbols in the head can axiomatise infinitely large structures Reasoning with unrestricted DGLP ontologies is undecidable Acyclicity checks are sufficient but not necessary conditions for chase termination Model-faithful and model-summarising acyclicity (MFA and MSA): capture as generally as possible class of programs with models of finite size
6
ACYCLICITY CONDITIONS
Rules with function symbols in the head can axiomatise infinitely large structures Reasoning with unrestricted DGLP ontologies is undecidable Acyclicity checks are sufficient but not necessary conditions for chase termination Model-faithful and model-summarising acyclicity (MFA and MSA): capture as generally as possible class of programs with models of finite size Cost for checking MFA and MSA
6
ACYCLICITY CONDITIONS
Rules with function symbols in the head can axiomatise infinitely large structures Reasoning with unrestricted DGLP ontologies is undecidable Acyclicity checks are sufficient but not necessary conditions for chase termination Model-faithful and model-summarising acyclicity (MFA and MSA): capture as generally as possible class of programs with models of finite size Cost for checking MFA and MSA bounded arity no restriction MFA 2EXPTIME-complete 2EXPTIME-complete MSA coNP-complete EXPTIME-complete
6
ACYCLICITY CONDITIONS
Rules with function symbols in the head can axiomatise infinitely large structures Reasoning with unrestricted DGLP ontologies is undecidable Acyclicity checks are sufficient but not necessary conditions for chase termination Model-faithful and model-summarising acyclicity (MFA and MSA): capture as generally as possible class of programs with models of finite size Cost for checking MFA and MSA bounded arity no restriction MFA 2EXPTIME-complete 2EXPTIME-complete MSA coNP-complete EXPTIME-complete Both subsume previously suggested polynomial conditions
6
IMPLEMENTATION
Draws upon DLV, a deductive databases engine
7
IMPLEMENTATION
Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI
7
IMPLEMENTATION
Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs
7
IMPLEMENTATION
Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches:
7
IMPLEMENTATION
Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches:
[Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs
7
IMPLEMENTATION
Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches:
[Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs
Subsumptions exposed by our prototype:
7
IMPLEMENTATION
Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches:
[Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs
Subsumptions exposed by our prototype:
ascorbic acid is a polyatomic entity, a carboxylic ester and a cyclic molecule missing from the ChEBI OWL ontology
7
IMPLEMENTATION
Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches:
[Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs
Subsumptions exposed by our prototype:
ascorbic acid is a polyatomic entity, a carboxylic ester and a cyclic molecule missing from the ChEBI OWL ontology
Contradictory subclass relation from ChEBI:
7
IMPLEMENTATION
Draws upon DLV, a deductive databases engine Evaluation with data extracted from ChEBI 500 molecules under 51 chemical classes in 40 secs Quicker than other approaches:
[Hastings et al., 2010] 140 molecules in 4 hours [Magka et al., 2012] 70 molecules in 450 secs
Subsumptions exposed by our prototype:
ascorbic acid is a polyatomic entity, a carboxylic ester and a cyclic molecule missing from the ChEBI OWL ontology
Contradictory subclass relation from ChEBI:
Ascorbic acid is asserted to be a carboxylic acid (release 95) Not listed among the subsumptions derived by our prototype
7
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules 3 DLV-based implementation exhibits a significant speedup
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors
Language for representing complex objects with a favourable performance/expressivity trade-off
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors
Language for representing complex objects with a favourable performance/expressivity trade-off Future directions
SMILES-based surface syntax
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors
Language for representing complex objects with a favourable performance/expressivity trade-off Future directions
SMILES-based surface syntax ∧5
i=1hasAtom(x, yi) ∧ c(y1) ∧ o(y2) ∧ o(y3) ∧ c(y4)∧
double(y1, y2) ∧ single(y1, y3) ∧ single(y3, y4) ∧ single(y1, y5) → carboxylicEster(x)
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors
Language for representing complex objects with a favourable performance/expressivity trade-off Future directions
SMILES-based surface syntax define carboxylicEster some hasAtom SMILES(C − O − C(= O) − ∗) end.
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors
Language for representing complex objects with a favourable performance/expressivity trade-off Future directions
SMILES-based surface syntax Detect subsumptions between classes
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors
Language for representing complex objects with a favourable performance/expressivity trade-off Future directions
SMILES-based surface syntax Detect subsumptions between classes E.g., Carboxylic ester is an organic molecular entity
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors
Language for representing complex objects with a favourable performance/expressivity trade-off Future directions
SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors
Language for representing complex objects with a favourable performance/expressivity trade-off Future directions
SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes Define a mapping of DGLPs to RDF
8
CONCLUSIONS
Results
1 Expressive and decidable formalism for structured domains 2 Novel acyclicity conditions for existential rules 3 DLV-based implementation exhibits a significant speedup 4 Evaluation over ChEBI ontology revealed modelling errors
Language for representing complex objects with a favourable performance/expressivity trade-off Future directions
SMILES-based surface syntax Detect subsumptions between classes Extensions with numerical datatypes Define a mapping of DGLPs to RDF
Thank you! Questions?!?
8