Language Technologies and the Semantic Web: An Essential - - PowerPoint PPT Presentation
Language Technologies and the Semantic Web: An Essential - - PowerPoint PPT Presentation
Language Technologies and the Semantic Web: An Essential Relationship. Enrico Motta Professor of Knowledge Technologies Knowledge Media Institute The Open University Content of the Talk Update on the Semantic Web Beyond the hype
Content of the Talk
- Update on the Semantic Web
– Beyond the hype
- What it is
- Why it is interesting
- What’s its status?
- Semantic Web and AI
- Semantic Web Applications
– Key features – Reasoning on the Semantic Web – Key role of Language Technologies
- Conclusions
The Semantic Web in 2 minutes…
< foa f :Pe r son rd f : a bou t= "h t t p : / / i den t i f i e rs . km i .open .ac .uk/ peop le /en r i co
- mo
t t a / "> < foa f : name>Enr i c
- Mo
t ta< / f
- a
f : name> < foa f : f i r s tName>Enr i co< / f
- a
f : f i r s tName> < foa f : sur name>Mo t ta< / f
- a
f : su rname> < foa f : phone rd f : r e sou rce=" t e l :+44
- (0
) 1908
- 653506"
/> < foa f : homepage r d f : r esou r ce="h t t p : / / km i .open. ac .uk /peop le /mo t t a / " /> < foa f :wor kp laceHomepage r d f : r esour ce= "h t t p : / / km i .open .ac .uk / " / > < foa f : dep i c t i
- n
rd f : r esou rce="h t t p : / / k mi .open .ac .uk / img / me mbers/ en r i co . j p g" /> < foa f : t
- p
i c_ in te res t>Know l edge Techno log ies< / foa f : t
- pi
c_ i n te rest > < foa f : t
- p
i c_ in te res t>Seman t i c Web< / foa f : t
- pi
c_ i n te rest > < foa f : t
- p
i c_ in te res t>On to log ies< / f
- a
f : t
- p
i c_ in t e res t> < foa f : t
- p
i c_ in te res t>Prob lem So lv i ng Me thods< / foa f : t
- pi
c_ i n te rest > < foa f : t
- p
i c_ in te res t>Know l edge Mode l l i ng< / f
- a
f : t
- p
i c_i n t e res t> < foa f : t
- p
i c_ in te res t>Know l edge Management < / f
- a
f : t
- p
i c_ in te res t> < foa f : based_nea r > <geo :Poi n t> <geo : l a t >52 .024868< /geo: l a t> <geo : l
- ng>-0
.707143< /geo : l
- ng>
<con tac t : nea res tAi r po r t> <a i rpor t : name>London Lu ton A i rpo r t< /a i r por t : name> <a i rpor t : i a t aCode>LTN< / a i rpo r t : i a t aCode> <a i rpor t : l
- ca
t i
- n>Lu
ton , Un i t ed K ingdom</a i r po r t : l
- cat
i
- n>
<geo : l a t>51 .866666666667< /geo: l a t> <geo : l
- ng>-0
.36666666666667< / geo : l
- ng>
< rd f s : seeA lso r d f : r esou r ce="h t t p : / /www.daml .o rg / cg i
- b
i n /a i r por t ?LTN" /> < foa f : cu r r en tP ro ject > < foa f :P ro j ec t> < foa f : name>AquaLog< / foa f : name>
The foaf ontology
The SW as ‘Web of Data’
Current status of the semantic web
- 10-20 million semantic web documents
– Expressed in RDF, OWL, DAML+ OIL
- 7K-10K ontologies
– These cover a variety of domains - multimedia, computing, management, bio-medical sciences, geography, entertainment, upper level concepts, etc…
The above figures refer to resources w hich are publicly accessible on the w eb
The Semantic Web today
- To a significant extent the Semantic Web is already in place and is
characterized by a widespread production of formalized knowledge models (ontologies and metadata), from a variety of different groups and individuals
– “The Next Knowledge Medium - An information network with semi- automated services for the generation, distribution, and consumption of knowledge”
- Stefik, 1986
– “Knowledge modelling to become a new form of literacy?”
- Stutt and Motta, 1997
- Still primarily a research enterprise, however interest is rapidly
increasing in both governmental and business organizations
- “early adopters” phase
- The result is slowly emerging as an unprecedented knowledge
resource, which can enable a new generation of intelligent applications on the web
Semantic Web Applications
What can you do with the Semantic Web?
- A ‘corporate ontology’ is
used to provide a homogeneous view over heterogeneous data sources
- Often tackle Enterprise
Information Integration scenarios
- Hailed by Gartner as one
- f the key emerging
strategic technology trends
– E.g., see personal information management in Garlik
“Corporate Semantic Webs”
Exploiting large scale semantics Next Generation SW Applications Semantic Web
Exploiting large scale semantics Next Generation SW Applications Semantic Web
NGSW Applications in the context of AI research
Knowledge-Based Systems Large Body
- f Know ledge
I ntelligent Behaviour
“Today there has been a shift in
- paradigm. The fundamental problem of
understanding intelligence is not the identification of a few powerful techniques, but rather the question of how to represent large amounts of knowledge in a fashion that permits their effective use” Goldstein and Papert, 1977
The Knowledge Acquisition Bottleneck Large Body
- f Know ledge
I ntelligent Behaviour KA Bottleneck Know ledge
SW as Enabler of Intelligent Behaviour I ntelligent Behaviour
Both a platform for knowledge publishing and a large scale source of knowledge
KBS vs SW Systems
Very Variable High Degree of trust Very Variable High Quality Heterogeneous Homogeneous
- Repr. Schem a
Extra Huge Small/ Medium Size Distributed Centralized Provenance SW System s Classic KBS
Key Paradigm Shift A side-effect of being able to integrate different types of reasoning to handle size and heterogeneous quality and representation A function of sophisticated, logical, task- centric problem solving I ntelligent Behaviour SW System s Classic KBS
Next Generation SW Applications: Examples
Case Study 1: Automatic Alignment of Thesauri in the Agricultural/ Fishery Domain
Method
Concept_A
(e.g., Supermarket)
Concept_B
(e.g., Building)
Scarlet Scarlet
≡ ≡
Sem antic W eb Sem antic Relation ( ) Deduce Access
⊆
- SCARLET - matching
by Harvesting the SW
- Automatically select
and combine multiple
- nline ontologies to
derive a relation
Two strategies
Supermarket Building
Superm arket Shop
⊆ ⊆
PublicBuilding
⊆ ⊆
Building
Scarlet Scarlet
Cholesterol OrganicChemical
Cholesterol Steroid
⊆ ⊆
Lipid
⊆ ⊆
OrganicChem ical
Scarlet Scarlet
Steroid
≡ ≡ ≡ ≡ ≡
Deriving relations from (A) one ontology and (B) across ontologies.
Sem antic W eb
( A) ( B)
Matching:
- AGROVOC
- UN’s Food and
Agriculture Organisation (FAO) thesaurus
- 28.174 descriptor terms
- 10.028 non-descriptor
terms
- NALT
- US National Agricultural
Library Thesaurus
- 41.577 descriptor terms
- 24.525 non-descriptor
terms
Experiment
226 Used Ontologies
http:/ / 1 3 9 .9 1 .1 8 3 .3 0 :9 0 9 0 / RDF/ VRP/ Exam ples/ tap.rdf http:/ / reliant.teknow ledge.com / DAML/ SUMO.dam l http:/ / reliant.teknow ledge.com / DAML/ Mid-level-ontology.dam l http:/ / reliant.teknow ledge.com / DAML/ Econom y.dam l http:/ / gate.ac.uk/ projects/ htechsight/ Technologies.dam l
Evaluation 1 - Precision
- Manual assessment of 1000 mappings (15% )
- Evaluators:
– Researchers in the area of the Semantic Web – 6 people split in two groups
- Results:
– Comparable to best results for background knowledge based matchers.
Evaluation 2 – Error Analysis
Other Case Studies…
Giving meaning to tags
Example
Cluster_ 1: {college commerce corporate course education high
instructing learn learning lms school student}
education training1,4 qualification corporate1 institution university2,3 college2 postSecondary School2 school2 student3 studiesAt course3
- ffersCourse
takesCourse activities4 learning4 teaching4
1http://gate.ac.uk/projects/htechsight/Employment.daml. 2http://reliant.teknowledge.com/DAML/Mid-level-ontology.daml. 3http://www.mondeca.com/owl/moses/ita.owl. 4http://www.cs.utexas.edu/users/mfkb/RKF/tree/CLib-core-office.owl.
Conclusions
Typical misconceptions…
- “The SW is a long-term vision…
”
– Ehm… actually… it already exists…
- “The SW will never work because nobody is going to annotate
their web pages”
– The SW is not about annotating w eb pages, the SW is a web
- f data, most of which are generated from DBs, or from web
mining software, or from applications which produce SW data as a side effect of supporting users’ tasks
- “The idea of a universal ontology has failed before and will fail
- again. Hence the SW is doomed”
– The SW is not about a single universal ontology. Already there are around 10K ontologies and the number is growing… – SW applications may use 1, 2, 3, or even hundreds of ontologies.
SW and Language Technologies
- All the applications mentioned here combine
language, web, statistical and semantic technologies
- Heterogeneity and sloppy modelling implies that
language and statistical technologies are almost always needed when building NGSW apps
- In contrast with traditional KBS, intelligent