Language Technologies and the Semantic Web: An Essential - - PowerPoint PPT Presentation

language technologies and the semantic web an essential
SMART_READER_LITE
LIVE PREVIEW

Language Technologies and the Semantic Web: An Essential - - PowerPoint PPT Presentation

Language Technologies and the Semantic Web: An Essential Relationship. Enrico Motta Professor of Knowledge Technologies Knowledge Media Institute The Open University Content of the Talk Update on the Semantic Web Beyond the hype


slide-1
SLIDE 1

Language Technologies and the Semantic Web: An Essential Relationship.

Enrico Motta Professor of Knowledge Technologies Knowledge Media Institute The Open University

slide-2
SLIDE 2

Content of the Talk

  • Update on the Semantic Web

– Beyond the hype

  • What it is
  • Why it is interesting
  • What’s its status?
  • Semantic Web and AI
  • Semantic Web Applications

– Key features – Reasoning on the Semantic Web – Key role of Language Technologies

  • Conclusions
slide-3
SLIDE 3

The Semantic Web in 2 minutes…

slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6

< foa f :Pe r son rd f : a bou t= "h t t p : / / i den t i f i e rs . km i .open .ac .uk/ peop le /en r i co

  • mo

t t a / "> < foa f : name>Enr i c

  • Mo

t ta< / f

  • a

f : name> < foa f : f i r s tName>Enr i co< / f

  • a

f : f i r s tName> < foa f : sur name>Mo t ta< / f

  • a

f : su rname> < foa f : phone rd f : r e sou rce=" t e l :+44

  • (0

) 1908

  • 653506"

/> < foa f : homepage r d f : r esou r ce="h t t p : / / km i .open. ac .uk /peop le /mo t t a / " /> < foa f :wor kp laceHomepage r d f : r esour ce= "h t t p : / / km i .open .ac .uk / " / > < foa f : dep i c t i

  • n

rd f : r esou rce="h t t p : / / k mi .open .ac .uk / img / me mbers/ en r i co . j p g" /> < foa f : t

  • p

i c_ in te res t>Know l edge Techno log ies< / foa f : t

  • pi

c_ i n te rest > < foa f : t

  • p

i c_ in te res t>Seman t i c Web< / foa f : t

  • pi

c_ i n te rest > < foa f : t

  • p

i c_ in te res t>On to log ies< / f

  • a

f : t

  • p

i c_ in t e res t> < foa f : t

  • p

i c_ in te res t>Prob lem So lv i ng Me thods< / foa f : t

  • pi

c_ i n te rest > < foa f : t

  • p

i c_ in te res t>Know l edge Mode l l i ng< / f

  • a

f : t

  • p

i c_i n t e res t> < foa f : t

  • p

i c_ in te res t>Know l edge Management < / f

  • a

f : t

  • p

i c_ in te res t> < foa f : based_nea r > <geo :Poi n t> <geo : l a t >52 .024868< /geo: l a t> <geo : l

  • ng>-0

.707143< /geo : l

  • ng>

<con tac t : nea res tAi r po r t> <a i rpor t : name>London Lu ton A i rpo r t< /a i r por t : name> <a i rpor t : i a t aCode>LTN< / a i rpo r t : i a t aCode> <a i rpor t : l

  • ca

t i

  • n>Lu

ton , Un i t ed K ingdom</a i r po r t : l

  • cat

i

  • n>

<geo : l a t>51 .866666666667< /geo: l a t> <geo : l

  • ng>-0

.36666666666667< / geo : l

  • ng>

< rd f s : seeA lso r d f : r esou r ce="h t t p : / /www.daml .o rg / cg i

  • b

i n /a i r por t ?LTN" /> < foa f : cu r r en tP ro ject > < foa f :P ro j ec t> < foa f : name>AquaLog< / foa f : name>

slide-7
SLIDE 7

The foaf ontology

slide-8
SLIDE 8

The SW as ‘Web of Data’

slide-9
SLIDE 9

Current status of the semantic web

  • 10-20 million semantic web documents

– Expressed in RDF, OWL, DAML+ OIL

  • 7K-10K ontologies

– These cover a variety of domains - multimedia, computing, management, bio-medical sciences, geography, entertainment, upper level concepts, etc…

The above figures refer to resources w hich are publicly accessible on the w eb

slide-10
SLIDE 10

The Semantic Web today

  • To a significant extent the Semantic Web is already in place and is

characterized by a widespread production of formalized knowledge models (ontologies and metadata), from a variety of different groups and individuals

– “The Next Knowledge Medium - An information network with semi- automated services for the generation, distribution, and consumption of knowledge”

  • Stefik, 1986

– “Knowledge modelling to become a new form of literacy?”

  • Stutt and Motta, 1997
  • Still primarily a research enterprise, however interest is rapidly

increasing in both governmental and business organizations

  • “early adopters” phase
  • The result is slowly emerging as an unprecedented knowledge

resource, which can enable a new generation of intelligent applications on the web

slide-11
SLIDE 11

Semantic Web Applications

What can you do with the Semantic Web?

slide-12
SLIDE 12
  • A ‘corporate ontology’ is

used to provide a homogeneous view over heterogeneous data sources

  • Often tackle Enterprise

Information Integration scenarios

  • Hailed by Gartner as one
  • f the key emerging

strategic technology trends

– E.g., see personal information management in Garlik

“Corporate Semantic Webs”

slide-13
SLIDE 13
slide-14
SLIDE 14

Exploiting large scale semantics Next Generation SW Applications Semantic Web

slide-15
SLIDE 15

Exploiting large scale semantics Next Generation SW Applications Semantic Web

slide-16
SLIDE 16

NGSW Applications in the context of AI research

slide-17
SLIDE 17

Knowledge-Based Systems Large Body

  • f Know ledge

I ntelligent Behaviour

“Today there has been a shift in

  • paradigm. The fundamental problem of

understanding intelligence is not the identification of a few powerful techniques, but rather the question of how to represent large amounts of knowledge in a fashion that permits their effective use” Goldstein and Papert, 1977

slide-18
SLIDE 18

The Knowledge Acquisition Bottleneck Large Body

  • f Know ledge

I ntelligent Behaviour KA Bottleneck Know ledge

slide-19
SLIDE 19

SW as Enabler of Intelligent Behaviour I ntelligent Behaviour

Both a platform for knowledge publishing and a large scale source of knowledge

slide-20
SLIDE 20

KBS vs SW Systems

Very Variable High Degree of trust Very Variable High Quality Heterogeneous Homogeneous

  • Repr. Schem a

Extra Huge Small/ Medium Size Distributed Centralized Provenance SW System s Classic KBS

slide-21
SLIDE 21

Key Paradigm Shift A side-effect of being able to integrate different types of reasoning to handle size and heterogeneous quality and representation A function of sophisticated, logical, task- centric problem solving I ntelligent Behaviour SW System s Classic KBS

slide-22
SLIDE 22

Next Generation SW Applications: Examples

Case Study 1: Automatic Alignment of Thesauri in the Agricultural/ Fishery Domain

slide-23
SLIDE 23

Method

Concept_A

(e.g., Supermarket)

Concept_B

(e.g., Building)

Scarlet Scarlet

≡ ≡

Sem antic W eb Sem antic Relation ( ) Deduce Access

  • SCARLET - matching

by Harvesting the SW

  • Automatically select

and combine multiple

  • nline ontologies to

derive a relation

slide-24
SLIDE 24

Two strategies

Supermarket Building

Superm arket Shop

⊆ ⊆

PublicBuilding

⊆ ⊆

Building

Scarlet Scarlet

Cholesterol OrganicChemical

Cholesterol Steroid

⊆ ⊆

Lipid

⊆ ⊆

OrganicChem ical

Scarlet Scarlet

Steroid

≡ ≡ ≡ ≡ ≡

Deriving relations from (A) one ontology and (B) across ontologies.

Sem antic W eb

( A) ( B)

slide-25
SLIDE 25

Matching:

  • AGROVOC
  • UN’s Food and

Agriculture Organisation (FAO) thesaurus

  • 28.174 descriptor terms
  • 10.028 non-descriptor

terms

  • NALT
  • US National Agricultural

Library Thesaurus

  • 41.577 descriptor terms
  • 24.525 non-descriptor

terms

Experiment

slide-26
SLIDE 26

226 Used Ontologies

http:/ / 1 3 9 .9 1 .1 8 3 .3 0 :9 0 9 0 / RDF/ VRP/ Exam ples/ tap.rdf http:/ / reliant.teknow ledge.com / DAML/ SUMO.dam l http:/ / reliant.teknow ledge.com / DAML/ Mid-level-ontology.dam l http:/ / reliant.teknow ledge.com / DAML/ Econom y.dam l http:/ / gate.ac.uk/ projects/ htechsight/ Technologies.dam l

slide-27
SLIDE 27

Evaluation 1 - Precision

  • Manual assessment of 1000 mappings (15% )
  • Evaluators:

– Researchers in the area of the Semantic Web – 6 people split in two groups

  • Results:

– Comparable to best results for background knowledge based matchers.

slide-28
SLIDE 28

Evaluation 2 – Error Analysis

slide-29
SLIDE 29

Other Case Studies…

slide-30
SLIDE 30

Giving meaning to tags

slide-31
SLIDE 31

Example

Cluster_ 1: {college commerce corporate course education high

instructing learn learning lms school student}

education training1,4 qualification corporate1 institution university2,3 college2 postSecondary School2 school2 student3 studiesAt course3

  • ffersCourse

takesCourse activities4 learning4 teaching4

1http://gate.ac.uk/projects/htechsight/Employment.daml. 2http://reliant.teknowledge.com/DAML/Mid-level-ontology.daml. 3http://www.mondeca.com/owl/moses/ita.owl. 4http://www.cs.utexas.edu/users/mfkb/RKF/tree/CLib-core-office.owl.

slide-32
SLIDE 32
slide-33
SLIDE 33
slide-34
SLIDE 34
slide-35
SLIDE 35

Conclusions

slide-36
SLIDE 36

Typical misconceptions…

  • “The SW is a long-term vision…

– Ehm… actually… it already exists…

  • “The SW will never work because nobody is going to annotate

their web pages”

– The SW is not about annotating w eb pages, the SW is a web

  • f data, most of which are generated from DBs, or from web

mining software, or from applications which produce SW data as a side effect of supporting users’ tasks

  • “The idea of a universal ontology has failed before and will fail
  • again. Hence the SW is doomed”

– The SW is not about a single universal ontology. Already there are around 10K ontologies and the number is growing… – SW applications may use 1, 2, 3, or even hundreds of ontologies.

slide-37
SLIDE 37

SW and Language Technologies

  • All the applications mentioned here combine

language, web, statistical and semantic technologies

  • Heterogeneity and sloppy modelling implies that

language and statistical technologies are almost always needed when building NGSW apps

  • In contrast with traditional KBS, intelligent

behaviour is more a side-effect of intg. multiple techniques to handle scale and heterogeneity, rather than a function of powerful deductive reasoning

slide-38
SLIDE 38