Frameworks, Implementation & Open Frameworks, Implementation - - PowerPoint PPT Presentation

frameworks implementation open frameworks implementation
SMART_READER_LITE
LIVE PREVIEW

Frameworks, Implementation & Open Frameworks, Implementation - - PowerPoint PPT Presentation

Frameworks, Implementation & Open Frameworks, Implementation & Open Problems for the Collaborative Building of Problems for the Collaborative Building of a Multilingual Lexical Database a Multilingual Lexical Database Mathieu Mangeot


slide-1
SLIDE 1
  • Sat. 31 Aug. 2002

SEMANET Workshop 1 / 25

Frameworks, Implementation & Open Frameworks, Implementation & Open Problems for the Collaborative Building of Problems for the Collaborative Building of a Multilingual Lexical Database a Multilingual Lexical Database

Mathieu Mangeot & Gilles Sérasset

NII, Tokyo, Japan mangeot@nii.ac.jp GETA-CLIPS, Grenoble, France Gilles.Serasset@imag.fr

slide-2
SLIDE 2
  • Sat. 31 Aug. 2002

SEMANET Workshop 2 / 25

Outline Outline

♦ Presentation of Papillon Project ♦ Macrostructure of the Dictionary ♦ Microstructure of the Entries ♦ Bootstrapping & Contribution Process

♦ Limbo, Purgatory & Paradise ♦ Bootstrapping with Conceptual Vectors ♦ Contributions & Validation Process

♦ Lexico-Semantical Network

♦ Monolingual with Lexical Functions ♦ Multilingual with Axies (Interlingual Links)

♦ Conclusion & References

slide-3
SLIDE 3
  • Sat. 31 Aug. 2002

SEMANET Workshop 3 / 25

Motivations Motivations

♦ Initial Goal

♦ Build a French-Japanese electronic dictionary for humans

♦ Very Few Existing Resources

♦ French-Japanese, Free, Electronic

♦ Construction Costs Too High

♦ EDR English-Japanese Dictionary ♦ 1200 human-year; 300 000 entries; price: 14,3 Mo ¥

♦ On Going Collaborative Construction Projects

♦ Edict Japanese->English, SAIKAM Japanese-Thai

♦ Lack of Information

♦ Numerical Specifiers, kanji+kana+romaji

slide-4
SLIDE 4
  • Sat. 31 Aug. 2002

SEMANET Workshop 4 / 25

Extended Goals Extended Goals

♦ Build a More Complete Dictionary

♦ Multilingual (English, French, German, Japanese, Lao ,

Malay, Thai, Vietnamese)

♦ Multiusers (beginners, experts, applications)

♦ Community Development

♦ LINUX Construction Paradigm ♦ Voluntary Contributors ♦ Mutualization of the Resources ♦ User Preferences & Profiles

slide-5
SLIDE 5
  • Sat. 31 Aug. 2002

SEMANET Workshop 5 / 25

Outline Outline

♦ Presentation of Papillon Project ♦ Macrostructure of the Dictionary ♦ Microstructure of the Entries ♦ Bootstrapping & Contribution Process

♦ Limbo, Purgatory & Paradise ♦ Bootstrapping with Conceptual Vectors ♦ Contributions & Validation Process

♦ Lexico-Semantical Network

♦ Monolingual with Lexical Functions ♦ Multilingual with Axies (Interlingual Links)

♦ Conclusion & References

slide-6
SLIDE 6
  • Sat. 31 Aug. 2002

SEMANET Workshop 6 / 25

Bilingual Dictionaries Bilingual Dictionaries

French English Thai Japanese Malay Vietnamese Lao

slide-7
SLIDE 7
  • Sat. 31 Aug. 2002

SEMANET Workshop 7 / 25

Pivot Dictionary Pivot Dictionary

French English Thai Japanese Malay Vietnamese Lao Int

slide-8
SLIDE 8
  • Sat. 31 Aug. 2002

SEMANET Workshop 8 / 25

Detailed Pivot Structure Detailed Pivot Structure

French DiCo Vocable affection n.f.

lexie affection.1

(tendresse)

lexie affection.2

(médecine)

Interlingual Links (Axies)

lexie maladie

Vocable maladie n.f.

Refinement Links

English DiCo Vocable disease N

lexie disease lexie affection

Vocable affection N

病気 byouki 【びょうき】

Japanese DiCo

Ref: Work done by Gilles Sérasset

slide-9
SLIDE 9
  • Sat. 31 Aug. 2002

SEMANET Workshop 9 / 25

Outline Outline

♦ Presentation of Papillon Project ♦ Macrostructure of the Dictionary ♦ Microstructure of the Entries ♦ Bootstrapping & Contribution Process

♦ Limbo, Purgatory & Paradise ♦ Bootstrapping with Conceptual Vectors ♦ Contributions & Validation Process

♦ Lexico-Semantical Network

♦ Monolingual with Lexical Functions ♦ Multilingual with Axies (Interlingual Links)

♦ Conclusion & References

slide-10
SLIDE 10
  • Sat. 31 Aug. 2002

SEMANET Workshop 10 / 25

Combinatorial Lexicography Combinatorial Lexicography

♦ From Meaning-Text Theory

♦ Alain Polguère & Igor Mel’tchuk (U. de Montréal) ♦ Gives the necessary information to go from an

idea (the meaning) to its realisation in a given language (the text).

♦ Existing Dictionaries: DEC, DiCo database & LAF

♦ Same Structure for Every Language

♦ 56 Basic Lexical Functions

slide-11
SLIDE 11
  • Sat. 31 Aug. 2002

SEMANET Workshop 11 / 25

French Lexie (DiCo Entry) French Lexie (DiCo Entry)

Name of the Lexical Unit: MEURTRE

Grammatical Properties: nom, masc

Semantical Formula: action de tuer: ~ PAR L'individu X DE L'individu Y

Government Pattern: X =I = de N, A-poss Y= II = de N, A-poss

Lexical Functions:

{QSyn} assassinat,homicide#1;crime /*Quasi synonyms*/

{Oper1} accomplir, commettre, perpétrer [ART ~]; tremper [dans ART ~] /*Causes that X does a M.*/

{S1} auteur [de ART Ø]//meurtrier-n /*Name for X*/

{S2} victime [de ART Ø] /*Name for Y*/

Example: La mésentente pourrait être le mobile du meurtre.

Idioms:

_appel au meurtre_

_crier au meurtre_

slide-12
SLIDE 12
  • Sat. 31 Aug. 2002

SEMANET Workshop 12 / 25

Japanese Lexie Japanese Lexie

Name of the Lexical Unit:殺人 【さつじん】

Reading: satsujin

Grammatical Properties: 名詞 【めいし】

Semantical Formula: どうさ: 人 Y の 人 X の ̃

Government Pattern: X = I = N, Y = II = N の

Lexical Functions:

{QSyn} 殺戮【さつりく】, 殺害【さつがい】 /*Quasi synonyms*/

{Oper1} [̃ を] する;[̃ を]犯す /*Causes that X does a M.*/

{S1} 殺人者 【さつじんしゃ】, 殺人鬼 【さつじんき】 /*Name for X*/

{S2} 被害者【ひがいしゃ】 /*Name for Y*/

Example: 喧嘩【けんか】は殺人【さつじん】の動機【どうき】になり得 【え】るだろう。

Idioms:

_殺人剣 【さつじんけん】_

_嘱託殺人 【しくたくさつじん】_

slide-13
SLIDE 13
  • Sat. 31 Aug. 2002

SEMANET Workshop 13 / 25

Interlingual Links (Axies) Interlingual Links (Axies)

♦ Links to other Axies

♦ Synonyms, Refinement, Generalizations

♦ Motivated by existing translation links.

♦ Not like concepts

♦ Links to External References

♦ To be independent from any existing theory ♦ Wordnet synsets, NTT Semantic category, ♦ ONTOS or LexiGuide ontologies, ♦ UNL Uws & Graphs etc.

♦ Linking Monolingual Lexies

slide-14
SLIDE 14
  • Sat. 31 Aug. 2002

SEMANET Workshop 14 / 25

Structure of an Axie Structure of an Axie

Unique ID: a000023

Semantic Tag (entity, process, state, result): process

Links to lexies: fra: meurtre.1 eng: murder.1 jpn: satsujin.1

Links to other axies

♦ synonym axies: a000024 (assassination) ♦ generic axies: a00002 ♦ refined axies: a000025 ♦

References to External Resources:

♦ WordNet Synset: 00143589 unlawful premeditated

killing of a human being

♦ UNL UW: murder(icl>action,agt>human,obj>human) ♦ NTT Semantic Category ♦ ONTOS Concept ♦ LexiGuide concept

slide-15
SLIDE 15
  • Sat. 31 Aug. 2002

SEMANET Workshop 15 / 25

Outline Outline

♦ Presentation of Papillon Project ♦ Macrostructure of the Dictionary ♦ Microstructure of the Entries ♦ Bootstrapping & Contribution Process

♦ Limbo, Purgatory & Paradise ♦ Bootstrapping with Conceptual Vectors ♦ Contributions & Validation Process

♦ Lexico-Semantical Network

♦ Monolingual with Lexical Functions ♦ Multilingual with Axies (Interlingual Links)

♦ Conclusion & References

slide-16
SLIDE 16
  • Sat. 31 Aug. 2002

SEMANET Workshop 16 / 25

Preparation of the Existing Data Preparation of the Existing Data

Local Resources

Export

Recuperation

DicDist DicOrig DicGen

Contrib1 Contrib2 EDR

FeM JMDict SAIKAM DiCo

Contrib3 ELRA

Limbo

Original Format

Purgatory

XML Format

Paradise

Papillon Format

Import

I n t e g r a t i

  • n

Integration

Spap Consultation

slide-17
SLIDE 17
  • Sat. 31 Aug. 2002

SEMANET Workshop 17 / 25

Introduction to Conceptual Vectors Introduction to Conceptual Vectors

♦ An idea = a concept = a conceptual vector ♦ The vector space is of K dimensions

♦ K = nb of concepts in a thesaurus hierarchy ♦ Eg: for French, Thesaurus Larousse = 873 concepts ♦ One independent vector space for each language

♦ Distance between 2 vectors = angular distance

♦ DA(x, y) = acos(sim(x,y)) ♦ DA(x, y) = acos(x.y/|x||y|))

♦ Ref: Work done by Mathieu Lafourcade

♦ http://www.lirmm.fr/~lafourca/

slide-18
SLIDE 18
  • Sat. 31 Aug. 2002

SEMANET Workshop 18 / 25

demand

English Vectorized monolingual dictionary English-French Bilingual dictionary

v v v v

Left over meaning Association

demand.3 v def demand.2 v def demand.1 v def demand.1 equivalents demand

Slide from Mathieu Lafourcade Vector space for English

demand.2 equivalents demand.3 equivalents demand.4 equivalents

Linking Word Senses with Vectors Linking Word Senses with Vectors

slide-19
SLIDE 19
  • Sat. 31 Aug. 2002

SEMANET Workshop 19 / 25

Contributions & Validation Contributions & Validation

Papillon Server Paradise

Papillon Format

Spap

Voluntary Contributors

Contributions User Space User Space

Payed Specialists

Validation Integration

slide-20
SLIDE 20
  • Sat. 31 Aug. 2002

SEMANET Workshop 20 / 25

Outline Outline

♦ Presentation of Papillon Project ♦ Macrostructure of the Dictionary ♦ Microstructure of the Entries ♦ Bootstrapping & Contribution Process

♦ Limbo, Purgatory & Paradise ♦ Bootstrapping with Conceptual Vectors ♦ Contributions & Validation Process

♦ Lexico-Semantical Network

♦ Monolingual with Lexical Functions ♦ Multilingual with Axies (Interlingual Links)

♦ Conclusion & References

slide-21
SLIDE 21
  • Sat. 31 Aug. 2002

SEMANET Workshop 21 / 25

Lexico Semantical Lexico Semantical Multilingual Network (1) Multilingual Network (1)

DiCo French Assassinat Meurtre Assassination Murder _Lancer un appel au meurtre_ _To call for sb's assassination_ {Qsyn} {Qsyn} {Qsyn} ? DiCo English Interlingual Links

slide-22
SLIDE 22
  • Sat. 31 Aug. 2002

SEMANET Workshop 22 / 25

Lexico Semantical Lexico Semantical Multilingual Network (2) Multilingual Network (2)

DiCo French Meurtrier Meurtre 殺人者 【さつじんしゃ】

殺人 【さつじん】

{S0} {S0} {S0} ? DiCo Japanese Interlingual Links

Satsujin (Murder) Satsujinsha (Murderer)

_Lancer un appel au meurtre_

slide-23
SLIDE 23
  • Sat. 31 Aug. 2002

SEMANET Workshop 23 / 25

Outline Outline

♦ Presentation of Papillon Project ♦ Macrostructure of the Dictionary ♦ Microstructure of the Entries ♦ Bootstrapping & Contribution Process

♦ Limbo, Purgatory & Paradise ♦ Bootstrapping with Conceptual Vectors ♦ Contributions & Validation Process

♦ Lexico-Semantical Network

♦ Monolingual with Lexical Functions ♦ Multilingual with Axies (Interlingual Links)

♦ Conclusion & References

slide-24
SLIDE 24
  • Sat. 31 Aug. 2002

SEMANET Workshop 24 / 25

Conclusion Conclusion

♦ Framework for Experimenting Networks ♦ Research Issues Remaining

♦ Social issues: how to motivate people? ♦ Contribution Interfaces ♦ Checking Interfaces

♦ The Project Cannot Succeed without the Help

  • f the Public People (Voluntary Contributors)
slide-25
SLIDE 25
  • Sat. 31 Aug. 2002

SEMANET Workshop 25 / 25

References & Contacts References & Contacts

♦ Web Site (information & consultation)

♦ http://www.papillon-dictionary.org

♦ Steering Committee President

♦ Gilles Sérasset Gilles.Serasset@imag.fr

♦ Technical Responsible in Japan

♦ Mathieu Mangeot mangeot@nii.ac.jp