Vocabularies to Classification Systems: Modelling DDC with FRSAD - - PowerPoint PPT Presentation

vocabularies to classification systems
SMART_READER_LITE
LIVE PREVIEW

Vocabularies to Classification Systems: Modelling DDC with FRSAD - - PowerPoint PPT Presentation

Extending Models for Controlled Vocabularies to Classification Systems: Modelling DDC with FRSAD Joan S. Mitchell OCLC, Inc. Marcia Lei Zeng Kent State University Maja umer University of Ljubljana, Slovenia The big question Can the FRSAD


slide-1
SLIDE 1

Extending Models for Controlled Vocabularies to Classification Systems: Modelling DDC with FRSAD

Joan S. Mitchell OCLC, Inc. Marcia Lei Zeng Kent State University Maja Žumer University of Ljubljana, Slovenia

slide-2
SLIDE 2

The big question

Can the FRSAD conceptual model be extended beyond subject authority data (its original focus) to model classification data?

slide-3
SLIDE 3

Outline

  • 1. From Knowledge Organisation Systems (KOS)

to data and conceptual models

  • 2. FRSAD conceptual model
  • 3. FRSAD model for classification systems
  • 4. DDC case study
  • 5. Findings and limitations
  • 6. Future work
slide-4
SLIDE 4

2009 1998 2010 1876 DDC 1905 UDC 1898 LCSH FRSAD FRAD FRBR 1967

TEST*

*Thesaurus of engineering and scientific terms ISO 2788 (1974) Guidelines for the Establishment and Development of Monolingual Thesauri ISO 5964 (1985) Guidelines for the Establishment and Development of Multilingual Thesauri 1974 ISO 2788* 1985 ISO5964* 2004-2009 SKOS OWL

  • 1. From Knowledge Organisation Systems

to Data and Conceptual Models: Timeline

slide-5
SLIDE 5

From Knowledge Organisation Systems to Data and Conceptual Models: Modelling efforts

2009 1998 2010 1876 1905 Classifi- cation 1898 Subject headings FRSAD FRAD FRBR 1967 1974 ISO 2788 1985 ISO5964 2004-2009 SKOS OWL Classifi- cation Thesauri Thesauri KOS KOS

  • ntology

Thesauri: mostly comply with ISO 2788 and ISO 5964. Subject heading schemes: adopted the basic structure of the thesaurus since 1990s. Classification systems: implemented different practices and are usually constructed according to specific conventions and examples.

slide-6
SLIDE 6

The “FRBR family”

 FRBR: the original framework

 All entities, focusing on Group 1 entities: work, expression, manifestation, item  Published 1998

 FRAD: Functional Requirements for Authority Data

 Focusing on Group 2 entities: person, corporate body, family  Published 2009

 FRSAD: Functional Requirements for Subject Authority Data

 Focusing on Group3 entities  FRSAR WG established in 2005  Published 2010

slide-7
SLIDE 7

The FRBR family models: main entities and relationships

FRBR FRAD FRSAD

slide-8
SLIDE 8
  • 2. FRSAD Conceptual Model
  • 2.1 The core of the FRSAD conceptual model
slide-9
SLIDE 9

FRSAD – generalisation of FRBR

slide-10
SLIDE 10

The core of the FRSAD conceptual model

  • FRSAD Part 1:

WORK has as subject THEMA / THEMA is subject of WORK FRSAD Part 2: THEMA has appellation NOMEN / NOMEN is appellation of THEMA NOMEN = any sign or sequence of signs (alphanumeric characters, symbols, sound, etc.) that a thema is known by, referred to or addressed as

slide-11
SLIDE 11

Note: in a given controlled vocabulary and within a domain, a nomen should be an appellation of only one thema. The ‘has appellation’ relationship between thema and nomen in a controlled vocabulary:

slide-12
SLIDE 12

NOMEN = any sign or sequence of signs (alphanumeric characters, symbols, sound, etc.) that a thema is known by, referred to or addressed as.

Source: STN Database Summary Sheet: USAN (The USP Dictionary of U.S. Adopted Names and International Drug Names)

An example of nomens in an authority record for a chemical compound

Nomen 1-8

Nomen en 9

slide-13
SLIDE 13

terms (preferred & non-preferred) notations terms of pre-coordinated strings category labels (w or w/t notations) terms or identifiers … …

  • thesauri:
  • classification schemes:
  • subject heading systems:
  • taxonomies:
  • controlled lists:
  • … …

themas

represented by:

Nomens in different types of KOS

slide-14
SLIDE 14

2.2 Relationships (1) Thema-to-thema relationships  Hierarchical

 The generic relationship  The hierarchical whole-part relationship  The instance relationship  Other hierarchical relationships

 Associative

 [most commonly considered categories are listed in the report]

Other thema-to-thema relationships are domain- or implementation-dependent

slide-15
SLIDE 15

 Equivalence

Two nomens are considered equivalent only if they are appellations of the same thema in a controlled vocabulary.

 Partitive

An instance of a nomen may have parts. A whole-part relationship may exist between a nomen and its components.

2.2 Relationships (2) Nomen-to-nomen relationships

slide-16
SLIDE 16

2.3 Attributes

 Some general attributes of thema and nomen are proposed (1) thema attributes: type of thema, scope note

 In an implementation themas can be organized based on category, kind, or type

(2) nomen attributes: see next slide   In an implementation additional attributes may be recorded

slide-17
SLIDE 17

Nomen attributes

 Type of nomen (identifier, controlled name, …)  Scheme (LCSH, DDC, UDC, ULAN, ISO 8601…)  Reference source of nomen (Encyclopaedia Britannica…)  Representation of nomen (alphanumeric, sound, visual,...)  Language of nomen (English, Japanese, Slovenian,…)  Script of nomen (Cyrillic, Thai, Chinese-simplified,…)  Script conversion (Pinyin, ISO 3601, Romanisation of Japanese…)  Form of nomen (full name, abbreviation, formula…)  Time of validity of nomen (until xxxx, after xxxx, from… to …)  Audience (English-speaking users, scientists, children …)  Status of nomen (provisional, accepted, official,...) Note: examples of attribute values in parenthesis

include but not limited to:

slide-18
SLIDE 18

2.4 The importance of the THEMA-NOMEN model to the subject authority data

 Separating what are usually called concepts (or topics, subjects, classes [of concepts]) from what they are known by, referred to, or addressed as  A general abstract model, not limited to any particular domain or implementation  Potential for interoperability within the library field and beyond

slide-19
SLIDE 19
  • 3. FRSAD model for classification systems
  • Each class corresponds to a thema
  • Notation associated with the class is the nomen
  • Thema is the full category description of the class
  • Nomen is the symbol (or surrogate) used to

represent the full category description

slide-20
SLIDE 20
  • 4. DDC case study
slide-21
SLIDE 21

Thema: Class 025.04

slide-22
SLIDE 22

Nomens: DDC number, Full caption, URI

025.04 Computer science, information & general works/Library & information sciences/Operations of libraries, archives, information centers/Information storage and retrieval systems http://dewey.info/class/025.04/

slide-23
SLIDE 23

Thema: Any topic co-extensive with the full meaning of the class

topics that are functionally equivalent to the class

slide-24
SLIDE 24

Scope note: Text describing or defining thema

  • r specifying scope within particular system

Scope note (≠ thema/class) Scope note (≠ thema/class)

slide-25
SLIDE 25

Thema-to-thema relationships

associative relationship associative relationship (poly)hierarchical relationship

slide-26
SLIDE 26

Alternative nomens: Relative Index terms with equivalence relationship to class

slide-27
SLIDE 27

equivalence relationship ? ? ? ? ? ? ? ? scope note

SN SN SN SN ?

unknown relationship ?

slide-28
SLIDE 28

Derived alternative nomens

150 ## $a Databanks 260 ## $i see also $a Databases

slide-29
SLIDE 29

equivalence relationship ? ? ? ? ? ? scope note

SN SN SN SN ?

unknown relationship Derived

slide-30
SLIDE 30
  • 5. Findings and limitations
  • FRSAD conceptual model appears to accommodate

DDC data at a broad level

  • Topic-to-topic relationships require further study
  • The study did not consider the usefulness of

classification data modelled using FRSAD in real- world applications

slide-31
SLIDE 31
  • 6. Future work
  • Specify all relationships between Relative Index terms

and classes (see earlier work by Green, Mitchell)

slide-32
SLIDE 32

equivalence relationship ? ? ? ? ? ? scope note

SN SN SN SN ?

unknown relationship Derived

slide-33
SLIDE 33
  • 6. Future work
  • Specify all relationships between Relative Index terms

and classes (see earlier work by Green, Mitchell)

  • Investigate DDC translations and mappings in context of

model

slide-34
SLIDE 34

French DDC 22 German DDC 22 Italian DDC 22 Swedish Mixed DDC 22

Italian A14 Vietnamese A14 French A14 Spanish A14 Hebrew A14

200 Religion Class Guide (French)

DDC 22

A14

DDC Sach- Gruppen (German) DDC Summaries

English French Italian Rhaeto- Romansch Afrikaans Arabic Chinese French German Norwegian Portuguese Russian Scots Gaelic Spanish Swedish

slide-35
SLIDE 35

Mappings and crosswalks

DDC

LCSH

MeSH

SWD RAMEAU SAB BISAC SEARS CSH UDC LCC SAO Nuovo Soggettario

slide-36
SLIDE 36

Thema-to-thema relationships across languages: Class 025.04 (22/swe) = Class 025.04 (22)

slide-37
SLIDE 37

Thema-to-thema relationships (Complex case): T2—43414 (22) = T2—43414 (22/ger), but . . .

T2—43414 Giessen district (Giessen Regierungsbezirk) Including *Lahn River T2—43414 Regierungsbezirk Gießen T2—434147 Lahn-Dill-Kreis Hier auch: der Fluss *Lahn

not equivalent to thema/class T2—43414 functionally equivalent to thema/class T2—434147

slide-38
SLIDE 38
  • 6. Future work
  • Specify all relationships between Relative Index terms

and classes (see earlier work by Green, Mitchell)

  • Investigate DDC translations and mappings in context
  • f model
  • Investigate modelling the Relative Index as a separate

controlled vocabulary to provide a topic-centered view

  • Experiment with modelling other classification

schemes

  • Investigate usefulness of classification data modelled

using FRSAD