Finnish National Bibliography Fennica as Linked Data Osma Suominen - - PowerPoint PPT Presentation

finnish national bibliography fennica as linked data
SMART_READER_LITE
LIVE PREVIEW

Finnish National Bibliography Fennica as Linked Data Osma Suominen - - PowerPoint PPT Presentation

Finnish National Bibliography Fennica as Linked Data Osma Suominen SWIB17, Hamburg 6 Dec 2017 NATIONAL BIBLIOGRAPHY with apologies to Scott Adams Why? 1. Making our data more visible, also internationally 2. Improving the quality and


slide-1
SLIDE 1

Finnish National Bibliography Fennica as Linked Data

Osma Suominen

SWIB17, Hamburg 6 Dec 2017

slide-2
SLIDE 2

NATIONAL BIBLIOGRAPHY

with apologies to Scott Adams

slide-3
SLIDE 3

Why?

  • 1. Making our data more visible, also internationally
  • 2. Improving the quality and interoperability of our metadata
  • 3. Building competency for the future
  • 4. Why not? :)
slide-4
SLIDE 4

bib record bib record bib record bib record auth record auth record auth record bib record bib record auth record auth record auth record

Work Instance Person Subject

1M bib records 125k person names 40k corporate names 35k subjects (YSA) bib record bib record

Place Organization

slide-5
SLIDE 5

Work Instance Person Subject

Image credit: MaryMaking blog bib record bib record bib record bib record auth record auth record auth record bib record bib record auth record auth record auth record 125k person names 40k corporate names 35k subjects (YSA) bib record bib record 1M bib records

slide-6
SLIDE 6
slide-7
SLIDE 7

As seen in: SWIB16 talk DCMI webinar

  • -bib journal article

“From MARC silos to Linked Data silos”

slide-8
SLIDE 8

with separate Works and Instances like BIBFRAME, as enabled by the bibliographic extensions because it allows us to describe our resources from a common-sense, Web user perspective (and we get a metadata haircut for free!) Special thanks to Richard Wallis for help with applying schema.org!

slide-9
SLIDE 9

MARC Linked Data

?

slide-10
SLIDE 10

MARCXML BIBFRAME RDF Schema.org RDF Linked to external URIs MARC / Aleph seq With deduplicated works Work keys With deduplicated agents Agent keys

Convert & clean using Catmandu Convert using marc2bibframe2 Convert to Schema.org using SPARQL CONSTRUCT

YSA subjects YSO subjects Corporate names RDA Media, Content, Carrier

Link against controlled vocabularies using SPARQL Generate work keys for merging using SPARQL Merge works using SPARQL Merge agents (person, org) using SPARQL RDF store

https://github.com/NatLibFi/bib-rdf-pipeline

slide-11
SLIDE 11

Data dump downloads

Publishing as Linked Open Data for human & machine access

RDF HDT

Jena Fuseki bib-lod-ui Flask app HTML+JSON-LD OpenSearch API Linked Data RDF

RDF store

RDF N-Triples MARC records

Linked Data Fragments server SPARQL LDF

slide-12
SLIDE 12

Demo

http://data.nationallibrary.fi/bib/me/W00009584100

Spelunking UI...maybe?

slide-13
SLIDE 13

Our hairballs

slide-14
SLIDE 14
slide-15
SLIDE 15

Data model documentation

slide-16
SLIDE 16

Challenges

slide-17
SLIDE 17

Work extraction

  • 1. Extract works from MARC records
  • 2. Create a work authority
  • 3. Use and maintain it for cataloging
  • 4. ???
  • 5. Profit!

Not so easy in practice. Lots of problems in the metadata that cause inconsistencies in the output.

slide-18
SLIDE 18

Linking

Work Instance Person Subject Place Organization

LCSH Finnish Place Name Registry Wikidata

slide-19
SLIDE 19

Linking

Work Instance Person Subject Place Organization

LCSH Finnish Place Name Registry Wikidata WorldCat Other national libraries WorldCat Works LIBRIS XL ISNI VIAF ISNI Wikidata

slide-20
SLIDE 20

“Cool URIs don’t change” -- Tim Berners-Lee

...but we rely on conversion of MARC records that change all the time!

slide-21
SLIDE 21

It’s open data. Is it FAIR?

slide-22
SLIDE 22

1. Findable: URIs as identifiers, with rich metadata 2. Accessible: URI lookup, SPARQL and LDF endpoints, downloadable data dumps 3. Interoperable: RDF represenation using Schema.org and a little bit of RDAu 4. Reusable: CC0 license. Entities that are references also from other metadata

slide-23
SLIDE 23

What next?

1. Enriching and cleaning the RDF data, e.g. using subclasses like Map 2. More links to other Linked Data sets 3. Expanding to new data sets: Viola discography, Arto article database

slide-24
SLIDE 24

The Finnish Declaration of Independence was adopted by the Parliament of Finland on 6 December 1917

My birthday present

slide-25
SLIDE 25

Thank you! Questions?

  • sma.suominen@helsinki.fi - @OsmaSuominen

http://data.nationallibrary.fi - @NatLibFiData Code: https://github.com/NatLibFi/bib-rdf-pipeline https://github.com/NatLibFi/bib-lod-ui These slides: http://tinyurl.com/fennica-ld