Introduction

This document specifies the RDF data model used for the Finnish national bibliography Fennica Linked Data set, which consists of approximately 40 million RDF triples generated from 1 million MARC bibliographic records and auxiliary sources. The data model is heavily based on Schema.org, including the bibliographic extensions. The OCLC WorldCat Linked Data model has been used as a reference whenever possible. The separation between Works and Instances is modelled according to BIBFRAME 2.0.

This document specifies the available entity types, their relationships and properties.

Publishing Fennica as Linked Data is a work in progress. Some parts of this document are marked TODO, to indicate that the modelling or implementation is not yet finished. For more detailed information about current issues, see the open issues on the bib-rdf-pipeline GitHub project that implements the conversion of data from MARC records and auxiliary sources into the published RDF.

Accessing the data

The data set is currently available as:

URI patterns and stability

The data set currently uses URIs of the form http://urn.fi/URN:NBN:fi:bib:me:Tnnnnnnnnnxx where

  • T is a single capital letter representing the entity type (see below)
  • nnnnnnnnn is the numeric identifier of the MARC record where the entity originated
  • xx is a two-digit sequence number ensuring uniqueness of entities of the same type from the same record

The URI patterns are in draft status and may still change. Some entities are currently represented only as blank nodes in the RDF graph, but may later be given URIs.

TODO the URIs in this data set are not yet resolvable. We plan to use the urn.fi resolver to manage identifiers, but it has not yet been set up to resolve this namespace.

Entity types

Overview

This diagram shows the main entity types and their relationships as a UML class diagram.

Work

URI pattern: http://urn.fi/URN:NBN:fi:bib:me:Wnnnnnnnnnxx

Number of entities: approx. 930 000 Works, of which 450 000 are Series and 270 000 are Periodicals.

The Work entity type represents an abstract creative work, very similar to the BIBFRAME 2.0 notion of Work. Derived works such as translations are modelled as separate Work entities. In FRBR terms, this Work entity is a combination of a FRBR Work and Expression.

Field nameRDF propertyExpected Value / RangeDefinitionCardinalityData Quality Notes
Typerdf:typeClassType of resource. Always both schema:CreativeWork and bf:Work. May also have the more specific types schema:CreativeWorkSeries and schema:Periodical (see below).2..* 
Titleschema:nameLiteralTitle of work1..* 
Subjectschema:aboutskos:Concept, Work, Person, Organization or LiteralSubject matter of the work0..*YSO concepts are used whenever possible. Literal values are used in cases where no entity was found matching the label.
Has instanceschema:workExampleInstanceExample/instance/realization/derivation of the concept of this work. eg. The paperback edition, first edition, or eBook.0..* 
Languageschema:inLanguageLiteral (language code)Language of the work, expressed as a language code following BCP 47 rules (i.e. ISO 639-1 or 639-3 code)0..*Needs cleanup. There are a few bad values such as numeric values
Authorschema:authorPerson or OrganizationThe main author of this work0..1 
Contributorschema:contributorPerson or OrganizationA secondary contributor to the work0..* 
Content typerdau:P60049skos:Concept from RDA Content Type vocabulary 0..1Should generally be available for most Works, but in practice, missing for some of them.
Is part of (series)schema:isPartOfSeriesThe series which this work is a part of.0..* 
Is translation ofschema:translationOfWorkWorkThe work that this work has been translated from. Inverse of "Has translation"0..* 
Has translationschema:workTranslationWorkA work that is a translation of the content of this work. Inverse of "Is translation of"0..* 

Periodical

The Periodical entity type is a sub-type of Work and represents a publication series.

Field nameRDF propertyExpected Value / RangeDefinitionCardinalityData Quality Notes
Typerdf:typeClassAlways schema:Periodical1 
Has partschema:hasPartWorkA work that is included in this series1..* 
ISSNschema:issnLiteral (ISSN code)The International Standard Serial Number (ISSN) that identifies this periodical0..1 

 

Instance

URI pattern: http://urn.fi/URN:NBN:fi:bib:me:Innnnnnnnnxx

Number of entities: approx. 1.1 million

The Instance entity type represents a specific edition (e.g. a hardcover book or a specific DVD release of a film) of a Work. It is similar to the BIBFRAME 2.0 notion of Instance. In FRBR terms, it is similar to a FRBR Manifestation.

Field nameRDF propertyExpected Value / RangeDefinitionCardinalityData Quality Notes
Typerdf:typeClassType of resource. Always both schema:CreativeWork and bf:Instance. May have an additional, more specific type (see below).2..*The more specific types such as schema:Book are for the most part not implemented yet.
Titleschema:nameLiteralTitle of instance1..* 
Descriptionschema:descriptionLiteralA textual description of the instance0..*This field is used to represent many kinds of notes extracted from the bibliographic record. Some of these would probably deserve their own fields or a more structural way of expressing the information.
Is instance of workschema:exampleOfWorkWorkA work that this work is an example/instance/realization/derivation of.1 
Date publishedschema:datePublishedLiteral (date value)Date of first publication1Needs cleanup. May contain brackets or other expressions indicating uncertainty
Publicationschema:publicationPublicationEventA publication event of the instance0..1 
Publisherschema:publisherOrganizationThe publisher of the instance0..1 
Media typerdau:P60050skos:Concept from RDA Media type vocabularyRelates a resource to a categorization reflecting a general type of
intermediation device required to view, play, run, etc., the content of a
resource.
0..1Should generally be available for all Instances, but in practice, missing from some.
Carrier typerdau:P60048skos:Concept from RDA Carrier type vocabularyRelates a resource to a categorization reflecting a format of a storage medium and housing of a carrier in combination with a type of intermediation device required to view, play, run, etc., the content of a resource.0..1Should generally be available for all Instances, but in practice, missing from some.
Number of pagesschema:numberOfPagesLiteral (integer)The number of pages in the book0..1Needs cleanup. Often the values are structured page counts (including Roman numerals), not plain integers.
URLschema:urlURIThe URL where the electronic version is available0..1 

Book

The Instance sub-type Book represents a book edition, e.g. hardcover, paperback or electronic book.

Field nameRDF propertyExpected Value / RangeDefinitionCardinalityData Quality Notes
Typerdf:typeClassAlways schema:Book1 
Book formatschema:bookFormatschema:BookFormatTypeThe value schema:EBook is used for electronic books. Other values are currently not used.0..1 

ISBN

schema:isbnLiteral (ISBN code)The ISBN code of the book0..* 

 

Person

URI pattern: http://urn.fi/URN:NBN:fi:bib:me:Pnnnnnnnnnxx

Number of entities: approx. 1.3 million. Note that there is a lot of duplication within these entities. TODO: reconcile the Person entities with the person authority file.

The Person entity type represents a human being (e.g. author, contributor or subject of a work). A person may be a pseudonym or fictitious.

Field nameRDF propertyExpected Value / RangeDefinitionCardinalityData Quality Notes
Typerdf:typeClassAlways schema:Person1 
Nameschema:nameLiteralThe name of the person1May contain birth and death years. These should be moved to a separate field or removed.

Organization

URI pattern: http://urn.fi/URN:NBN:fi:bib:me:Onnnnnnnnnxx or blank node or CN identifier (TBD)

The Organization entity type represents an organization (e.g. publisher of a work).

Field nameRDF propertyExpected Value / RangeDefinitionCardinalityData Quality Notes
Typerdf:typeClassAlways schema:Organization1 
Nameschema:nameLiteralThe name of the organization1 

Place

URI pattern: blank nodes only

Number of entities: approx. 1.0 million. Note that there is a lot of duplication within these entities. TODO: reconcile the Place entities with YSO places.

The Place entity type represents a physical place (e.g. a country or a city).

Field nameRDF propertyExpected Value / RangeDefinitionCardinalityData Quality Notes
Typerdf:typeClassAlways schema:Place1 
Nameschema:nameLiteralThe name of the place1Needs cleanup. May contain abbreviations, inflected forms etc.

PublicationEvent

URI pattern: blank nodes only

Number of entities: approx. 980 000

The PublicationEvent entity type represents the event when an instance of a work was published.

Field nameRDF propertyExpected Value / RangeDefinitionCardinalityData Quality Notes
Typerdf:typeClassAlways schema:PublicationEvent1 
Organizerschema:organizerOrganizationThe publisher1 
Locationschema:locationPlaceThe place of publication  
Dateschema:startDateLiteral (date string)The date of publication  

 

 

  • No labels