Note: Please don’t rely on this document yet as it’s still under discussion and development.
1. Introduction
1.1. Goal
This document prescribes the generic data model to be used when publishing linked data in the heritage network. The model consists of a minimal set of classes and properties. It is based on:
-
the current state of datasets in the heritage network, as observed in the Dataset Knowledge Graph, particularly its property partitions analysis;
-
the needs of service platform builders for understanding, processing and presenting data.
By adhering to this model, dataset publishers ensure that their data is visible and can be consumed and combined with other datasets in the network.
1.2. Scope
These requirements are restricted in three ways:
-
they apply only to the way published data is expressed, not how it is stored or managed internally;
-
they prescribe a generic data model and leave the use of domain data models up to dataset publishers;
-
they bear upon datasets, not their descriptions; for the latter see [NDE-DATASETS].
1.3. Examples
While RDF examples in this document are in the [JSON-LD] RDF serialization, publishers MAY use any RDF serialization format, such as [Turtle] or [N3].
2. Definitions
- Data model
-
Set of classes and their properties that defines how data is expressed.
- Generic data model
-
A simple, shared data model; the scope of this document. See also [NDE-ALIGNMENT]. Can be used alongside domain data models.
- Domain data model
-
A domain-specific data model, such as CIDOC-CRM, Linked Art, RiC-O or RDA. Can be used alongside a generic data model. Adds precision at the cost of complexity. Out of this document’s scope.
- Metadata record
-
An RDF resource that expresses one of the top-level classes in the § 4 Data model.
- Term
-
A word, name, acronym, phrase or other symbol with a formal definition, published in the Network of Terms.
3. General considerations
3.1. Generic and domain data models
The purpose of generic data models is to integrate data in the heritage network and make it more visible. Domain models are usually more richly populated and provide consumers with more possibilities for further processing, for example in service platforms.
This document is limited to a set of classes and properties that together form the generic data model. For most datasets, the generic data model expresses only a subset of data properties that are available. This document’s purpose, therefore, is not a complete and correct expression of the source data, but an easily understandable and usable one.
If done well, the generic data invites consumers to explore the data in more depth using the domain data models. So to facilitate further exploration, publishers MAY use domain data models of their choosing alongside the generic data model. Examples are:
-
CIDOC-CRM and its derivative Linked Art for museum collections and catalogues;
-
RiC-O for archives;
-
PiCo for biographical data;
-
RDA for libraries.
3.2. Vocabulary
The generic data model presented in this document is designed as a [SCHEMA-ORG] application profile. The choice for Schema.org is substantiated in Implementation guidelines for NDE alignment § generic-data-model.
While the Schema.org website considers “both 'https://schema.org' and 'http://schema.org' (...) fine”, mixing the namespaces makes it harder to consume datasets.
Therefore, Publishers MUST use the https://schema.org/
(HTTPS) namespace for Schema.org; not http://schema.org/
(HTTP).
3.3. Language
For each property with a literal value, the value’s language MUST be specified, The language MUST be expressed as a language code from [BCP47], such as ‘nl’ or ‘nl-NL’.
name
property:
{ "@context" : "https://schema.org/" , "@id" : "https://n2t.net/ark:/123456/1" , "@type" : "CreativeWork" , "name" : [ { "@language" : "nl" , "@value" : "De Sterrennacht" }, { "@language" : "en" , "@value" : "The Starry Night" } ] }
Even if only one language is available, the language MUST be specified.
{ "@context" : "https://schema.org/" , "name" : { "@language" : "nl" , "@value" : "De Sterrennacht" } }
3.4. Publication method
3.4.1. Combined
With RDF, it’s perfectly fine to express the same data in multiple ways. Therefore, the generic and domain data models MAY coexist in the same information resource.
{ "@context" : { "schema" : "https://schema.org/" , "edm" : "http://www.europeana.eu/schemas/edm/" , "rdfs" : "http://www.w3.org/2000/01/rdf-schema#" , "dcterms" : "http://purl.org/dc/terms/" }, "@id" : "https://literatuurmuseum.nl/id/123456789" , "@type" : [ "schema:CreativeWork" , "schema:VisualArtwork" ], "schema:name" : "Het fluitketeltje en andere versjes" , "rdfs:label" : "Het fluitketeltje en andere versjes" , "schema:creator" : { "@type" : "schema:Person" , "@id" : "http://data.rkd.nl/artists/8342" }, "dcterms:creator" : { "@type" : "dcterms:Agent" , "@id" : "http://data.rkd.nl/artists/8342" } }
3.4.2. Separate profiles
Alternatively, publishers MAY separate the generic data model by using profile-based content negotiation (see [DX-PROF-CONNEG]).
To do so, publish a profile with URI https://netwerk-digitaal-erfgoed.github.io/schema-profile/
.
# Get the list of profiles.GET /resource/a?profile=alt HTTP / 1.1 # Server responds with a list of profiles that includes the NDE generic data model. HTTP/1.1 200 OK Content-Type: application/json { "resource": "http://example.org/resource/a", "profiles": [ { "token": "nde", "uri": "https://netwerk-digitaal-erfgoed.github.io/schema-profile/", "media_types": ["application/ld+json", "text/turtle"] }, ... ] }
4. Data model
This section describes the classes and properties that MUST be used to publish metadata records in the heritage network.
Each record MUST be typed as one of the following classes:
For each of these classes, the sections below list the REQUIRED and OPTIONAL properties.
4.1. CreativeWork
The most generic kind of item created by humans, i.e. heritage objects.
Candidate properties [Issue #3]
4.1.1. Subclasses
Publishers SHOULD use more fine-grained classes alongside the top-level class CreativeWork
.
Examples include:
-
Article for stories;
-
ArchiveComponent for archival items and collections;
-
Book;
-
Message for letters;
-
MusicComposition, MusicRecording and MusicAlbum for musical items and collections;
{ "@context" : "https://schema.org/" , "@id" : "https://n2t.net/ark:/123456/1" , "@type" : [ "CreativeWork" , "Painting" ] }
4.1.2. URI (required)
Each CreativeWork MUST be identified by a persistent URI. Blank nodes MUST NOT be used for CreativeWorks.
@id
property:
{ "@context" : "https://schema.org/" , "@id" : "https://example.com/dataset1/resource1" , "@type" : "CreativeWork" }
Do we need identifier alongside URI? Not from a web perspective (where we care only about URIs) but perhaps identifier is useful to reference physical objects, e.g. in a museum.
4.1.3. name (required)
A REQUIRED property to indicate the CreativeWork’s name, assigned either by its creator or by others. The name MUST be a language-tagged string:
{ "@context" : "https://schema.org/" , "@id" : "https://example.com/dataset1/resource1" , "@type" : "CreativeWork" , "name" : [ { "@language" : "nl" , "@value" : "De Sterrennacht" }, { "@language" : "en" , "@value" : "The Starry Night" } ] }
4.1.4. creator (required)
A REQUIRED property that identifies the person(s) or organization that created the CreativeWork. If a term is available, that MUST be referenced. If not, a Person or Organization resource MUST be used instead.
{ "@context" : "https://schema.org" , "@type" : [ "CreativeWork" , "Painting" ], "@id" : "http://www.wikidata.org/entity/Q45585" , "creator" : { "@id" : "https://data.rkd.nl/artists/32439" , "@type" : "Person" , "name" : "Rembrandt" } }
Even where more specific properties, applicable to CreativeWork’s subtypes, are available in Schema.org,
such as artist, composer and director,
the creator
property MUST be used for consistency.
4.1.5. isPartOf (required)
A REQUIRED property that points to the dataset(s) that the CreativeWork is part of.
Note that a CreativeWork may be part of multiple datasets.
The dataset MUST be typed as a Dataset
.
{ "@context" : "https://schema.org/" , "@id" : "https://n2t.net/ark:/123456/1" , "@type" : "CreativeWork" , "isPartOf" : { "@id" : "https://organization.com/dataset1" , "@type" : "Dataset" } }
4.1.6. associatedMedia (required)
Or use specialized properties schema:image, schema:video, schema:audio alongside or without schema:associatedMedia?
A media object that represents the CreativeWork. This property is REQUIRED if applicable, i.e. if at least one media object is available for the metadata record.
{ "@context" : "https://schema.org/" , "@id" : "http://www.wikidata.org/entity/Q45585" , "@type" : "CreativeWork" , "associatedMedia" : { "@id" : "https://demo.limb-gallery.com/idurl/1/25290" , "@type" : "ImageObject" , "contentUrl" : "https://demo.limb-gallery.com/iiif/25290/manifest" , "encodingFormat" : "application/ld+json" } }
See MediaObject for this property’s allowed values.
How to refer to media that is not part of the dataset, such as external images that are used not as unique representations but as illustrations of the CreativeWork?
4.1.7. description
An OPTIONAL property that describes the CreativeWork in one sentence. The description MUST be free of jargon and abbreviations so it can be understood by others. The value MUST be a language-tagged string.
{ "@context" : "https://schema.org/" , "@id" : "https://example.com/dataset1/resource1" , "description" : [ { "@language" : "nl" , "@value" : "Olieverfschilderij van het uitzicht uit Van Goghs ziekenhuiskamer in Saint-Rémy-de-Provence, vlak voor zonsopkomst." }, { "@language" : "en" , "@value" : "Oil-on-canvas painting depicting the view from his asylum room at Saint-Rémy-de-Provence, just before sunrise." } ] }
4.1.8. abstract
An OPTIONAL property that provides a longer summarizing description of the CreativeWork.
{ "@context" : "https://schema.org/" , "@id" : "https://example.com/dataset1/resource1" , "abstract" : [ { "@language" : "nl" , "@value" : "Het schilderij is een nachttafereel met gele sterren boven een kleine stad met heuvels. Het is een uitzicht vanuit een denkbeeldig punt over een dorp met kerktoren, met links een vlammende cipres en rechts olijfbomen tegen de heuvels op." } ] }
4.1.9. license
Does license make sense on the level of individual resources of should we delegate to the level of the dataset? Or perhaps only on certain types of resources, such as media?
4.1.10. contentLocation
An OPTIONAL property that indicates the location depicted or described in the CreativeWork. For example, the location in a photograph or painting.
If available, a term MUST be referenced. If not, a Place resource MUST be used.
{ "@context" : "https://schema.org" , "@id" : "http://www.wikidata.org/entity/Q45585" , "contentLocation" : { "@id" : "http://www.wikidata.org/entity/Q221507" , "@type" : "Place" , "name" : "Saint-Rémy-de-Provence" } }
4.1.11. locationCreated
An OPTIONAL property that indicates the location where the CreativeWork was created (which may be different from its contentLocation).
If available, a term MUST be referenced. If not, a Place resource MUST be used.
{ "@context" : "https://schema.org" , "@id" : "http://www.wikidata.org/entity/Q45585" , "locationCreated" : { "@id" : "http://www.wikidata.org/entity/Q221507" , "@type" : "Place" , "name" : "Saint-Rémy-de-Provence" } }
4.1.12. dateCreated
An OPTIONAL property that indicates the date the CreativeWork was created.
The value MUST be in [ISO8601] format. Partial dates MAY be used if the exact date is unknown.
{ "@context" : "https://schema.org" , "@id" : "http://www.wikidata.org/entity/Q45585" , "dateCreated" : "1889-06" }
4.1.13. about
An OPTIONAL property to indicate the subject-matter of the CreativeWork. For example, which subjects are depicted in a painting or photograph? Or which subjects is a story about?
The value MUST reference terms.
If the subject is a location, it MUST be listed under contentLocation instead.
{ "@context" : "https://schema.org" , "@id" : "http://www.wikidata.org/entity/Q45585" , "about" : [ { "@id" : "http://www.wikidata.org/entity/Q149908" , "@type" : "DefinedTerm" }, { "@id" : "http://www.wikidata.org/entity/Q405" , "@type" : "DefinedTerm" } ] }
Whereas schema:about has range schema:Thing, schema:material and other properties do not. This means we can use schema:DefinedTerm for schema:about but not for schema:material. Should we drop schema:DefinedTerm completely?
4.1.14. material
An OPTIONAL property that indicates the material(s) that the CreativeWork is made from, e.g. leather, wool, cotton, paper. The value MUST reference terms.
{ "@context" : "https://schema.org" , "@id" : "http://www.wikidata.org/entity/Q45585" , "material" : [ { "@id" : "http://vocab.getty.edu/aat/300015050" }, { "@id" : "http://vocab.getty.edu/aat/300014078" } } }
4.1.15. genre
An OPTIONAL property that indicates the genre(s) of the CreativeWork, for example art movements or periods.
The value MUST reference a term.
{ "@context" : "https://schema.org" , "@id" : "http://www.wikidata.org/entity/Q45585" , "genre" : { "@id" : "http://vocab.getty.edu/aat/300021508" } }
4.2. Person
If a metadata record is a person, it MUST be typed as Person
.
If a term is available for the person, that MUST be referenced.
If not, the person MUST be defined by the required properties listed below.
The objective for the Person model is not to fully describe all aspects of a person, but to easily identify and distinguish between similar persons.
Consider candidate properties nationality, description, familyName, givenName.
4.2.1. name (required)
A REQUIRED property that indicates the Person’s full name in its preferred display form:
{ "@context" : "https://schema.org" , "@type" : "Person" , "@id" : "https://n2t.net/ark:/123456/2" , "name" : { "@language" : "nl-NL" , "@value" : "Pluk van de Petteflat" } }
Does it make sense to require person names to be language-tagged? Think about languages that show names in a different format, such as ZH.
4.2.2. birthDate
An OPTIONAL property that indicates the person’s date of birth in [ISO8601] format.
4.2.3. birthPlace
An OPTIONAL property that references the person’s place of birth. The value MUST reference a term.
{ "@context" : "https://schema.org" , "@id" : "https://n2t.net/ark:/123456/2" , "birthPlace" : { "@id" : "https://sws.geonames.org/2745912/" } }
4.2.4. deathDate
An OPTIONAL property that indicates the person’s date of death in [ISO8601] format.
4.2.5. deathPlace
An OPTIONAL property that references the person’s place of death. The value MUST reference a term.
{ "@context" : "https://schema.org" , "@id" : "https://n2t.net/ark:/123456/2" , "deathPlace" : { "@id" : "http://www.wikidata.org/entity/Q131153786" } }
4.2.6. hasOccupation
An OPTIONAL property that indicates the person’s occupation. The value MUST reference a term.
{ "@context" : "https://schema.org" , "@id" : "https://n2t.net/ark:/123456/2" , "hasOccupation" : { "@id" : "http://vocab.getty.edu/aat/300025008" } }
4.3. Organization
4.3.1. name (required)
A REQUIRED property that indicates the Organization’s full name in its preferred display form.
Do we need more properties for Organization?
4.4. MediaObject
In case of image, video or audio objects, the relevant subclass MUST be used:
In case of other types of media, the generic class MediaObject MUST be used.
4.4.1. ImageObject
An ImageObject MUST have a contentUrl
property that points to a IIIF Presentation API manifest.
See [Issue #2]
Should we support non-IIIF clients/users?
{ "@context" : "https://schema.org/" , "@id" : "https://example.com/image1" , "@type" : "ImageObject" , "contentUrl" : "https://example.com/image1/manifest.json" , "encodingFormat" : "application/ld+json" }
4.4.2. AudioObject
4.4.3. VideoObject
4.5. Place
For properties that reference locations, if no term is available, a custom Place resource MUST be used.
4.5.1. address (required)
A property that indicates the Place’s address, REQUIRED if known.
REQUIRED address properties are:
-
streetAddress
-
postalCode
-
addressLocality
(city) -
addressRegion
(province) -
addressCountry
(MUST be in [ISO3166-1] format).
{ "@context" : "https://schema.org/" , "@id" : "https://example.com/dataset/place" , "@type" : "Place" , "address" : { "@type" : "PostalAddress" , "streetAddress" : "Street 123" , "postalCode" : "1234 AB" , "addressLocality" : "City" , "addressRegion" : "Noord-Holland" , "addressCountry" : "NL" } }
4.5.2. geo (required)
A property that indicates the Place’s geo coordinates, REQUIRED if known.
{ "@context" : "https://schema.org/" , "@id" : "https://example.com/dataset/place" , "@type" : "Place" , "geo" : { "@type" : "GeoCoordinates" , "latitude" : "37.42242" , "longitude" : "-122.08585" } }
5. Example
{ "@context" : "https://schema.org/" , "@id" : "https://literatuurmuseum.nl/id/123456789" , "@type" : "CreativeWork" , "name" : "Het fluitketeltje en andere versjes" , "creator" : { "@type" : "Person" , "@id" : "http://data.rkd.nl/artists/8342" }, "material" : { "@id" : "https://data.cultureelerfgoed.nl/term/id/cht/2d28d9aa-77e8-40ab-b0fe-f04d99f57955" }, "dateCreated" : "1950" }
6. Formal definition
This SHACL file does not yet reflect all changes in the text above.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix schema: <https://schema.org/> . @prefix sh: <http://www.w3.org/ns/shacl#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . _ : CreativeWorkShape a sh : NodeShape ; sh : targetClass schema : CreativeWork ; sh : property _ : NameProperty , _ : DescriptionProperty , _ : CreatorProperty . _ : NameProperty a sh : PropertyShape ; sh : path schema : name ; sh : datatype rdf : langString ; sh : minCount 1 . _ : DescriptionProperty a sh : PropertyShape ; sh : path schema : description ; sh : datatype rdf : langString ; sh : minCount 1 . _ : ImageProperty a sh : PropertyShape ; sh : path schema : image ; sh : class schema : ImageObject ; sh : minCount 0 . _ : CreatorProperty a sh : PropertyShape ; sh : path schema : creator ; sh : or ( [ sh : datatype schema : Person ] [ sh : datatype schema : Organization ] ) ; sh : minCount 1 . _ : GeoCoordinatesShape a sh : NodeShape ; sh : targetClass schema : GeoCoordinates ; sh : property [ sh : path schema : latitude ; sh : datatype xsd : float ; sh : minCount 1 ; sh : maxCount 1 ; ] , [ sh : path schema : longitude ; sh : datatype xsd : float ; sh : minCount 1 ; sh : maxCount 1 ; ] . _ : PlaceShape a sh : NodeShape ; sh : targetClass schema : Place ; sh : property [ sh : path schema : geo ; sh : or ( [ sh : class schema : GeoCoordinates ] [ sh : class schema : GeoShape ] ) ; sh : minCount 0 ; sh : maxCount 1 ; ] .