Linked Data Technologies for Cultural Heritage and Digital Humanities:
Introducing the Semantic Web in Video Lectures

Eero Hyvönen, Aalto University and University of Helsinki
Semantic Computing Reserch Group (SeCo), Helsinki Centre for Digital Humanities (HELDIG)

Introduction: Paradigm Change in Publishing and Using Cultural Heritage Data

A paradigm shift is taking place in publishing and using Cutural Heritage (CH) content on the Web: in addition to publishing materials for the humans to read as usual, also the underlying data is published as Linked Open Data (LOD) using the FAIR principles. This facilitates

  1. publishing ever larger aggregated datasets,
  2. enriching the Big Data with other publishers' data and by reasoning,
  3. development of more intelligent interfaces and applications for the end users, and
  4. using the data in Digital Humanities (DH) research for data analyses.

As a result, CH and DH have grown into and important application field of Semantic Web (SW) technologies.

Instead of a textual tutorial, this site provides an introduction based on videos, which hopefully lowers the barrier of getting grips with the opportunities and practical technical ideas underlying the paradigm change. The learning materials published in this online course are targeted to researches, collection managers and curators in memory organizations, students, and application developers interested in the opprotunities and challenges related to creating, aggregating, publishing, and using CH LOD on the SW.

The course first presents the basic ideas underlying the Semantic Web as a megatrend on the World Wide Web evolving into a Web of Data (Lecture 1). After this the fundamental idea of respresenting data as semantic networks using the RDF model and language are presented (Lecture 2). The topic of publishing LOD on the Semantic Web and using it by querying is the topic of the next Lecture 3. At the hearth of the Semantic Web lay ontologies, the "silver bullet" of the Semantic Web. They define the concepts and data models based on which the Web of Data is built (Lecture 4). Another key idea on the SW is the idea of using logical rules and reasong for enriching the data (Lecture 6). From a pracical point of view, the SW needs an ontology and data instructure, so that data could be created in interoprable and re-usable forms in a cost efficient way. (Lecture 7) Finally, practical applications of SW in creating and using LOD services and semantic portals are discussed using work in Finland since 2001 as a case study. This includess issues on building a national semantic web infrastructure that is interoperable with infrastructure developments on the global WWW based on Semantic web standards of the World Wide Web consortium (W3C) (Lecture 8).

The learning materials are published under the open Creative Commons CC BY 4.0 license; you can refer to the course as Eero Hyvönen: Linked Data Technologies for Cultural Heritage and Digital Humanities: Introducing the Semantic Web in Video Lectures [Online course materials] https://seco.cs.aalto.fi//teaching/sw-introduction/


Lecture 1: Introduction: From WWW of today to the Semantic Web

This lecture first gives an overview of the "tradional" view of the WWW and its technologies in use today:

PDF Slides

Based on the challenges of the traditional WWW indentified, a motivating introduction to why the SW is needed is then presented:

PDF Slides


Lecture 2: Resource Description Framework: the Foundation of SW

Resource Description Framework, including the RDF and RDF Schema specifications standardized by of the W3C organization, form the backbone for representing data and knowledge on the Semantic Web, the Web of Data.

RDF Resource Description Framework

The video below describes the RDF data model and language:

PDF Slides

RDF Schema: Extending RDF with Hierarchies and Constraints

RDF Schema recommnendation (standard) extends RDF to representing class and property hierarchies and property constraints for data modeling:

PDF Slides

Lecture 3: Linked Data Publishing and SPARQL Query Language

This lecture explains how RDF-based Linked Data can be published on the Web, and  be used via APIs, especially by using the SPARQL query language on top of SPARQL endpoints.

Linked Data Publishing Principles

The first video introduces principes of W3C that are used when publishing linked data:

PDF Slides

After this it is shown, how linked data can be publihed by embedding data in web pages:

Linked Data Publishing in HTML pages

PDF Slides

Linked Data Publishing as Datasets

Linked data is typically harvested and publhished using online linked data services that can be used accessing the data:

PDF Slides

SPARQL Query Language

The most important way of using linked data services is querying by the SPARQL query language, standardized by the W3C, the topic of the text lecture:

PDF Slides


Lecture 4: Ontologies and Knowledge Organization Systems

Introduction to Knowledge Organization Systems (KOS)

Knowledge Organization Systems (KOS) have different names in differnt disciplines; they are called classifications, thesauri, controlled vocabularies, and "ontologies" in the context of the Semantic Web. Ontologies are used for defining and representing concepts on the Semantic Web for both the human users and the machines. Ontologies are a key element in linked data for representing data and knowledge and in making the data interoperable. This lecture first introduces the underlying of ideas of Knowledge Organization Systems (KOS) developed in different disciplines:

PDF Slides

Simple Knowledge Organizations System SKOS

After this, the Semantic Web standard SKOS Simple Knowledge Organization Systems is presented. It is used for representing KOSs in RDF form. SKOS is often applied when transforming existing KOS, such as classifications and thesauri, into a form that is suitable for linked data applications.

PDF Slides

An example of a massive international ontology library service is BioPortal focusing on biomedical ontologies. It has been created at Stanford University in collaboration with various other organizations. In Finland, for another example, lots of SKOS-based ontologies are provided openly by the National Library at the Finto.fi service, whose prototype ONKI.fi was developed at Aalto University. An ontology service can be used via APIs (e.g., SPARQL endpoint) in legacy systems for content creation, for example in museums for cataloging objects in collection databases. In this way, the URIs of the concepts used in metadata descriptions can be found and imported easily, and the metadata can be created in an interoperable way and be interlinked easily with other datasets that use the same ontology infrastructure.


Lecture 5: Web Ontology Language OWL

This lecture presents the logic-based Web Ontology Language OWL of W3C for developing Semantic Web ontologies.

OWL Web Ontology Language: Background and Context

PDF Slides

OWL Web Ontology Language: Introduction

PDF Slides

Ontology Engineering

PDF Slides

A series of hands-on videos of using the Protege editor for developing ontologies by Noureddin Sadawi can be watched here in YouTube:

https://www.youtube.com/watch?v=R9ERlUgvgwM


Lecture 6: Logical Rules and Inference

The semantics of the Semantic Web is based on application domain-agnostic logic. The idea of logic as a basis for human reasoning and Artificial Intelligence  is not bound to particular natural languages either. Logic is therefore a suitable semantic basis for web contents that cover all aspects of human knowledge expressed using a multitude of natural languages. 

From a practical point of view, logic makes it possible to enrich knowledge graphs on the Semantic Web by rule-based reasoning. This is useful in, e.g., creating intelligent search systems, recommender systems for exploring and browsing data, and in finding and solving problems.  Logic can also be used for explaining the solutions.

Logic and Inference Rules on the Semantic Web

The following video gives an introduction to logic and inference rules on the Semantic Web:

PDF Slides


Lecture 7: Infrastructures for Developing Applications

This lecture discusses infrastructures available for developing Semantic Web applications. The lecture consists of two introductory video presentations:

Semantic Web Infrastructures

PDF Slides

Applications of Semantic Web

PDF Slides

How to Build a National Semantic Web Infrastructure

The keynote presentation video of the DCMI 2021 confence below gives an overview of work in Finland on developing a national Semantic Web infrastructure and its applications:

PDF Slides, paper

Lecture 8: Case Studies: Sampo Model and Portal Series

This lecture discusses applications by presenting case studies. The lecture starts with an introductory video presentation of the Sampo systems:

After this you have a look at the videos below presenting some Sampo systems.

WarSampo - World War II on the Semantic Web

BiographySampo - Artificial Intelligence Reading Biographies for the Semantic Web

AcademySampo - Finnish Academic People 1640-1899

LetterSampo - Historical Letters on the Semantic Web

ParliamentSampo - Parliament of Finland on the Semantic Web

More Information

More videos and material related to semantic portals in use created at the Aalto University and University of Helsinki together with several Finnish and international collaborators can be found here: Sampo Model and Series of Semantic Portals