Eero Hyvönen, Aalto University and University of Helsinki Semantic Computing Reserch Group (SeCo), Helsinki Centre for Digital Humanities (HELDIG)
A paradigm shift [1] is taking place in publishing and using Cutural Heritage (CH) content on the Web: in addition to publishing materials for the humans to read as usual, also the underlying data is published as Linked Open Data (LOD) using the FAIR principles. This facilitates
As a result, CH and DH have grown into and important application field of Semantic Web (SW) technologies.
Instead of a textual tutorial, this site provides an introduction based on videos, which hopefully lowers the barrier of getting grips with the opportunities and practical technical ideas underlying the paradigm change. The learning materials published in this online course are targeted to researches, collection managers and curators in memory organizations, students, and application developers interested in the opportunities and challenges related to creating, aggregating, publishing, and using CH LOD on the SW.
The course first presents the basic ideas underlying the Semantic Web as a megatrend on the World Wide Web evolving into a Web of Data (Lecture 1). After this the fundamental idea of respresenting data as semantic networks using the RDF model and language are presented (Lecture 2). The topic of publishing LOD on the Semantic Web and using it by querying is the topic of the next (Lecture 3). At the hearth of the Semantic Web lay ontologies, the "silver bullet" of the Semantic Web. They define the concepts and data models based on which the Web of Data is built (Lecture 4). Lecture 5 gives an introduction to the most developed language recommendation of W3C for developing ontologies: the OWL Web Ontology Language. Another key idea on the SW is the idea of using logical rules and reasoning for enriching the data (Lecture 6). From a practical point of view, the SW needs an ontology and data instructure, so that data could be created in interoperable and re-usable forms in a cost efficient way (Lecture 7). Finally, practical applications of SW in creating and using LOD services and semantic portals are discussed using work in Finland since 2001 as a case study. This includes issues on building a national semantic web infrastructure that is interoperable with infrastructure developments on the global WWW based on Semantic web standards of the World Wide Web consortium (W3C) (Lecture 7). As use cases, a series of some twenty LOD services and semantic portals for Cultural Heritage and Digital Humanities are introduced (Lecture 8).
More detailed information about this line of research can be found in the overview papers [2] and [3] listed in the end of this web page, and some 500 more focused publications related to this line of research are available on the Semantic Computing Research Group's publications page.
The learning materials on this video lectures course are published under the open Creative Commons CC BY 4.0 license; you can refer to the course as Eero Hyvönen: Linked Data Technologies for Cultural Heritage and Digital Humanities: Introducing the Semantic Web in Video Lectures [Online course materials] https://seco.cs.aalto.fi//teaching/sw-introduction/
This lecture first gives an overview of the "traditional" view of the WWW and its technologies in use today:
PDF SlidesBased on the challenges of the traditional WWW indentified, a motivating introduction to why the SW is needed is then presented:
Resource Description Framework, including the RDF and RDF Schema specifications standardized by the W3C organization, form the backbone for representing data and knowledge on the Semantic Web, the Web of Data.
The video below describes the RDF data model and language:
PDF SlidesRDF Schema recommendation (standard) extends RDF to representing class and property hierarchies and property constraints for data modeling:
PDF SlidesThis lecture explains how RDF-based Linked Data can be published on the Web, and be used via APIs, especially by using the SPARQL query language on top of SPARQL endpoints.
The first video introduces principes of W3C that are used when publishing linked data:
After this it is shown, how linked data can be published by embedding data in web pages:
Linked data is typically harvested and publhished using online linked data services that can be used accessing the data:
The most important way of using linked data services is querying by the SPARQL query language, standardized by the W3C, the topic of the next lecture:
Knowledge Organization Systems (KOS) have different names in different disciplines; they are called classifications, thesauri, controlled vocabularies, and "ontologies" in the context of the Semantic Web. Ontologies are used for defining and representing concepts on the Semantic Web for both the human users and the machines. Ontologies are a key element in linked data for representing data and knowledge and in making the data interoperable. This lecture first introduces the underlying of ideas of Knowledge Organization Systems (KOS) developed in different disciplines:
After this, the Semantic Web standard SKOS Simple Knowledge Organization Systems is presented. It is used for representing KOSs in RDF form. SKOS is often applied when transforming existing KOS, such as classifications and thesauri, into a form that is suitable for linked data applications.
An example of a massive international ontology library service is BioPortal focusing on biomedical ontologies. It has been created at Stanford University in collaboration with various other organizations. In Finland, for another example, lots of SKOS-based ontologies are provided openly by the National Library at the Finto.fi service, whose prototype ONKI.fi was developed at Aalto University. An ontology service can be used via APIs (e.g., SPARQL endpoint) in legacy systems for content creation, for example in museums for cataloging objects in collection databases. In this way, the URIs of the concepts used in metadata descriptions can be found and imported easily, and the metadata can be created in an interoperable way and be interlinked easily with other datasets that use the same ontology infrastructure.
This lecture presents the logic-based Web Ontology Language OWL of W3C for developing Semantic Web ontologies.
A series of hands-on videos of using the Protege editor for developing ontologies by Noureddin Sadawi can be watched here in YouTube:
https://www.youtube.com/watch?v=R9ERlUgvgwM
The semantics of the Semantic Web is based on application domain-agnostic logic. The idea of logic as a basis for human reasoning and Artificial Intelligence is not bound to particular natural languages either. Logic is therefore a suitable semantic basis for web contents that cover all aspects of human knowledge expressed using a multitude of natural languages.
From a practical point of view, logic makes it possible to enrich knowledge graphs on the Semantic Web by rule-based reasoning. This is useful in, e.g., creating intelligent search systems, recommender systems for exploring and browsing data, and in finding and solving problems. Logic can also be used for explaining the solutions.
The following video gives an introduction to logic and inference rules on the Semantic Web:
This lecture discusses infrastructures available for developing Semantic Web applications. The lecture consists of two introductory video presentations:
The keynote presentation video of the DCMI 2021 confence below gives an overview of work in Finland on developing a national Semantic Web infrastructure and its applications:
PDF Slides, paperThis lecture discusses applications by presenting case studies. The lecture starts with an introductory video presentation of the Sampo systems:
After this you have a look at the videos below presenting some Sampo systems.
More videos and material related to semantic portals in use created at the Aalto University and University of Helsinki together with several Finnish and international collaborators can be found here: Sampo Model and Series of Semantic Portals