- FIN-CLARIAH Research Infrastructure
A new national research infrastructure initiative FIN-CLARIAH for...
8.12.2021 8:12 by eahyvone
- WarMemoirSampo published on December 3, 2021
A new “Sampo” application, “WarMemoirSampo”...
8.12.2021 8:04 by eahyvone
- Five new SeCo papers accepted for the ISWC 2021
The 20th International Semantic Web Conference (ISWC 2021), the...
2.8.2021 6:53 by eahyvone
- Eljas Oksanen, Frida Ehrnsten, Heikki Rantala and Eero Hyvönen: Semantic Solutions for Democratising Archaeological and Numismatic Data Analysis
- : BD2022 Proceedings of the BD2022 Biographical Data in a Digital World 2022 Conference
- Petri Leskinen and Eero Hyvönen: Biographical and Prosopographical Analyses of Finnish Academic People 1640–1899 Based on Linked Open Data
- Eero Hyvönen: Creating and Using Biographical Dictionaries for Digital Humanities Based on Linked Data: A Survey of Web Services in Use in Finland
An instance of the VideoSampo Framework
WarMemoirSampo is a Linked Open Data (LOD) resource of Finnish Second World War (WW2) veteran interview videos, as well as a semantic portal for easy access to them. The system is being realized using the Sampo model and by enriching the videos with related information from the WarSampo knowledge graph. WarMemoirSampo hosts a collection of video interviews of Finnish WW2 veterans, mostly reminiscing about their lives during and after wartime. Rough transcriptions of the interviews have been provided, which form the basis of the textual information presented in the portal and the metadata extracted from them.A key technical challenge addressed in this work is how to search and access different temporal points in long videos, based on their time-stamped transcriptions. In order to achieve this, we created an RDF graph featuring data for the interviews as well as the interviewees. The main building blocks of the graph are coarse summary notes written by the interviewers, alongside their corresponding timestamps. However, the timestamps lack precision, since they may be repeated over multiple notes. These circumstances led us to our basic unit of data for the interviews: a group of notes that share the same timestamp, corresponding to a given stretch of the interview. They are of varying length, but typically several minutes long. The textual contents are enriched semantically using NLP techniques and knowledge extraction, resulting in new metadata about mentioned named entities - e.g., people, places, and events - and keywords generated via a pre-trained subject indexing tool.
The WarMemoirSampo portal is implemented using the Sampo-UI framework, which enables faceted search, exploration and analysis of the interviews. It is possible to identify interviews from specific interviewees and/or based on mentions of places, persons or subject matters (keywords) of interest. The results of the search take the user to the relevant parts of the video interviews, so that the veterans can be heard in their own voices. A semantic recommender system provides the user with links to related interview snippets present in the database, as well as additional information in WarSampo.
More features are planned for the future: named entities will be linked to the relevant resources in WarSampo and other Sampo portals, contributing to the growing web of Finnish linked data. An event detection tool is to be developed which extracts event information using times and places mentioned in the interviews. Moreover, when Finnish speech-to-text technology advances to the point that everyday dialectal speech can be automatically and reliably transcribed, the same tools could be used on the transcriptions, resulting in richer and more accurate metadata.
Video about WarMemoirSampo
The portal WarMemisSampo was published on December 3, 2021, at the National Archives of Finland, and is in use at:
More information is available at the Finnish homepage.