- FIN-CLARIAH Research Infrastructure
A new national research infrastructure initiative FIN-CLARIAH for...
8.12.2021 8:12 by eahyvone
- WarMemoirSampo published on December 3, 2021
A new “Sampo” application, “WarMemoirSampo”...
8.12.2021 8:04 by eahyvone
- Five new SeCo papers accepted for the ISWC 2021
The 20th International Semantic Web Conference (ISWC 2021), the...
2.8.2021 6:53 by eahyvone
- Frida Ehrnsten, Eljas Oksanen, Heikki Rantala and Eero Hyvönen: DigiNUMA ja Rahasampo – uusi digitaalinen palvelu rahalöydöistä kiinnostuneille
- Mehwish Alam, Victor de Boer, Enrico Daga, Marieke van Erp, Eero Hyvönen and Albert Meroño-Peñuela: Editorial of Special Issue on Cultural Heritage and Semantic Web
- Heikki Rantala, Eero Hyvönen and Petri Leskinen: Finding and explaining relations in a biographical knowledge graph based on life events: Case BiographySampo
- Heikki Rantala, Eero Hyvönen and Petri Leskinen: Finding relations between entities in a knowledge graph: Case artists of the Getty Union List of Artist Names (ULAN)
An instance of the VideoSampo Framework
WarMemoirSampo is a Linked Open Data (LOD) resource of Finnish Second World War (WW2) veteran interview videos, as well as a semantic portal for easy access to them. The system is being realized using the Sampo model and by enriching the videos with related information from the WarSampo knowledge graph. WarMemoirSampo hosts a collection of video interviews of Finnish WW2 veterans, mostly reminiscing about their lives during and after wartime. Rough transcriptions of the interviews have been provided, which form the basis of the textual information presented in the portal and the metadata extracted from them.A key technical challenge addressed in this work is how to search and access different temporal points in long videos, based on their time-stamped transcriptions. In order to achieve this, we created an RDF graph featuring data for the interviews as well as the interviewees. The main building blocks of the graph are coarse summary notes written by the interviewers, alongside their corresponding timestamps. However, the timestamps lack precision, since they may be repeated over multiple notes. These circumstances led us to our basic unit of data for the interviews: a group of notes that share the same timestamp, corresponding to a given stretch of the interview. They are of varying length, but typically several minutes long. The textual contents are enriched semantically using NLP techniques and knowledge extraction, resulting in new metadata about mentioned named entities - e.g., people, places, and events - and keywords generated via a pre-trained subject indexing tool.
The WarMemoirSampo portal is implemented using the Sampo-UI framework, which enables faceted search, exploration and analysis of the interviews. It is possible to identify interviews from specific interviewees and/or based on mentions of places, persons or subject matters (keywords) of interest. The results of the search take the user to the relevant parts of the video interviews, so that the veterans can be heard in their own voices. A semantic recommender system provides the user with links to related interview snippets present in the database, as well as additional information in WarSampo.
More features are planned for the future: named entities will be linked to the relevant resources in WarSampo and other Sampo portals, contributing to the growing web of Finnish linked data. An event detection tool is to be developed which extracts event information using times and places mentioned in the interviews. Moreover, when Finnish speech-to-text technology advances to the point that everyday dialectal speech can be automatically and reliably transcribed, the same tools could be used on the transcriptions, resulting in richer and more accurate metadata.
Video about WarMemoirSampo
The portal WarMemisSampo was published on December 3, 2021, at the National Archives of Finland, and is in use at:
More information is available at the Finnish homepage.