|
Suomalaiset sotavangit 1939-1945 Neuvostoliitossa
|
|
Tausta ja tavoiteet
Sotavainajien muiston vaalimisyhdistys ry:ssä (SMVY) käynnistyi v. 2016 yhteistyössä Kansallisarkiston (KA) kanssa hanke,
jonka tavoitteena on täydentää ja julkaista avoimena datana tiedot viime sotiemme suomalaisista sotavangeista nykyisten rajojemme ulkopuolella.
Aineisto on vankeudessa menehtyneiden osalta osittain päällekkäistä Kansallisarkiston ja Aalto-yliopiston aiemmin yhteistyönä kehittämän Menehtyneet 1939-1945 -datajulkaisun kanssa,
joka on osa laajempaa Sotasampo-kokonaisuutta.
Hanke päätettiin toteuttaa Semanttisen laskennan tutkimusryhmän (SeCo) kanssa, koska
järkevintä on tuottaa uudesta sotavankeihin liittyvästä datasta suoraan linkitettyä dataa Sotasammon ontologiainfrastruktuuria hyödyntäen.
Näin uusi data automaattisesti rikastuu Sotasammon muiden aineistojen kautta 'ilmaiseksi', ja toisaalta uusi data rikastaa Sotasammossa jo julkaistujen aineistojen sisältöä
'Sampo' mallin idean mukaisesti.
Tutkimus- ja kehitystyön tavoitteet
Hankkeessa on suunniteltu ja toteutettu sotavankiaineiston julkaisemisen metadatamallit ja ontologiat linkitetyn datan menetelmillä.
Mallit populoitiin yhdistyksen hallussa jo olevan taulukkomuotoisen datan avulla ja uutta dataa ja aineistoja sotavangeista tuottaen.
Keskeisiä työssä ratkaistavia haasteita olivat mm. literaalidatan (esim. paikkojen ja henkilöiden nimet) linkitys Sotasammon ontologioihin ja
datafuusion hallinta uuden vankiaineiston ja Sotasammon tietämysverkon välillä.
Pilottisovellus verkossa
Sotavangit datan hyödyntämistä suuren yleisön kannalta ja humanistissa tutkimuksessa on demonstroitu
kehittämällä Sotasampoon uusi sovellusnäkymä 'Sotavangit'.
Projektissa kehitettävät ohjelmistot julkaistaan avoimena koodina MIT-lisenssillä ja data avoimena datana CC-BY-4.0-lisenssillä
siltä osin kun sen on tietosuojasyistä mahdollista.
Pilottisovellus julkaistaan tutkijoiden ja laajemman yleisön käytettäväksi syksyllä 2019 talvisodan 80v muiston kunniaksi.
Julkistustilaisuus
Sotavangit-sovellus julkaistaan talvisodan 80-vuotismuiston kunniaksi 29.11.2019 Kansallisarkistossa pidettävässä
julkistustilaisuudessa.
Projekti ja ohjausryhmä
Projektia ohjasi ohjausryhmä, jonka jäseninä ovat toimineet toimivat:
Pertti Suominen (SMVY), Reijo Nikkilä (SMVY),
Jussi Nuorteva (KA), Dmitri Frolov (KA), Pekka Pitkänen, Ohto Manninen
ja Eero Hyvönen (Aalto-yliopisto ja Helsingin yliopisto (HELDIG)).
Yhteyshenkilöt ja projektiryhmä SeCossa
Prof. Eero Hyvönen,
Aalto-yliopisto ja Helsingin yliopisto, HELDIG - Helsinki Centre for Digital Humanities
FM Mikko Koho,
Aalto-yliopisto, tietotekniikan laitos
DI Esko Ikkala,
Aalto-yliopisto, tietotekniikan laitos
Julkaisuja
2025
Eero Hyvönen, Petri Leskinen, Henna Poikkimäki, Heikki Rantala, Jouni Tuominen, Senka Drobac, Ossi Koho, Ilona Pikkanen and Hanna-Leena Paloposki:
Searching, exploring, and analyzing historical letters and the underlying networks: LetterSampo Finland (1809–1917) data service and semantic portal. 2025. Abstract, submitted for peer review.
bib pdf
2024
2023
2022
Mikko Koho, Esko Ikkala and Eero Hyvönen:
Reassembling the Lives of Finnish Prisoners of the Second World War on the Semantic Web.
Proceedings of the Third Conference on Biographical Data in a Digital World (BD 2019), pp. 31-39, CEUR Workshop Proceedings, June, 2022.
bib pdf link This paper presents first results of a new, ninth application perspective for the semantic portal WarSampo - Finnish WW2 on the Semantic Web, based on a database of ca. 4450 Finnish prisoners of war in the Soviet Union. Our key idea is to reassemble the life of each prisoner of war by using Linked Data, based on information about the person in different data sources. Using the enriched aggregated data, a biographical global home page for each prisoner of war can be created, that is more complete than information in individual data sources. The application perspective is targeted to researchers of military history, to study and analyze the data in order to form new research questions or hypotheses, as well as to public in the large looking for information e.g., about their relatives that were captured as prisoners of war. Employing the faceted search of the application perspective, prosopographical research on subgroups of prisoners is possible.
2021
Mikko Koho, Esko Ikkala, Petri Leskinen, Minna Tamper, Jouni Tuominen and Eero Hyvönen:
WarSampo Knowledge Graph: Finland in the Second World War as Linked Open Data. Semantic Web – Interoperability, Usability, Applicability, vol. 12, no. 2, pp. 265-278, January, 2021.
bib pdf link The Second World War (WW2) is arguably the most devastating catastrophe of human history, a topic of great interest to not only researchers but the general public. However, data about the Second World War is heterogeneous and distributed in various organizations and countries making it hard to utilize. In order to create aggregated global views of the war, a shared ontology and data infrastructure is needed to harmonize information in various data silos. This makes it possible to share data between publishers and application developers, to support data analysis in Digital Humanities research, and to develop data-driven intelligent applications. As a first step towards these goals, this article presents the WarSampo knowledge graph (KG), a shared semantic infrastructure, and a Linked Open Data (LOD) service for publishing data about WW2, with a focus on Finnish military history. The shared semantic infrastructure is based on the idea of representing war as a spatio-temporal sequence of events that soldiers, military units, and other actors participate in. The used metadata schema is an extension of CIDOC CRM, supplemented by various military historical domain ontologies. With an infrastructure containing shared ontologies, maintaining the interlinked data brings upon new challenges, as one change in an ontology can propagate across several datasets that use it. To support sustainability, a repeatable automatic data transformation and linking pipeline has been created for rebuilding the whole WarSampo KG from the individual source datasets. The WarSampo KG is hosted on a data service based on W3C Semantic Web standards and best practices, including content negotiation, SPARQL API, download, automatic documentation, and other services supporting the reuse of the data. The WarSampo KG, a part of the international LOD Cloud and totalling ca. 14 million triples, is in use in nine end-user application views of the WarSampo portal, which has had over 400 000 end users since its opening in 2015.
2020
Mikko Koho, Petri Leskinen and Eero Hyvönen:
Integrating Historical Person Registers as Linked Open Data in the WarSampo Knowledge Graph.
Semantic Systems. In the Era of Knowledge Graphs. SEMANTiCS 2020 (Eva Blomqvist, Paul Groth, Victor de Boer, Tassilo Pellegrini, Mehwish Alam, Tobias Käfer, Peter Kieseberg, Sabrina Kirrane, Albert Meroño-Peñuela and Harshvardhan J. Pandit (eds.)), Lecture Notes in Computer Science, vol. 12378, pp. 118-126, Springer, Cham, Amsterdam, The Netherlands, October, 2020.
bib pdf link Semantic data integration from heterogeneous, distributed data silos enables Digital Humanities research and application development employing a larger, mutually enriched and interlinked knowledge graph. However, data integration is challenging, involving aligning the data models and reconciling the concepts and named entities, such as persons and places. This paper presents a record linkage process to reconcile person references in different military historical person registers with structured metadata. The information about persons is aggregated into a single knowledge graph. The process was applied to reconcile three person registers of the popular semantic portal WarSampo -- Finnish World War 2 on the Semantic Web . The registers contain detailed information about some 100,000 people and are individually maintained by domain experts. Thus, the integration process needs to be automatic and adaptable to changes in the registers. An evaluation of the record linkage results is promising and provides some insight into military person register reconciliation in general.
2019
Mikko Koho, Erkki Heino, Petri Leskinen, Esko Ikkala, Minna Tamper, Kasper Apajalahti, Jouni Tuominen, Eetu Mäkelä and Eero Hyvönen:
WarSampo Knowledge Graph. Zenodo, October, 2019. Dataset.
bib link WarSampo Knowledge Graph includes harmonized data of different kinds concerning the Second World War in Finland, separated in different subgraphs representing events, actors, places, photographs, and other aspects and documentation of the war. The data covers the Winter War 1939-1940 against the Soviet attack, the Continuation War 1941-1944 where the occupied areas of the Winter War were temporarily regained, and the Lapland War 1944-1945, where the Finns pushed the German troops away from Lapland.
Lia Gasbarra, Mikko Koho, Ilkka Jokipii, Heikki Rantala and Eero Hyvönen:
An Ontology of Finnish Historical Occupations.
The Semantic Web: ESWC 2019 Satellite Events (Hitzler, Pascal, Kirrane, Sabrina, Hartig, Olaf, de Boer, Victor, Vidal, Maria-Esther, Maleshkova, Maria, Schlobach, Stefan, Hammar, Karl, Lasierra, Nelia, Stadtmüller, Steffen, Hose, Katja and Verborgh, Ruben (eds.)), Lecture Notes in Computer Science, pp. 64-68, Springer, Cham, Portoroz, Slovenia, June, 2019.
bib pdf link Historical datasets often impose the need to study groups of people based on occupation or social status. This paper presents first results in creating an ontology of historical Finnish occupations, AMMO, that enables selection of groups of people based on their occupation, occupational groups, or socioeconomic class. AMMO is linked to the international historical occupation classification HISCO and to a modern Finnish occupational classification for interoperability. AMMO will be used as a component in two semantic portals for Finnish war history.
Mikko Koho, Lia Gasbarra, Jouni Tuominen, Heikki Rantala, Ilkka Jokipii and Eero Hyvönen:
AMMO Ontology of Finnish Historical Occupations.
Proceedings of the First International Workshop on Open Data and Ontologies for Cultural Heritage (ODOCH 19) (Antonella Poggi (ed.)), vol. 2375, pp. 91-96, CEUR Workshop Proceedings, Rome, Italy, June, 2019.
bib pdf link This paper introduces AMMO Ontology of Finnish Historical Occupations. AMMO is based on thousands of occupation labels extracted from three Finnish military historical datasets of the early 20th century: the first consists of the ca. 40 000 war-related death records around the time of the Finnish Civil War (1914–1922); the second consists of the ca. 95 000 death records of Finnish soldiers in the Winter War and Continuation War (1939–1944); the third contains the ca. 4500 records of Finnish prisoners of war in the Soviet Union during the WW2. Our goal from a Digital Humanities perspective is to use AMMO to study military history and these datasets based on the occupation and social status of the soldiers. AMMO will also be used as a component for faceted search and semantic recommendation in two semantic portals for Finnish military history. AMMO is aligned with the international historical occupation classification HISCO and with a modern Finnish occupational classification for international and national interoperability. The ontology is published as Linked Open Data in an ontology service.
2018
Mikko Koho, Esko Ikkala, Erkki Heino and Eero Hyvönen:
Maintaining a Linked Data Cloud and Data Service for Second World War History.
Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection. 7th International Conference, EuroMed 2018, Nicosia, Cyprus, vol. 11196, Springer-Verlag, October-November, 2018.
bib pdf link
Mikko Koho, Erkki Heino, Esko Ikkala, Eero Hyvönen, Reijo Nikkilä, Tiia Moilanen, Katri Miettinen and Pertti Suominen:
Integrating Prisoners of War Dataset into the WarSampo Linked Data Infrastructure.
Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN 2018), CEUR Workshop Proceedings, Helsinki, Finland, March, 2018. Vol 2084.
bib pdf link One of the great promises of Linked Data and the Semantic Web standards is to provide a shared data infrastructure into which more and more data can be imported and aligned, forming a sustainable, ever growing knowledge graph or linked data cloud, Web of Data. This paper studies and evaluates this idea in the context of the WarSampo Linked Data cloud, providing an infrastructure for data related to the Second World War in Finland. As a case study, a new database of prisoners of war with related contents is transformed into linked data and integrated into WarSampo. Lessons learned are discussed in relation to using traditional data publishing approaches.