» print this page!
» Follow us on Twitter
» Be our friend on Facebook

Latest News

SeCo on Twitter

SeCo on Facebook

WarSampo:
Finnish World War II on the Semantic Web

Goal: Understanding History and Promoting Peace

According to Georg Wilhelm Friedrich Hegel we learn from history that we learn nothing from history. Hopefully this is not the case for the Second World War (WW2), now that fighting has started again even within the borders of Europe in Ukraine. One way to promote peace is to make reliable data about the war openly available for everybody to learn.

WarSampo is the next step in our series of "Sampo" portals based on Linked Data, including CultureSampo, BookSampo, and TravelSampo, and continues our earlier works on modeling the First World War as Linked Data. The project started in autumn 2014 and is finished in 2017, by the Centennial of Finland's Independence, December 6th.

Figure: Mäkiluoto artillery fires at the Battle of Hanko in 1942. Finnish Wartime Photograph Archive, Defence Forces.

WarSampo is a project and semantic portal that aims at this goal by publishing large heterogeneous sets of data about the WW2 in Finland as Linked Open Data (LOD). Application demonstrators are built that provide different perspectives to war history, for both historians and the public. The data covers the Winter War 1939-1940 against the Soviet attack, the Continuation War 1941-1944 where the occupied areas of the Winter War were temporarily regained, and the Lapland War 1944-1945, where the Finns pushed the German troops away from Lapland.

Prototype Demonstrator Online since Nov 9th, 2015

Try WarSampo online at http://www.sotasampo.fi/en/.

The videos below show how the different perspectives of WarSampo are used:

Video Presentation about WarSampo initiative

"WarSampo Data Service and Semantic Portal for Publishing Linked Open Data about the Second World War History". Recorded presentation at the Extended Semantic Web Conference 2016 (ESWC 2016), courtesy of Videolectures.net.

Modeling War History as Linked Data

Data

The project deals the datasets obtained from a network of collaborating and data publishing organizations. For example:

  1. Casualties data (93,000 death incidents) includes data about the deaths in action during the wars.
  2. War Diaries (23,000 diaries) are digitized authentic documentations of the troop actions in the frontiers.
  3. Photos and films (160,000 pieces) were taken during the war by the troops of the Defence Forces.
  4. The Kansa taisteli magazine (3,360 articles) was published in 1957-1986 - its articles contain mostly memories of the men that fought on the fronts.
  5. Karelian places (some 30,000 georeferenced locations on historical maps) and related historical maps cover the war zone area in Finland that was finally annexed by the Soviet Union.
  6. YLE's audio and film material (Living Archive) was recorded during the war, or is related to it.

Metadata models

CIDOC CRM is used as the harmonizing basis for modeling data, with events providing the semantic glue for data linking. Our earlier data model for WW1 is used and extended as the metadata model to start with.

Domain Ontologies

The data is annotated using a set of domain ontologies, including: 1) an ontology of the troops and their hierarchies, 2) persons with their ranks and roles, 3) place ontology of historical places, 4) event ontology of battles, politics, and other war time incidents, 5) an ontology of time periods, 6) ontology of weapons, 7) ontology of vessels, and 8) a subject matter ontology. For 1-7 we have harvested named entities from the datasets, given them URIs and labels and some initial structure, as needed in our initial demos (discussed below). However, ontology modeling and development is still underway. A challenge of the actor ontologies, for example, is modeling the changes: names and positions of the troops as well as the roles of the personnel in the army change frequently (e.g., promotions of persons and changes in troop leadership) and have to be conditioned on time. For 8, the KOKO ontology, a center piece of the Finnish ontology infrastructure, is used.

The data and ontologies are published using SPARQL endpoints that form the basis of the WarSampo semantic portal and its applications. The idea of the portal is to provide a variety of different kind of perspectives to war data, represented on different tabs. Most datasets will have their own perspective, where the user can first search data of interest and then get linked data related to the resources found. The perspectives enrich each other via Linked Data.

Implementation

WarSampo is implemented using the "7-star" Linked Data Finland platform, based on Fuseki with a Varnish Cache Varnish Cache front end for serving LOD.

Collaboration Network and Funding

Our collaborator and data provider network consists of various organizations and data publishers on the web, including:

  • National Archives of Finland
  • War Museum
  • Finnish Defence Forces
  • The Association for Military History in Finland
  • Bonnier Publications
  • National Land Survey of Finland
  • National Broadcasting Company YLE
  • Finnish Literature Society
  • Svenska Litteratur Sällskapet

WarSampo project is funded 2015-2016 as part of the national Open Science and Research programme funded by the Ministry of Education and Science. Our work on historical places and maps is supported by the Finnish Cultural Foundation (see our project Ontology Service of Historical Places). LOD publication of War Casualties in 2015 was funded by the National Archives of Finland.

WarSampo is also part of the Centennary of Finland's Independence 2017 programme coordinated by the Prime Minister's Office.

WarSampo is one of the Finnish proposals for the EU Prize of Cultural Heritage / Europa Nostra Awards.

More Information

More information is available in the publications below, at the Finnish homepage, and in the project description (PDF, in Finnish).

Contact Person

Prof. Eero Hyvönen, Aalto University


Publications

2017

Eero Hyvönen: Cultural Heritage Linked Data on the Semantic Web: Three Case Studies Using the Sampo Model. 2017. Submitted for publication. bib pdf
A major challenge in publishing linked Cultural Heritage (CH) collections on the web is interoperability. This is due to the heterogeneity of CH contents and the distributed content creation model where publishers focus on their own data with little consideration on the others’ data. As a solution approach, the “Sampo” model is presented based on using domain independent modeling standards, on a model for aligning metadata models, and on sharing domain ontologies for populating the matadata models. The harmonized data is published for machines as a linked data service, to be used by applications for human users. To illustrate and evaluate the model, three online systems on the Web, Culture- Sampo, BookSampo, and WarSampo are presented.
Erkki Heino, Minna Tamper, Eetu Mäkelä, Petri Leskinen, Esko Ikkala, Jouni Tuominen, Mikko Koho and Eero Hyvönen: Named Entity Linking in a Complex Domain: Case Second World War History. February, 2017. Submitted for evaluation. bib pdf
This paper discusses the challenges of applying named entity linking in a rich, complex domain – specifically, the linking of 1) military units, 2) places and 3) people in the context of rich Second World War data. Multiple sub-scenarios are discussed in detail through concrete evaluations, analyzing the problems faced, and the solutions developed. A key contribution of this work is to highlight the heterogeneity of problems and approaches needed even inside a single domain, depending on both the source data as well as the target authority.
Minna Tamper, Petri Leskinen, Esko Ikkala, Arttu Oksanen, Eetu Mäkelä, Erkki Heino, Jouni Tuominen, Mikko Koho and Eero Hyvönen: AATOS – a Configurable Tool for Automatic Annotation. February, 2017. Submitted for evaluation. bib pdf
This paper presents an automatic annotation tool AATOS for providing documents with semantic annotations. The tool links entities found from the texts to ontologies defined by the user. The application is highly configurable and can be used with different natural language Finnish texts. The application was developed as a part of WarSampo and Semantic Finlex projects and tested using Kansa Taisteli magazine articles and consolidated Finnish legislation of Semantic Finlex. The quality of the automatic annotation was evaluated by measuring precision and recall against existing manual annotations. The results showed that the quality of the input text, as well as the selection and configuration of the ontologies impacted the results.

2016

Minna Tamper: Extraction of Entities and Concepts from Finnish Texts. MSc Thesis (in English), Aalto University, School of Science, Degree Programme in Computer Science and Engineering, Dec, 2016. bib pdf
Keywords are used in many document databases to improve search. The process of assigning keywords from controlled vocabularies to a document is called subject indexing. If the controlled vocabulary used for indexing is an ontology, with semantic relations and descriptions of concepts, the process is also called semantic annotation. In this thesis an automatic annotation tool was created to provide the documents with semantic annotations. The application links entities found from the texts to ontologies defined by the user. The application is highly configurable and can be used with different Finnish texts. The application was developed as a part of WarSampo and Semantic Finlex projects and tested using Kansa Taisteli magazine articles and consolidated legislation of Finnish legislation. The quality of the automatic annotation was evaluated by measuring precision and recall against existing manual annotations. The results showed that the quality of the input text, as well as the selection and configuration of the ontologies impacted the results.
Petri Leskinen: Sotilashenkilöiden ja joukko-osastojen mallintaminen ja käyttö toimijaontologiana. MSc Thesis (in Finnish), Aalto University, School of Science, Degree Programme in Computer Science and Engineering, Dec, 2016. bib pdf
Toimijaontologia mallintaa henkilöitä ja henkilöryhmiä linkitetyssä avoimessa datassa. Toimijaontologiamallin tarkoitus on mahdollistaa eri lähteiden aineistojen kokoaminen yhteen ja sen julkaisu yhdenmukaisessa formaatissa, jotta tietoa voidaan hyödyntää niin digitaalisten ihmistieteiden tutkimuksessa kuin tarjoamalla käyttöliittymiä aineiston selaamiseen visuaalisessa muodossa. Laadittu ontologia noudattaa toimija–tapahtuma-mallia. Siinä toimija mallinnetaan häneen liittyvien elämäkerrallisten tapahtumien summana. Ratkaisujen perustana käytettiin CIDOC CRM -standardia, millä haluttiin taata mallin helppo laajennettavuus sekä noudattaa kulttuurihistorialliselle datalle yhdenmukaista julkaisukäytäntöä. Työ on tehty osana laajempaa Sotasampo-projektia, johon kerättiin kattava tietokanta toisen maailmansodan aikaista aineistoa Suomen osalta. Oma osuuteni tässä työssä oli toimijaontologiamallin laatiminen sekä sen populointi sotilashenkilöillä ja -osastoilla. Aineisto on julkaistu avoimena datana (http://www.ldf.fi/dataset/warsa) ja on selattavissa Sotasampo-portaalissa (http://www.sotasampo.fi).
Esko Ikkala: Suomalainen historiallisten paikkojen ja karttojen ontologiapalvelu. MSc Thesis (in Finnish), Aalto University, School of Electrical Engineering, Degree Programme of Automation and Systems Technology, August, 2016. bib pdf
Historiallinen paikkatieto on keskeisessä asemassa muistiorganisaatioiden kokoelmien hallinnassa ja hyödyntämisessä sekä digitaalisten ihmistieteiden tutkimuksessa. Paikkatiedon käsitteleminen muissa kuin erikoistuneissa paikkatietojärjestelmissä sekä paikkatiedon ajallinen ulottuvuus tuovat mukanaan lukuisia haasteita, joihin linkitetyn datan teknologiat ovat tarjonneet lupaavia ratkaisuja. Tässä työssä esitellään kulttuurialan organisaatioiden tarpeeseen kehitetty uusi linkitetyn datan teknologioihin perustuva historiallisten paikkojen ja karttojen palvelumalli, HIPLA. HIPLA-palvelumallin tavoitteena on tarjota yhteinen näkymä eri organisaatioiden hallinnoimaan paikkatietoon ja mahdollistaa hajautettujen paikkatietoaineistojen yhteisöllinen täydentäminen, haku ja selailu sekä nykyisillä että historiallisilla kartoilla. Lisäksi työssä toteutettiin HIPLA-palvelumallin etuja havainnollistava prototyyppisovellus Hipla.fi, jota pilotoitiin osana talvi- ja jatkosodan aineistoja linkitettynä avoimena datana julkaisevaa Sotasampo-projektia. Pilotoinnin tuloksena syntyi talvi- ja jatkosodan paikkaontologia, joka tarjoaa työkalun sotiin liittyvien aineistojen automaattiselle linkitykselle ja aineistojen maantieteelliselle visualisoimiselle.
Eero Hyvönen, Erkki Heino, Petri Leskinen, Esko Ikkala, Mikko Koho, Minna Tamper, Jouni Tuominen and Eetu Mäkelä: Publishing Second World War History as Linked Data Events on the Semantic Web. Proceedings of Digital Humanities 2016, short papers, pp. 571-573, Kraków, Poland, July, 2016. bib pdf link
Data about wars is typically heterogeneous, distributed in the data silos of the fighting parties, multilingual, and often controversial depending on the political point of view. It is therefore hard for the historians to get a global picture of what has actually happened, to whom, where, when, and how. We argue that Semantic Web and Linked Data technologies are a very promising approach for modeling, harmonizing, and aggregating data about war history. Our goal is to make it possible, for both historians and laymen, to study history in a contextualized way where linked datasets enrich each other. The paper presents the in-use WarSampo 1 system, where massive collections of heterogeneous data about the (Finnish) history of the Second World War are harmonized using an event-based approach, and provided as a Linked Open Data service for applications to use. As a use case, a semantic portal WarSampo providing six different perspectives to the war based on events is presented.
Mikko Koho, Eero Hyvönen, Erkki Heino, Jouni Tuominen, Petri Leskinen and Eetu Mäkelä: Linked Death - Representing, Publishing, and Using Second World War Death Records as Linked Open Data. The Semantic Web: ESWC 2016 Satellite Events (Harald Sack, Giuseppe Rizzo, Nadine Steinmetz, Dunja Mladenić, Sören Auer and Christoph Lange (eds.)), Springer-Verlag, June, 2016. bib pdf
War history of the Second World War (WW2), humankind’s largest disaster, is of great interest to both laymen and researchers. Most of us have ancestors and relatives who participated in the war, and in the worst case got killed. Researchers are eager to find out what actually happened then, and even more importantly why, so that future wars could perhaps be prevented. The darkest data of war history are casualty records—from such data we could perhaps learn most about the war. This paper presents a model and system for representing death records as linked data, so that 1) citizens could find out more easily what happened to their relatives during WW2 and 2) digital humanities (DH) researchers could (re)use the data easily for research.
Eero Hyvönen, Erkki Heino, Petri Leskinen, Esko Ikkala, Mikko Koho, Minna Tamper, Jouni Tuominen and Eetu Mäkelä: WarSampo Data Service and Semantic Portal for Publishing Linked Open Data about the Second World War History. The Semantic Web – Latest Advances and New Domains (ESWC 2016) (Harald Sack, Eva Blomqvist, Mathieu d Aquin, Chiara Ghidini, Simone Paolo Ponzetto and Christoph Lange (eds.)), Springer-Verlag, May, 2016. bib pdf
This paper presents the WarSampo system for publishing collections of heterogeneous, distributed data about the Second World War on the Semantic Web. WarSampo is based on harmonizing massive datasets using event-based modeling, which makes it possible to enrich datasets semantically with each others’ contents. WarSampo has two components: First, a Linked Open Data (LOD) service WarSampo Data for Digital Humanities (DH) research and for creating applications related to war history. Second, a semanticWarSampo Portal has been created to test and demonstrate the usability of the data service. The WarSampo Portal allows both historians and laymen to study war history and destinies of their family members in the war from different interlinked perspectives. Published in November 2015, theWarSampo Portal had some 20,000 distinct visitors during the first three days, showing that the public has a great interest in this kind of applications.
Mikko Koho, Eero Hyvönen, Erkki Heino, Jouni Tuominen, Petri Leskinen and Eetu Mäkelä: Linked Death - Representing, Publishing, and Using Second World War Death Records as Linked Open Data. Proceedings of the 1st Workshop on Humanities in the Semantic Web (WHiSe), CEUR Workshop Proceedings, Heraklion, Crete, Greece, May, 2016. Vol 1608. bib pdf link
War history of the Second World War (WW2), humankind s largest disaster, is of great interest to both laymen and researchers. Most of us have ancestors and relatives who participated in the war, and in the worst case got killed. Researchers are eager to find out what actually happened then, and even more importantly why, so that future wars could perhaps be prevented. The darkest data of war history are casualty records---from such data we could perhaps learn most about the war. This paper presents a model and system for representing death records as linked data, so that 1) citizens could find out more easily what happened to their relatives during WW2 and 2) digital humanities (DH) researchers could (re)use the data easily for research.

2015

Eero Hyvönen, Jouni Tuominen, Eetu Mäkelä, Jérémie Dutruit, Kasper Apajalahti, Erkki Heino, Petri Leskinen and Esko Ikkala: Second World War on the Semantic Web: The WarSampo Project and Semantic Portal. Proceedings of the ISWC 2015 Posters & Demonstrations Track, CEUR-WS Proceedings, Bethlehem, PA, USA, October, 2015. Vol 1486. bib pdf link
This paper initiates and fosters work on publishing Linked Open Data about the Second World War. It is argued that the heterogeneous, distributed data about the international world war history makes a promising use case for semantic technologies. We hope that by making war data openly available we can learn from the past and promote peace.
/m/fs/seco/www/www.seco.tkk.fi/include/secoweb/utils.php; Thu, 30 Mar 2017 17:27:39 +0300