University of Helsinki
A133 @ Helsinki Centre for Digital Humanities (HELDIG), Metsätalo, Unioninkatu 40, Helsinki
HELDIG Centre, Faculty of Arts, P.O. Box 24, FI-00014 University of Helsinki, Finland
phone: +358 50 431 6071 (office)
room: 3171 @ Department of Computer Science, Maarintie 8, Espoo
postal address: Department of Computer Science, P.O. Box 15400, FI-00076 Aalto, Finland
Currently working in the Anoppi project.
Also worked in the following projects:
See see citation indices in Google Scholar.
Minna Tamper is a doctoral candidate at Aalto University and University of Helsinki. She received her M.Sc. (Tech.) at Aalto University in 2016. Her research interests include linked data, natural language processing and data analysis. She has published over 20 research articles since 2016, and has received national and international awards. She is currently a substitute member of COST NexusLinguarum management committee, an organiser of Aalto Digi Platform Digital Humanities Pizza Seminar, and as a reviewer on scientific conferences and journals. She has collaborated with libraries, archives, and Finnish ministries on cultural heritage data analysis since 2015.
Mikko Koho, Esko Ikkala, Petri Leskinen, Minna Tamper, Jouni Tuominen and Eero Hyvönen: WarSampo Knowledge Graph: Finland in the Second World War as Linked Open Data
. Semantic Web – Interoperability, Usability, Applicability, 2020. Accepted. bib pdf link
The Second World War (WW2) is arguably the most devastating catastrophe of human history, a topic of great interest to not only researchers but the general public. However, data about the Second World War is heterogeneous and distributed in various organizations and countries making it hard to utilize. In order to create aggregated global views of the war, a shared ontology and data infrastructure is needed to harmonize information in various data silos. This makes it possible to share data between publishers and application developers, to support data analysis in Digital Humanities research, and to develop data-driven intelligent applications. As a first step towards these goals, this article presents the WarSampo knowledge graph (KG), a shared semantic infrastructure, and a Linked Open Data (LOD) service for publishing data about WW2, with a focus on Finnish military history. The shared semantic infrastructure is based on the idea of representing war as a spatio-temporal sequence of events that soldiers, military units, and other actors participate in. The used metadata schema is an extension of CIDOC CRM, supplemented by various military historical domain ontologies. With an infrastructure containing shared ontologies, maintaining the interlinked data brings upon new challenges, as one change in an ontology can propagate across several datasets that use it. To support sustainability, a repeatable automatic data transformation and linking pipeline has been created for rebuilding the whole WarSampo KG from the individual source datasets. The WarSampo KG is hosted on a data service based on W3C Semantic Web standards and best practices, including content negotiation, SPARQL API, download, automatic documentation, and other services supporting the reuse of the data. The WarSampo KG, a part of the international LOD Cloud and totalling ca. 14 million triples, is in use in nine end-user application views of the WarSampo portal, which has had over 400 000 end users since its opening in 2015.
Arttu Oksanen, Jouni Tuominen, Eetu Mäkelä, Minna Tamper, Aki Hietanen and Eero Hyvönen: Semantic Finlex: Transforming, Publishing, and Using Finnish Legislation and Case Law As Linked Open Data on the Web
. Knowledge of the Law in the Big Data Age
(G. Peruginelli and S. Faro (eds.)), Frontiers in Artificial Intelligence and Applications, vol. 317, pp. 212-228, IOS Press, 2019. ISBN 978-1-61499-984-3 (print); ISBN 978-1-61499-985-0 (online). bib pdf link
Governments publish legislation and case law widely in print and on the Web. Such legal information is provided for human consumption, but the information is usually not available as data for algorithmic analysis and applications to use. However, this would be beneficial in many use cases, such as building more intelligent juridical online services and conducting research into legislation and legal practice. To address these needs, this Chapter presents Semantic Finlex, a national in-use data resource and service for publishing Finnish legislation and related case law as Linked Open Data for legal applications to use. The system transforms and interlinks on a regular basis data from the legacy legal database Finlex of the Ministry of Justice into Linked Open Data, based on the European standards ECLI and ELI. The published data is hosted on the 7-star Linked Data Finland service and SPARQL endpoint with a variety of related services available that ease data re-use. Rich Internet Applications using SPARQL for data access are presented as application demonstrators of the data service. In addition, this Chapter presents methods and tools under development to automatically annotate legal texts and to anonymize case law documents prior to their publication on the Web. Anonymization is necessary due to issues of data protection and privacy, and annotation is needed for semantic search and interlinking the documents. The automated approaches could significantly speed up the process and minimize costs of publishing legal documents as Linked Open Data.
Arttu Oksanen, Minna Tamper, Jouni Tuominen, Aki Hietanen and Eero Hyvönen: Anoppi: A Pseudonymization Service for Finnish Court Documents
. Legal Knowledge and Information Systems. JURIX 2019: The Thirty-second Annual Conference
(Araszkiewicz, M., Rodríguez-Doncel, V. (ed.)), pp. 251-254, IOS Press, December, 2019. bib pdf
Mikko Koho, Erkki Heino, Petri Leskinen, Esko Ikkala, Minna Tamper, Kasper Apajalahti, Jouni Tuominen, Eetu Mäkelä and Eero Hyvönen: WarSampo Knowledge Graph
. Zenodo, October, 2019. Dataset. bib link
WarSampo Knowledge Graph includes harmonized data of different kinds concerning the Second World War in Finland, separated in different subgraphs representing events, actors, places, photographs, and other aspects and documentation of the war. The data covers the Winter War 1939-1940 against the Soviet attack, the Continuation War 1941-1944 where the occupied areas of the Winter War were temporarily regained, and the Lapland War 1944-1945, where the Finns pushed the German troops away from Lapland.
Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen and Kirsi Keravuori: BiographySampo - Publishing and Enriching Biographies on the Semantic Web for Digital Humanities Research
. The Semantic Web. ESWC 2019
(Pascal Hitzler, Miriam Fernández, Krzysztof Janowicz, Amrapali Zaveri, Alasdair J.G. Gray, Vanessa Lopez, Armin Haller and Karl Hammar (eds.)), pp. 574-589, Springer-Verlag, June, 2019. bib pdf link
Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen and Kirsi Keravuori: Demonstrating BiographySampo in Solving Digital Humanities Research Problems in Biography and Prosopography
. The Fourth Digital Humanities in the Nordic Countries 2019 (DHN2019), Book of Abstracts
, University of Copenhagen, Copenhagen, Denmark, March, 2019. bib pdf link
Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen and Kirsi Keravuori: Biografiasammon tekoäly yhdistää ja rikastaa suomalaiset elämäkerrat semanttisessa webissä
. Aalto-yliopisto, Semanttisen laskennan tutkimusryhmä (SeCo), Nov, 2018. bib pdf
Biografiasampo-järjestelmä käynnistää uuden aikakauden elämäkertakokoelmien julkaisemisessa ja käyttämisessä verkossa. Järjestelmän ydinaineistona on Kansallisbiografia ja muut Suomalaisen Kirjallisuuden Seuran (SKS) ja tieteellisten seurojen toimittamat pienoiselämäkerrat, yhteensä 13 100 elämäntarinaa, joita on kirjoittanut 900 suomalaista tutkijaa. Biografiasammon innovaationa on luoda kieliteknologian, tekoälyn ja semanttisen webin teknologioiden avulla elämäkertojen teksteistä ja niihin eri lähteissä liittyvistä tiedoista tietämysverkko (knowledge graph) ja kansallinen tietoinfrastruktuuri, joka koostuu miljoonista tietojen välisistä yhteyksistä. Tietämysverkko on julkaistu linkitetyn datan palvelussa, jonka varaan on toteutettu seitsemästä sovellusnäkymästä koostuva älykäs, kaikille avoin ja maksuton verkkopalvelu biografiasampo.fi kansalaisten ja digitaalisten ihmistieteiden tutkijoiden käytettäväksi.
Minna Tamper, Petri Leskinen, Kasper Apajalahti and Eero Hyvönen: Using Biographical Texts as Linked Data for Prosopographical Research and Applications
. Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection. 7th International Conference, EuroMed 2018, Nicosia, Cyprus
, Springer-Verlag, November, 2018. bib pdf
Arttu Oksanen, Jouni Tuominen, Eetu Mäkelä, Minna Tamper, Aki Hietanen, and Eero Hyvönen: Semantic Finlex: Finnish Legislation and Case Law as a Linked Open Data Service
. Proceedings of Law via the Internet 2018 (LVI 2018), Knowledge of the Law in the Big Data Age, abstracts
, Florence, Italy, October, 2018. bib pdf
Eero Hyvönen, Petri Leskinen, Minna Tamper, Jouni Tuominen and Kirsi Keravuori: Semantic National Biography of Finland
. Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN 2018)
, pp. 372-385, CEUR Workshop Proceedings, Vol-2084, Helsinki, Finland, March, 2018. bib pdf link
This paper presents the vision of publishing and utilizing textual biographies as Linked (Open) Data on the Semantic Web. As a case study, we publish the live stories of the National Biography of Finland, created by the Finnish Literature Society, as semantic, i.e., machine “understandable” metadata in a SPARQL endpoint using the Linked Data Finland (LDF.fi) service. On top of the data service various Digital Humanities applications are built. The applications include searching and studying individual personal histories as well as historical research of groups of persons using methods of prosopography. The biographical data is enriched by extracting events from unstructured and semi-structured texts, and by linking entities internally and to external data sources. A faceted semantic search engine is provided for filtering groups of people from the data for prosopographical research. An extension of the event-based CIDOC CRM ontology is used as the underlying data model, where lives are seen as chains of interlinked events populated from the data of the biographies and additional data sources, such as museum collections, library databases, and archives.
Petri Leskinen, Mikko Koho, Erkki Heino, Minna Tamper, Esko Ikkala, Jouni Tuominen, Eetu Mäkelä and Eero Hyvönen: Modeling and Using an Actor Ontology of Second World War Military Units and Personnel
. Proceedings of the 16th International Semantic Web Conference (ISWC 2017)
(Claudia d Amato, Miriam Fernandez, Valentina Tamma, Freddy Lecue, Philippe Cudré-Mauroux, Juan Sequeda, Christoph Lange and Jeff Heflin (eds.)), pp. 280-296, Springer-Verlag, Vienna, Austria, October, 2017. bib pdf link
This paper presents a model for representing historical military personnel and army units, based on large datasets about World War II in Finland. The model is in use in WarSampo data service and semantic portal, which has had tens of thousands of distinct visitors. A key challenge is how to represent ontological changes, since the ranks and units of military personnel, as well as the names and structures of army units change rapidly in wars. This leads to serious problems in both search as well as data linking due to ambiguity and homonymy of names. In our solution, actors are represented in terms of the events they participated in, which facilitates disambiguation of personnel and units in different spatio-temporal contexts. The linked data in the WarSampo Linked Open Data cloud and service has ca. 9 million triples, including actor datasets of ca. 100 000 soldiers and ca. 16 100 army units. To test the model in practice, an application for semantic search and recommending based on data linking was created, where the spatio-temporal life stories of individual soldiers can be reassembled dynamically by linking data from different datasets. An evaluation is presented showing promising results in terms of linking precision.
Eero Hyvönen, Erkki Heino, Petri Leskinen, Esko Ikkala, Mikko Koho, Minna Tamper, Jouni Tuominen and Eetu Mäkelä: WarSampo: Publishing and Using Linked Open Data about the Second World War
. EuropeanaTech Insight, no. 7, Europeana, September, 2017. bib pdf link
The article overviews the system WarSampo – Finnish World War 2 on the Semantic Web, the winner of the LODLAM Challenge 2017 Open Data Prize on June 29 in Venice, Italy.
Erkki Heino, Minna Tamper, Eetu Mäkelä, Petri Leskinen, Esko Ikkala, Jouni Tuominen, Mikko Koho and Eero Hyvönen: Named Entity Linking in a Complex Domain: Case Second World War History
. Proceedings, Language, Technology and Knowledge (LDK 2017)
, pp. 120-133, Springer-Verlag, Galway, Ireland, June, 2017. bib pdf link
This paper discusses the challenges of applying named entity linking in a rich, complex domain – specifically, the linking of 1) military units, 2) places and 3) people in the context of rich Second World War data. Multiple sub-scenarios are discussed in detail through concrete evaluations, analyzing the problems faced, and the solutions developed. A key contribution of this work is to highlight the heterogeneity of problems and approaches needed even inside a single domain, depending on both the source data as well as the target authority.
Minna Tamper, Petri Leskinen, Esko Ikkala, Arttu Oksanen, Eetu Mäkelä, Erkki Heino, Jouni Tuominen, Mikko Koho and Eero Hyvönen: AATOS – a Configurable Tool for Automatic Annotation
. Proceedings, Language, Technology and Knowledge (LDK 2017)
, pp. 276-289, Springer-Verlag, Galway, Ireland, June, 2017. bib pdf link
This paper presents an automatic annotation tool AATOS for providing documents with semantic annotations. The tool links entities found from the texts to ontologies defined by the user. The application is highly configurable and can be used with different natural language Finnish texts. The application was developed as a part of WarSampo and Semantic Finlex projects and tested using Kansa Taisteli magazine articles and consolidated Finnish legislation of Semantic Finlex. The quality of the automatic annotation was evaluated by measuring precision and recall against existing manual annotations. The results showed that the quality of the input text, as well as the selection and configuration of the ontologies impacted the results.
Minna Tamper: Extraction of Entities and Concepts from Finnish Texts
. MSc Thesis (in English), Aalto University, School of Science, Degree Programme in Computer Science and Engineering, Dec, 2016. bib pdf
Keywords are used in many document databases to improve search. The process of assigning keywords from controlled vocabularies to a document is called subject indexing. If the controlled vocabulary used for indexing is an ontology, with semantic relations and descriptions of concepts, the process is also called semantic annotation. In this thesis an automatic annotation tool was created to provide the documents with semantic annotations. The application links entities found from the texts to ontologies defined by the user. The application is highly configurable and can be used with different Finnish texts. The application was developed as a part of WarSampo and Semantic Finlex projects and tested using Kansa Taisteli magazine articles and consolidated legislation of Finnish legislation. The quality of the automatic annotation was evaluated by measuring precision and recall against existing manual annotations. The results showed that the quality of the input text, as well as the selection and configuration of the ontologies impacted the results.
Eero Hyvönen, Erkki Heino, Petri Leskinen, Esko Ikkala, Mikko Koho, Minna Tamper, Jouni Tuominen and Eetu Mäkelä: Publishing Second World War History as Linked Data Events on the Semantic Web
. Proceedings of Digital Humanities 2016, short papers
, pp. 571-573, Kraków, Poland, July, 2016. bib pdf link
Data about wars is typically heterogeneous, distributed in the data silos of the fighting parties, multilingual, and often controversial depending on the political point of view. It is therefore hard for the historians to get a global picture of what has actually happened, to whom, where, when, and how. We argue that Semantic Web and Linked Data technologies are a very promising approach for modeling, harmonizing, and aggregating data about war history. Our goal is to make it possible, for both historians and laymen, to study history in a contextualized way where linked datasets enrich each other. The paper presents the in-use WarSampo 1 system, where massive collections of heterogeneous data about the (Finnish) history of the Second World War are harmonized using an event-based approach, and provided as a Linked Open Data service for applications to use. As a use case, a semantic portal WarSampo providing six different perspectives to the war based on events is presented.
Eero Hyvönen, Erkki Heino, Petri Leskinen, Esko Ikkala, Mikko Koho, Minna Tamper, Jouni Tuominen and Eetu Mäkelä: WarSampo Data Service and Semantic Portal for Publishing Linked Open Data about the Second World War History
. The Semantic Web – Latest Advances and New Domains (ESWC 2016)
(Harald Sack, Eva Blomqvist, Mathieu d Aquin, Chiara Ghidini, Simone Paolo Ponzetto and Christoph Lange (eds.)), pp. 758-773, Springer-Verlag, May, 2016. bib pdf link
This paper presents the WarSampo system for publishing collections of heterogeneous, distributed data about the Second World War on the Semantic Web. WarSampo is based on harmonizing massive datasets using event-based modeling, which makes it possible to enrich datasets semantically with each others’ contents. WarSampo has two components: First, a Linked Open Data (LOD) service WarSampo Data for Digital Humanities (DH) research and for creating applications related to war history. Second, a semanticWarSampo Portal has been created to test and demonstrate the usability of the data service. The WarSampo Portal allows both historians and laymen to study war history and destinies of their family members in the war from different interlinked perspectives. Published in November 2015, theWarSampo Portal had some 20,000 distinct visitors during the first three days, showing that the public has a great interest in this kind of applications.
(total: 28 publications)