- FIN-CLARIAH Research Infrastructure
A new national research infrastructure initiative FIN-CLARIAH for...
8.12.2021 8:12 by eahyvone - WarMemoirSampo published on December 3, 2021
A new “Sampo” application, “WarMemoirSampo”...
8.12.2021 8:04 by eahyvone - Five new SeCo papers accepted for the ISWC 2021
The 20th International Semantic Web Conference (ISWC 2021), the...
2.8.2021 6:53 by eahyvone
- Eljas Oksanen, Frida Ehrnsten, Heikki Rantala and Eero Hyvönen: Semantic Solutions for Democratising Archaeological and Numismatic Data Analysis
- Annastiina Ahola, Lilli Peura, Rafael Leal, Heikki Rantala and Eero Hyvönen: Using generative AI and LLMs to enrich art collection metadata for searching, browsing, and studying art history in Digital Humanities
- Eero Hyvönen, Patrik Boman, Heikki Rantala, Annastiiina Ahola and Petri Leskinen: ConfermentSampo - A Knowledge Graph, Data Service, and Semantic Portal for Intangible Academic Cultural Heritage 1643-2023 in Finland
- Petri Leskinen: Modeling and Using Biographical Linked Data for Prosopographical Data Analysis
Petri Leskinen
Aalto University
email: firstnames.lastname@aalto.fi
room: 3171 @ Department of Computer Science, Maarintie 8, Espoo
postal address: Department of Computer Science, P.O. Box 15400, FI-00076 Aalto, Finland
Currently working in the AcademySampo project.
Also worked in the following projects:
See citation indices in Google Scholar.
Petri Leskinen is a doctoral candidate at Aalto University. He received his M.Sc. (Tech.) at Aalto University in 2016. He is completing his dissertation on actor ontologies. His interests include data mining, data science, machine learning, and network analysis. He is currently a course assistent on the Bachelor Seminar. Earlier he has participated in over 30 research articles since 2016. He has been a course assistant on eight different bachelor level courses at the departments of Computer Science and Mathematics and System Analysis at Aalto University in 2011-2016.
Publications
2024
Eero Hyvönen, Patrik Boman, Heikki Rantala, Annastiiina Ahola and Petri Leskinen: ConfermentSampo - A Knowledge Graph, Data Service, and Semantic Portal for Intangible Academic Cultural Heritage 1643-2023 in Finland. Proceedings of the 6th International Knowledge Graph and Semantic Web Conference, Dec 11-13. 2024, Paris, France, Springer-Verlag, October, 2024. Accepted. bib pdf
Petri Leskinen: Modeling and Using Biographical Linked Data for Prosopographical Data Analysis. Dissertation, Aalto University, School of Science, Department of Computer Science, October, 2024. bib pdf
Eero Hyvönen, Laura Sinikallio, Petri Leskinen, Senka Drobac, Rafael Leal, Matti La Mela, Jouni Tuominen, Henna Poikkimäki and Heikki Rantala: Publishing and Using Parliamentary Linked Data on the Semantic Web: ParliamentSampo System for Parliament of Finland. Semantic Web, October, 2024. In print. bib pdf
Heikki Rantala, Petri Leskinen, Lilli Peura and Eero Hyvönen: Representing and searching associations in cultural heritage knowledge graphs using faceted search. Knowledge Graphs in the Age of Language Models and Neuro-Symbolic AI. Proceedings of the 20th International Conference on Semantic Systems, 17–19 September 2024, Amsterdam, The Netherlands, pp. 420-435, IOS Press, September, 2024. bib link
Heikki Rantala, Petri Leskinen, Lilli Peura and Eero Hyvönen: Representing and searching associations in cultural heritage knowledge graphs using faceted search. Knowledge Graphs in the Age of Language Models and Neuro-Symbolic AI. Proceedings of the 20th International Conference on Semantic Systems, 17–19 September 2024, Amsterdam, The Netherlands, pp. 420-435, IOS Press, September, 2024. bib pdf link
Henna Poikkimäki, Petri Leskinen, Eero Hyvönen: Using Network Analysis for Studying Cultural Heritage Knowledge Graphs – Case Correspondence Networks in Grand Duchy of Finland 1809–1917. August, 2024. Under review. bib pdf
Henna Poikkimäki, Kati Katajisto, Petri Leskinen and Eero Hyvönen: Applying Network and Bibliometric Analyses to Mentions of Politicians in Plenary Speeches: Case ParliamentSampo - Parliament of Finland on the Semantic Web. July, 2024. Submitted for evaluation. bib pdf
Petri Leskinen and Eero Hyvönen: Biographical and Prosopographical Analyses of Finnish Academic People 1640–1899 Based on Linked Open Data. Proceedings of the Biographical Data in a Digital World 2022 (BD 2022), Tokyo, Institute of Cultural History, ZRC SAZU, Ljubljana, Slovenia, January, 2024. bib pdf link
Petri Leskinen, Javier Ureña-Carrion, Jouni Tuominen, Mikko Kivelä, Eero Hyvönen: Knowledge Graphs and Data Services for Studying Historical Epistolary Data in Network Science on the Semantic Web. Semantic Web, IOS Press, 2024. Under open review. bib pdf link
2023
Senka Drobac, Johanna Enqvist, Petri Leskinen, Muhammad Faiz Wahjoe, Heikki Rantala, Mikko Koho, Ilona Pikkanen, Iida Jauhiainen, Jouni Tuominen, Hanna-Leena Paloposki, Matti La Mela and Eero Hyvönen: The Laborious Cleaning: Acquiring and Transforming 19th-Century Epistolary Metadata. Digital Humanities in the Nordic and Baltic Countries Publication, DHNB2023 Conference Proceeding, vol. 5, no. 1, pp. 248-262, University of Oslo Library, Norway, 2023. bib pdf link
Eero Hyvönen, Petri Leskinen and Jouni Tuominen: LetterSampo – Historical Letters on the Semantic Web: A Framework and Its Application to Publishing and Using Epistolary Data of the Republic of Letters. Journal on Computing and Cultural Heritage, vol. 16, no. 1, 2023. bib pdf link
Heikki Rantala, Eero Hyvönen and Petri Leskinen: Finding relations between entities in a knowledge graph: Case artists of the Getty Union List of Artist Names (ULAN). 2023. Submitted for review. bib pdf
Heikki Rantala, Eero Hyvönen and Petri Leskinen: Finding and explaining relations in a biographical knowledge graph based on life events: Case BiographySampo. ESWC 2023 Workshops and tutorials joint proceedings, CEUR Workshop Proceedings, 2023. Forth-coming. bib pdf
Senka Drobac, Petri Leskinen and Muhammad Faiz Wahjoe: Navigating the Challenges of Deduplicating Actors in Historical Letter Exchanges. Proceedings of the 24th European Conference on Knowledge Management, vol. 24, no. 2, pp. 1694-1697, Academic Conferences International Limited, 2023. bib link
Matthias Schlögl, Jouni Tuominen, Joonas Kesäniemi, Petri Leskinen, Victor de Boer, Go Sugimoto and Joh Dokler: The InTaVia Knowledge Graph - Publishing European National Biographical and Cultural Heritage Object Data. December, 2023. Submitted for review. bib pdf
Eero Hyvönen, Petri Leskinen and Jouni Tuominen: A Data-driven Approach to Create an Ontology of Parliamentary Work: Case Parliament of Finland on the Semantic Web. Proceedings of SWODCH 2023. Semantic Web and Ontology Design for Cultural Heritage. Co-located with the 22nd International Semantic Web Conference (ISWC 2023) in Athens, Greece, CEUR Workshop Proceedings, Vol-3540, November, 2023. bib pdf link
Eero Hyvönen, Patrik Boman, Heikki Rantala, Annastiina Ahola and Petri Leskinen: Promootiosampo - Helsingin yliopiston filosofisen tiedekunnan 100 promootiota 1643-2023 semanttisessa webissä. Aalto-yliopisto ja Helsingin yliopisto, Semanttisen laskennan tutkimusryhmä (SeCo), October, 2023. Artikkelin käsikirjoitus, arvioitavana. bib pdf
Petri Leskinen and Eero Hyvönen: Biographical and Prosopographical Analyses of Finnish Academic People 1640–1899 Based on Linked Open Data. Biographical Data in a Digital World 2022 (BD 2022), Tokyo, Proceedings, accepted, August, 2023. Forth-coming. bib pdf
Eero Hyvönen, Laura Sinikallio, Petri Leskinen, Senka Drobac, Rafael Leal, Matti La Mela, Jouni Tuominen, Henna Poikkimäki and Heikki Rantala: Plenary Speeches of the Parliament of Finland as Linked Open Data and Data Services. Joint Proceedings of the Second International Workshop on Knowledge Graph Generation From Text and the First International BiKE Challenge co-located with 20th Extended Semantic Conference (ESWC 2023), pp. 1-20, CEUR Workshop Proceedings, Vol. 3447, August, 2023. bib pdf link
Eero Hyvönen, Petri Leskinen and Heikki Rantala: Integrating Faceted Search with Data Analytic Tools in the User Interface of ParliamentSampo - Parliament of Finland on the Semantic Web. The Semantic Web: ESWC 2023 Satellite Events, pp. 16-21, Sringer-Verlag, June, 2023. bib pdf
Eero Hyvönen, Petri Leskinen, Laura Sinikallio, Senka Drobac, Rafael Leal, Matti La Mela, Jouni Tuominen, Henna Poikkimäki and Heikki Rantala: ParliamentSampo Infrastructure for Publishing the Plenary Speeches and Networks of Politicians of the Parliament of Finland as Open Data Services. Aalto University, Dept. of Computer Science, February, 2023. Paper published at the publication event of the ParliamentSampo data service and portal. bib pdf
Minna Tamper, Petri Leskinen, Eero Hyvönen, Risto Valjus and Kirsi Keravuori: Analyzing Biography Collection Historiographically as Linked Data: Case National Biography of Finland. Semantic Web – Interoperability, Usability, Applicability, vol. 14, no. 2, pp. 385-419, IOS Press, 2023. bib pdf link
2022
Telma Peura, Petri Leskinen and Eero Hyvönen: What Linked Data Can Tell about Geographical Trends in Finnish Fiction Literature - Using the BookSampo Knowledge Graph in Digital Humanities. 2022. Abstract under peer review. bib
Henna Poikkimäki, Petri Leskinen, Minna Tamper and Eero Hyvönen: Analyses of Networks of Politicians Based on Linked Data: Case ParliamentSampo - Parliament of Finland on the Semantic Web. New Trends in Database and Information Systems, pp. 585-592, Springer International Publishing, August, 2022. bib pdf link
Eero Hyvönen, Laura Sinikallio, Petri Leskinen, Matti La Mela, Jouni Tuominen, Kimmo Elo, Senka Drobac, Mikko Koho, Esko Ikkala, Minna Tamper, Rafael Leal and Joonas Kesäniemi: Linked Data Approach for Studying Parliamentary Speeches and Networks of Politicians in Finland 1907-2021 (long paper). Digital Humanities 2022, Conference Abstracts, July 25-29, 2022 Online, Tokyo. Japan, University of Tokyo, pp. 254-257, ADHO, July, 2022. bib link
Minna Tamper, Rafael Leal, Laura Sinikallio, Petri Leskinen, Jouni Tuominen and Eero Hyvönen: Extracting Knowledge from Parliamentary Debates for Studying Political Culture and Language. Proceedings of the 1st International Workshop on Knowledge Graph Generation From Text and the 1st International Workshop on Modular Knowledge co-located with 19th Extended Semantic Conference (ESWC 2022) (Sanju Tiwari, Nandana Mihindukulasooriya, Francesco Osborne, Dimitris Kontokostas, Jennifer D’Souza and Mayank Kejriwal (eds.)), vol. 3184, pp. 70-79, CEUR WS, May, 2022. International Workshop on Knowledge Graph Generation from Text (TEXT2KG 2022). bib pdf link
Petri Leskinen, Javier Ureña-Carrion, Petri Leskinen, Jouni Tuominen, Mikko Kivelä and Eero Hyvönen: Knowledge Graphs and Data Services for Studying Historical Epistolary Data in Network Science on the Semantic Web. May, 2022. Submitted for review. bib pdf
Javier Ureña-Carrion, Petri Leskinen, Jouni Tuominen, Charles van den Heuvel, Eero Hyvönen and Mikko Kivelä: Communication Now and Then: Analyzing the Republic of Letters as a Communication Network. Applied Network Science, vol. 7, May, 2022. bib pdf link
Eero Hyvönen, Laura Sinikallio, Petri Leskinen, Matti La Mela, Jouni Tuominen, Kimmo Elo, Senka Drobac, Mikko Koho, Esko Ikkala, Minna Tamper, Rafael Leal and Joonas Kesäniemi: Finnish Parliament on the Semantic Web: Using ParliamentSampo Data Service and Semantic Portal for Studying Political Culture and Language. Digital Parliamentary data in Action (DiPaDA 2022), Workshop at the 6th Digital Humanities in Nordic and Baltic Countries Conference, long paper, pp. 69-85, CEUR Workshop Proceedings, Vol. 3133, May, 2022. bib pdf link
Jouni Tuominen, Mikko Koho, Ilona Pikkanen, Senka Drobac, Johanna Enqvist, Eero Hyvönen, Matti La Mela, Petri Leskinen, Hanna-Leena Paloposki and Heikki Rantala: Constellations of Correspondence: a Linked Data Service and Portal for Studying Large and Small Networks of Epistolary Exchange in the Grand Duchy of Finland. DHNB 2022 The 6th Digital Humanities in Nordic and Baltic Countries Conference, pp. 415-423, CEUR Workshop Proceedings, Vol. 3232, March, 2022. bib pdf link
Petri Leskinen, Heikki Rantala and Eero Hyvönen: Analyzing the Lives of Finnish Academic People 1640–1899 in Nordic and Baltic Countries: AcademySampo Data Service and Portal. DHNB 2022 The 6th Digital Humanities in Nordic and Baltic Countries Conference, CEUR Workshop Proceedings, long papers, Vol. 3232, March, 2022. bib pdf link
Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen and Kirsi Keravuori: Linked Data – A Paradigm Shift for Publishing and Using Biography Collections on the Semantic Web. Proceedings of the Third Conference on Biographical Data in a Digital World (BD 2019), pp. 16-23, CEUR-WS Proceedings, vol. 3152, 2022. bib pdf link
2021
Eero Hyvönen, Laura Sinikallio, Petri Leskinen, Senka Drobac, Jouni Tuominen, Kimmo Elo, Matti La Mela, Mikko Koho, Esko Ikkala, Minna Tamper, Rafael Leal and Joonas Kesäniemi: Parlamenttisampo: eduskunnan aineistojen linkitetyn avoimen datan palvelu ja sen käyttömahdollisuudet. Informaatiotutkimus, vol. 40, no. 3, pp. 216-244, November, 2021. bib pdf link
Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen and Kirsi Keravuori: Biografiasampo yhdistää ja rikastaa suomalaiset elämäkerrat linkitettynä datana semanttisessa webissä (Biographysampo links and enriches Finnish biographies as linked data on the Semantic Web. Informaatiotutkimus, vol. 40, no. 3, pp. 346-368, November, 2021. bib pdf link
Petri Leskinen, Eero Hyvönen and Jouni Tuominen: Sparql2GraphServer: a Server-side Tool for Extracting Networks from Linked Data for Data Analysis. ISWC-Posters-Demos-Industry 2021 International Semantic Web Conference (ISWC) 2021: Posters, Demos, and Industry Tracks, CEUR Workshop Proceedings, Oct, 2021. bib pdf link
Petri Leskinen and Eero Hyvönen: Using the AcademySampo Portal and Data Service for Biographical and Prosopographical Research in Digital Humanities. ISWC-Posters-Demos-Industry 2021 International Semantic Web Conference (ISWC) 2021: Posters, Demos, and Industry Tracks, CEUR Workshop Proceedings, Oct, 2021. bib pdf link
Petri Leskinen and Eero Hyvönen: Reconciling and Using Historical Person Registers as Linked Open Data in the AcademySampo Knowledge Graph. Proceedings of the 20th International Semantic Web Conference (ISWC 2021), Springer, October, 2021. bib pdf link
Minna Tamper, Eero Hyvönen and Petri Leskinen: Visualizing and Analyzing Networks of Named Entities in Biographical Dictionaries for Digital Humanities Research. Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing (CICling 2019), Springer-Verlag, October, 2021. Forth-coming. bib pdf
This paper shows how named entity extraction and networkanalysis can be used to examine biographies individually and in groupsto aid historians in biographical and prosopographical research. For this purpose a reference network of 13 100 biographies in the collections ofthe Biographical Centre of the Finnish Literature Society was created, based on links between the biographies as well as automatically extracted named entities found in the texts. The data was published in a SPARQL endpoint as a Linked Data knowledge graph on top of which network analytic tools were created and analysis were done showing the usefulness of the approach in Digital Humanities. The reference graph has been utilized for network analysis to examine egocentric networks of individual persons as well as networks among groups of people in prosopography. The data and tools presented are in use since autumn 2018 in the semantic portal BiographySampo that has had tens of thousands of users.
Eero Hyvönen, Petri Leskinen, Heikki Rantala, Esko Ikkala and Jouni Tuominen: Akatemiasampo-portaali ja -datapalvelu henkilöiden ja henkilöryhmien historialliseen tutkimukseen (AcademySampo Portal and Data Service for Biographical and Prosopographical Research). Informaatiotutkimus, vol. 40, no. 2, pp. 28-56, May, 2021. bib pdf link
Mikko Koho, Esko Ikkala, Petri Leskinen, Minna Tamper, Jouni Tuominen and Eero Hyvönen: WarSampo Knowledge Graph: Finland in the Second World War as Linked Open Data. Semantic Web – Interoperability, Usability, Applicability, vol. 12, no. 2, pp. 265-278, January, 2021. bib pdf link
The Second World War (WW2) is arguably the most devastating catastrophe of human history, a topic of great interest to not only researchers but the general public. However, data about the Second World War is heterogeneous and distributed in various organizations and countries making it hard to utilize. In order to create aggregated global views of the war, a shared ontology and data infrastructure is needed to harmonize information in various data silos. This makes it possible to share data between publishers and application developers, to support data analysis in Digital Humanities research, and to develop data-driven intelligent applications. As a first step towards these goals, this article presents the WarSampo knowledge graph (KG), a shared semantic infrastructure, and a Linked Open Data (LOD) service for publishing data about WW2, with a focus on Finnish military history. The shared semantic infrastructure is based on the idea of representing war as a spatio-temporal sequence of events that soldiers, military units, and other actors participate in. The used metadata schema is an extension of CIDOC CRM, supplemented by various military historical domain ontologies. With an infrastructure containing shared ontologies, maintaining the interlinked data brings upon new challenges, as one change in an ontology can propagate across several datasets that use it. To support sustainability, a repeatable automatic data transformation and linking pipeline has been created for rebuilding the whole WarSampo KG from the individual source datasets. The WarSampo KG is hosted on a data service based on W3C Semantic Web standards and best practices, including content negotiation, SPARQL API, download, automatic documentation, and other services supporting the reuse of the data. The WarSampo KG, a part of the international LOD Cloud and totalling ca. 14 million triples, is in use in nine end-user application views of the WarSampo portal, which has had over 400 000 end users since its opening in 2015.
Petri Leskinen, Eero Hyvönen and Jouni Tuominen: Members of Parliament in Finland Knowledge Graph and Its Linked Open Data Service. Further with Knowledge Graphs. Proceedings of the 17th International Conference on Semantic Systems, 6-9 September 2021, Amsterdam, The Netherlands, pp. 255-269, IOS Press, 2021. bib pdf link
2020
Petri Leskinen and Eero Hyvönen: Linked Open Data Service about Historical Finnish Academic People in 1640–1899. DHN 2020 Digital Humanities in the Nordic Countries. Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, pp. 284-292, CEUR Workshop Proceedings, vol. 2612, Riga, Latvia, October, 2020. bib pdf link
Mikko Koho, Petri Leskinen and Eero Hyvönen: Integrating Historical Person Registers as Linked Open Data in the WarSampo Knowledge Graph. Semantic Systems. In the Era of Knowledge Graphs. SEMANTiCS 2020 (Eva Blomqvist, Paul Groth, Victor de Boer, Tassilo Pellegrini, Mehwish Alam, Tobias Käfer, Peter Kieseberg, Sabrina Kirrane, Albert Meroño-Peñuela and Harshvardhan J. Pandit (eds.)), Lecture Notes in Computer Science, vol. 12378, pp. 118-126, Springer, Cham, Amsterdam, The Netherlands, October, 2020. bib pdf link
Semantic data integration from heterogeneous, distributed data silos enables Digital Humanities research and application development employing a larger, mutually enriched and interlinked knowledge graph. However, data integration is challenging, involving aligning the data models and reconciling the concepts and named entities, such as persons and places. This paper presents a record linkage process to reconcile person references in different military historical person registers with structured metadata. The information about persons is aggregated into a single knowledge graph. The process was applied to reconcile three person registers of the popular semantic portal WarSampo -- Finnish World War 2 on the Semantic Web . The registers contain detailed information about some 100,000 people and are individually maintained by domain experts. Thus, the integration process needs to be automatic and adaptable to changes in the registers. An evaluation of the record linkage results is promising and provides some insight into military person register reconciliation in general.
Minna Tamper, Petri Leskinen, Jouni Tuominen and Eero Hyvönen: Modeling and Publishing Finnish Person Names as a Linked Open Data Ontology. 3rd Workshop on Humanities in the Semantic Web (WHiSe 2020), pp. 3-14, CEUR Workshop Proceedings, vol. 2695, June, 2020. bib pdf link
2019
Mikko Koho, Erkki Heino, Petri Leskinen, Esko Ikkala, Minna Tamper, Kasper Apajalahti, Jouni Tuominen, Eetu Mäkelä and Eero Hyvönen: WarSampo Knowledge Graph. Zenodo, October, 2019. Dataset. bib link
WarSampo Knowledge Graph includes harmonized data of different kinds concerning the Second World War in Finland, separated in different subgraphs representing events, actors, places, photographs, and other aspects and documentation of the war. The data covers the Winter War 1939-1940 against the Soviet attack, the Continuation War 1941-1944 where the occupied areas of the Winter War were temporarily regained, and the Lapland War 1944-1945, where the Finns pushed the German troops away from Lapland.
Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen and Kirsi Keravuori: BiographySampo - Publishing and Enriching Biographies on the Semantic Web for Digital Humanities Research. The Semantic Web. ESWC 2019 (Pascal Hitzler, Miriam Fernández, Krzysztof Janowicz, Amrapali Zaveri, Alasdair J.G. Gray, Vanessa Lopez, Armin Haller and Karl Hammar (eds.)), pp. 574-589, Springer-Verlag, June, 2019. bib pdf link
Petri Leskinen and Eero Hyvönen: Extracting Genealogical Networks of Linked Data from Biographical Texts. The Semantic Web: ESWC 2019 Satellite Events (Hitzler, P., Kirrane, S., Hartig, O., de Boer, V., Vidal, M.-E., Maleshkova, M., Schlobach, S., Hammar, K., Lasierra, N., Stadtmüller, S., Hose, K., Verborgh, R. (ed.)), pp. 121-125, Springer, June, 2019. bib pdf
Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen and Kirsi Keravuori: Demonstrating BiographySampo in Solving Digital Humanities Research Problems in Biography and Prosopography. The Fourth Digital Humanities in the Nordic Countries 2019 (DHN2019), Book of Abstracts, University of Copenhagen, Copenhagen, Denmark, March, 2019. bib pdf link
2018
Petri Leskinen, Eero Hyvönen and Jouni Tuominen: Analyzing and Visualizing Prosopographical Linked Data Based on Biographies. Proceedings of the Second Conference on Biographical Data in a Digital World 2017 (BD2017), vol. 2119, pp. 39-44, CEUR Workshop Proceedings, Linz, Austria, 2018. bib pdf link
This paper shows how faceted search on biographical data can be utilized as a flexible basis for filtering target groups of people and, in particular, how generic data analysis and visualization tools can then be applied for solving prosopographical research questions based on the filtered data. This idea is demonstrated and evaluated in practice by presenting two application case studies: 1) linked data extracted from a printed registry of over 10 000 alumni (1867–1992) of the prominent Finnish high school Norssi, and 2) a knowledge graph extracted from 13 000 short biographies of significant Finnish people (from 3rd century to present times) in the National Biography of Finland. In both cases, the data is enriched by linking their entities with several other external datasets.
Jouni Tuominen, Eero Hyvönen and Petri Leskinen: Bio CRM: A Data Model for Representing Biographical Data for Prosopographical Research. Proceedings of the Second Conference on Biographical Data in a Digital World 2017 (BD2017), vol. 2119, pp. 59-66, CEUR Workshop Proceedings, Linz, Austria, 2018. bib pdf link
Biographies make a promising application case of Linked Data: they can be used, e.g., as a basis for Digital Humanities research in prosopography and as a key data and linking resource in semantic Cultural Heritage (CH) portals. In both use cases, a semantic data model for harmonizing and interlinking heterogeneous data from different sources is needed. This paper presents such a data model, Bio CRM, with the following key ideas: 1) The model is a domain specific extension of CIDOC CRM, making it applicable to not only biographical data but to other CH data, too. 2) The model makes a distinction between enduring unary roles of actors, their enduring binary relationships, and perduing events, where the participants can take different roles modeled as a role concept hierarchy. 3) The model can be used as a basis for semantic data validation and enrichment by reasoning. 4) The enriched data conforming to Bio CRM is targeted to be used by SPARQL queries in a flexible ways using a hierarchy of roles in which participants can be involved in events.
Goki Miyakita, Petri Leskinen and Eero Hyvönen: Using Linked Data for Prosopographical Research of Historical Persons: Case U.S. Congress Legislators. Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection. 7th International Conference, EuroMed 2018, Nicosia, Cyprus, Springer-Verlag, November, 2018. bib pdf
Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen and Kirsi Keravuori: Biografiasammon tekoäly yhdistää ja rikastaa suomalaiset elämäkerrat semanttisessa webissä. Aalto-yliopisto, Semanttisen laskennan tutkimusryhmä (SeCo), Nov, 2018. bib pdf
Biografiasampo-järjestelmä käynnistää uuden aikakauden elämäkertakokoelmien julkaisemisessa ja käyttämisessä verkossa. Järjestelmän ydinaineistona on Kansallisbiografia ja muut Suomalaisen Kirjallisuuden Seuran (SKS) ja tieteellisten seurojen toimittamat pienoiselämäkerrat, yhteensä 13 100 elämäntarinaa, joita on kirjoittanut 900 suomalaista tutkijaa. Biografiasammon innovaationa on luoda kieliteknologian, tekoälyn ja semanttisen webin teknologioiden avulla elämäkertojen teksteistä ja niihin eri lähteissä liittyvistä tiedoista tietämysverkko (knowledge graph) ja kansallinen tietoinfrastruktuuri, joka koostuu miljoonista tietojen välisistä yhteyksistä. Tietämysverkko on julkaistu linkitetyn datan palvelussa, jonka varaan on toteutettu seitsemästä sovellusnäkymästä koostuva älykäs, kaikille avoin ja maksuton verkkopalvelu biografiasampo.fi kansalaisten ja digitaalisten ihmistieteiden tutkijoiden käytettäväksi.
Minna Tamper, Petri Leskinen, Kasper Apajalahti and Eero Hyvönen: Using Biographical Texts as Linked Data for Prosopographical Research and Applications. Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection. 7th International Conference, EuroMed 2018, Nicosia, Cyprus (Marinos Ioannides, Eleanor Fink, Raffaella Brumana, Petros Patias, Anastasios Doulamis, João Martins and Manolis Wallace (eds.)), pp. 125-137, Springer-Verlag, November, 2018. bib pdf link
Goki Miyakita, Petri Leskinen and Eero Hyvönen: U.S. Congress Prosopograher - A Tool for Prosopographical Research of Legislators. Proceedings of the ISWC 2018 Posters & Demonstrations, Industry and Blue Sky Ideas Tracks, CEUR Workshop Proceedings, Monterey, Califonia, USA, October, 2018. Vol 2180. bib pdf link
Petri Leskinen, Goki Miyakita, Mikko Koho and Eero Hyvönen: Combining Faceted Search with Data-analytic Visualizations on Top of a SPARQL Endpoint. Proceedings of VOILA 2018, Monterey, California. CEUR Workshop Proceedings, Vol. 2187, October, 2018. bib pdf
Eero Hyvönen, Petri Leskinen, Minna Tamper, Jouni Tuominen and Kirsi Keravuori: Semantic National Biography of Finland. Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN 2018), pp. 372-385, CEUR Workshop Proceedings, Vol-2084, Helsinki, Finland, March, 2018. bib pdf link
This paper presents the vision of publishing and utilizing textual biographies as Linked (Open) Data on the Semantic Web. As a case study, we publish the live stories of the National Biography of Finland, created by the Finnish Literature Society, as semantic, i.e., machine “understandable” metadata in a SPARQL endpoint using the Linked Data Finland (LDF.fi) service. On top of the data service various Digital Humanities applications are built. The applications include searching and studying individual personal histories as well as historical research of groups of persons using methods of prosopography. The biographical data is enriched by extracting events from unstructured and semi-structured texts, and by linking entities internally and to external data sources. A faceted semantic search engine is provided for filtering groups of people from the data for prosopographical research. An extension of the event-based CIDOC CRM ontology is used as the underlying data model, where lives are seen as chains of interlinked events populated from the data of the biographies and additional data sources, such as museum collections, library databases, and archives.
2017
Mikko Koho, Eero Hyvönen, Erkki Heino, Jouni Tuominen, Petri Leskinen and Eetu Mäkelä: Linked Death - Representing, Publishing, and Using Second World War Death Records as Linked Open Data. The Semantic Web: ESWC 2017 Satellite Events (Eva Blomqvist, Katja Hose, Heiko Paulheim, Agnieszka Ławrynowicz, Fabio Ciravegna and Olaf Hartig (eds.)), pp. 369-383, Springer, Cham, 2017. bib pdf link
War history of the Second World War (WW2), humankind’s largest disaster, is of great interest to both laymen and researchers. Most of us have ancestors and relatives who participated in the war, and in the worst case got killed. Researchers are eager to find out what actually happened then, and even more importantly why, so that future wars could perhaps be prevented. The darkest data of war history are casualty records—from such data we could perhaps learn most about the war. This paper presents a model and system for representing death records as linked data, so that 1) citizens could find out more easily what happened to their relatives during WW2 and 2) digital humanities (DH) researchers could (re)use the data easily for research.
Jouni Tuominen, Eero Hyvönen and Petri Leskinen: Bio CRM: A Data Model for Representing Biographical Data for Prosopographical Research. Biographical Data in a Digital World 2017 (BD2017), Linz, Austria, November, 2017. bib pdf link
Petri Leskinen, Eero Hyvönen and Jouni Tuominen: Analyzing and Visualizing Prosopographical Linked Data Based on Short Biographies. Biographical Data in a Digital World 2017 (BD2017), Linz, Austria, November, 2017. bib pdf link
Petri Leskinen, Mikko Koho, Erkki Heino, Minna Tamper, Esko Ikkala, Jouni Tuominen, Eetu Mäkelä and Eero Hyvönen: Modeling and Using an Actor Ontology of Second World War Military Units and Personnel. Proceedings of the 16th International Semantic Web Conference (ISWC 2017) (Claudia d Amato, Miriam Fernandez, Valentina Tamma, Freddy Lecue, Philippe Cudré-Mauroux, Juan Sequeda, Christoph Lange and Jeff Heflin (eds.)), pp. 280-296, Springer-Verlag, Vienna, Austria, October, 2017. bib pdf link
This paper presents a model for representing historical military personnel and army units, based on large datasets about World War II in Finland. The model is in use in WarSampo data service and semantic portal, which has had tens of thousands of distinct visitors. A key challenge is how to represent ontological changes, since the ranks and units of military personnel, as well as the names and structures of army units change rapidly in wars. This leads to serious problems in both search as well as data linking due to ambiguity and homonymy of names. In our solution, actors are represented in terms of the events they participated in, which facilitates disambiguation of personnel and units in different spatio-temporal contexts. The linked data in the WarSampo Linked Open Data cloud and service has ca. 9 million triples, including actor datasets of ca. 100 000 soldiers and ca. 16 100 army units. To test the model in practice, an application for semantic search and recommending based on data linking was created, where the spatio-temporal life stories of individual soldiers can be reassembled dynamically by linking data from different datasets. An evaluation is presented showing promising results in terms of linking precision.
Esko Ikkala, Mikko Koho, Erkki Heino, Petri Leskinen, Eero Hyvönen and Tomi Ahoranta: Prosopographical Views to Finnish WW2 Casualties Through Cemeteries and Linked Open Data. Proceedings of the Workshop on Humanities in the Semantic Web (WHiSe II), CEUR Workshop Proceedings, Vienna, Austria, October, 2017. bib pdf link
This paper presents an application for studying the death records of WW2 casualties from a prosopograhical perspective, provided by the various local military cemeteries where the dead were buried. The idea is to provide the end user with a global visual map view on the places in which the casualties were buried as well as with a local historical perspective on what happened to the casualties that lay within a particular cemetery of a village or town. Plenty of data exists about the Second World War (WW2), but the data is typically archived in unconnected, isolated silos in different organizations. This makes it difficult to track down, visualize, and study information that is contained within multiple distinct datasets. In our work, this problem is solved using aggregated Linked Open Data provided by the WarSampo Data Service and SPARQL endpoint.
Petri Leskinen, Jouni Tuominen, Erkki Heino and Eero Hyvönen: An Ontology and Data Infrastructure for Publishing and Using Biographical Linked Data. Proceedings of the Workshop on Humanities in the Semantic Web (WHiSe II), pp. 15-26., CEUR Workshop Proceedings, Vol. 2014, Vienna, Austria, October, 2017. bib pdf link
This paper describes the ontology model and published datasets of a digitized biographical person register. The applied ontology model is designed to represent people via their enduring roles and perduring lifetime events. The model is designed to support 1) prosopographical Digital Humanities research, 2) linking to resources in semantic Cultural Heritage portals, and 3) semantic data validation and enrichment by using SPARQL queries. The linked data approach enables to enrich a person s biography by interlinking it with space and time related biographical events, persons relating by social contacts or family relations, historical events, and personal achievements.
Eero Hyvönen, Erkki Heino, Petri Leskinen, Esko Ikkala, Mikko Koho, Minna Tamper, Jouni Tuominen and Eetu Mäkelä: WarSampo: Publishing and Using Linked Open Data about the Second World War. EuropeanaTech Insight, no. 7, Europeana, September, 2017. bib pdf link
The article overviews the system WarSampo – Finnish World War 2 on the Semantic Web, the winner of the LODLAM Challenge 2017 Open Data Prize on June 29 in Venice, Italy.
Minna Tamper, Petri Leskinen, Esko Ikkala, Arttu Oksanen, Eetu Mäkelä, Erkki Heino, Jouni Tuominen, Mikko Koho and Eero Hyvönen: AATOS – a Configurable Tool for Automatic Annotation. Proceedings, Language, Data and Knowledge (LDK 2017), pp. 276-289, Springer-Verlag, Galway, Ireland, June, 2017. bib pdf link
This paper presents an automatic annotation tool AATOS for providing documents with semantic annotations. The tool links entities found from the texts to ontologies defined by the user. The application is highly configurable and can be used with different natural language Finnish texts. The application was developed as a part of WarSampo and Semantic Finlex projects and tested using Kansa Taisteli magazine articles and consolidated Finnish legislation of Semantic Finlex. The quality of the automatic annotation was evaluated by measuring precision and recall against existing manual annotations. The results showed that the quality of the input text, as well as the selection and configuration of the ontologies impacted the results.
Erkki Heino, Minna Tamper, Eetu Mäkelä, Petri Leskinen, Esko Ikkala, Jouni Tuominen, Mikko Koho and Eero Hyvönen: Named Entity Linking in a Complex Domain: Case Second World War History. Proceedings, Language, Data and Knowledge (LDK 2017), pp. 120-133, Springer-Verlag, Galway, Ireland, June, 2017. bib pdf link
This paper discusses the challenges of applying named entity linking in a rich, complex domain – specifically, the linking of 1) military units, 2) places and 3) people in the context of rich Second World War data. Multiple sub-scenarios are discussed in detail through concrete evaluations, analyzing the problems faced, and the solutions developed. A key contribution of this work is to highlight the heterogeneity of problems and approaches needed even inside a single domain, depending on both the source data as well as the target authority.
Eero Hyvönen, Petri Leskinen, Erkki Heino, Jouni Tuominen and Laura Sirola: Reassembling and Enriching the Life Stories in Printed Biographical Registers: Norssi High School Alumni on the Semantic Web. Proceedings, Language, Data and Knowledge (LDK 2017), pp. 113-119, Springer-Verlag, Galway, Ireland, June, 2017. bib pdf link
This paper presents the idea to enrich printed biographical person registers with linked data related to events that took place after the register was published. By transforming printed historical documents into structured data, semantic search to written texts can be provided for the reader. Even more importantly, life stories of historical persons can be extended based on data linking by extracting semantic structures from printed texts, and by combining this data with external datasets and data services. Such linking provides an enriched context for prosopographical research on people in the register, as well as an enhanced reading experience for anyone interested in reading the biographies. As a concrete case study, a register 1867–1992 of over 10 000 alumni of the prominent Finnish high school “Norssi” was transformed into RDF, was enriched by data linking, was published as a linked data service, and is provided to end users via a faceted search engine and browser for studying lives of historical persons and for prosopographical research.
2016
Petri Leskinen: Sotilashenkilöiden ja joukko-osastojen mallintaminen ja käyttö toimijaontologiana. MSc Thesis (in Finnish), Aalto University, School of Science, Degree Programme in Computer Science and Engineering, Dec, 2016. bib pdf
Toimijaontologia mallintaa henkilöitä ja henkilöryhmiä linkitetyssä avoimessa datassa. Toimijaontologiamallin tarkoitus on mahdollistaa eri lähteiden aineistojen kokoaminen yhteen ja sen julkaisu yhdenmukaisessa formaatissa, jotta tietoa voidaan hyödyntää niin digitaalisten ihmistieteiden tutkimuksessa kuin tarjoamalla käyttöliittymiä aineiston selaamiseen visuaalisessa muodossa. Laadittu ontologia noudattaa toimija–tapahtuma-mallia. Siinä toimija mallinnetaan häneen liittyvien elämäkerrallisten tapahtumien summana. Ratkaisujen perustana käytettiin CIDOC CRM -standardia, millä haluttiin taata mallin helppo laajennettavuus sekä noudattaa kulttuurihistorialliselle datalle yhdenmukaista julkaisukäytäntöä. Työ on tehty osana laajempaa Sotasampo-projektia, johon kerättiin kattava tietokanta toisen maailmansodan aikaista aineistoa Suomen osalta. Oma osuuteni tässä työssä oli toimijaontologiamallin laatiminen sekä sen populointi sotilashenkilöillä ja -osastoilla. Aineisto on julkaistu avoimena datana (http://www.ldf.fi/dataset/warsa) ja on selattavissa Sotasampo-portaalissa (http://www.sotasampo.fi).
Eero Hyvönen, Erkki Heino, Petri Leskinen, Esko Ikkala, Mikko Koho, Minna Tamper, Jouni Tuominen and Eetu Mäkelä: Publishing Second World War History as Linked Data Events on the Semantic Web. Proceedings of Digital Humanities 2016, short papers, pp. 571-573, Kraków, Poland, July, 2016. bib pdf link
Data about wars is typically heterogeneous, distributed in the data silos of the fighting parties, multilingual, and often controversial depending on the political point of view. It is therefore hard for the historians to get a global picture of what has actually happened, to whom, where, when, and how. We argue that Semantic Web and Linked Data technologies are a very promising approach for modeling, harmonizing, and aggregating data about war history. Our goal is to make it possible, for both historians and laymen, to study history in a contextualized way where linked datasets enrich each other. The paper presents the in-use WarSampo 1 system, where massive collections of heterogeneous data about the (Finnish) history of the Second World War are harmonized using an event-based approach, and provided as a Linked Open Data service for applications to use. As a use case, a semantic portal WarSampo providing six different perspectives to the war based on events is presented.
Mikko Koho, Eero Hyvönen, Erkki Heino, Jouni Tuominen, Petri Leskinen and Eetu Mäkelä: Linked Death - Representing, Publishing, and Using Second World War Death Records as Linked Open Data. Proceedings of the 1st Workshop on Humanities in the Semantic Web (WHiSe), CEUR Workshop Proceedings, Heraklion, Crete, Greece, May, 2016. Vol 1608. bib pdf link
War history of the Second World War (WW2), humankind s largest disaster, is of great interest to both laymen and researchers. Most of us have ancestors and relatives who participated in the war, and in the worst case got killed. Researchers are eager to find out what actually happened then, and even more importantly why, so that future wars could perhaps be prevented. The darkest data of war history are casualty records---from such data we could perhaps learn most about the war. This paper presents a model and system for representing death records as linked data, so that 1) citizens could find out more easily what happened to their relatives during WW2 and 2) digital humanities (DH) researchers could (re)use the data easily for research.
Eero Hyvönen, Erkki Heino, Petri Leskinen, Esko Ikkala, Mikko Koho, Minna Tamper, Jouni Tuominen and Eetu Mäkelä: WarSampo Data Service and Semantic Portal for Publishing Linked Open Data about the Second World War History. The Semantic Web – Latest Advances and New Domains (ESWC 2016) (Harald Sack, Eva Blomqvist, Mathieu d Aquin, Chiara Ghidini, Simone Paolo Ponzetto and Christoph Lange (eds.)), pp. 758-773, Springer-Verlag, May, 2016. bib pdf link
This paper presents the WarSampo system for publishing collections of heterogeneous, distributed data about the Second World War on the Semantic Web. WarSampo is based on harmonizing massive datasets using event-based modeling, which makes it possible to enrich datasets semantically with each others’ contents. WarSampo has two components: First, a Linked Open Data (LOD) service WarSampo Data for Digital Humanities (DH) research and for creating applications related to war history. Second, a semanticWarSampo Portal has been created to test and demonstrate the usability of the data service. The WarSampo Portal allows both historians and laymen to study war history and destinies of their family members in the war from different interlinked perspectives. Published in November 2015, theWarSampo Portal had some 20,000 distinct visitors during the first three days, showing that the public has a great interest in this kind of applications.
2015
Eero Hyvönen, Jouni Tuominen, Eetu Mäkelä, Jérémie Dutruit, Kasper Apajalahti, Erkki Heino, Petri Leskinen and Esko Ikkala: Second World War on the Semantic Web: The WarSampo Project and Semantic Portal. Proceedings of the ISWC 2015 Posters & Demonstrations Track, CEUR-WS Proceedings, Bethlehem, PA, USA, October, 2015. Vol 1486. bib pdf link
This paper initiates and fosters work on publishing Linked Open Data about the Second World War. It is argued that the heterogeneous, distributed data about the international world war history makes a promising use case for semantic technologies. We hope that by making war data openly available we can learn from the past and promote peace.
(total: 71 publications)