Semantic Biographies Based on Linked Data
We aim at developing methods and infrastructure for representing biographical narratives using Linked Data,
as well as practical demonstrators in which usefulness of the new technology is tested and shown.
The general idea is to provide the reader with a richer reading experience by providing her texts with additional linked contextual
information, such as the space and time of biographical events, and links to related persons, historical events, publications, paintings etc.
Biographical data can also be used as a basis for Digital Humanities research, where e.g. prosopographies and networks of people are studied.
National Semantic Biography of Finland
As a first experiment, the Semantic National Biography based on 6300 short biographies of the National Biography
of the Finnish Literature Society was created with an online
demo and a research paper (cf. Publications section below).
Reassembling the Republic of Letters
Research on biographical data is also of concern in the Reassembling the Republic of Letters project
where SeCo leads the People and Networks Work Package, as well as in the Cultures of Knowledge project where we collaborate with University of Oxford and Stanford University regarding research on epistolary data of people active during the early modern period.
These projects were/are funded by Tekes and the Linked Data Finland consortium, and by the EU COST Office and Mellon Foundation.
Bio CRM: A Data Model for Representing Biographical Information for Prosopography
One of the results of the project is Bio CRM, a conceptual reference model for representation of biographical information, with focus on prosopographical research. Bio CRM provides a semantic data model for harmonizing and interlinking heterogeneous data from different biographical data sources. The model includes structures for basic data of people, personal relations, professions, and events with participants in different roles.
The core design principles of the data model are:
- The model is a domain specific extension of CIDOC CRM, making it applicable to not only biographical data but to other CH data, too.
- The model makes a distinction between enduring unary roles of actors, their enduring binary relationships, and perduing events, where the participants can take different roles modeled as a role concept hierarchy.
- The model can be used as a basis for semantic data validation and enrichment by reasoning.
- The enriched data conforming to Bio CRM is targeted to be used by SPARQL queries in a flexible ways using a hierarchy of roles in which participants can be involved in events.
See more information about Bio CRM in the working paper.
Reassembling the War History of Military Personnel and Units of WW2
A key goal of the project WarSampo - Finnish WW2 on the Semantic Web
is to reasseble the life stories and narratives of WW2 soldiers and army units based on Linked Open Data from differents data sources.
For this, event-based modeling and an extension of CIDOC CRM is used.
Publishing Printed Person Registries on the Semantic Web
In this project, a printed person registry (short biographies) of 10 000 alumni
of the prominent Finnish high school Norssi in 1867-1992 was digitized and transformed into a
semantic portal, enriching the data from external data sources, and providing the end user with
faceted search and visualization tools for prosopographical research.
Prof. Eero Hyvönen, Aalto University and University of Helsinki
Dr. Eetu Makelä, Aalto University
Dr. Jouni Tuominen, University of Helsinki and Aalto University
Eero Hyvönen, Petri Leskinen, Minna Tamper, Jouni Tuominen and Kirsi Keravuori: Semantic National Biography of Finland
. Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN 2018)
, pp. 372-385, CEUR Workshop Proceedings, Helsinki, Finland, March, 2018. bib pdf link
This paper presents the vision of publishing and utilizing textual biographies as Linked (Open) Data on the Semantic Web. As a case study, we publish the live stories of the National Biography of Finland, created by the Finnish Literature Society, as semantic, i.e., machine “understandable” metadata in a SPARQL endpoint using the Linked Data Finland (LDF.fi) service. On top of the data service various Digital Humanities applications are built. The applications include searching and studying individual personal histories as well as historical research of groups of persons using methods of prosopography. The biographical data is enriched by extracting events from unstructured and semi-structured texts, and by linking entities internally and to external data sources. A faceted semantic search engine is provided for filtering groups of people from the data for prosopographical research. An extension of the event-based CIDOC CRM ontology is used as the underlying data model, where lives are seen as chains of interlinked events populated from the data of the biographies and additional data sources, such as museum collections, library databases, and archives.
Petri Leskinen, Jouni Tuominen, Erkki Heino and Eero Hyvönen: An Ontology and Data Infrastructure for Publishing and Using Biographical Linked Data
. Proceedings of the Workshop on Humanities in the Semantic Web (WHiSe II)
, CEUR Workshop Proceedings, Vienna, Austria, October, 2017. bib pdf
This paper describes the ontology model and published datasets of a digitized biographical person register. The applied ontology model is designed to represent people via their enduring roles and perduring lifetime events. The model is designed to support 1) prosopographical Digital Humanities research, 2) linking to resources in semantic Cultural Heritage portals, and 3) semantic data validation and enrichment by using SPARQL queries. The linked data approach enables to enrich a person s biography by interlinking it with space and time related biographical events, persons relating by social contacts or family relations, historical events, and personal achievements.
Eero Hyvönen, Petri Leskinen, Erkki Heino, Jouni Tuominen and Laura Sirola: Reassembling and Enriching the Life Stories in Printed Biographical Registers: Norssi High School Alumni on the Semantic Web
. Proceedings, Language, Technology and Knowledge (LDK 2017)
, pp. 113-119, Springer-Verlag, Galway, Ireland, June, 2017. bib pdf link
This paper presents the idea to enrich printed biographical person registers with linked data related to events that took place after the register was published. By transforming printed historical documents into structured data, semantic search to written texts can be provided for the reader. Even more importantly, life stories of historical persons can be extended based on data linking by extracting semantic structures from printed texts, and by combining this data with external datasets and data services. Such linking provides an enriched context for prosopographical research on people in the register, as well as an enhanced reading experience for anyone interested in reading the biographies. As a concrete case study, a register 1867–1992 of over 10 000 alumni of the prominent Finnish high school “Norssi” was transformed into RDF, was enriched by data linking, was published as a linked data service, and is provided to end users via a faceted search engine and browser for studying lives of historical persons and for prosopographical research.
Eero Hyvönen, Miika Alonen, Esko Ikkala and Eetu Mäkelä: Life Stories as Event-based Linked Data: Case Semantic National Biography
. Proceedings of ISWC 2014 Posters & Demonstrations Track
, CEUR Workshop Proceedings, October, 2014. bib pdf link
This paper argues, by presenting a case study and a demonstration on the web, that biographies make a promising application case of Linked Data: the reading experience can be enhanced by enriching the biographies with additional life time events, by proving the user with a spatio-temporal context for reading, and by linking the text to additional contents in related datasets.