||Reassembling the Republic of Letters
SeCo participates in the Digital Humanities EU COST action
Reassembling the Republic of Letters 1500-1800, 2014-2018,
lead by University of Oxford.
Eero Hyvönen leads the Work Group 2 People and Networks
in the initiative involving over 30 countries.
Related to this topic, there is also a research project
Cultures of Knowledge, phase III (2015-2017),
where SeCo collaborates with Oxford and Stanford University with the goal of designing a Linked Open Data
infrastructure, services, and tooling for the underlying humanist scholarly community.
Prof. Eero Hyvönen, Aalto University
Dr. Eetu Mäkelä, Aalto University and University of Helsinki
Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen and Kirsi Keravuori: BiographySampo – A Paradigm Shift for Publishing and Using Biography Collections on the Semantic Web
. November, 2018. bib pdf
This paper argues for making a paradigm shift in publishing and using biographical dictionaries on the web, based on Linked Data. Firstly, a biographical dictionary on the web should provide the end user with an enhanced reading experience of biographies by enriching them with data linking and reasoning. Secondly, the web publication should include not only biographies for humans to read but also versatile tooling for 1) biographical research of individual persons as well as for 2) prosopographical research on groups of people. To support these arguments, we present the designing principles and the implementation of the semantic portal ”BiographySampo – Finnish Life Stories on the Semantic Web” especially from the end user’s point of view. The system is based on a Linked Data service and knowledge graph extracted automatically from a collection of 13 100 textual biographies, written by 900 researchers. The texts are enriched with data linking to 16 external data sources and by harvesting external collection data from libraries, museums, and archives. The portal, consisting of seven different interlinked application perspectives, was released on September 27, 2018, for free public use for Digital Humanities researchers and the general public.
Jouni Tuominen, Eero Hyvönen and Petri Leskinen: Bio CRM: A Data Model for Representing Biographical Data for Prosopographical Research
. Proceedings of the Second Conference on Biographical Data in a Digital World 2017 (BD2017)
, vol. 2119, pp. 59-66, CEUR Workshop Proceedings, Linz, Austria, 2018. bib pdf link
Biographies make a promising application case of Linked Data: they can be used, e.g., as a basis for Digital Humanities research in prosopography and as a key data and linking resource in semantic Cultural Heritage (CH) portals. In both use cases, a semantic data model for harmonizing and interlinking heterogeneous data from different sources is needed. This paper presents such a data model, Bio CRM, with the following key ideas: 1) The model is a domain specific extension of CIDOC CRM, making it applicable to not only biographical data but to other CH data, too. 2) The model makes a distinction between enduring unary roles of actors, their enduring binary relationships, and perduing events, where the participants can take different roles modeled as a role concept hierarchy. 3) The model can be used as a basis for semantic data validation and enrichment by reasoning. 4) The enriched data conforming to Bio CRM is targeted to be used by SPARQL queries in a flexible ways using a hierarchy of roles in which participants can be involved in events.
Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen and Kirsi Keravuori: Biografiasammon tekoäly yhdistää ja rikastaa suomalaiset elämäkerrat semanttisessa webissä
. Aalto-yliopisto, Semanttisen laskennan tutkimusryhmä (SeCo), Nov, 2018. bib pdf
Biografiasampo-järjestelmä käynnistää uuden aikakauden elämäkertakokoelmien julkaisemisessa ja käyttämisessä verkossa. Järjestelmän ydinaineistona on Kansallisbiografia ja muut Suomalaisen Kirjallisuuden Seuran (SKS) ja tieteellisten seurojen toimittamat pienoiselämäkerrat, yhteensä 13 100 elämäntarinaa, joita on kirjoittanut 900 suomalaista tutkijaa. Biografiasammon innovaationa on luoda kieliteknologian, tekoälyn ja semanttisen webin teknologioiden avulla elämäkertojen teksteistä ja niihin eri lähteissä liittyvistä tiedoista tietämysverkko (knowledge graph) ja kansallinen tietoinfrastruktuuri, joka koostuu miljoonista tietojen välisistä yhteyksistä. Tietämysverkko on julkaistu linkitetyn datan palvelussa, jonka varaan on toteutettu seitsemästä sovellusnäkymästä koostuva älykäs, kaikille avoin ja maksuton verkkopalvelu biografiasampo.fi kansalaisten ja digitaalisten ihmistieteiden tutkijoiden käytettäväksi.
Minna Tamper, Petri Leskinen, Kasper Apajalahti and Eero Hyvönen: Using Biographical Texts as Linked Data for Prosopographical Research and Applications
. Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection. 7th International Conference, EuroMed 2018, Nicosia, Cyprus
, Springer-Verlag, November, 2018. bib pdf
Jouni Tuominen, Eetu Mäkelä, Eero Hyvönen, Arno Bosse, Miranda Lewis and Howard Hotson: Reassembling the Republic of Letters - A Linked Data Approach
. Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN 2018)
, pp. 76-88, CEUR Workshop Proceedings, Helsinki, Finland, March, 2018. bib pdf link
Between 1500 and 1800, a revolution in postal communication allowed ordinary men and women to scatter letters across and beyond Europe. This exchange helped knit together what contemporaries called the respublica litteraria, Republic of Letters, a knowledge-based civil society, crucial to that era’s intellectual breakthroughs, and formative of many modern European values and institutions. To enable effective Digital Humanities research on the epistolary data distributed in different countries and collections, metadata about the letters have been aggregated, harmonised, and provided for the research community through the Early Modern Letters Online (EMLO) service. This paper discusses the idea and benefits of using Linked Data as a basis for the next digital framework of EMLO, and presents experiences of a first demonstrational implementation of such a system.
Eero Hyvönen, Petri Leskinen, Minna Tamper, Jouni Tuominen and Kirsi Keravuori: Semantic National Biography of Finland
. Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN 2018)
, pp. 372-385, CEUR Workshop Proceedings, Vol-2084, Helsinki, Finland, March, 2018. bib pdf link
This paper presents the vision of publishing and utilizing textual biographies as Linked (Open) Data on the Semantic Web. As a case study, we publish the live stories of the National Biography of Finland, created by the Finnish Literature Society, as semantic, i.e., machine “understandable” metadata in a SPARQL endpoint using the Linked Data Finland (LDF.fi) service. On top of the data service various Digital Humanities applications are built. The applications include searching and studying individual personal histories as well as historical research of groups of persons using methods of prosopography. The biographical data is enriched by extracting events from unstructured and semi-structured texts, and by linking entities internally and to external data sources. A faceted semantic search engine is provided for filtering groups of people from the data for prosopographical research. An extension of the event-based CIDOC CRM ontology is used as the underlying data model, where lives are seen as chains of interlinked events populated from the data of the biographies and additional data sources, such as museum collections, library databases, and archives.
Eetu Mäkelä, Juha Törnroos, Thea Lindquist and Eero Hyvönen: WW1LOD: An application of CIDOC-CRM to World War 1 linked data
. International Journal on Digital Libraries, vol. 18, no. 4, pp. 333-343, Springer, nov, 2017. bib pdf link
The CIDOC-CRM standard indicates that common events, actors, places and timeframes are important in linking together cultural material, and provides a framework for describing them. However, merely describing entities in this way in two datasets does not yet interlink them. To do that, the identities of instances still need to be either reconciled, or be based on a shared vocabulary. The WW1LOD dataset presented in this paper was created to facilitate both of these approaches for collections dealing with the First World War. For this purpose, the dataset includes events, places, agents, times, keywords, and themes related to the war, based on over ten different authoritative data sources from providers such as the Imperial War Museum. The content is harmonized into RDF, and published as a Linked Open Data service. While generally basing on CIDOC-CRM, some modeling choices used also deviate from it where our experience dictated such. In the article, these deviations are discussed in the hope that they may serve as examples where CIDOC-CRM itself may warrant further examination. As a demonstration of use, the dataset and online service have been used to create a contextual reader application that is able link together and pull in information related to WW1 from e.g. 1914–1918 Online, Wikipedia, WW1 Discovery, Europeana and the Digital Public Library of America.
Esko Ikkala, Eetu Mäkelä and Eero Hyvönen: TourRDF: Representing, Enriching, and Publishing Curated Tours Based on Linked Data
. 19th International Conference of Knowledge Engineering and Management (EKAW 2014), Demo and Poster Papers
, November, 2014. bib pdf
Current mobile tourist guide systems are developed and used in separate data silos: each system and vendor tends to use its own proprietary, closed formats for representing tours and point of interest (POI) content. As a result, tour data cannot be enriched from other providers’ tour and POI repositories, or from other external data sources — even when such data were publicly available by, e.g., cities willing to promote tourism. This paper argues, that an open shared RDF-based tour vocabulary is needed to address these problems, and introduces such a model, TourRDF, extending the earlier TourML schema into the era of Linked Data. As a test and an evaluation of the approach, a case study based on data about the Unesco World Heritage site Suomenlinna fortress is presented.
Eero Hyvönen, Miika Alonen, Esko Ikkala and Eetu Mäkelä: Life Stories as Event-based Linked Data: Case Semantic National Biography
. Proceedings of ISWC 2014 Posters & Demonstrations Track
, CEUR Workshop Proceedings, October, 2014. bib pdf link
This paper argues, by presenting a case study and a demonstration on the web, that biographies make a promising application case of Linked Data: the reading experience can be enhanced by enriching the biographies with additional life time events, by proving the user with a spatio-temporal context for reading, and by linking the text to additional contents in related datasets.
Thea Lindquist, Michael Dulock, Juha Törnroos, Eero Hyvönen and Eetu Mäkelä: Using Linked Open Data to Enhance Subject Access in Online Primary Sources
. Cataloging & Classifying Quarterly, vol. 51, no. 8, Francis & Taylor, 2013. bib link
Using online primary sources is both rewarding and challenging for users. Improving subject access is essential as these sources become increasingly important in educational curricula. A user needs assessment with humanities users showed improving findability and context for historical subjects were major needs. Linked Data can help by linking related concepts in the sources using specialized vocabularies, enriching them with outside resources, and enabling semantic services that empower users. This article discusses a project to enhance subject access in an online World War I collection by deep linking historical data on the civilian experience in occupied Belgium and France.
Thea Lindquist, Eero Hyvönen, Juha Törnroos, Eetu Mäkelä: Leveraging linked data to enhance subject access - A case study of the University of Colorado Boulder s World War I collection online
. World Library and Information Congress: 78th IFLA General Conference and Assembly, Helsinki
, IFLA, http://conference.ifla.org/ifla78, August, 2012. bib link
Academic users often find work with online primary sources both rewarding and challenging. Improving subject access in these sources is essential as digital collections propagate and work with primary sources becomes increasingly important in humanities curricula. A user needs assessment was conducted with humanities users at the University of Colorado Boulder to facilitate engagement with these sources. Two of the major user needs identified were improving findability and context, particularly for historical subjects. Linked Data can help meet these needs by linking related concepts in the sources using a specialized vocabulary, enriching them with outside resources, and enabling semantically rich services that empower users. This paper discusses a project the authors undertook to enhance subject access in CU’s WWI Collection Online by deep linking historical data on the civilian experience in occupied Belgium. This work is intended to lead to a richer understanding of forces shaping the WWI period.
Eero Hyvönen, Thea Lindquist, Juha Törnroos and Eetu Mäkelä: History on the Semantic Web as Linked Data - An Event Gazetteer and Timeline for World War I
. Proceedings of CIDOC 2012 - Enriching Cultural Heritage, Helsinki, Finland
, CIDOC, http://www.cidoc2012.fi/en/cidoc2012/programme, June, 2012. bib pdf
Events are an essential component of cultural heritage (CH) Linked Data (LD): they link actors, places, times, objects, and other events into larger narrative structures, providing a rich basis for semantic searching, recommending, analysis, and visualization of CH data. This paper argues that shared vocabularies (gazetteers, ontologies) of events, such as the “Battle of Normandy” or “Crucifixion of Jesus”, are necessary to facilitate the aggregation and linking of heterogeneous content from various collections. For example, biographies, histories, photos, and paintings often reference or depict events. A set of general requirements for an event gazetteer is presented, based on the needs of publishing, aggregating, and reusing cultural heritage content as Linked Data. After this, a metadata model addressing the presented requirements for representing historical events is outlined. The model is being applied in a case study aimed at developing an event ontology for World War I (WWI). Our goals from an end-user perspective are twofold: 1) Facilitate event-based cataloging for curators in memory organizations; 2) Utilize semantic event descriptions and narrative event structures in end-user applications for searching and linking documents and other content about WWI, and for structuring and visualizing them.
Eeva Ahonen and Eero Hyvönen: Publishing Historical Texts on the Semantic Web - A Case Study
. Proceedings of the Third IEEE International Conference on Semantic Computing (ICSC2009)
, Berkeley, CA, USA, September, 2009. bib pdf
Historical texts are an important component of cultural heritage, and are being digitized and published on the web in various portals for the researhers and the public. However, searching and linking them with related contents is challenging due the non-structured text form, digitization errors, and the differences and variations between old and modern language, including historical names (e.g. places), used for querying. This paper addresses these issues by presenting an approach and a system for publishing old texts on the semantic web. As a case study, an existing historical newspaper archive on the web is considered. In our model, semantic metadata is added to the text using automated concept extraction methods. Search is implemented with semantic techniques, by creating a multi-faceted search interface for the text materials. Problems due to OCR errors and spelling variants are addressed with a fuzzy string matching algorithm trying to guess corresponding words in a lexicon, and giving suggestions for corrected words forms. References between texts in the library as well as links between the library and external knowledge sources are formed by using shared ontologies for semantic annotations.
Eero Hyvönen, Olli Alm and Heini Kuittinen: Using an Ontology of Historical Events in Semantic Portals for Cultural Heritage
. Proceedings of the Cultural Heritage on the Semantic Web Workshop at the 6th International Semantic Web Conference (ISWC 2007)
, Busan, Korea, November 12, 2007. bib pdf
We argue that an ontology of historical events is needed in semantic portals for cultural heritage due to three reasons. First, ontological identifiers (URIs) of events, such as the World War II or coronation of Napoleon, are needed in order to make collection metadata mutually interoperable in terms of related events---in the vein as identifiers are needed for identifying artifact types, persons, and geolocations when annotating collection items. Second, events are of central importance in creating semantic links between cultural contents in applications such as recommendation systems. Third, historical events are important as content items of their own, forming the backbone of chronological histories.