Booksampo has been deployed by the Finnish public libraries
BookSampo - Finnish Fiction Literature on the Semantic Web
What is BookSampo?
BookSampo is a Linked Open Data (LOD) service and a semantic portal on top of it in use Finland.
It is a member of the Sampo series of LOD services and semantic portals
based on the Finnish Semantic Web infrasructure FIN-CLARIAH.
The Booksampo Knowledge Graph (KG) covers metadata about
practically all Finnish fiction literature available at Finnish public libraries on
a work level.
After its initial publication in autumn 2011, the Bookasampo KG has been extended with additional content including, e.g., non-fiction iterature, too.
The system introduces a variety of semantic web novelties
deployed into practise: The underlying data model is based on the functional, content-centered
metadata indexing paradigm using RDF. Linked Data (LD) principles are used for mapping the
metadata with tens of interlinked ontologies in the national FinnONTO ontology infrastructure.
The contents are also linked with the large LD metadata repository of related
cultural heritage content of CultureSampo.
BookSampo was originally based on using the CultureSampo - Finnish Culture on the Semantic 2.0
as its data service, demonstrating the idea of re-using semantic content from multiple
perspectives without a need for modifications.
Public Portal on the Semantic Web
The portal has been online since autumn 2011 at:
http://www.kirjasampo.fi/.
Booksampo was developed as part of the national FinnONTO research project series 2003-2012 of Aalto University and University of Helsinki together with tens of partnering Finnish organiztions in the research consortium, including the Finnish public libraries.
The original portal interface was implemented at the Finnish public library consortium Kirjastot.fi.
The system is based on the fiction literature ontology KAUNO (developed from the Kaunokki thesaurus) and other ontologies developed in the FinnONTO-project, and metadata from library databases, biographies, review articles, and other sources. The contents, search, and recommending services of the portal came through the APIs of the
CultureSampo system, that contained all semantic content of the system interlinked in RDF with other cultural materials.
Most of the BookSampo content was transformed automatically from existing databases,
with the help of ontologies derived from thesauri in use in Finland,
but in addtion tens of volunteered librarians have participated in a Web 2.0
fashion in annotating and correcting the metadata using the SAHA metadata editor connected to the ONKI ontology services of FinnONTO, especially regarding older literature.
BookSampo Deployment by the Finnish Public Libraries
The BookSampo research propotype was deployed the public libraries and has been maintained indepently by them since 2011.
At the same time, the Booksampo KG was seprated from the larger CultureSampo KG
and is today hosted by a Fuseki triplestore SPARQL endpoint.
The BookSampo portal has had lately some 1.6 million annual users and has become an important part of the web services provided by the national public library consortium
Kirjastot.fi.
BookSampo 2.0: Semantic Search, Browsing, and Data-analyses
In 2022 we started a new project pertaining to BookSampo with two major aims:
- Develop a prototype user interface for the BookSampo knowledge graph based on the Sampo Model
and Sampo-UI tools that facilitate semantic search and browsing integrated seamlessly with tools for data analysis.
- Study Finnish fiction literature using Digital Humanities methods based on the BookSampo knowledge graph, LOD services, and the portal.
BookSampo 2.0 landing page with five application perspectives for searhing, browing, and studying literature
In the 00's research on semantic portal development focused on data harmonization, aggregation, search, and browsing ("first generation systems").
Current Booksampo portal (Kirjasampo.fi) is an example of first generation systems.
The rise of Digital Humanities research then started to shift the focus to providing the user with integrated tools for solving research problems in interactive ways ("second generation systems". BookSampo 2.0 demontrates the idea of second generation systems. The next step ahead to "third generation systems" is based on Artificial Intelligence: future portals not only provide tools for the human to solve problems but are used for finding research problems in the first place, for addressing them, and even for solving them automatically under the constraints set by the human researcher. (Hyvönen, 2022)
BookSampo 2.0 was published at the BookSampo 2.0 publication event on October 10, 2023, and is available at:
https://analyysi.kirjasampo.fi
More Information
Here is a short three minute video about the BookSampo 2022 project presented at the Theory and Practice of Digital Libraries conference (TPDL 2022) in Padua, Italy, 2022.
BookSampo Fiction Literature Knowledge Graph Revised: A New User Interface from SeCo Research Group on Vimeo.
In this video the new BookSampo user interface integrated with data-analytic tools, based on the Sampo model and Sampo-UI framework, is demonstrated online.
New BookSampo User Interface Demonstration from SeCo Research Group on Vimeo.
Contact
Professor Eero Hyvönen (project leader)
Aalto University, Department of Computer Science, and University of Helsinki (HELDIG)
first.last [ at ] aalto.fi
Annastiina Ahola
Aalto University, Department of Computer Science
first.last [ at ] aalto.fi
Telma Peura
Aalto University, Department of Computer Science, and University of Helsinki (HELDIG)
first.last [ at ] helsinki.fi
Heikki Rantala
Aalto University, Department of Computer Science
first.last [ at ] aalto.fi
Publications Related to BookSampo
2025
Eero Hyvönen, Petri Leskinen, Henna Poikkimäki, Heikki Rantala, Jouni Tuominen, Senka Drobac, Ossi Koho, Ilona Pikkanen and Hanna-Leena Paloposki:
Searching, exploring, and analyzing historical letters and the underlying networks: LetterSampo Finland (1809–1917) data service and semantic portal. 2025. Accepted, long papers.
bib pdf
2024
2023
Annastiina Ahola, Eero Hyvönen and Heikki Rantala:
A User Interface Model for Digital Humanities Research: Case BookSampo – Finnish Fiction Literature on the Semantic Web.
Proceedings of ESWC 2023, poster and demo papers, Springer-Verlag, June, 2023.
bib
2022
Telma Peura, Petri Leskinen and Eero Hyvönen:
What Linked Data Can Tell about Geographical Trends in Finnish Fiction Literature - Using the BookSampo Knowledge Graph in Digital Humanities. 2022. Abstract under peer review.
bib
2020
2019
2013
Eetu Mäkelä, Kaisa Hypén and Eero Hyvönen:
Fiction Literature as Linked Open Data - the BookSampo Dataset. Semantic Web – Interoperability, Usability, Applicability, vol. 4, no. 3, pp. 299-306, 2013.
bib pdf link The BookSampo dataset provides information as linked data on fiction literature published in Finland going back to the 15th century, along with rich descriptions of both their content and context. The dataset contains data on nearly 400,000 subjects, including literary works, authors, book covers, reviews, awards, images, and movies, over 3 million triples in total. The data has been applied as the basis of the BookSampo portal in public use in Finland, and is aligned with the cross-domain cultural heritage contents and ontologies of CultureSampo, another in-use semantic portal. The data has been used to answer complex questions, such as what topics should one write about, if one wants to get a literary award (based on statistics). The metadata was transformed into RDF from legacy library databases, then enriched manually by dozens of librarians in a Web 2.0 fashion in Finnish public libraries, and is constantly updated at a rate of some new 90,000 triples monthly.
2012
Eetu Mäkelä, Kaisa Hypén and Eero Hyvönen:
Improving Fiction Literature Access by Linked Open Data -Based Collaborative Knowledge Storage - the BookSampo Project.
World Library and Information Congress: 78th IFLA General Conference and Assembly, Helsinki, IFLA, http://conference.ifla.org/ifla78, August, 2012.
bib pdf BookSampo is a joint project between the Finnish public libraries and semantic web researchers, to improve fiction literature search and recommendation. In the project, dozens of librarians around Finland have used a collaborative web-based metadata editor to input diverse knowledge about fiction literature into a shared database. Particularly, the project has sought to improve access by indexing not only bibliographical information about the books, but focusing on the content and context of the works. In order to do this, the database employs advanced techniques such as functional, content-centered indexing, ontological vocabularies and the networked data model of linked open data. To demonstrate the functionality this makes possible, the fiction literature portal http://www.kirjasampo.fi/ was created. This portal uses the knowledge created in the project to offer advanced semantic search and recommendation based on the database created. In addition, web services exposing direct access to the data have been used for example in culture hack events to answer more complex questions, such as where in Finland are the most crimes committed in fiction literature.
2011
Eetu Mäkelä, Kaisa Hypén and Eero Hyvönen:
BookSampo--Lessons Learned in Creating a Semantic Portal for Fiction Literature.
The Semantic Web - ISWC 2011 - 10th International Semantic Web Conference, Bonn, Germany, pp. 173-188, Springer-Verlag, 2011.
bib pdf link BookSampo is a semantic portal in use, covering metadata about practically all Finnish fiction literature of Finnish public libraries on a work level. The system introduces a variety of semantic web novelties deployed into practise: The underlying data model is based on the emerging functional, content-centered metadata indexing paradigm using RDF. Linked Data (LD) principles are used for mapping the metadata with tens of interlinked ontologies in the national FinnONTO ontology infrastructure. The contents are also linked with the large LD metadata repository of related cultural heritage content of CultureSampo. BookSampo is actually based on using CultureSampo as a semantic web service, demonstrating the idea of re-using semantic content from multiple perspectives without the need for modifications. Most of the content has been transformed automatically from existing databases, with the help of ontologies derived from thesauri in use in Finland, but in addtion tens of volunteered librarians have participated in a Web 2.0 fashion in annotating and correcting the metadata, especially regarding older litarature. For this purpose, semantic web editing tools and public ONKI ontology services were created and used. The paper focuses on lessons learned in the process of creating the semantic web basis of BookSampo.
Kaisa Hypén and Eetu Mäkelä:
An ideal model for an information system for fiction and its application: Kirjasampo and Semantic Web. Library Review, vol. 60, no. 4, April, 2011.
bib link Purpose – Library Director Jarmo Saarti introduced a wide or ideal model for fiction in literature in his dissertation, published in 1999. It introduces those aspects that should be included in an information system for fiction. Such aspects include literary prose and its intertextual references to other works, the writer, readers and critics receptions of the work as well as a researcher s view. It is also important to note how libraries approach a literary work by means of inventory, classification and content description. The most ambiguous of the aspects relates to that context in cultural history, which the work reflects and is a part of. The paper aims to discuss these issues. Design/methodology/approach – Since the model consists of several components which are not found in present library information systems and cannot be implemented by them, a new way had to be found to produce, save, process and present fiction‐related metadata. The Semantic Computing Research Group of Aalto University has developed several Semantic Web services for use in the field of culture, so cooperation with it and the use of Semantic Web tools were a natural starting point for the construction of the new service. Kirjasampo will be based on the Semantic Web RDF data model. The model enables a flexible linking of metadata derived from different sources, and it can be used to build a Semantic Web that can be approached contextually from different angles. Findings – The “semantically enriched” ideal model for fiction has hence been realised, at least to some extent: Kirjasampo supports literature‐related metadata that is more varied than earlier and aims to account for different contexts within literature and connections with regard to other cultural phenomena. It also includes contemporary reviews of works and, as such, readers receptions as well. Modern readers can share their views on works, once the user interface of the server is completed. It will include several features from the Kirjasto 2.0‐application, which enables the evaluation, description and recommendations of works. The service should be online by the end of Spring 2011. Research limitations/implications – The project involves novel collaboration between a public library and a computer science research unit, and utilises a novel approach to the description of fiction. Practical implications – The system encourages user participation in the description of fiction and is of practical benefit to librarians in understanding both how fiction is organised and how users interpret the same. Originality/value – Upon completion, the service will be the first Finnish information system for libraries built with the tools of the Semantic Web which offers a completely new user environment and application for data produced by libraries. It also strives to create a new model for saving and producing data, available to both library professionals and readers. The aim is to save, accumulate and distribute literary knowledge, experiences and silent information.