|
Knowledge Extraction from Natural Language
Case Finnish (and Swedish) Texts |
Knowledge Extraction from Finnish Texts
Goals of Research
In Digital Humanities (DH) the data comes in often in textual form (e.g., news, biographies, articles, novels).
When publishing such content as Linked Data for DH analysis, the meaning of literal mentions of entities (such as names of persons and places), concepts, relations, events, topics, etc. have to be extracted from unstructured texts and represented as structured semantic data for the computer.
In our own work, for example, such knowledge extraction has been needed when developing the Sampo series of semantic portals.
During this research, various natural language processing (NLP) tools and linguistic datasets have been developed. In our view, such tools and resources should be made openly available on the Web as web services. NLP tools and resources would be an important part of the
Linked Open Data Infrastructure for Digital Humanities in Finland.
Services for NLP
More information will appear here later.
Contact Persons
Dr. Cand. Rafael Leal, Aalto University
Prof. Eero Hyvönen, University of Helsinki (HELDIG) and Aalto
Publications
2025
2024
2021
Rafael Leal, Joonas Kesäniemi, Mikko Koho and Eero Hyvönen:
Relevance Feedback Search Based on Automatic Annotation and Classification of Texts.
3rd Conference on Language, Data and Knowledge (LDK 2021) (Dagmar Gromann, Gilles Sérasset, Thierry Declerck, John P. McCrae, Jorge Gracia, Julia Bosque-Gil, Fernando Bobillo and Barbara Heinisch (eds.)), Open Access Series in Informatics (OASIcs), vol. 93, pp. 18:1-18:15, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
bib pdf link
2020
Minna Tamper, Arttu Oksanen, Jouni Tuominen, Aki Hietanen and Eero Hyvönen:
Automatic Annotation Service APPI: Named Entity Linking in Legal Domain.
The Semantic Web: ESWC 2020 Satellite Events (Harth, Andreas, Presutti, Valentina, Troncy, Raphaël, Acosta, Maribel, Polleres, Axel, Fernández, Javier D., Xavier Parreira, Josiane, Hartig, Olaf, Hose, Katja and Cochez, Michael (eds.)), Lecture Notes in Computer Science, vol. 12124, pp. 208-213, Springer-Verlag, 2020.
bib pdf link
2019
Petri Leskinen and Eero Hyvönen:
Extracting Genealogical Networks of Linked Data from Biographical Texts.
The Semantic Web: ESWC 2019 Satellite Events (Hitzler, P., Kirrane, S., Hartig, O., de Boer, V., Vidal, M.-E., Maleshkova, M., Schlobach, S., Hammar, K., Lasierra, N., Stadtmüller, S., Hose, K., Verborgh, R. (ed.)), pp. 121-125, Springer, June, 2019.
bib pdf
2018
Minna Tamper, Petri Leskinen, Kasper Apajalahti and Eero Hyvönen:
Using Biographical Texts as Linked Data for Prosopographical Research and Applications.
Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection. 7th International Conference, EuroMed 2018, Nicosia, Cyprus (Marinos Ioannides, Eleanor Fink, Raffaella Brumana, Petros Patias, Anastasios Doulamis, João Martins and Manolis Wallace (eds.)), pp. 125-137, Springer-Verlag, November, 2018.
bib pdf link