» back to normal layout
Knowledge Extraction from Natural Language
Case Finnish (and Swedish) Texts

Knowledge Extraction from Finnish Texts

Goals of Research

In Digital Humanities (DH) the data comes in often in textual form (e.g., news, biographies, articles, novels). When publishing such content as Linked Data for DH analysis, the meaning of literal mentions of entities (such as names of persons and places), concepts, relations, events, topics, etc. have to be extracted from unstructured texts and represented as structured semantic data for the computer. In our own work, for example, such knowledge extraction has been needed when developing the Sampo series of semantic portals.

During this research, various natural language processing (NLP) tools and linguistic datasets have been developed. In our view, such tools and resources should be made openly available on the Web as web services. NLP tools and resources would be an important part of the Linked Open Data Infrastructure for Digital Humanities in Finland.

Services for NLP

More information will appear here later.

Contact Persons

Dr. Cand. Minna Tamper, Aalto University

Prof. Eero Hyvönen, University of Helsinki (HELDIG) and Aalto


Publications

2021

Rafael Leal, Joonas Kesäniemi, Mikko Koho and Eero Hyvönen: Relevance Feedback Search Based on Automatic Annotation and Classification of Texts. 3rd Conference on Language, Data and Knowledge (LDK 2021) (Dagmar Gromann, Gilles Sérasset, Thierry Declerck, John P. McCrae, Jorge Gracia, Julia Bosque-Gil, Fernando Bobillo and Barbara Heinisch (eds.)), Open Access Series in Informatics (OASIcs), vol. 93, pp. 18:1-18:15, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021. bib pdf link

2020

Minna Tamper, Arttu Oksanen, Jouni Tuominen, Aki Hietanen and Eero Hyvönen: Automatic Annotation Service APPI: Named Entity Linking in Legal Domain. The Semantic Web: ESWC 2020 Satellite Events (Harth, Andreas, Presutti, Valentina, Troncy, Raphaël, Acosta, Maribel, Polleres, Axel, Fernández, Javier D., Xavier Parreira, Josiane, Hartig, Olaf, Hose, Katja and Cochez, Michael (eds.)), Lecture Notes in Computer Science, vol. 12124, pp. 208-213, Springer-Verlag, 2020. bib pdf link
Sami Sarsa and Eero Hyvönen: Searching Case Law Judgements by Using Other Judgements as a Query. Artificial Intelligence and Natural Language. 9th Conference, AINL 2020, Helsinki, Finland, October 7–9, 2020 (Filchenkov A., Kauttonen J. and Pivovarova L. (eds.)), pp. 145-157, Springer-Verlag, 2020. bib pdf link
Minna Tamper, Petri Leskinen, Jouni Tuominen and Eero Hyvönen: Modeling and Publishing Finnish Person Names as a Linked Open Data Ontology. 3rd Workshop on Humanities in the Semantic Web (WHiSe 2020), pp. 3-14, CEUR Workshop Proceedings, vol. 2695, June, 2020. bib pdf link

2019

Arttu Oksanen, Minna Tamper, Jouni Tuominen, Aki Hietanen and Eero Hyvönen: Anoppi: A Pseudonymization Service for Finnish Court Documents. Legal Knowledge and Information Systems. JURIX 2019: The Thirty-second Annual Conference (Araszkiewicz, M. and Rodríguez-Doncel, V. (eds.)), pp. 251-254, IOS Press, December, 2019. bib pdf
Petri Leskinen and Eero Hyvönen: Extracting Genealogical Networks of Linked Data from Biographical Texts. The Semantic Web: ESWC 2019 Satellite Events (Hitzler, P., Kirrane, S., Hartig, O., de Boer, V., Vidal, M.-E., Maleshkova, M., Schlobach, S., Hammar, K., Lasierra, N., Stadtmüller, S., Hose, K., Verborgh, R. (ed.)), pp. 121-125, Springer, June, 2019. bib pdf
Matti La Mela, Minna Tamper and Kimmo Kettunen: Finding Nineteenth-century Berry Spots: Recognizing and Linking Place Names in a Historical Newspaper Berry-picking Corpus. The Fourth Digital Humanities in the Nordic Countries 2019 (DHN2019), CEUR Workshop Proceedings, Copenhagen, Denmark, March, 2019. bib pdf link

2018

Minna Tamper, Petri Leskinen, Kasper Apajalahti and Eero Hyvönen: Using Biographical Texts as Linked Data for Prosopographical Research and Applications. Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection. 7th International Conference, EuroMed 2018, Nicosia, Cyprus (Marinos Ioannides, Eleanor Fink, Raffaella Brumana, Petros Patias, Anastasios Doulamis, João Martins and Manolis Wallace (eds.)), pp. 125-137, Springer-Verlag, November, 2018. bib pdf link
Eero Hyvönen: Semanttinen web. Linkitetyn avoimen datan käsikirja (Semantic Web. Handbook of Linked Open Data). pp. 271, Gaudeamus, Helsinki, Finland, March, 2018. bib link
/var/www/html/include/secoweb/utils.php; Sat, 20 Apr 2024 16:07:44 +0000