» print this page!
» Follow us on Twitter
» Be our friend on Facebook

Latest News

SeCo on Twitter

SeCo on Facebook

Semantic Web Publications - Texts as Data Services (Severi)

Project Goals

The project develops automatic annotation technology and tools by which texts can be transformed into Linked Data services. The methods are tested and evaluated in practise by developing application demonstrators on top of the data services in four case study areas:

  1. Legal texts in the context of the Semantic Finlex project
  2. Norms in use in the constraction industry
  3. Business news about law and technology innovations
  4. E-books related to biographical materials.

Research Plan

The project lasts Sept 1, 2016 - May 31st, 2018. More detailed materials about the project and its results will be published on this home page later. An abstract in Finnish is available below:

WWW on muuttumassa perinteisestä dokumenttien julkaisualustasta (Web of Documents) datan julkaisualustaksi (Web of Data). Ideana on media-aineistojen julkaiseminen verkossa ihmisluettavan tekstin ohella myös koneluettavana datana, mikä mahdollistaa sovellusten kehittämisen ja lisäarvon luomisen uudenlaisina palvelukon-septeina ja liiketoimintamalleina. Teknologisena haasteena on kuitenkin tekstiaineistojen rakenteistaminen dataksi, missä tarvitaan kieliteknologian ja semanttisen web-teknologian monitieteistä yhdistämistä.

Severi-hankkeessa luodaan avoin teknologinen perusta ja yhteistyöverkosto tekstiperustaisten verkkosisältöjen julkaisemiseksi semanttisina datapalveluina. Tutkimustyö tehdään hankkeessa mukana olevan yrityskonsortion tapaustutkimusten kautta sovellusalueina juridiset aineistot, rakennusalan normit, uutiset sekä e-kirjat. Hankkeen tulokset julkaistaan verkkopalveluina ja avoimella lisenssillä niiden maksimaaliseksi hyödyntämiseksi Suomessa. Hankkeessa on mukana myös laaja kansainvälinen huippuyliopistojen yhteistyöverkosto.

Consortium

The project consortium includes the following organizations:

  1. Aalto University, Department of Computer Science
  2. Edita Publishing Ltd
  3. CSC Ltd
  4. Heldig - Helsinki Centre for Digital Humanities
  5. Lingsoft Ltd
  6. Ministry of Justice
  7. Building Information Group Ltd
  8. Finnish Literature Society (SKS)
  9. Svenska Littetursällskapet i Finland (SLS)
  10. Tekniikan akateemiset TEK
  11. YLE Ltd

Thanks to Tekes for making the project financially possible.

The project Steering Group includes the following representatives: Sari Korhonen (Edita), Pirjo-Leena Forsström (CSC), Tiina Lindh-Knuutila and Juhani Reiman (Lingsoft) Aki Hietanen (Ministry of Justice), Jouko Kanerva (Building Information Group), Kirsi Keravuori (SKS), Karola Söderman (SLS), Pekka Pellinen (TEK), Pia Virtanen (YLE), and Eero Hyvönen (Aalto). Aki Parviainen is the project representative at Tekes.

Contact Person

Prof. Eero Hyvönen, Director , Aalto University and University of Helsinki, Heldig


Publications

2017

Eero Hyvönen, Arttu Oksanen, Jouni Tuominen, Eetu Mäkelä and Minna Tamper: Semanttinen Finlex. Laki ja oikeus avoimena linkitettynä datana. (Semantic Finlex. Law and Justice as Linked Open Data.). March, 2017. Submitted for evaluation. bib pdf
Erkki Heino, Minna Tamper, Eetu Mäkelä, Petri Leskinen, Esko Ikkala, Jouni Tuominen, Mikko Koho and Eero Hyvönen: Named Entity Linking in a Complex Domain: Case Second World War History. February, 2017. Submitted for evaluation. bib pdf
This paper discusses the challenges of applying named entity linking in a rich, complex domain – specifically, the linking of 1) military units, 2) places and 3) people in the context of rich Second World War data. Multiple sub-scenarios are discussed in detail through concrete evaluations, analyzing the problems faced, and the solutions developed. A key contribution of this work is to highlight the heterogeneity of problems and approaches needed even inside a single domain, depending on both the source data as well as the target authority.
Minna Tamper, Petri Leskinen, Esko Ikkala, Arttu Oksanen, Eetu Mäkelä, Erkki Heino, Jouni Tuominen, Mikko Koho and Eero Hyvönen: AATOS – a Configurable Tool for Automatic Annotation. February, 2017. Submitted for evaluation. bib pdf
This paper presents an automatic annotation tool AATOS for providing documents with semantic annotations. The tool links entities found from the texts to ontologies defined by the user. The application is highly configurable and can be used with different natural language Finnish texts. The application was developed as a part of WarSampo and Semantic Finlex projects and tested using Kansa Taisteli magazine articles and consolidated Finnish legislation of Semantic Finlex. The quality of the automatic annotation was evaluated by measuring precision and recall against existing manual annotations. The results showed that the quality of the input text, as well as the selection and configuration of the ontologies impacted the results.
Eero Hyvönen, Petri Leskinen, Erkki Heino, Jouni Tuominen and Laura Sirola: Reassembling and Enriching the Life Stories in Printed Biographical Registers: High School Alumni on the Semantic Web. February, 2017. Submitted for evaluation. bib pdf
This paper presents the idea to enrich printed biographical person registers with linked data related to events that took place after the register was published. By transforming printed historical documents into structured data, semantic search to written texts can be provided for the reader. Even more importantly, extracting semantic structures from printed texts, and combining this data with external datasets and data services, life stories of historical persons can be extended based on data linking. Such linking provides an enriched context for prosopographical research on people in the register, as well as an enhanced reading experience for anyone interested in reading the biographies. As a concrete case study, a register 1867–1992 of over 10 000 alumni of a prominent Finnish high school “Norssi” is transformed into RDF, is enriched by data linking, is published as a linked data service, and is provided to end users via a faceted search engine.
/m/fs/seco/www/www.seco.tkk.fi/include/secoweb/utils.php; Tue, 28 Mar 2017 11:16:32 +0300