» print this page!
» Follow us on Twitter
» Be our friend on Facebook

Latest News

Latest Publications

SeCo on Twitter

SeCo on Facebook

Poka - A framework for automatic annotation

Poka provides a basis for the automatic extraction of the ontological concepts. The main features of the Poka framework are:

  • Efficient adaptation of user-defined ontologies as vocabularies.
  • Building indices for the concepts found in the documents.
  • Tagging the concept locations inside the document.
  • Pipelining the vocabularies and rule-based extraction tools for complex tasks.

Applications

In the following the characteristics of Poka are introduced shortly via the applications. Each bullet describes a way that the application utilizes Poka.

Opas

Opas is an ontology-based annotation and authoring tool for a library help desk service. It uses Poka to extract ontological concepts and person names from the input text. The main characteristics of the system from the Poka's viewpoint are:

  • Extraction of common nouns from YSO ontology.
  • Extraction of places from a place ontology.
  • Extraction of person names using Poka's person name recognition tool.
  • Lemmatized extraction of resources: both the ontological terms and question-answer texts are lemmatized to achieve better recall.
  • Excluding, ordered pipeling of different resource types: first, persons are extracted, second, the places and third, the common nouns. Person name tokens found are excluded from the document and can not be recognized as place or common noun resource. Similarly, place tokens found are excluded as not being potential common nouns. With the exclusion, simple disambugation between different resource types is achieved.

DynaPoka

DynaPoka is a dynamic user interface for integrating ontologies for the extraction task.

  • Handles user-defined (RDFS, OWL) ontologies.
  • A tool to examine the literal values of an ontology (that is, which literal values are reasonable to use in a certain extraction task).
  • The concept strings to be extracted may be chosen from the ontology by
    • defining class range(s)
    • choosing language(s)
    • choosing literal properies
  • Supports (Finnish) lemmatization of the concepts (by Connexor).
  • Selected concept strings act as a terminology and can be tested with a in-frame browser that tags the concepts to the current web page.
  • Supports RDF serialization of terminologies. Serialized version (a term file) act as a preprocession stage in complex extraction tasks.

Saha

Saha is a general, browser-based annotation tool. Poka is utilized for searching ontological concepts from the web page being annotated. Each extraction component is mapped to a object property of Saha's annotation schema. With DynaPoka, new ontologies may be harnessed for the Saha's extraction task.

Airo

Airo is an automatic annotation and search system developed for Sanoma Data. It utilizes Poka to find basic occurences of ontological concepts. Found concepts are postprocessed to achieve automatic disambiguation between the occurences.


Publications

2009

Eero Hyvönen, Eetu Mäkelä, Tomi Kauppinen, Olli Alm, Jussi Kurki, Tuukka Ruotsalo, Katri Seppälä, Joeli Takala, Kimmo Puputti, Heini Kuittinen, Kim Viljanen, Jouni Tuominen, Tuomas Palonen, Matias Frosterus, Reetta Sinkkilä, Panu Paakkarinen, Joonas Laitio, Katariina Nyberg: CultureSampo - A National Publication System of Cultural Heritage on the Semantic Web 2.0. Proceedings of the 6th European Semantic Web Conference (ESWC2009), Heraklion, Greece, May 31 - June 4, 2009. Springer-Verlag. bib pdf
CULTURESAMPO is an application demonstration of a national level publication system of cultural heritage contents on the Web, based on ideas and technologies of the Semantic (Web and) Web 2.0. On the semantic side, the system presents new solutions to interoperability problems of dealing with multiple ontologies of different domains, and to problems of integrating multiple metadata schemas and cross-domain content into a homogeneous semantic portal. A novelty of the system is to use semantic models based on events and narrative process descriptions for modeling and visualizing cultural phenomena, and for semantic recommendations. On the Web 2.0 side, CULTURESAMPO proposes and demonstrates a content creation process for collaborative, distributed ontology and content development including different memory organizations and citizens. The system provides the cultural heritage contents to end-users in a new way through multiple (nine) thematic perspectives, based on semantic visualizations. Furthermore, CULTURESAMPO services are available for external web-applications to use through semantic AJAX widgets.
Eero Hyvönen, Eetu Mäkelä, Tomi Kauppinen, Olli Alm, Jussi Kurki, Tuukka Ruotsalo, Katri Seppälä, Joeli Takala, Kimmo Puputti, Heini Kuittinen, Kim Viljanen, Jouni Tuominen, Tuomas Palonen, Matias Frosterus, Reetta Sinkkilä, Panu Paakkarinen, Joonas Laitio, Katariina Nyberg: CultureSampo - Finnish Culture on the Semantic Web 2.0. Thematic Perspectives for the End-user. Proceedings, Museums and the Web 2009, Indianapolis, USA, April 15-18, 2009. bib pdf
We present an overview of CultureSampo, an ambitious system for creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic web challenge of aggregating highly heterogeneous, cross-domain cultural heritage collections and other contents into a semantically rich intelligent system for human and machine users. At the same time, CultureSampo is an approach to solve the social and practical Web 2.0 challenge of organizing the underlying collaborative ontology development and content creation work of memory organizations and citizens. This paper focuses on CultureSampo’s search, recommendation, and visualization services for the end-users. The key idea here is to access cultural heritage on the Semantic Web through nine “thematic perspectives”, such as places on the maps, the social network of cultural persons, timelines, and narrative texts, e.g. biographies and literary works.
Eero Hyvönen, Eetu Mäkelä, Tomi Kauppinen, Olli Alm, Jussi Kurki, Tuukka Ruotsalo, Katri Seppälä, Joeli Takala, Kimmo Puputti, Heini Kuittinen, Kim Viljanen, Jouni Tuominen, Tuomas Palonen, Matias Frosterus, Reetta Sinkkilä, Panu Paakkarinen, Joonas Laitio, Katariina Nyberg: CultureSampo - Finnish Cultural Heritage Collections on the Semantic Web 2.0. Proceedings of the 1st International Symposium on Digital Humanities for Japanese Arts and Cultures (DH-JAC-2009), Ritsumeikan University, Kyoto, Japan, March, 2009. bib pdf
This paper presents an overview of the SemanticWeb 2.0 application CultureSampo, an ambitious system for creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic web challenge of aggregating highly heterogeneous, cross-domain cultural heritage content into a semantically rich intelligent system for human and machine users. At the same time, CultureSampo is an approach to solve the social and practical Web 2.0 challenge of organizing the underlying collaborative ontology development and content creation work of memory organizations and citizens.

2008

Tuukka Ruotsalo, Katri Seppälä, Kim Viljanen, Eetu Mäkelä, Jussi Kurki, Olli Alm, Tomi Kauppinen, Jouni Tuominen, Matias Frosterus, Reetta Sinkkilä and Eero Hyvönen: Ontology-­based Approach for Interoperability of Digital Collections. Signum, no. 5, 2008. bib pdf
This paper presents solutions and lessons learned in FinnONTO project carried out in Finland in 2003–2007. The paper focuses on three aspects of interoperability of digital collections. First, transforming thesauri to ontologies. Second, publishing ontologies for the use of indexers and content providers. Third, ontology based methods for improving end user access to digital collections.  The first aspect is analysed through case studies done with Finnish thesauri. The second is discussed by presenting the ONKI ontology server. The last aspect is demonstrated in the scope of the semantic portal CultureSampo for publishing cultural heritrage on the Semantic Web.
Eero Hyvönen, Kim Viljanen, Jouni Tuominen, Katri Seppälä, Tomi Kauppinen, Matias Frosterus, Reetta Sinkkilä, Jussi Kurki, Olli Alm, Eetu Mäkelä and Joonas Laitio: National Ontology Infrastructure Service ONKI. Oct 1, 2008. bib pdf
This paper presents the national level cross-domain ontology and ontology service infrastructure ONKI used in Finland. The novelty of ONKI is based on two ideas. First, the core ontologies are developed collaboratively by experts transforming thesauri into mutually aligned lightweight ontologies, based on a large top ontology that is extended by various domain specific ontologies. Second, the National Ontology Service ONKI has been implemented for publishing ontologies cost-efficiently as ready to use services. ONKI provides legacy and other applications with ready to use functionalities for using ontologies on the HTML level by Ajax and semantic widgets. ONKI has been used in various applications for creating mash-up applications in a way analogous to using Google Maps, but in our case external applications are mashed-up with ontology support for indexing and information retrieval.
Eero Hyvönen, Eetu Mäkelä, Tomi Kauppinen, Olli Alm, Jussi Kurki, Tuukka Ruotsalo, Katri Seppälä Kim Viljanen, Jouni Tuominen, Tuomas Palonen, Matias Frosterus, Reetta Sinkkilä, Panu Paakkarinen, Joonas Laitio, Katariina Nyberg: CultureSampo - A Collective Memory of Finnish Cultural Heritage on the Semantic Web 2.0. Semantic Computing Research Group, Helsinki University of Technology and University of Helsinki, Sept 29, 2008. bib pdf
This paper presents the Semantic Web 2.0 application CULTURESAMPO, an ambitious system of creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic challenge of aggregating highly heterogeneous, cross-domain cultural heritage into a semantically rich intelligent system for human and machine users. At the same time, CULTURESAMPO is an approach to solve the social and practical Web 2.0 challenge of organizing the underlying collaborative ontology development and content creation work of memory organizations and citizens.
Eero Hyvönen, Eetu Mäkelä, Tuukka Ruotsalo, Tomi Kauppinen, Olli Alm, Jussi Kurki, Joeli Takala, Kimmo Puputti and Heini Kuittinen: CultureSampo-Finnish Culture on the Semantic Web. Posters of the 5th European Semantic Web Conference 2008 (ESWC 2008), Tenerife, Spain, June 1-5, 2008. bib pdf
This paper presents the semantic portal CULTURESAMPO---Finnish Culture on the Semantic Web . The portal provides memory organizations and other cultural content publishers with a national, shared semantic publication channel for heteroge- nous cultural contents. The content comes from over ten organizations and is annotated using various ontologies of the FinnONTO infrastructure. For the end-user, intel- ligent semantic search, recommendation, and visualization services for accessing and learning about cultural heritage are provided.
Antti Vehviläinen, Eero Hyvönen and Olli Alm: A Semi-automatic Semantic Annotation and Authoring Tool for a Library Help Desk Service. Emerging Technologies for Semantic Work Environments: Techniques, Methods, and Applications, IGI Group, Hershey, USA, 2008. bib pdf

2007

Eero Hyvönen, Olli Alm and Heini Kuittinen: Using an Ontology of Historical Events in Semantic Portals for Cultural Heritage. Proceedings of the Cultural Heritage on the Semantic Web Workshop at the 6th International Semantic Web Conference (ISWC 2007), Busan, Korea, November 12, 2007. bib pdf
We argue that an ontology of historical events is needed in semantic portals for cultural heritage due to three reasons. First, ontological identifiers (URIs) of events, such as the World War II or coronation of Napoleon, are needed in order to make collection metadata mutually interoperable in terms of related events---in the vein as identifiers are needed for identifying artifact types, persons, and geolocations when annotating collection items. Second, events are of central importance in creating semantic links between cultural contents in applications such as recommendation systems. Third, historical events are important as content items of their own, forming the backbone of chronological histories.
Eetu Mäkelä, Kim Viljanen, Olli Alm, Jouni Tuominen, Onni Valkeapää, Tomi Kauppinen, Jussi Kurki, Reetta Sinkkilä, Teppo Känsälä, Robin Lindroos, Osma Suominen, Tuukka Ruotsalo and Eero Hyvönen: Enabling the Semantic Web with Ready-to-Use Web Widgets. Proceedings of the First Industrial Results of Semantic Technologies Workshop, ISWC2007, November 11, 2007. bib pdf
A lot of functionality is needed when an application, such as a museum cataloguing system, is extended with semantic capabilities, for example ontological indexing functionality or multi-facet search. To avoid duplicate work and to enable easy and cost-efficient integration of information systems with the Semantic Web, we propose a web widget approach. Here, data sources are combined with functionality into readyto-use software components that allow adding semantic functionality to systems with just a few lines of code. As a proof of the concept, we present a collection of general semantic web widgets and case applications that use them, such as the ontology server ONKI, the annotation editor SAHA and the culture portal CultureSampo.
Eero Hyvönen, Joeli Takala, Olli Alm, Tuukka Ruotsalo and Eetu Mäkelä: Semantic Kalevala - Accessing Cultural Contents Through Semantically Annotated Stories. Proceedings of the Cultural Heritage on the Semantic Web Workshop at the 6th International Semantic Web Conference (ISWC 2007), Busan, Korea, Nov, 2007. bib pdf
An event-based approach is presented for annotating events and narrative structures underlying texts and stories semantically. The idea is applied to using the Finnish national epic Kalevala for accessing related cultural contents, such as artifacts, paintings etc. in a semantic portal.
Olli Alm: Tekstidokumenttien automaattinen ontologiaperustainen annotointi. MSc Thesis, University of Helsinki, Department of Computer Science, September, 2007. bib pdf
Semanttisen Webin perustavana ajatuksena on tuoda Internetiin – tai suppeammassa mielessä hyperlinkitettyyn aineistoon – järjestystä määrittelemällä eksplisiittisiä, koneluettavia käsitteistöjä ja kuvaamalla Internetin sisältämää aineistoa tällä käsitteistöllä. Nämä kaksi työvaihetta kuuluvat keskeisesti Semanttisen Webin ydinalueisiin. Tässä tutkielmassa määritellään Semanttisen Webin liittyvän aineiston kuvailun eli ontologiaperustaisen annotoinnin piirteitä ja toisaalta myös rajoja. Ontologiaperustainen annotointi on aineiston kuvailua, jonka määrittävänä piirteenä on tietomalli. Annotoinnin automatisointi on keskeinen haaste ontologiaperustaisten järjestelmien tuottamisessa, sillä manuaalisesti tehtävä annotointi on yleensä hidasta ja aikaa vievää. Automaattista annotointia edustavien järjestelmien joukko on kirjava, eikä täsmällistä määrittelyä automaattisen annotoinnin ongelmakentästä esiinny kirjallisuudessa. Työssä määritellään automaattisille annotointijärjestelmille malli, jonka avulla voidaan vertailla järjestelmiä toisiinsa ja mallintaa uusia. Mallia sovelletaan työssä ontologiaperustaisten järjestelmien vertailuun ja automaattisen annotointijärjestelmän Pokan, toteuttamisessa.
Eero Hyvönen, Kim Viljanen, Eetu Mäkelä, Tomi Kauppinen, Tuukka Ruotsalo, Onni Valkeapää, Katri Seppälä, Osma Suominen, Olli Alm, Robin Lindroos, Teppo Känsälä, Riikka Henriksson, Matias Frosterus, Jouni Tuominen, Reetta Sinkkilä and Jussi Kurki: Elements of a National Semantic Web Infrastructure - Case Study Finland on the Semantic Web (Invited paper). Proceedings of the First International Semantic Computing Conference (IEEE ICSC 2007), Irvine, California, September, 2007. IEEE Press. bib pdf
This article presents the vision and results of creating the basis for a national semantic web content infrastructure in Finland in 2003-2007. The main elements of the infrastructure are shared and open metadata schemas, core ontologies, and public ontology services. Several practical applications testing and demonstrating the usefulness of the infrastructure are overviewed in the fields of eCulture, eHealth, eGovernment, eLearning, and eCommerce.
Olli Alm, Eero Hyvönen and Antti Vehviläinen: Opas: An ontology-based library help desk service. Demo track at the European Semantic Web Conference ESWC 2007, Innsbruck, Austria, June 4-5, 2007. bib pdf
Onni Valkeapää, Olli Alm and Eero Hyvönen: Efficient Content Creation on the Semantic Web Using Metadata Schemas with Domain Ontology Services (System Description). Proceedings of the European Semantic Web Conference ESWC 2007, Innsbruck, Austria, Springer, June 4-5, 2007. bib pdf
Onni Valkeapää, Olli Alm and Eero Hyvönen: A Framework for Ontology-based Adaptable Content Creation on the Semantic Web. Journal of Universal Computer Science, 2007. bib pdf
Creation of rich, ontology-based metadata is one of the major challenges in developing the Semantic Web. Emerging applications utilizing semantic web techniques, such as semantic portals, cannot be realized if there are no proper tools to provide metadata for them. This paper discusses how to make provision of metadata easier and cost-effective by an annotation framework comprising of annotation editor combined with shared ontology services. We have developed an annotation system supporting distributed collaboration in creating annotations, and hiding the complexity of the annotation schema and the domain ontologies from the annotators. Our system adapts flexibly to different metadata schemas, which makes it suitable for different applications. Support for using ontologies is based on ontology services, such as concept searching and browsing, concept URI fetching, semantic autocompletion and linguistic concept extraction. The system is being tested in various practical semantic portal projects.

2006

Antti Vehviläinen, Eero Hyvönen and Olli Alm: A Semi-Automatic Semantic Annotation and Authoring Tool for a Library Help Desk Service. Proceedings of the first Semantic Authoring and Annotation Workshop, November, 2006. bib pdf
Antti Vehviläinen, Olli Alm and Eero Hyvönen: Combining Case-Based Reasoning and Semantic Indexing in a Question-Answer Service. June 20, 2006. Poster paper, 1st Asian Semantic Web Conference (ASWC2006). bib pdf

Contact:

Olli Alm
Helsinki University of Technology, Laboratory of Media Technology and University of Helsinki
Olli Alm [at] tkk fi

Prof. Eero Hyvönen
Helsinki University of Technology, Laboratory of Media Technology and University of Helsinki
eero.hyvonen [at] tkk.fi

/m/fs/seco/www/www.seco.tkk.fi/include/secoweb/utils.php; Wed, 20 Sep 2017 21:24:57 +0300