DataFinland - Semantic Search and Annotation Tool for Open Datasets
Suomeksi - in Finnish
DataFinland provides a community-based channel for annotating and publishing Open Datasets in a searchable, semantic portal.
DataFinland uses the browser-based annotation tool Saha to allow for the annotation of Open Datasets using a modified version of the voiD schema. After the annotation, Hako is used to form a faceted search portal for the datasets.
For a demonstration of DataFinland, see here:
Matias Frosterus, Eero Hyvönen and Joonas Laitio: Creating and Publishing Semantic Metadata about Linked and Open Datasets
. AAAI Fall Symposium 2011, Open Government Knowledge: AI Opportunities and Challenges
, Arlington, USA, November, 2011. bib pdf
We present a comprehensive system for producing interoperable metadata for Linked Open datasets and governmental datasets published in various formats.
Matias Frosterus, Eero Hyvönen and Joonas Laitio: DataFinland - A Semantic Portal for Open and Linked Dataset
. Proceedings of the 8th Extended Semantic Web Conference (ESWC 2011)
, pp. 243-254, Springer-Verlag, Heraklion, Greece, June, 2011. bib pdf link
The number of open datasets available on the web is increasing rapidly with the rise of the Linked Open Data (LOD) cloud and various governmental efforts for releasing public data in different formats, not only in RDF. The aim in releasing open datasets is for developers to use them in innovative applications, but the datasets need to be found first and metadata available is often minimal, heterogeneous, and distributed making the search for the right dataset often problematic. To address the problem, we present DataFinland, a semantic portal featuring a distributed content creation model and tools for annotating and publishing metadata about LOD and non-RDF datasets on the web. The metadata schema for DataFinland is based on a modified version of the voiD vocabulary for describing linked RDF datasets, and annotations are done using an online metadata editor SAHA connected to ONKI ontology services providing a controlled set of annotation concepts. The content is published instantly on an integrated faceted search and browsing engine HAKO for human users, and as a SPARQL endpoint and a source file for machines. As a proof of concept, the system has been applied to LOD and Finnish governmental datasets.
Helsinki University of Technology, Laboratory of Media Technology
Prof. Eero Hyvönen
Helsinki University of Technology, Laboratory of Media Technology and University of Helsinki
eero.hyvonen [at] tkk.fi