Doctoral Candidate, M.Sc.
phone: +358 (0)40 8250356
room: 2528 @ Department of Media Technology
postal address: Department of Media Technology,
P.O. Box 15500, FI-00076 Aalto, Finland
Currently working in the Linked Data Finland project.
Osma Suominen, Sini Pessala, Jouni Tuominen, Mikko Lappalainen, Susanna Nykyri, Henri Ylikotila, Matias Frosterus and Eero Hyvönen: Deploying National Ontology Services: From ONKI to Finto
. Proceedings of the Industry Track at the International Semantic Web Conference 2014
, CEUR Workshop Proceedings, Riva del Garda, Italy, October, 2014. Vol 1383. bib pdf link
The Finnish Ontology Library Service ONKI was published as a living laboratory prototype for public use in 2008. Its idea is to support content indexers and ontology developers via a browser interface and machine APIs. ONKI has been well-accepted, but being a prototype maintained by the ending research project FinnONTO (2003–2012), a more sustainable service was needed, supported by permanent governmental funding. To achieve this, ONKI was deployed and is being further developed by the National Library of Finland into a new national vocabulary service Finto. We discuss challenges in the deployment of ONKI into Finto and lessons learned during the transition process.
Tuukka Ruotsalo and Matias Frosterus: Diversifying Semantic Entity Search: Independent Component Analysis Approach
. International Journal of Semantic Computing, vol. 7, no. 4, pp. 407-426, June, 2014. bib
Matias Frosterus: ONKI-projekti luo kansallista ontologiapalvelua
. Terminfo, no. 4, Finnish Terminology Centre TSK, Helsinki, Finland, 2013. bib
Tuukka Ruotsalo and Matias Frosterus: Semantic Entity Search Diversiﬁcation
. Semantic Computing (ICSC), 2013 IEEE Seventh International Conference on
, pp. 32-39, Irvine, CA, Sept, 2013. bib pdf
We present an approach to diversify entity search by utilizing semantics present and inferred from the initial entity search results. Our approach makes use of ontologies and independent component analysis of the entity descriptions to reveal direct and latent semantic connections between the entities present in the initial search results. The semantic connections are then used to sample a set of diverse entities. We empirically demonstrate the performance of our approach through retrieval experiments that use a real-world dataset composed from four entity databases. The results indicate that our approach significantly improves both diversity and effectiveness of entity search.
Matias Frosterus, Jouni Tuominen, Sini Pessala, Katri Seppälä and Eero Hyvönen: Linked Open Ontology Cloud KOKO--Managing a System of Cross-domain Lightweight Ontologies
. The Semantic Web: ESWC 2013 Satellite Events
, pp. 296-297, Springer-Verlag, Berlin Heidelberg, Montpellier, France, May 26-30, 2013. bib pdf
Matias Frosterus, Jouni Tuominen, Mika Wahlroos and Eero Hyvönen: The Finnish Law as a Linked Data Service
. The Semantic Web: ESWC 2013 Satellite Events
, pp. 289-290, Springer-Verlag, Berlin Heidelberg, Montpellier, France, May 26-30, 2013. bib pdf
Juridical information is important to organizations and individuals alike and is linked to from all walks of life. The Finnish government has published the Finlex Data Bank for searching and browsing legislation documents. However, the data there is not yet open, is based on a traditional XML schema, and does not conform to new semantic metadata standards. There are many difficulties in maintaining and using the site in, e.g., data harvesting, interoperability, querying, and linking that could be mitigated by the Semantic Web technologies. This paper presents an approach and a project—including first results—for publishing and using Finnish legislation as a 5-star Linked Open Data service.
Katariina Nyberg, Matias Frosterus and Eero Hyvönen: Linking Data for Industrial Knowledge Management – A Case Study
. ESWC 2011, OSEMA Workshop paper, 2011. bib pdf
Manufacturing companies face the challenge of maintaining documentation and knowledge about their projects and products, scattered in heterogenous, distributed databases, represented in different formats and languages, and hosted in mutually incompatible systems. At the same time, the knowledge needs to be accessed on a global level from different perspectives and user groups, such as project planners, designers, and maintenance personnel. This paper presents a case study, based on real datasets of a major international diesel engine and power plant manufacturer, where these problems are addressed simultaneuosly by harmonizing the datasets from different sources using RDF, and by linking them together into a global repository using shared resources. Based on the global RDF store, services for both human and machine users, such as a faceted search engine and a SPARQL end-point, can be provided to support access from different perspectives to the company knowledge base.
Matias Frosterus, Eero Hyvönen and Joonas Laitio: Creating and Publishing Semantic Metadata about Linked and Open Datasets
. AAAI Fall Symposium 2011, Open Government Knowledge: AI Opportunities and Challenges
, Arlington, USA, November, 2011. bib pdf
We present a comprehensive system for producing interoperable metadata for Linked Open datasets and governmental datasets published in various formats.
Sini Pessala, Katri Seppälä, Osma Suominen, Matias Frosterus, Jouni Tuominen and Eero Hyvönen: MUTU: An Analysis Tool for Maintaining a System of Hierarchically Linked Ontologies
. ISWC 2011 - Ontologies come of Age Workshop (OCAS)
, Bonn, Germany, October, 2011. bib pdf
We consider ontology evolution in a system of light-weight Linked Data ontologies, aligned with each other to form a larger ontology system. When one ontology changes, the human editor must keep track of the actual changes and of the modifications needed in the related ontologies in order to keep the system consistent. This paper presents an analysis tool MUTU, by which such changes and their potential effects on other ontologies can be found. Such an analysis is useful for the ontology editors for understanding the differences between ontology versions, and for updating linked ontologies when changes occurred in other components of an ontology system.
Matias Frosterus, Eero Hyvönen and Mika Wahlroos: Extending Ontologies with Free Keywords in a Collaborative Annotation Environment
. Proceedings of the ISWC 2011 Workshop Ontologies Come of Age in the Semantic Web (OCAS)
, CEUR Workshop Proceedings, Vol 809, http://ceur-ws.org, ISSN 1613-0073, Bonn, Germany, October, 2011. bib pdf
Semantic web technologies have introduced the idea of annotating content in terms of concepts taken from ontologies. Since concepts are defined in terms of properties and relations to other concepts, descriptions grow up into larger RDF graphs that can be used as a basis for data integration and intelligent information retrieval. Since ontologies do not typically contain all the possible concepts needed for annotation, it is usually necessary to offer the annotator the possibility to introduce new free keywords or tags in addition to the predefined ontology concepts. The problem then is that free keywords/tags do not have ontological connections to the rest of the RDF graph, unless such relations are defined by the annotator.We present a process for integrating free keywords into the ontological framework, and a practical tool implementation of it, discussing the challenges and possibilities introduced by the system. We also describe a case study performed for the Finnish Defence Forces, where the tool is used for creating a faceted semantic search portal featuring the free keywords and the ontological concepts at the same time.
Matias Frosterus, Eero Hyvönen and Joonas Laitio: DataFinland - A Semantic Portal for Open and Linked Dataset
. Proceedings of the 8th Extended Semantic Web Conference (ESWC 2011)
, Heraklion, Greece, June, 2011. bib pdf
The number of open datasets available on the web is increasing rapidly with the rise of the Linked Open Data (LOD) cloud and various governmental efforts for releasing public data in different formats, not only in RDF. The aim in releasing open datasets is for developers to use them in innovative applications, but the datasets need to be found first and metadata available is often minimal, heterogeneous, and distributed making the search for the right dataset often problematic. To address the problem, we present DataFinland, a semantic portal featuring a distributed content creation model and tools for annotating and publishing metadata about LOD and non-RDF datasets on the web. The metadata schema for DataFinland is based on a modified version of the voiD vocabulary for describing linked RDF datasets, and annotations are done using an online metadata editor SAHA connected to ONKI ontology services providing a controlled set of annotation concepts. The content is published instantly on an integrated faceted search and browsing engine HAKO for human users, and as a SPARQL endpoint and a source file for machines. As a proof of concept, the system has been applied to LOD and Finnish governmental datasets.
Matias Frosterus and Eero Hyvönen: Bridging the Search Gap between the Web of Pages and Web of Data by Combining Ontological Document Expansion with Text Search
. Proceedings of the International Conferences on Digital Libraries and the Semantic Web 2009 (ICSD2009)
, Trento, Italy, September, 2009. bib pdf
The Semantic Web extends traditional web documents, i.e. the Web of Pages, with conceptual structures based on ontologies and metadata, i.e. the Web of Data. This paper presents a hybrid document search approach combining the benefits of the traditional text search of literal documents and the semantic search based on their underlying conceptual structures. The approach is based on document expansion, where documents are automatically annotated with not only the concepts explicitly present in a given document, but also with the ontologically related concepts using smaller weights. Our test results using the CLEF Test Suite suggest that document expansion alone achieves better recall than text search at the expense of precision. As a solution, a method of combining document expansion with text search is presented in which better recall was obtained without sacrificing precision. This approach seems promising when integrating unstructured, textual content with the Semantic Web of Data.
Tuukka Ruotsalo, Eetu Mäkelä, Tomi Kauppinen, Eero Hyvönen, Krister Haav, Ville Rantala, Matias Frosterus, Nima Dokoohaki and Mihhail Matskin: Smartmuseum: Personalized Context-aware Access to Digital Cultural Heritage
. Proceedings of the International Conferences on Digital Libraries and the Semantic Web 2009 (ICSD2009)
, September, 2009. Trento, Italy. bib pdf
This paper presents a semantic recommender method and a system for a personalized access to digital cultural heritage through context-aware user pro- filing. Given annotation knowledge-bases, explicit background knowledge in the form of ontologies, a user model capturing the user’s behavior and context, the system produces recommendations. Ontology-based user profiling can be used to reduce cold-start, sparsity and over-specialization problems. In addition, we present a recommendation retrieval method that is based on the vector space model and uses indices that enable fast and scalable implementation of the system.
Jouni Tuominen, Matias Frosterus, Kim Viljanen and Eero Hyvönen: ONKI SKOS Server for Publishing and Utilizing SKOS Vocabularies and Ontologies as Services
. Proceedings of the 6th European Semantic Web Conference (ESWC 2009)
, Heraklion, Greece, May 31 - June 4, 2009. Springer-Verlag. bib pdf
Vocabularies are the building blocks of the Semantic Web providing shared terminological resources for content indexing, information retrieval, data exchange, and content integration. Most semantic web applications in practical use are based on lightweight ontologies and, more recently, on the Simple Knowledge Organization System (SKOS) data model being standardized by W3C. Easy and cost-efficient publication, integration, and utilization methods of vocabulary services are therefore highly important for the proliferation of the Semantic Web. This paper presents the ONKI SKOS Server for these tasks. Using ONKI SKOS, a SKOS vocabulary or a lightweight ontology can be published on the web as ready-to-use services in a matter of minutes. The services include not only a browser for human usage, but also Web Service and AJAX interfaces for concept finding, selecting and transporting resources from the ONKI SKOS Server to connected systems. Code generation services for AJAX and Web Service APIs are provided automatically, too. ONKI SKOS services are also used for semantic query expansion in information retrieval tasks. The idea of publishing ontologies as services is analogous to Google Maps. In our case, however, vocabulary services are provided and mashed-up in applications. ONKI SKOS was published in the beginning of 2008 and is to our knowledge the first generic SKOS server of its kind. The system has been used to publish and utilize some 60 vocabularies and ontologies in the National Finnish Ontology Service ONKI www.yso.fi.
Eero Hyvönen, Eetu Mäkelä, Tomi Kauppinen, Olli Alm, Jussi Kurki, Tuukka Ruotsalo, Katri Seppälä, Joeli Takala, Kimmo Puputti, Heini Kuittinen, Kim Viljanen, Jouni Tuominen, Tuomas Palonen, Matias Frosterus, Reetta Sinkkilä, Panu Paakkarinen, Joonas Laitio, Katariina Nyberg: CultureSampo - A National Publication System of Cultural Heritage on the Semantic Web 2.0
. Proceedings of the 6th European Semantic Web Conference (ESWC2009), Heraklion, Greece
, May 31 - June 4, 2009. Springer-Verlag. bib pdf
CULTURESAMPO is an application demonstration of a national level publication system of cultural heritage contents on the Web, based on ideas and technologies of the Semantic (Web and) Web 2.0. On the semantic side, the system presents new solutions to interoperability problems of dealing with multiple ontologies of different domains, and to problems of integrating multiple metadata schemas and cross-domain content into a homogeneous semantic portal. A novelty of the system is to use semantic models based on events and narrative process descriptions for modeling and visualizing cultural phenomena, and for semantic recommendations. On the Web 2.0 side, CULTURESAMPO proposes and demonstrates a content creation process for collaborative, distributed ontology and content development including different memory organizations and citizens. The system provides the cultural heritage contents to end-users in a new way through multiple (nine) thematic perspectives, based on semantic visualizations. Furthermore, CULTURESAMPO services are available for external web-applications to use through semantic AJAX widgets.
Eero Hyvönen, Eetu Mäkelä, Tomi Kauppinen, Olli Alm, Jussi Kurki, Tuukka Ruotsalo, Katri Seppälä, Joeli Takala, Kimmo Puputti, Heini Kuittinen, Kim Viljanen, Jouni Tuominen, Tuomas Palonen, Matias Frosterus, Reetta Sinkkilä, Panu Paakkarinen, Joonas Laitio, Katariina Nyberg: CultureSampo - Finnish Culture on the Semantic Web 2.0. Thematic Perspectives for the End-user
. Proceedings, Museums and the Web 2009, Indianapolis, USA
, April 15-18, 2009. bib pdf
We present an overview of CultureSampo, an ambitious system for creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic web challenge of aggregating highly heterogeneous, cross-domain cultural heritage collections and other contents into a semantically rich intelligent system for human and machine users. At the same time, CultureSampo is an approach to solve the social and practical Web 2.0 challenge of organizing the underlying collaborative ontology development and content creation work of memory organizations and citizens. This paper focuses on CultureSampo’s search, recommendation, and visualization services for the end-users. The key idea here is to access cultural heritage on the Semantic Web through nine “thematic perspectives”, such as places on the maps, the social network of cultural persons, timelines, and narrative texts, e.g. biographies and literary works.
Eero Hyvönen, Eetu Mäkelä, Tomi Kauppinen, Olli Alm, Jussi Kurki, Tuukka Ruotsalo, Katri Seppälä, Joeli Takala, Kimmo Puputti, Heini Kuittinen, Kim Viljanen, Jouni Tuominen, Tuomas Palonen, Matias Frosterus, Reetta Sinkkilä, Panu Paakkarinen, Joonas Laitio, Katariina Nyberg: CultureSampo - Finnish Cultural Heritage Collections on the Semantic Web 2.0
. Proceedings of the 1st International Symposium on Digital Humanities for Japanese Arts and Cultures (DH-JAC-2009), Ritsumeikan University, Kyoto, Japan
, March, 2009. bib pdf
This paper presents an overview of the SemanticWeb 2.0 application CultureSampo, an ambitious system for creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic web challenge of aggregating highly heterogeneous, cross-domain cultural heritage content into a semantically rich intelligent system for human and machine users. At the same time, CultureSampo is an approach to solve the social and practical Web 2.0 challenge of organizing the underlying collaborative ontology development and content creation work of memory organizations and citizens.
Tuukka Ruotsalo, Katri Seppälä, Kim Viljanen, Eetu Mäkelä, Jussi Kurki, Olli Alm, Tomi Kauppinen, Jouni Tuominen, Matias Frosterus, Reetta Sinkkilä and Eero Hyvönen: Ontology-based Approach for Interoperability of Digital Collections
. Signum, no. 5, 2008. bib pdf
This paper presents solutions and lessons learned in FinnONTO project carried out in Finland in 2003–2007. The paper focuses on three aspects of interoperability of digital collections. First, transforming thesauri to ontologies. Second, publishing ontologies for the use of indexers and content providers. Third, ontology based methods for improving end user access to digital collections. The first aspect is analysed through case studies done with Finnish thesauri. The second is discussed by presenting the ONKI ontology server. The last aspect is demonstrated in the scope of the semantic portal CultureSampo for publishing cultural heritrage on the Semantic Web.
Eero Hyvönen, Kim Viljanen, Jouni Tuominen, Katri Seppälä, Tomi Kauppinen, Matias Frosterus, Reetta Sinkkilä, Jussi Kurki, Olli Alm, Eetu Mäkelä and Joonas Laitio: National Ontology Infrastructure Service ONKI
. Oct 1, 2008. bib pdf
This paper presents the national level cross-domain ontology and ontology service infrastructure ONKI used in Finland. The novelty of ONKI is based on two ideas. First, the core ontologies are developed collaboratively by experts transforming thesauri into mutually aligned lightweight ontologies, based on a large top ontology that is extended by various domain specific ontologies. Second, the National Ontology Service ONKI has been implemented for publishing ontologies cost-efficiently as ready to use services. ONKI provides legacy and other applications with ready to use functionalities for using ontologies on the HTML level by Ajax and semantic widgets. ONKI has been used in various applications for creating mash-up applications in a way analogous to using Google Maps, but in our case external applications are mashed-up with ontology support for indexing and information retrieval.
Eero Hyvönen, Eetu Mäkelä, Tomi Kauppinen, Olli Alm, Jussi Kurki, Tuukka Ruotsalo, Katri Seppälä Kim Viljanen, Jouni Tuominen, Tuomas Palonen, Matias Frosterus, Reetta Sinkkilä, Panu Paakkarinen, Joonas Laitio, Katariina Nyberg: CultureSampo - A Collective Memory of Finnish Cultural Heritage on the Semantic Web 2.0
. Semantic Computing Research Group, Helsinki University of Technology and University of Helsinki
, Sept 29, 2008. bib pdf
This paper presents the Semantic Web 2.0 application CULTURESAMPO, an ambitious system of creating a collective semantic memory of the cultural heritage of a nation on the Semantic Web 2.0, combining ideas underlying the Semantic Web and the Web 2.0. The system addresses the semantic challenge of aggregating highly heterogeneous, cross-domain cultural heritage into a semantically rich intelligent system for human and machine users. At the same time, CULTURESAMPO is an approach to solve the social and practical Web 2.0 challenge of organizing the underlying collaborative ontology development and content creation work of memory organizations and citizens.
Jouni Tuominen, Matias Frosterus, Kim Viljanen and Eero Hyvönen: ONKI-SKOS - Publishing and Utilizing Thesauri in the Semantic Web
. AI and Machine Consciousness - Proceedings of the 13th Finnish Artificial Intelligence Conference STeP 2008
, Espoo, Finland, August 20-22, 2008. bib pdf
Thesauri and other controlled vocabularies act as building blocks of the Semantic Web by providing shared terminology for facilitating information retrieval, data exchange and integration. Representation and publishing methods are needed for utilizing thesauri efficiently, e.g., in content indexing and searching. W3C has provided the Simple Knowledge Organization System (SKOS) data model for expressing concept schemes, such as thesauri. A standard representation format for thesauri eliminates the need for implementing thesaurus specific rules or applications for processing them. However, there do not exist general tools which provide out of the box support for publishing and utilizing SKOS vocabularies in applications, without needing to implement application specific user interfaces for end users. For solving this problem the ONKI-SKOS server is presented.
Matias Frosterus: Tekstiaineiston ontologiaperustainen indeksointi ja haku
. MSc Thesis, Helsinki University of Technology, Department of Automation and Systems Technology, March, 2008. bib pdf
Informaation lisääntyessä yhteiskunnassa vaaditaan sen tehokasta käsittelyä yhä enemmän ammattilaisten lisäksi myös tavallisilta käyttäjiltä. Tällöin luonnollinen pyrkimys on yksinkertaistaa ja automatisoida tiedonhakuprosessia mahdollisimman paljon, johon semanttisen webin tekniikat tarjoavat uusia mahdollisuuksia. Tässä diplomityössä tutkittiin mahdollisuuksia dokumentin laajentamisen ja ontologisten käsitteiden hyödyntämisen kautta parantaa tiedonhakuprosessia tekstipohjaiseen aineistoon, kuten sanomalehtiarkistoon. Tätä tarkoitusta varten luotiin automaattinen annotointi ja hakusovellus Airo, joka suorittaa jonkin annetun ontologian pohjalta dokumentin laajennuksen. Tämä tapahtuu ontologisella käsiteklusteroinnilla, jossa jonkin käsitteen esiintyminen tekstissä nostaa myös ontologian hierarkiassa läheisten käsitteiden painoa kyseistä dokumenttia indeksoitaessa ja haettaessa. Järjestelmän testit osoittivat, että käsitehaku yhdistettynä sanahakuun laskee haun tarkkuutta, mutta nostaa saantia. Sen sijaan hybridimenetelmä dokumentin- ja kyselyn laajennuksesta, jossa perinteisen sanahaun tuottamien dokumenttien käsitteillä suoritetaan laajentava haku, nosti saantia tarkkuuden kärsimättä. Luotu järjestelmä on ontologiariippumaton ja jokaisen ontologian tuottamat käsitteistykset talletetaan omaan indeksiinsä, jolloin niitä voidaan hakea erikseen.
Eero Hyvönen, Kim Viljanen, Eetu Mäkelä, Tomi Kauppinen, Tuukka Ruotsalo, Onni Valkeapää, Katri Seppälä, Osma Suominen, Olli Alm, Robin Lindroos, Teppo Känsälä, Riikka Henriksson, Matias Frosterus, Jouni Tuominen, Reetta Sinkkilä and Jussi Kurki: Elements of a National Semantic Web Infrastructure - Case Study Finland on the Semantic Web (Invited paper)
. Proceedings of the First International Semantic Computing Conference (IEEE ICSC 2007), Irvine, California
, September, 2007. IEEE Press. bib pdf
This article presents the vision and results of creating the basis for a national semantic web content infrastructure in Finland in 2003-2007. The main elements of the infrastructure are shared and open metadata schemas, core ontologies, and public ontology services. Several practical applications testing and demonstrating the usefulness of the infrastructure are overviewed in the fields of eCulture, eHealth, eGovernment, eLearning, and eCommerce.