A Large Scale Dataset for the Evaluation of Matching Systems

From semanticweb.org
Jump to: navigation, search

A poster presentation written by Paolo Avesani, Mikalai Yatskevich and Fausto Giunchiglia. It was presented at the ESWC2007. It is about Evaluation methodology and Onlology matching

[edit] Abstract

In the last years the number of ontology matching techniques and systems has significantly increased, and this, in turn, has raised the issue of their evaluation and comparison. One of the key challenges is how to build large scale datasets. In fact the number of possible mappings between two ontologies grows quadratically in respect to the number of nodes in the graphs what, in turn, makes the manual construction of the reference mappings too demanding for large scale real world matching tasks. In this paper we present a new mapping dataset TaxME 2 extracted from Google, Yahoo and Looksmart web directories. TaxME 2 is computed in a semiautomatic way and it is an order of magnitude larger than the state of the art datasets. Moreover, to our knowledge, it is the only large scale dataset which can be used to compute both Precision and Recall. We have evaluated TaxME 2 exploiting results of twelve state of the art matching systems. The evaluation results have shown that the data set has the desired key properties, namely it is discriminative, error-free and hard to solve for state of the art matching systems.

This data has been imported from the ESWC2007 RDF

Personal tools
Namespaces

Variants
Actions
Navigation
services
Toolbox