ALLRIGHT: Automatic Ontology Instantiation from Tabular Web Documents
From semanticweb.org
A paper written by Gerhard Friedrich, Olga Kozeruk, Dietmar Jannach and Kostyantyn Shchekotykhin. It was presented at the ISWC2007+ASWC2007.
[edit] Abstract
The process of instantiating an ontology with high-quality and up-to-date instance information manually is both time consuming and prone to error. Automatic ontology instantiation from Web sources is one of the possible solutions to this problem and aims at the computer supported population of an ontology through the exploitation of (redundant) information available on the Web. In this paper we present AllRight, a comprehensive ontology instantiating system. In particular, the techniques implemented in AllRight are designed for application scenarios, in which the desired instance information is given in the form of tables and for which existing Information Extraction approaches based on statistical or natural language processing methods are not directly applicable. Within AllRight, we have therefore developed new techniques for dealing with tabular instance data and combined these techniques with existing methods. The system supports all necessary steps for ontology instantiation, i.e. web crawling, name extraction, document clustering as well as fact extraction and validation. AllRight has been successfully evaluated in the popular domains of digital cameras and notebooks leading to a about eighty percent accuracy of the extracted facts given only a very limited amount of seed knowledge.
A linked list of all papers is provided in the article on ISWC2007+ASWC2007 papers. This article has originally been created from the ISWC 2007/ASWC 2007 metadata.
