Adding Data Mining Support to SPARQL via Statistical Relational Learning Methods

Jump to: navigation, search

A paper written by Christoph Kiefer, André Locher and Abraham Bernstein. It was presented at the ESWC2008. It is about data mining, evaluation, SPARQL and statistical relational learning

See on


In machine learning/data mining, people have been exploring how to learn models of relational data for a long time. The rational behind this is that exploiting the rich and complex structure of relational data enables to build better models by taking into account the additional information provided by the links between objects. These links are usually hard to model by traditional propositional learning techniques. We extend this idea to the Semantic Web. In this paper we introduce a novel approach we call SPARQL-ML to perform data mining for Semantic Web data. Our approach is based on traditional SPARQL and statistical relational learning methods, such as Relational Probability Trees and Relational Bayesian Classifiers. We analyze our approach thoroughly conducting three sets of experiments on synthetic as well as real-world datasets. Our analytical results show that our approach can be used for any Semantic Web dataset to perform instance-based learning and classification. A comparison to kernel methods used in Support Vector Machines shows that our approach is superior in terms of classification accuracy. Moreover, we show how our approach can be used for Semantic Web service classification and automatic semantic annotation.

This data has been imported from the ESWC2008 data