VoID

From semanticweb.org
(Redirected from VoiD)
Jump to: navigation, search
Vocabulary of Interlinked Datasets
Homepage: rdfs.org/ns/void/
Language: RDF Schema
Last release: March 6 2011
Last revision: March 6 2011
Namespace: http://rdfs.org/ns/void#

VoID (from "Vocabulary of Interlinked Datasets") is an RDF based schema to describe linked datasets. With VoID the discovery and usage of linked datasets can be performed both effectively and efficiently. A dataset is a collection of data, published and maintained by a single provider, available as RDF, and accessible, for example, through dereferenceable HTTP URIs or a SPARQL endpoint.

Contents

[edit] Overview

Basically, we find two classes at the heart of VoID:

  • A dataset (void:Dataset) is a collection of data, which is:
    • published and maintained by a single provider, and
    • available as RDF, and
    • accessible, for example, through dereferenceable HTTP URIs or a SPARQL endpoint.
  • The interlinking is modelled by a linkset (void:Linkset). A linkset in voiD is a subclass of a dataset, used for storing triples to express the interlinking relationship between datasets. In each interlinking triple, the subject is a resource hosted in one dataset and the object is a resource hosted in another dataset. This modelling enables a flexible and powerful way to talk in great detail about the interlinking between two datasets, such as how many links there exist, which kind of links (e.g. owl:sameAs or foaf:knows) are present, or stating who claims these statements.

In the following, the modelling of the interlinking in voiD is depicted:

voiD interlinking concept

The core resources of the VoID spec are as follows:

  1. VoID vocabulary (normative), defines the classes and properties (available in HTML and RDF)
  2. Describing Linked Datasets with the VoID Vocabulary, explains the usage of VoID for both data publisher and consumer (along with other vocabularies such as Dublin Core, FOAF, etc.)
  3. VoID code repository, hosting exemplary implementations (also issues re the vocabulary are accessible there)

[edit] Using VoID

A simple VoID example that describes two well-known LOD datasets and their interlinking is shown in the following.

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix void: <http://rdfs.org/ns/void#> .

:DBpedia rdf:type void:Dataset ;
         foaf:homepage <http://dbpedia.org/> .

:DBLP rdf:type void:Dataset ;
      foaf:homepage <http://www4.wiwiss.fu-berlin.de/dblp/all> ;
      dcterms:subject <http://dbpedia.org/resource/Computer_science> ;
      dcterms:subject <http://dbpedia.org/resource/Journal> ;
      dcterms:subject <http://dbpedia.org/resource/Proceedings> .

:DBpedia void:subset :DBpedia2DBLP  .

:DBpedia2DBLP rdf:type void:Linkset ;
              void:target :DBpedia ;
              void:target :DBLP .

Let us assume that the above VoID description has, for example, been gathered by a semantic indexer from the VoID documents data publishers provide along with their dataset. It is then possible to execute the following example query:

SELECT DISTINCT ?dataset
WHERE {
  ?dataset a void:Dataset .
  ?dataset dcterms:subject <http://dbpedia.org/resource/Journal> .
}

This query asks for all datasets that have been categorised as containing data about journals and will eventually return :DBLP. A slightly modified version of the query above applied to the RKB explorer is shown below:

SELECT DISTINCT ?dataset
WHERE {
 ?dataset a void:Dataset .
 ?dataset dcterms:subject <http://dbpedia.org/resource/Category:Computer_scientists> .
}

The above SPARQL query will list all datasets about computer scientists.

Much more is possible with VoID, though. You can describe basic dataset metadata (such as publisher, license, etc.), vocabularies used, example resources, SPARQL endpoint availability, and of course the fine-grained interlinking between the datasets. Such an example, describing partly the interlinking between DBpedia and Geonames, is:

:DBpedia void:subset :DBpedia2Geonames .

:DBpedia2Geonames a void:Linkset ;
             void:linkPredicate owl:sameAs ;
             void:target :DBpedia ;
             void:target :Geonames .

[edit] Generating and Consuming

There are already tools and applications available that consume or produce VoID.

[edit] Generators & Editors

  • voidGen (Christoph Böhm, HPI, Universität Potsdam)
  • VoID tools is a collection of Jena-based tools to support the generation of VoID descriptions including RDFStats statistics.
  • OpenLink's Virtuoso produced VoID, see DB.DBA.RDF_VOID_STORE
  • liftSSM is an XSLT that takes a semantic sitemap in XML and creates a stub voiD description in RDF/XML.
  • The current version v1.1 of the NxParser (Parser for NTriples, NQuads, and more) supports exporting VoID descriptions of statistics.
  • ve2 - the2 VoID editor allows to manually generate a VoID file in RDF Turtle format, incl. the definition of the characteristics of your linked dataset, such as categories, interlinking, technical features, licensing, etc. Developed at DERI.
    • Open PHACTS VoID editor — Open PHACTS refers to a European initiative to create an "Open Pharmacological Space" where data is stored in RDF and datasets are described via VoID. The VoID editor developed for Open PHACTS is a direct descendent of the ve2 edtor developed at DERI
  • RDF::Generator::Void - A Perl module to generate VoID descriptions. Can also be found in Debian as librdf-generator-void-perl. The current release compiles some statistics and uses some heuristics to extract vocabularies. Further data can be provided manually and it can also read an RDF model from a different source. The module RDF::LinkedData can be configured to use this module. The current release is an early beta.

[edit] Exploration, Browser, Stores

[edit] Examples in the Wild

In the examples below, VoID is used for different purposes and use cases, however, the time it took people to implement it may be an indicator for how remarkably easy it is to generate/consume VoID.

[edit] Decimalised Database of Concepts

In the decimalised database of concepts (DDC) dataset, VoID is used extensively. DDC is a collection of topics suitable for use in linked data. It is inspired by the Dewey Decimal Classification, but no guarantees are made about the closeness of its resemblance as a whole. SKOS mapping links are provided from this database to the Dewey system, to Library of Congree Classification codes and to DBPedia resources where possible.

[edit] EPrints

EPrints Repository software publishes RDF as of v3.2.1, and automatically describes the dataset using VoID. Suggestions for improvement to cjg@ecs.soton.ac.uk (it's worth getting right as many repositories will end up with this code)

[edit] Italian National Research Council (CNR)

Italian National Research Council (CNR) publishes organizational data at http://data.cnr.it/ and provides also a VoID description.

[edit] Lingvoj

As reported by Bernard Vatant, lingvoj has a VoID description as well; lingvoj is a linked dataset dedicated to the publication and use of multilingual RDF descriptions of human languages.

[edit] LODStats

LODStats attempts to compute comprehensive summaries and statistics for all LOD datasets on the Web. It uses VoID and the Data Cube Vocabulary to represent the statistics. Example for Europeana Linked Data

[edit] Open Data Communities

Open Data Communities uses VoID, see for example Index of Multiple Deprivation Ranking, 2010.

[edit] OECD Glossary of Statistics

oecd.dataincubator.org is a dataset offering VoID about data extracted from the OECD Glossary of Statistics.

[edit] OpenLink Virtuoso

Since Virtuoso 5.0.10 (2009-02-13), OpenLink has included support for VoID. Further, Kingsley Idehen (CEO and founder of OpenLink) has announced a demo from their Virtuoso platform: the URIBurner service turns structured HTML into RDF, and uses VoID to represent the (on the fly) generated data description.

As announced on 2009-03-05, OpenLink has generated a VoID graph for DBpedia; use <http://dbpedia.org/void/> for the default graph field at http://dbpedia.org/sparql.

Virtuoso (both Open- and Closed-Source variants) also now includes scripts which use built-in functions for VoID generation and storage, e.g., DB.DBA.RDF_VOID_STORE

[edit] Ordnance Survey

Ordnance Survey, Great Britain's national mapping agency uses VoID to describe their data.

[edit] PSI Catalogues Aggregator

The PSI Catalogues Aggregator offers voiD descriptions for Public Sector Information (PSI) catalogues. For example, see their data.gov.uk VoID description.

[edit] RAMON, Eurostat's Metadata Server

The Eurostat Metadata Server RAMON uses VoID to describe their data, including countries and NUTS codes.

[edit] RDFohloh

Sergio announced that RDFohloh has VoID descriptions.

[edit] RKB explorer

As reported by Hugh Glaser, the RKB explorer activity has a VoID site which enables query and browse for CRS datasets. For example:

PREFIX   rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX  void: <http://rdfs.org/ns/void#>
PREFIX scovo: <http://purl.org/NET/scovo#>

SELECT ?subjects ?objects ?stats  
WHERE
  { 
    ?crs   void:subjectsTarget  ?subjects  . 
    ?crs   void:objectsTarget   ?objects   . 
    ?crs   void:statItem        ?item      .
    ?item  rdf:value            ?stats
  }

The interlinkage of the RKB sites can be visualised from the VoID data.

The VoID data for a particular RKB site can be accessed as Linked Data at that site, for example the dblp.rkbexplorer.com site has a void.ttl file, voiD dataset URI and information about its CRS.

[edit] RPI's Linked Data from data.gov

http://data-gov.tw.rpi.edu uses VoID to describe their datasets, see for example in an URI design note.

[edit] SchemaCache

Talis' SchemaCache publishes a VoID description.

[edit] SPARQL Endpoints Status

Mondeca provides a list of the availability of public SPARQL endpoints fetched dynamically using CKAN, using VoID along with an extension to describe the status of an endpoint.

[edit] The Stationery Office (TSO)

The UK-based 'The Stationery Office' (TSO) provides information management & publishing solutions to the public and private sectors and uses VoID to describe different UK datasets, see for example http://gov.tso.co.uk/gazettes/void

[edit] Telegraphis

Linked Data about currencies on telegraphis.net, see http://telegraphis.net/data/void

[edit] World Bank

World Bank Linked Data has a VoID + SPARQL Service Description.

[edit] Feedback and Discussions

We have a VoID discussion group (void-discussion@googlegroups.com) if you are interested to share your experience or have a question. If you have a feature request or want to file a bug report, please use the VoID Issue Tracker. Some of us hang out on #swig IRC channel at Freenode.


[edit] Related Specifications

There are some specifications that use or extend VoID or are related to it:


[edit] See also

The VoID vocabulary is maintained by the voiD team:

[edit] by the VoID team

[edit] what others say about VoID

Aggregated references:


Listings:

Personal tools
Namespaces

Variants
Actions
Navigation
services
Toolbox