Semantic MediaWiki development activities

From semanticweb.org
Jump to: navigation, search

On this page, developers of Semantic MediaWiki and related efforts document their current activities. It helps to improve coordination among project members, and it also serves as a changelog. Finished tasks should be explained as features in the online help documentation for Semantic MediaWiki.

Features that are marked here as implemented may or may not be present in the version of the SMW code running on this wiki; the latest SMW code is available from the Subversion repository (but may not work!).

Tasks on this page should be edited only by the respective developers.

Note that you can easily link to bugs on SourceForge and MediaZilla by using the interwiki-prefixes smwbug: and mediazilla:, or Template:SMWbug.

New developers should observe the style guidelines for their code. Comments in code are extracted by Doxygen to create online documentation (click Files and scroll down to extensions/SemanticMediaWiki.

Contents

Version 0.7

The below list is outdated. See current activities above.


Support relation/attribute hierarchy Active developers: mak
Schedule: until 0.7? Status: 10%

Hierarchies of relations and attributes are obviously needed. The main challenge is to support them in inline queries, but also a well-formed RDF-export is a requirement.

Searching and querying

Inline queries Active developers: mak
Schedule: soon Status: 80%

Users write queries in article source, and results are shown in the article.

Enable template-based output format Implemented by: mak
Schedule: 0.7 Status: 90%

Templates can now be used for constructing queries, as well as for printing query results (format="template").

Rewrite low-performance parts of IQ Active developers: mak
Schedule: open Status: 10%

Two functions are way too slow right now: the materialisation of the category hierarchy and the identification of articles with mutual redirects. Those need simplification and reimplementation.

Timeline format for many dates per article Implemented by: mak
Schedule: until 0.7 Status: 80%

In addition to the current Timeline format, which shows one (event) or two (duration) dates for each returned article, there will be a new format "eventline" that shows all dates selected in the query. Color-coding and pre/appended article names enable users to tell which date came from which query result. In this way, many dates from many articles can be displayed.

Query format that embeds articles Active developers: Fernando Correia, mak
Schedule: open Status: 90%

A new format enables embedding article texts of articles retrieved with an inline query.

Query format that counts result Implemented by: mak
Schedule: 0.7 Status: 90%

Print the number of results instead of the results themselves.

Clean up query code, stage I Active developers: mak
Schedule: open Status: 90%

Try to eliminate all FIXMEs.


Improve search Special Active developers: dvr
Schedule: until 0.7 Status: 0%

The current simple semantic search is outdated. Required features are: support for many results split over many result pages, improved UI, cleaned up implementation (which is not efficient right now), facetted browsing features (e.g. results could provide further quicklinks).

External services and reuse

rewrite RDF export Active developers: mak
Schedule: open Status: 20%

Rewrite good portions of the RDF export to become more performant and easier to maintain.

Provide queryable export Active developers: mak
Schedule: soon Status: 0%

It should be possible to use a Special similar to Special:Ask for retrieving RDF selectively.

Ensure OWL compatibility of RDF export Active developers: mak
Schedule: soon Status: 30%

The wiki currently does not prevent "meta-modelling" in the sense that cateories and other annotations can again be annotated. To enable compatibility with OWL-based tools, it should be possible to constrain output to be in OWL, even if the wiki is not. This is currently done by just dropping offensive annotations, but a cooler way would be to create new annotation URIs and to describe those as AnnotationProperties. This will be implemented as follows:

  • normal properties on normal articles are exported as usual
  • properties can be declared as annotation properties in the first place (see below), and in this case they are always exported literally (as everything is now)
  • non-annotation properties that involve TBox-elements are interpreted as annotation properties and a new URI is created for this purpose. The annotation property's description cannot directly be accessed in the wiki and contains only a label and a link to its original non-annotation property URI.
Export equality statements Active developers: mak
Schedule: until 0.7 Status: 0%

Redirects are not considered in the RDF export right now. They should be owl:sameAs statements for articles, owl:equivalentProperty for attributes and relations and owl:equivalentClass for classes.


Import ontologies Active developers: dvr
Schedule: June 10th Status: 20%

The import of outside ontologies is possible. It adds missing statements from an ontology to the wiki.

Avoid multiple listing Active developers: dvr
Schedule: none Status: open

There's a bug in the upload, that in more complex ontologies lists an entity several times.

Clean up code/fix bugs Active developers: dvr
Schedule: none Status: open

The OI code is messy and has various problems and limitations. It should either be cleaned up, or our current solution of using Python-based refactoring should be polished to be shippable as a maintenance application.


SMW APIs Possible developers: not assigned
Schedule: none Status: not started yet

Needed: APIs in a variety of languages to include data from a SMW in your applications. No knowledge of PHP required, just your own native language, be it Python, C/C++, .NET (C#, VB.NET, J#, IronPhython, etc.), Java, or COBOL if you insist :) The language should have an RDF library though, or the task will grow rather big.

Datatype support

Date/time datatype Active developers: skierpage
Schedule: until 0.6 Status: first prototype implementation running

The date/time implementation is still insufficient since it handles dates only near present times. It cannot deal with anything before 1901-12-14 or after 2038-01-19.

Note: PHP 5.2 has a DateTime object which is a wrapper around a 64 bit integer. However, it is currently limited to 1AD to 9999 AD.

New tentative plan: do our own date parsing. Instead of converting to a timestamp of seconds since 1970, just convert to a number that provides accurate ordering. Still based around the same epoch so nearby times have precision extending into seconds for sorting.

Open Issue: should Type:Date figure out whether to export as XSD type #date or #datetime based on whether there's a time component?

Note that in addition to historical dates, there is also geologic time scales, e.g. "Tyrannosaurus rex flourished approximately 65 Million years ago". Although in theory you can represent this as XSD #date of -65000000, we will not attempt to handle this with Type:Date, instead just use an Attribute:Geologic time using the custom float unit Type:Time.



Timezone support for date/time Active developers: skierpage
Schedule: open Status: 0%

PHP understands timezone identifiers, but if no timezone is given, a default timezone of the wiki should be applied. Furthermore, it should be explicitly stated what timezone some time refers to in the infobox. (However, this doesn't make sense for historical dates. Maybe only support timezone for dates with times?)


Sorting for date/time Active developers: skierpage
Schedule: open Status: 0%

Sorting dates. Dates don't reliably sort and queries don't always work. See Type:Date#Date_Comparisons


date/time formats in inline queries Implemented by: skierpage
Schedule: 0.7 Status: 100%, in SVN

Permit specifying a format for date/time values in inline queries, to, e.g. just show the year or the day and month for birthdays.



Boolean datatype Implemented by: skierpage
Schedule: 0.7 Status: 100%, in SVN

Type:Boolean page should document that using a Category may be better.


Enumeration datatype Active developers: skierpage
Schedule: 0.7 Status: 50%, in SVN

See alternative proposed in "enum attribute type" thread on Semediawiki-user.


Annotation properties for all Active developers: mak
Schedule: open Status: 0%

The OWL ontology language distinguishes annotation properties (which could be compared to comments) with the more semantic datatype and object properties. To enable meta-modelling, SMW should allow users to declare some properties as annotation properties (see plans on RDF export for details on how this affects RDF). The plan is to allow properties to have additional statements of the form [[has Type::Type:Annotation]] which are then separated from normal type statements during parsing and saving. The advantage is that we can reuse "has type" instead of making something new, and that we have a link to a possibly enlightening page "Type:Annotation". The disadvantage is the additional mix-up of types and other properties. A similar scheme could be used for future features like symmetry or transitivity.

This feature would make the current Type:AnnoURL obsolete, but would require the vocabulary import features to have markers for annotation properties.


Support for long text data Active developers: ?
Schedule: open Status: 0%

We were repeatedly asked to support text attributes longer than 255 characters. Since the current SMW database tables do not have a space for such content, an additional table would be possible. The attribute table could use the value and unit fields for a hash and a key of the original text. Computing the hash in a datatype handler before searching would then enable querying to quite some extent. Using the unit field for a key might be less clean ... Finally, one must be possibly careful that the long-text data is not overly large either.

Interface improvements

Special:Types page Active developers: skierpage
Schedule: until 0.7 Status: alpha

The Special:Types became rather useless with the advent of custom types. It needs to be able to display custom and builtin types, and to show their supported units, if any.


Sorting tables by numbers Possible developers: ?
Schedule: unclear Status: unassigned

When a result table is sorted by a date or numeric column, most of the time it gets sorted by its lexicographic order, not by the numerical one. This is a weakness in SMWIP/skins/SMW_sorttable.js. It would be useful if the sorting script could refer to some (invisible) HTML-parameter to sort columns instead of using the value string (which is of unknown form and might be very complicated to parse).

inline queries could return the value_num for any attribute that isNumeric, hide this somewhere in the HTML, and tweak sorttable.js to check for this in its sort() function.
Just a heads up: it looks like base MediaWiki will soon have similar table sort code in it (bug 2001 is fixed). -- Skierpage 00:46, 16 December 2006 (CET)


AJAX-like input aids Active developers: darkrichsmile
Schedule: none Status: 75%

PROTOCOL -AJAX library choice was made (www.script.aculo.us) -Implementation of one example. (Suggestions were made by array) -Searching for a solution in interface among AJAX and MYSQL -new try with SAJAX-library (coded in PHP) and class.inputfilter.php5 -an example in SMW_Special implemented making DB queries -Implementation to the actual version of SMW 0.4 .Current aim is to implement Autocomplete-example easier and more beautiful coded. I've found a problem with the IE Browser, which I have to fix now!

Bugfixes/Cleanup

Ship SMW pages for Wiki functionality Active developers: skierpage, everybody
Schedule: Status: 0%

Much of the functionality of SMW is supported by wiki pages, e.g. help pages, standard user-defined types with units for area, length, time, etc., the coordinates infolink services, and so on) that we should consider providing dumps of these pages as part of the release to help set up systems.


Use object oriented features of PHP5 Active developers: mak, everybody
Schedule: Status: 80%

PHP5 sufficiently supports keywords like protected, private, static, and interface. They should be used eagerly.


Implement style guidelines Active developers: mak/everybody
Schedule: Status: 95%


Fix tooltips Possible developers: ? (kai)
Schedule: soon Status: 0%
Improve tooltip implementation Possible developers: ?
Schedule: soon Status: 0%

The JavaScript and the process of inserting it into a page is not optimal at the moment. The replacement process needs a second parse that in fact breaks most the things one could want to write within a tooltip. Especially it breaks when errors are reported in tooltips (since this requires a span-tag).

Fix Tooltip JScript for Opera Possible developers: ?
Schedule: soon Status: 0%

Changes in older versions

Personal tools
Namespaces

Variants
Actions
Navigation
services
Toolbox