Jump to: navigation, search

Special:Categories lists each category together with the number of members.

This page discusses some approaches towards modeling with categories. It is not part of the official documentation for SMW, and may or may not be accurate/correct/objective.

Singular or plural[edit]

It is not obvious whether a page like Amsterdam should be in Category:City (a description of a member of the category) or "Category:Cities" (a description of the category as a set of pages). A convention for using one or the other is useful to avoid ending up with both, with part of the applicable pages in one, and part in the other, as has happened.

Currently the convention on Ontoworld is to use the singular form, while the convention on the English Wikipedia is to use the plural form.

Comparison with relations and attributes[edit]

The whole category system is comparable with a single property: the subject is called "category member", the object "category". Let us call the direction from member to category "up". In the selection part of a query, the category feature finds related pages on the downward side, with automatic "transitivity": if page P is in category Q and Q is a subcategory of category R, selecting category R also gives P. In view of that the feature is intended for some transitive relation.

In a query, we can use [[Category:*]] to show for each selected page the categories it is in (i.e., go up one step), similar to using e.g. [[Instance of::*]] to see for each selected page, of what it is an instance.

Thus selecting Category:Country in European Union also gives countries in Category:Country in Eurozone, but in a query that selects countries, Category:=* gives for Netherlands the latter (and other categories), but not "Category:Country in European Union", in which it is indirectly.

Compare with a relation like Property:Subclass of: if we also have the inverse relation, in this case Property:Has subclass, we can list the contents of up to two sublevels.

Both for selection and for display, it is not obvious whether including instances of subclasses is an advantage or disadvantage, it depends on the case and on one's wishes.

Regardless of the method, if one wants to include instances of subclasses while these are not provided automatically, of course one can annotate them directly.

Category as class[edit]

On Ontoworld a page representing a class is always put in the category namespace, even if instances are not notable enough for a page, e.g. electron. As a result there are empty categories such as Category:Electron.

Property:Instance of and Property:Subclass of are in principle deprecated, use a standard MediaWiki category tag instead. However, the inverses Property:Has instance and Property:Has subclass are still useful. On the category page itself instances and subclasses are shown as articles and subcategories in the category, but on other pages the category feature does not provide this info. "Has instance" tags at the lowest class level may even be more useful than the corresponding category tags at the instance pages (the category is then shown as query result).

Currently there are still pages in the main namespace which each represent a class, see e.g. the objects of Property:Instance of and both the subjects and objects of Property:Subclass of.

For several of these classes there is also a category, see Property:Category about.

See also:


Although an attribute can have many values, we can also use an attribute for the answer to a yes/no question, e.g. of type string with value "yes", or undefined for "no". There may be additional possible values such as "partly", in which case we can use "fully" instead of "yes". With this method, in a query result table the column focuses on the required property, and does not also show other properties. This can be an advantage or a disadvantage. Examples (they use "v" as a simplification of checkmark "", but that is not recommended):

Whether an annotation for "no" is useful and practical varies.

An attribute that's strictly yes or no would be better supported by the proposed Boolean datatype. An attribute that can be, for example, undefined-no-partly-yes-fully would be better supported by the proposed Enumerated datatype. -- Skierpage 13:22, 30 November 2006 (CET)

Advantages of a category[edit]

A category page provides sections by first letter.

Multiple categories, as compared with one relation, have the advantage that Special:Categories shows the number of members of each while Special:Relations only gives the total for the relation. Of course we can also replace each category by a separate relation.

In a query we can use a selection like "Category:Located in Europe||Overlaps Europe". Compare this with the fact that with relations, although we can take a union of objects, as in located in::Europe||Asia, we cannot take a union of selections involving two relations, such as located in::Europe and overlaps::Europe.

Suppose there are n disjoint categories, then for each of the 2^n - 1 non-empty subsets of the set of categories, the selection in a query can be the union of the pages in that subset of categories. Compare this with using a single attribute with n values: with := , equalities and inequalities, we have 3n - 3 possible selections (n>1), which for n>2 is less.

Advantages of a relation[edit]

In the case of a relation, we can also relate to a page in the main namespace. This avoids the duplication of having e.g. pages Continent and Category:Continent, or Fictional person and Property:Fictional. The query table directly links to this page in the main namespace, while in the other cases it links to the category page or attribute page only.

Note: Directly relating content pages can also be achieved with the category system by not using the main namespace, but putting all content in the category namespace. However, this somewhat clutters the titles of the pages and makes linking more cumbersome.

All instances of e.g. Property:Instance of can be produced by one query, either alphabetically, or sorted by the object of the relation. Thus we can ask an open question "Of what is this an instance?" instead of a yes/no question "Is this an instance of ...?".

Advantages of an attribute with simple values[edit]

In a query result table the table column only needs a small width if we use a short header (if needed with a legend before or after the table).

In addition to info representing "yes" and "no" we can have "partly", etc. See the queries in the Europe page.

In the case of two values, e.g. "v" and "partly" (in addition to undefined) we can use the attribute for any selection, with := , :=v, and :=partly. In the case of more than two values we can choose the values such that the most desired selections can be done, if not with an equality, with an inequality. For example, if there are three values, 6 of the 7 selections are possible with :=*, an equality, or an inequality; if there are four values, 9 of the 15.

Compare this with the fact that we can not take the union of e.g. selections located in::Europe and overlaps::Europe.



  • <ask>[[Category:continent]]</ask> gives <ask></ask>
  • <ask>[[instance of::continent]]</ask> gives <ask>continent</ask>
  • <ask>[[Category:continent||country]]</ask> gives <ask></ask>
  • <ask>[[instance of::continent||country]]</ask> gives <ask></ask>
  • <ask sort="instance of">[[instance of::continent||country]]</ask> gives <ask sort="instance of"></ask>

A union of a category and a set of subjects relating to a given object is not possible. Actually, in SMW 1.0RC2 the behavior of these queries seems different. Therefore, if one wants the possibility to take a union one has to use two categories, or two objects with the "instance of" relation, not one of each.

See also[edit]

All categories | properties | types

Advice on Annotation | Ask | Attribute name | Browsing and searching | Category | Chains of relations and attributes | Custom units | Namespace | Relation name | Selection | SearchTriple | Sorting | Templates in SMW