OntoClean is a methodology for analyzing ontologies based on formal, domain-independent properties of classes (the metaproperties and their modifiers). See below for a bibliography. OntoCleanPaper provides definitions.
We are trying to support more open discussion of OntoClean related issues. The OntoClean Community Portal was created for this purpose.
OntoClean was the first attempt to formalize notions of ontological analysis for information systems. The idea was to justify the kinds of decision that experienced ontology builders make, and explain the common mistakes of the inexperienced. Alan Rector, during a debate at the KR-2002 conference in Toulouse, said, "What you have done is reduce the amount of time I spend arguing with medics."
The notions we focused on were drawn from philosophical ontology. We were not after the seemingly endless arguments about what the right ontology of the universe is, but rather the techniques these philosophers use to analyze, support, and criticize each others arguments. These techniques make very little, if any, committment to a particular ontology, instead they expose what are often very subtle distinctions.
The basis of OntoClean are the domain-independent properties of classes, the OntoClean metaproperties: identity, unity, rigidity, and dependence. Recent work has added two more metaproperties: permanence and actuality.
Note on terminology & examples
Unfortunately, the semantic web community totally destroyed a very useful word: property. In logic, a property is a unary predicate in intention, in other words a property is what it means to be a member of a class. For example, we say that instances of the Person class have the property of "being a person." The distinction between property and class is subtle, and probably not critical to understanding OntoClean, however we consisently use "property" according to its original meaning. For the most part, in understanding this dialog, one can treat "property" and "class" as synonymous. Thus a metaproperty is a property of a property or class. In the semantic web, contrarily, a property is a binary relation.
It can't be repeated often enough (because it is the most common confusion we encounter) that OntoClean is not an ontology. So it is difficult to give examples, such as "an example of a Rigid property is Person". Someone could come along and say, "I thought instances of rigid properties can't change their membership? When I die I cease to be an instance of Person! OntoClean is wrong!" Well, even though we think we know what we mean by such a commonly accepted term as "Person", the truth is we don't. There are many different definitions of Person that vary with culture, religion, philosophy. OntoClean, however, helps you to communicate these differences; so it is not OntoClean that dictates the property Person to be rigid, rather it is an ontology designer who uses OntoClean to communicate that their meaning for the "Person" class is rigid.
So when looking at the examples, try to keep this in mind. If your understanding of some class and some meta-property do not match, try thinking of different meanings for the class that would match the meta-property. A famous example of this phenomenon occurred during preparation of the taxonomy cleaning example presented in several of the papers and in the AAAI-2000 tutorial. Nicola had labeled the "Food" class as anti-rigidi, and Chris had labeled it as "Rigid". Why? Because Nicola's definition was that anything that is eaten is food, thus nothing is essentially food, and Chris' definition was that anything edible to humans is food, thus being that kind of food is essential to it. In the end the example survived with rigid Food, because it made a better example, not because it is a better or worse definition. The point that OntoClean helped expose was that we meant two different things by the same class.
Identity is fundamental to ontology, and especially to information systems ontologies. Identity is well known in metaphysics and in database conceptual modeling. In the latter case, it is an accepted best practice to specify a primary key for rows in a table. If "two" rows have identical primary keys, they are considered the same row.
More importantly for ontology are questions of identity that expose the existence of, or at least the need to represent, other entities. Here the issue at stake is finding the conditions under which a proposed entity would be both the same and different. The classic example is an amount of clay that is shaped into a statue. If you use the same clay but reshape it into a different statue, is it the same entity? If so, how could it be different? If not, how could it be the same. In conceptual modeling, it is understood that when such an ambiguity arises, one should treat it as two different entities to account for a situation where one changes and the other stays the same.
In OntoClean, we consider identity criteria to be associated with, or carried by, some classes of entities, called sortals. A sortal is a class all of whose instances are identified in the same way. In information systems, these criteria are often extrinsic, like a social security number or universally unique id, which is not interesting from an ontological point of view. Identity criteria, for our purposes, should be informative, they should help us and others understand what a class means. A triangle, for example, can be identified by the length of its three sides, or by two sides and an interior angle, etc. This says a lot about what is intended by the triangle class here, e.g. the same triangle could be in many places at the same time. Someone else may have an ontology in which the triangle class has different identity criteria, such that different drawings are always different triangles, even if they are the same size. Identity criteria (and OntoClean, for that matter) do not tell you that one of these definitions of triangle is right or wrong, just that they are different and thus that the classes are different.
Identity criteria and sortals are intuitively meant to account for the linguistic habit of associating identity with certain classes. In the classical statue and clay example, we naturally say "the same clay" or "the same statue", indicating that there are identity criteria that are peculiar to each class.
Being a sortal is the first OntoClean metaproperty, indicated with the I superscript (-I for non-sortals) on a class in the original notation. I (but not -I) is inherited down the class hierarchy, if a class is a sortal then all its subclasses are as well.
There are certain properties that only hold of individuals that are wholes. In formal ontology, wholes are often distinguished from mere sums, which are individuals whose boundaries are, in a sense, arbitrary. For example, consider the class clay. An instance of this class might be some amount of the material (this is only one possible meaning, of course), such that any (in fact, every) arbitrary subsection of the amount would be a different instance of the same class. By contrast, instances of the class Person are, typically, not decomposable in this fashion.
In our definition, wholes are individuals all of whose parts are related to each other, and only to each other, by some distinguished relation. This relation can be viewed as a generalized connection relation. Mere sums have no such relation since any decomposition of a mere sum is connected to any larger sum, which is not one of its parts, by the same relation.
Unity is the metaproperty, indicated by U, of classes all of whose individuals are wholes under the same relation. Like identity, OntoClean does not require that the relation itself be specified, often it is enough to know that the relation exists. Intuitively, a class has unity if all its instances are the same type of whole, and is typically true of classes of natural objects. Non-unity, indicated by -U, is the meta-property of classes whose instances are not all wholes, or not all wholes by the same relation. A further and more useful refinement of non-unity is anti-unity, indicated by ~U, the meta-property of classes all of whose instances are not wholes, such as classes of mere sums. U and ~U (but not -U) are inherited down the class hierarchy.
Leibnitz's law makes good sense when first considered, however it doesn't take long to see how considerations of time causes problems between most ontologies (especially semantic web ontologies) and Leibnitz's law. For example, I might have a beard on one day and shave it off the next, yet I am the same entity at both times. How is it possible for me to be the same if I have changed?
There are many logical approaches to this classic dilemma (including simply ignoring it), the most common is to consider some properties to be essential; an essential property (and, q.v. terminology above, we are talking about properties as unary predicates) is a property of an entity that cannot change, and these are the properties for which Leibnitz's law holds. Other properties of an entity that can change are non-essential and cannot be involved in identity.
Some properties are essential to all their instances. Think of the property of being a person, usually represented by the class Person. For every entity that has this property, the property is essential. So at least one of the properties that has not changed about me when I shave my beard is that I am a person. These properties, that are essential to all their instances, are rigid properties.
Rigid properties are designated by R, and properties that are not rigid -R. An important specialization of non-rigid properties are anti-rigid properties (~R), which are properties that must be changeable. Think of being a student - all students must possibly not be students. ~R (but not -R or R) is inherited down the class hierarchy.
Note that these are just examples - it is certainly possible to have an ontology in which Person is anti-rigid. Imagine an ontology of mystical beliefs, for example, in which an entity changes from Person to Spirit upon death. In order for the individual to be the same across this change, being a person must not be essential and furthermore must be changeable (i.e. anti-rigid).
Rigidity should not be confused with Kripke's notion of Rigid Designators, which are particulars. The term rigid in OntoClean is meant to describe the instanceOf link between an individual and a rigid class - it cannot be broken.
Dependence is a varied notion. In the core OntoClean papers, we used a kind of dependence that captures a meta-property of certain relational roles. A property is dependent if each instance of it implies the existence of another entity. The property Student, for example, is dependent, since to be a student there must be a teacher; for every instance of student there is at least one instance of teacher. In later work for [Dolce] this was noted to subsume two kinds of property dependence: specific constant dependence and generic constant dependence. The former accounts for dependence on specific entities, e.g. each person is dependent on having a particular brain. The latter accounts for the Student/Teacher case, where any instance of Teacher will do.
There are many other kinds of dependence, see [Fine and Smith, 1983] and especially [Simons, 1987]. It is an open problem to adapt them into the OntoClean framework.
Being dependent is indicated with D, being independent with -D. D (but not -D) is inherited down the class hierarchy.
Many people misunderstand the basic ideas of OntoClean. In this section, we'd like to promote some discussion of the ideas and how to overcome the misunderstandings.
Rigidity is the simplest idea in OntoClean, possible the most widely used and thus the most widely mis-used. The most common question we get is, "Is something-in-my-ontology rigid or not?". Several criticisms of OntoClean were based on the idea that OntoClean dictates some particular ontology. In the core papers and examples, we often used Person and Student as examples of Rigid and Anti-Rigid properties, does this mean that OntoClean requires Person to be rigid and Student anti-rigid?
OntoClean does not dictate whether any class is rigid or not. OntoClean is not a domain ontology. Part of OntoClean is the ontology of property kinds, but this just says that there are sortals and non-sortals, etc. It doesn't tell you what classes like Person are, it only tells you what it means if Person is rigid, or anti-rigid, or non-rigid. It's up to you, the ontology designer, to decide which.
This is an updated version of the annotated bibliography at http://www.ontoclean.org. The goal of putting it on the wiki is that others who have written OntoClean-related papers can help maintain it.
Protégé includes an implementation of OntoClean - see