Hi James,
First off, some background: it's important to note that much of what might seem like constraints in RDFS/OWL are not actually constraints. The problem is that RDFS/OWL abide by two relevant principles:
The first is the Open World Assumption (OWA): this means that OWL never assumes that your data is complete. Even though in OWL you can say things like, e.g., that all Parents must have at least one value for hasChild
, this cannot be interpreted as a constraint. If I say John type Parent
, and don't mention anything about children, OWL figures that John
must have a child, even if that child is not named in the data... OWL doesn't flag the data as invalid: it just figures that the data might be incomplete. If OWL had the Closed World Assumption (CWA) – which it doesn't – then it would expect the data to be complete, and expect the child to be named. With OWA, OWL is not suitable out-of-the-box for checking "completeness" of data.
The second principle is the lack of a Unique Name Assumption (UNA): this means that OWL is uncertain as to whether two things are the same or not until proven one way or the other (or stated explicitly) – i.e., names are not necessarily distinct. Why is this important? Well say you've stated that all Persons should have two values for hasBiologicalParent
. Now, you find out (maybe from a number of sources) that John type Person
, John hasBiologicalParent William
, John hasBiologicalParent Mary
, John hasBiologicalParent Bill
. Again, OWL won't sniff a problem: instead it will figure that some pair of names in { William
, Mary
, Bill
} refer to the same real-world entity. Thus, OWL is not suitable for checking that you might have, e.g., "overloaded" some property.
For similar reasons in RDFS, things like rdfs:domain
and rdfs:range
– which masquerade as constraints – are open to confusion. Saying that hasChild rdfs:range Person
does not mean that any value for hasChild
should be typed as Person: OWA means that any value for hasChild
can be automatically typed as Person
, even if the data is incomplete (if x type Person
is not explicitly given).
It should be noted that these are not omissions of OWL: rather features which are sympathetic to the Web. On the Web, you cannot expect complete data, rather pieces of jigsaws that coalesce into a bigger picture. Similarly, you cannot expect everyone to agree on using the same terms off the bat: instead, let them use different terms and try sort it all out later.
Anyways, to take an example of what you want:
Constraint 1) The tuple (manufacturer, serial number) globally defines a unique instance of a tire. Expressed another way, serial number defines the uniqueness of a tire instance within the context of a given manufacturer.
You can model something like this is OWL 2 using owl:hasKey
: define compound keys (manufacturer
, serialNumber
) which together identify an instance of a class (a particular Tire
). However, this is not to flag if you have two tires which have the same values for (manufacturer, serial number): if it finds two tires which do, the lack of UNA means that OWL will figure that those two tires are just names for the same tire.
It's worth noting perhaps that there is one way of emulating "constraints" in OWL using 'inconsistencies'. For example, if you know that Tire disjointWith Person
– meaning that you can't have something that's both – and someone says John type Tire
and John type Person
, then OWL definitely knows that somethings up – UNA and CWA don't play a role.
From what you need, it sounds like a solution involving inconsistency would be at least cumbersome, if even possible. Some else might be able to sketch a solution, but I doubt it.
On the other hand, I'm aware of some works that look at using OWL under UNA and CWA (interpreting what look like constraints as constraints for local data). You might want to check out: http://clarkparsia.com/weblog/2009/02/11/integrity-constraints-for-owl/ I'm not knowledgeable about Pellet or the tool, but someone might be able to give working examples for your constraints.
(It's also worth noting that yours is a commonly observed "problem/feature" of RDFS/OWL. E.g., see the start of an old panel discussion for some senior researchers in the field bickering like old women about the topic. Also, to justify the long answer, I need this text for elsewhere, so comments welcome ;))