Publishing Rulesets in the Wild

...regarding publishing RDF rulesets in the Wild (particularly as Linked Data)...

From here, it seems that you cannot encode the following form of inference using OWL 2 Full:

?x ex:youngerThan ?y . ?x ex:brotherOf ?y .?x ex:youngerBrotherOf ?y .

Similarly, various other forms of inference/constraints are more easily/intuitively expressed as rules.

Despite the fact that N3 and SWRL have been around for many years (not to mention SPIN, Jena Rules, etc.), I've yet to encounter any rules published in the Wild. I know that the W3C are working on encoding RIF in RDF, but this necessarily requires fairly complex RDF structures.

It seems that:

  • we have the motivation to publish and share rules on the Web;
  • we have various declarative languages (not necessarily RDF) to encode rules.

...but, as far as I know, we have few (if any) documents publishing rules on the Web.

So, some concrete questions to choose from:

  • Should rules be published in the Wild, and if so, how?
  • What proposals or best-practices are out there for publishing rules in the Wild?
  • Are there any rules published in the Wild? Where/how?
  • Why is publishing rules so uncommon?
  • How can vocabularies and rules interplay?

Should rules be published in the Wild, and if so, how?

Yes when it makes sense to do so. Many rule sets are quite specific to an application, data model and rule engine. Publishing those for documentation purposes is one thing but I take the question to be about publishing rules in reusable machine processable form. The number of existing rule rule which are broadly reusable is actually very small. However, purposefully developing reusable rulesets now we have the technology for publishing them is definitely a Good Thing to do.

What proposals or best-practices are out there for publishing rules in the Wild?

Well, you need a suitable platform independent rule representation; an interchange format for that; a way to associate metadata with the rules/rulesets to support discovery and reuse; maybe some infrastructure to help with it.

For the rule representation we have the OMG standards stack (e.g. PRR), the W3C standards stack (RIF), the arguable defacto standard (SWRL) and a very large number in-use languages which could be ported more broadly (N3 etc). Depending on what you mean by "rules" there are also ISO standards for Prolog, Conceptual Graphs and Common Logic. I'm biased of course :) but if you are in W3C/Semantic web space then RIF Core is the answer. It is very much aimed at interchange between existing systems, not itself an world-changing new language. It is compatible with RDF and OWL.

For interchange, then the prime format for RIF is an XML syntax, so publish the XML document and you are done. There is also a RIF in RDF encoding.

For metadata then RIF allows you to associate URIs and metadata annotations with rules and groups of rules (and indeed other bits of the language). The metdata is designed to be compatible with RDF so you can think of each rule/group/etc having a URI and have RDF metadata assertions about it. You can reference an RDF document from a RIF document and the SPARQL working group is defining how to reference a RIF rule set as an entailment regime for a query.

Are there any rules published in the Wild? Where/how?

There are platform specific rules sets out there (N3, CLIPS etc), not to mention lots of reusable (ISO-compatible) prolog code if you want to count that.

For RIF there is pretty much nothing. I was going to say we at least published the OWL RL ruleset but I can't locate the machine readable copies, just in-document copies :(

Why is publishing rules so uncommon?

Historically, a lack of standard representation of a simple enough rule language which pins down the semantics tightly enough to be truly reusable.

Writing reusable rule sets is hard, and many uses are application-specific.

In the case of RIF it is new, not yet widely implemented and has no adequate human readable format. Though that is all fixable.

The support for (RIF) rule-based entailment regimes in SPARQL 1.1 may also give a stimulus to people thinking about rule publication.

How can vocabularies and rules interplay?

That is a huge question. In terms of the semantics this is well defined for at least SWRL and RIF. In terms of best practice for what you want to represent in each then some trade-offs can be articulated but some of it is personal preference and style. In terms of mechanics there needs to much broader implementation and support before publishers can sensibly rely on publishing rules let alone rule/vocabulary mixes. SWRL is the language with the broadest de facto support and is very much an OWL extension rather than an RDF rule language so if you are working with OWL then SWRL remains a plausible choice.

Why is publishing rules so uncommon?

Low interoperability is likely the key factor here. My feeling is also that rule-based systems tend to target ad-hoc needs in niches not easily addressed by OWL reasoners: if that is indeed the case, there would be little request for interoperability in the first place.

Should rules be published in the Wild, and if so, how?

Yes, definitely. How? - Like any other resource.

What proposals or best-practices are out there for publishing rules in the Wild?

I guess, you know already all different formats for defining describing rules (OT: I prefer N3 and SPIN). So, the publishing guideline should be the same as for any other resource descriptions - the principles of Linked Data. It should be crucial that the consuming reasoning engine can still somehow decide, whether it applies a rule or not. However, it should be able to process such 'rule usage descriptions'.

Are there any rules published in the Wild? Where/how?

See e.g. here, not a canonical URI, anyway ;) (Edit: this rule is now also available via this namespace as SPIN rule)

Why is publishing rules so uncommon?

I guess, we are reaching currently the level where rule publishing, utilizing and reutilizing gets really interesting. I think its important for information integration task, to know i.e. which rules were applied (provenance)/could be applied (reasoning) on the current piece of information that is being retrieved.

How can vocabularies and rules interplay?

This issue is still open, cf. the thread with the title "How can I associate related rules to an ontology/RDF graph?". There are some investigations into this directions e.g., spin:rule, however, they are currently at a not really satisfying state. The relations to the vocabularies and instance data are really important, especially also their related descriptions e.g., "why?".

PS: Not to forget, the thread with the title "How to discover rules on the Web of Data?". I think, especially moustaki's comment (and reference) is quite interesting.

Only on the "should" question: Here is what I believe would be a reasonable usage scenario for rules on the web:

For data providers, offering rule sets might be a flexible alternative to creating RDFS or OWL vocabularies, or they may at least use them in addition. The data providers could come up with specifically taylored rule sets that closely match their intention about what to infer from certain triple patterns occurring in the data. The data provider may even decide to provide different rule sets for different purposes ("profiles") for the same data - the user could then choose depending on his needs. Or there could be rule set providers who offer more generic rule sets that support certain data patterns. The data providers could then reuse one or more of these rule sets for their published data, potentially adding own rules ("rule mash-ups").

In this way, one could support reasoning that is quite different from typical OWL reasoning or, at least, one would not need to wonder about complex OWL solutions if there are obvious rule'ish ones. For example, one could have rules for FOAF data matching certain foaf:knows-based triple patterns, e.g.:

  • mutual knowing: "[?x foaf:knows ?y] & [?y foaf:knows ?x] -> ..."; or
  • knowing and likeing: "[?x foaf:knows ?y] & [?x fb:like ?y] -> ...".

Of course, FOAF is primarily defined by the FOAF vocabulary, so ideally one would apply both vocabulary-based reasoning in OWL and rule-based reasoning for the custom rule sets. However, I have promised that I would restrict my answer to the first question, so I am stopping here! :-)