What is ontology hijacking?

I stumbled recently over the term 'ontology hijacking'? So the interesting question is here, what do you think, what ontology hijacking is? I would use the term ontology hijacking, when an ontology developer, for example, is super classing a concept of another ontology in his/her own ontology without permission. That means he simply do this super classing and haven't asked the developers of the used ontology before. Furthermore, is OWL-DL compliancing also ontology hijacking, because the ontology authors also redefine existing concepts in their own ontology. This issue could probably be solved with further owl:imports statements. However, I think this might be another question. Hence, I will ask this question separately. Could this problem of ontology hijacking be solved, when every ontology developer gets in dialog with other developers to discuss his/her concerns?

I'm to credit/blame for the phrase. The notion of "ontology hijacking" is fairly nuanced, but the core intuition is quite straightforward.

Simple answer:

In some Web document, you provide some axioms – some RDFS or OWL statements – which mandate a change in inferencing for instance data using terms outside of your namespace. "Ontology hijacking" does not cover axioms which mandate the translation from terms in your namespace to external terms: we would call this "extension".

Longer answer:

There are some obvious caveats to the intuition: for example, sioc:User extends foaf:OnlineAccount. If FOAF changes the definition of foaf:OnlineAccount, it might change inferences over sioc:User: this is not "ontology hijacking" – FOAF does not directly reference SIOC user.

More technically speaking, "ontology hijacking" is with respect to a certain ruleset; e.g., foaf:Agent owl:equivalentClass dc:Agent in the FOAF spec states that all dc:Agent members are also foaf:Agent members, which is "ontology hijacking". However, if you're doing RDFS reasoning, this means nothing – in OWL it's hijacking, in RDFS it's not.

Context:

We apply this term in the context of applying reasoning (materialisation) over a crawl of ~1b triples of arbitrary Linked Data. We give absolutely no preferential treatment to any vocab or web document outside of core RDF(S)/OWL terms. We have no manual inclusion/exclusion list for vocabularies. Closing your eyes, crossing your fingers, clicking your heels three times and hoping that people haven't published the kind of stuff that messes up your inferencing doesn't help. They have. What helps is tracking provenance – relying on the fact that the most instantiated terms are well-modelled in their vocabularies – and doing reasoning, e.g., over FOAF data as FOAF intends it, SIOC data as SIOC intends it, DC as DC intends it.

Disclaimer:

The term is largely divisive, even amongst like-minded Linked Data types. Some axioms we consider "ontology hijacking" actually give good inferences (like independant mappings, in which case we say that such axioms should be lobbied to the relevant vocabs); some axioms are "well intentioned" (people try to do good, but nonetheless get it wrong); some axioms are "insular" (work well for a given application, but not in general); some axioms are "malicious" (well, at least they could be). The main problem is, we currently cannot distinguish these different cases.

If I were to go back, I would not use such a pejorative term perhaps – the term at least stirs debate. I am also against restricting what people can or can't say. I personally believe that it is a good best practice to avoid such axioms in web vocabularies; I've had discussions with people in involved with other prominent web vocabularies who agree. Others – such as the FOAF guys – disagree [VocabMappings]. Our reasoning approach for Linked Data (building one authoritative "T-Box"/model for the Web) is not the only one, but such an approach is currently only feasible by ignoring "ontology hijacking" cases. We do reasoning over arbitrary Linked Data that's out there now, and it works.

Reference: Scalable Authoritative OWL Reasoning for the Web

P.S. In your question, you also mention DL – we focus on rule based reasoning, but you should check out "conservative extensions".

I don't think there is such a thing as ontology hijacking, anybody is free to make statements about anything, and each statement can be considered or not, and further considered to be a Truth or a Falsehood.

Further, whether something is considered to be a truth or a falsehood comes down to belief states and there is no true boolean true/false value for everybody in every context.

If you consider that one class is a subClassOf another class (ie refines) then say it.

The only reference I've seen is this, "Scalable Authoritative OWL Reasoning on a Billion Triples":

Ontology hijacking is the re-definition or extension of a definition of a legacy concept (class or property) in a non-authoritative source such that performing reasoning on legacy A-Box data results in a change in inferencing.

As such (and has been said above) it seems to be concerned with provenance. However there is a little more to it, which explains the pejorative name:

if one were to publish today a property in an ontology (in a non-authoritative location for FOAF), my:name, within which the following was stated: foaf:name rdfs:subClassOf my:name ., that person would be hi- jacking the foaf:name property and effecting the translation of all foaf:name statements in the web knowledge base into my:name statements as well.

The authors appear to consider this bad because of the associated complexity for their inferencing system. I've seen, as a result, certain subclassing condemned as 'hijacking' because of its scalability issues. Personally I'm not convinced here: it seems to be a problem for a particular kind of inferencer? Anyway, the conflation of the provenance and scalability issues isn't helpful.

So as far as I can discern almost any interesting ontology mapping exercise is hijacking :-) The interesting questions around this concern the social interactions between ontology authors, and the mechanical and UI issues around provenance and explaining inferencing.

I wrote a post called Linked Data Spam Vectors which came up with the following techniques for injecting misleading or unrequested information into RDF:

  • False Labelling
  • Misdirection
  • Schema Pollution
  • Identity Assumption
  • Bait and Switch
  • Data URI Embedding

Perhaps I need to expand that list to include Ontology Hijacking, i.e. the practice of redefining popular ontology terms so that the attackers information will be conveyed to the consumer.