Semantic Web Challenges


Some papers refer to difficulty of ontology development and mapping, lack of semantic search engines and inefficiency of reasoners as some of the main challenges in Semantic Web.

Just wanted to know if anyone could point out other challenges.

Thank you Manoj

here's an opinionated opinion.

system evaluation: semantic systems make old bag-of-words systems obsolete when they're good enough that you can set them loose and not be afraid of being embarassed by the results. Yet, trying to do that you find that conventional evaluation methodologies break down on many fronts -- for one thing there's the issue of cost, but also, it can often be difficult or impossible for the test data to be more than 95% accurate if there's the slightest subjective element. Conventional methodologies don't recognize that some mistakes will go unnoticed while others will force you to shut the system down

documentation: hyper-precise systems need to have a consistent point of view, and you can't enforce that if the knowledge engineers don't know what the p.o.v. is. On top of that, people who want write SPARQL queries against Linked Data face the problem that absent documentation is the norm. One of the largest roles that standards like dcterms and foaf play, for better and for worse, is that documentation already exists for them and they free publishers of the need to write it.

linked data quality: some organizations succeed and some fail. if you want to succeed, you've got to maintain that consistent p.o.v., even when you're sucking data in from Linked Data sources. In the current market, the consumer is the only agent with an economic need to address this problem. "Free" Linked Data is expensive to use precisely because publishers do not experience profit or loss if consumers succeed or fail to use their data.

upper ontologies: what's exciting in 2012 is the availability of lots of instance data, yet, people are having a hard time building applications because vocabulary terms aren't consistent everywhere. If we had real progress in the upper ontology space, we could see a lot of 'work reuse' across applications, much better ability to integrate data from different sources, and a big improvement in productivity. (business apps might almost "write themselves") Perhaps the large amount of instance data we have can be used to put proposed upper ontologies into a confrontation with reality, speeding up development.

performance: let's face it, we all wish the RDF stack were faster and cheaper to run. The fact that RDF and SPARQL are so flexible means that they'll be slower than systems that are more specialized and less flexible. This is true if you have little or no inference or if you have highly expressive inference. People in the 1980's realized that this kind of application was going to need large-scale parallelism, so they started the Japanese 5th generation project that was way ahead of it's time. If one thing could hold back a semantic future, it could be the failure of Moore's Law, in that mainstream applications aren't using parallelism so that hardware vendors don't provide it -- and without parallelism they won't invest in smaller transistors. Something like the Cray XMT (in the guise of the recently announced YarcData uRiKA appliance) could be be the answer, but they've got the mother of all marketing problems making it into something that can change the world.