Is there a web service that allow me to run SPARQL against a XHTML+RDFa website?

It would be cool if, like those RDFa extraction services, there would be a service that would allow running a SPARQL query against an arbitrary XHTML+RDFa website and return it in, for example, JSON. Is there such a public service?

There isn't a public service that I'm aware of, but there are a sparql services and rdfa services. Let's try to find the creator of a slideshare slideshow:

First, we want rdf from html. Let's use the java rdfa service:

http://rdf-in-html.appspot.com/translate/?uri=http%3A%2F%2Fwww.slideshare.net%2Fmadrobby%2Fi-cant-believe-its-not-flash&parser=HTML

That returns rdf. Let's query that using sparql.org:

SELECT *
FROM <http://rdf-in-html.appspot.com/translate/?uri=http%3A%2F%2Fwww.slideshare.net%2Fmadrobby%2Fi-cant-believe-its-not-flash&parser=HTML>
{
    ?source <http://purl.org/dc/terms/creator> ?creator .
}

There is a json result format available from there, so we'll add that option:

http://sparql.org/sparql?query=SELECT+*%0D%0AFROM+%3Chttp%3A%2F%2Frdf-in-html.appspot.com%2Ftranslate%2F%3Furi%3Dhttp%253A%252F%252Fwww.slideshare.net%252Fmadrobby%252Fi-cant-believe-its-not-flash%26parser%3DHTML%3E%0D%0A%7B%0D%0A++++%3Fsource+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fcreator%3E+%3Fcreator+.%0D%0A%7D&output=json

Readable! But it works:

{
  "head": {
    "vars": [ "source" , "creator" ]
  } ,
  "results": {
    "bindings": [
      {
        "source": { "type": "uri" , "value": "http://www.slideshare.net/madrobby/i-cant-believe-its-not-flash" } ,
        "creator": { "type": "literal" , "xml:lang": "en" , "value": "Thomas Fuchs" }
      }
    ]
  }
}

We have an answer.

Other services could replace these two, of course. It would be nice if sparql.org supported rdfa directly, and maybe it will in the future, but this is more fun ('fun' may depend on your definition of entertainment).

Egon,

You are using Virtuoso, this feature is in-built via the Sponger Middleware layer.

Just do this:

SELECT DISTINCT * FROM WHERE {?s ?p ?o} .

For the most basic demo.

You can even test this live at: http://uriburner.com/sparql (note the sponger options in the drop-down).

Clearly we aren't doing a good job of explaining what the Sponger Middleware component of Virtuoso is all about :-(

Links:

  1. http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger -- Sponger Middlware Page
  2. http://bit.ly/aEUdUV -- Virtuoso SPARQL Tutorial that focuses on Sponger Pragmas for the Virtuoso SPARQL engine.

Kingsley

You should be able to do the same thing using the demo of my SPARQL engine that's part of dotNetRDF at http://www.dotnetrdf.org/demos/leviathan/. Just use a FROM clause pointing at the URI you want to extract RDFa from or enter the URI in the Default Graph URI box.

It's running on a fairly recent pre-release build of the library that includes the new RDFa parser I'll be rolling out in the next release.

http://www.dotnetrdf.org/demos/leviathan/?query=SELECT+*%0D%0AFROM+%3Chttp%3A%2F%2Fwww.slideshare.net%2Fmadrobby%2Fi-cant-believe-its-not-flash%3E%0D%0AWHERE%0D%0A{%0D%0A++%3Fsource+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fcreator%3E+%3Fcreator%0D%0A}&default-graph-uri=&timeout=10000

There's no explicit output option (output format is based on the HTTP accept) header but if you specify application/sparql-results+json as the accept header you'll get back JSON like you want