If your focus is more on demonstrating the aggregation of data from lots of sources, you're probably going to need to be able to handle (in some fashion) the semantics of owl:sameAs
(although you might get away with just matching up labels).
Whenever I need similar examples, I always turn to old-school folk in the FOAF community, simply because they are the resources that pop up the most in different sources.
So, simple queries like give me all relevant information about Tim Berners-Lee:
SELECT * WHERE {
<http://www.w3.org/People/Berners-Lee/card#i> ?p ?o .
}
...should touch upon about thousands of sources from a dozen or so domains (assuming you also consider [and trust] owl:sameAs
links).
Same scenario for Dan Brickley.
After you've found some interesting resources that are described in numerous datasets/endpoints, you can start pimping the query a bit... ask for details about the resource, or related resources, such as the names of people they know, or images... or other information where properties are well agreed upon. To answer fancier queries, you may find that there isn't enough agreement on property/class URIs, and you're going to have to use ugly UNION
s to get the job done (a little reasoning may help here).
If your focus is more on demonstrating aggregation of data from a few sources—or for federated querying—you can probably go for more ambitious queries. My suggestion would be to pick some of the more popular datasets from the LOD cloud (like DBpedia, Freebase, LinkedMDB, DBTune, DBLP) and see if you can again find well-known resources described in both... then find a way of relating them (by owl:sameAs
or common label)... then figure out what queries require the combination of knowledge from the different sources.
Lee's SPARQL by Example gives the following federated query for getting the birthdates (DBpedia) of folks who acted in Star Trek (LinkedMDB):
PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
PREFIX dbpedia: <http://dbpedia.org/ontology/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?actor_name ?birth_date
FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf> # placeholder graph
WHERE {
SERVICE <http://data.linkedmdb.org/sparql> {
<http://data.linkedmdb.org/resource/film/675> movie:actor ?actor .
?actor movie:actor_name ?actor_name
}
SERVICE <http://dbpedia.org/sparql> {
?actor2 a dbpedia:Actor ;
foaf:name ?actor_name_en ;
dbpedia:birthDate ?birth_date .
FILTER(STR(?actor_name_en) = ?actor_name)
}
}
...which can be tried here (some patience required... after all, it's remote querying).
Trying to create a couple of novel examples following the above steps:
...get me the papers (DBLP) of Fellows of the British Computer Society (DBpedia).
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX dcterms: <http://purl.org/dc/terms/>
SELECT ?fellow_name ?paper_name
FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf> # placeholder graph
WHERE {
SERVICE <http://dbpedia.org/sparql> {
?fellow dcterms:subject <http://dbpedia.org/resource/Category:Fellows_of_the_British_Computer_Society> ;
owl:sameAs ?dblp_fellow .
FILTER ( REGEX( STR(?dblp_fellow), "dblp"))
}
SERVICE <http://www4.wiwiss.fu-berlin.de/dblp/sparql> {
?dblp_fellow foaf:name ?fellow_name .
?paper dc:creator ?dblp_fellow ;
rdfs:label ?paper_name .
}
}
...which this time uses some owl:sameAs
plumbing and which again can be tried here (shouldn't require as much patience).
Pushing the boat out for the number of sources, here you're trying to find which of your asthma (Diseasome) tablets (DrugBank) might have given you that slight dose of acidosis (Sider)... and get some info on the contraindications if available (DailyMed).
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX drugbank: <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/>
PREFIX sider: <http://www4.wiwiss.fu-berlin.de/sider/resource/sider/>
PREFIX dailymed: <http://www4.wiwiss.fu-berlin.de/dailymed/resource/dailymed/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?drug_name ?brand_name ?drug ?contraindication
FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf> # placeholder graph
WHERE {
SERVICE <http://www4.wiwiss.fu-berlin.de/diseasome/sparql> {
?disease rdfs:label "Asthma" .
}
SERVICE <http://www4.wiwiss.fu-berlin.de/drugbank/sparql> {
?drug drugbank:possibleDiseaseTarget ?disease ;
drugbank:dosageForm <http://www4.wiwiss.fu-berlin.de/drugbank/resource/dosageforms/tabletOral> ;
rdfs:label ?drug_name ;
drugbank:brandName ?brand_name .
}
SERVICE <http://www4.wiwiss.fu-berlin.de/sider/sparql> {
?siderdrug owl:sameAs ?drug ;
sider:sideEffect ?sideeffect .
?sideeffect rdfs:label "Acidosis" .
}
SERVICE <http://www4.wiwiss.fu-berlin.de/dailymed/sparql> {
OPTIONAL {
?moiety rdfs:label ?drug_name .
?branded_drug dailymed:activeMoiety ?moiety ;
dailymed:contraindication ?contraindication .
}
}
}
Again, you can try this here (again, with some patience). Cleaner results and query can be gotten by simply dropping the fourth service (the bit looking for warning labels from DailyMed is a bit messy).
Anyways, you get the idea.
[DISCLAIMER: I make no claims about the correctness/completeness of the results. ;)]