I noticed that from the thread on What's missing from RDF databases? that one thing people were quite keen on was full-text search in SPARQL.
This is because it mitigates a classic alleged performance problem in SPARQL when people write queries like so:
SELECT ?s
WHERE
{
?s ?p ?o
FILTER(REGEX(?o, "substring"))
}
It should be self evident that writing a query like this is a bad idea because you get a huge swathe of potential matches and then have to apply a regular expression on every one. Full text search extensions to SPARQL let you write queries which achieve the same thing with huge performance advantages, for example the LARQ style syntax:
PREFIX pf: <http://jena.hpl.hp.com/ARQ/property#>
SELECT ?s
{
?lit pf:textMatch 'substring' .
?s ?p ?lit
}
However one problem with this is that is isn't standardised and every vendor seems to have different syntaxes for this, I'm aware of at least 3 different implementations none of which are interoperable:
- 4store Text Indexing
- Jena LARQ - also supported by dotNetRDF
- Virtuoso Full Text Indexing
So I wondered several things:
- Are you using full text search with SPARQL or would you use it if it were available in your RDF database/SPARQL engine?
- Whose implementation do you use? Please add links to documentation and an example if it's not one I've listed or someone else has already suggested
- What's your use case for it?
I'm particularly interested in point 3, maybe I'm being unimaginative but I can't think of any use cases beyond the glaringly obvious text search uses i.e. using it to find RDF graphs/triples containing certain text. If you have some other interesting/useful use case please enlighten me!