Mathematical expressions in RDF

Are there any established ways of representing mathematical formulas or expressions in RDF?

A web search turned up a paper by Marchiori (2003) which provides an overview of relevant Semantic Web standards, and even contains an example of how a MathML fragment like this:

<apply>
  <csymbol encoding="text"
           definitionURL="http://www.mathsw.org/scalarplus">
           p
           </csymbol>
  <cn> 2 </cn>
  <cn> 6 </cn>
</apply>

could be represented in RDF like this:

:_1 <http://www.w3.org/TR/MathML2#apply> :_2
:_1 <http://www.w3.org/TR/MathML2#csymbol> "p"
:_1 <http://www.w3.org/TR/MathML2#definitionURL> 
                                 <http://www.mathsw.org/scalarplus>
:_1 <http://www.w3.org/TR/MathML2#encoding> "text"
:_2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#:_1> "2"
:_2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#:_2> "6"

However, that was just an example, and I couldn't find anything which would describe a complete framework for something like that.


Edit for clarification. I am looking for something which would allow to represent mathematical expressions which refer to other resources described in RDF as sub-expressions. Even perhaps something that would allow to construct mathematical expressions by some sort of rule language (like SPARQL CONSTRUCT queries). Therefore just using MathML (or OpenMath) markup in a XML Literal does not seem like a good solution.

In terms of the direct question I don't know of an established way of doing this.

What you want to achieve?

If you want to convey some mathematical markup over RDF but that maths is not "executable" or doesn't need to interact with the RDF itself then the obvious approach would be to use MathML to encode the content and embed the MathML fragments as rdf:XMLLiterals.

OTOH if you are trying to express some mathematics that will interact with the RDF data - for example integrity constraint checking or formulae for deriving computed values - then you might want to consider an RDF rule language. RIF Core can express some interesting maths and the semantics of RIF + RDF combinations is well defined, though there is not yet a standardized way of embedding RIF rules directly in RDF. SWRL, is another alternative, has been around for longer and has reasonable tool support.

I've presented an approach for the integration of mathematical expressions into RDF datasets for the purpose of mathematical reasoning on the OpenMath workshop 2012.

The paper Mathematical Computations for Linked Data Applications with OpenMath covers an OpenMath content dictionary for RDF that allows to reference RDF resources and their properties from mathematical expressions. This is complemented by an OWL ontology for OpenMath objects (available at http://numerateweb.org/vocab/math) to enable the encoding of mathematical expressions in RDF for cross-referencing between mathematical expressions and RDF data.

The simple example taken from our paper (in an extended Popcorn notation)

[foaf:Person] : 
@e:bmi = @e:mass / @e:height ^ 2

[foaf:Group] :
@e:aBMI = sum(@@foaf:member, $x -> @e:bmi($x)) / set1.size(@@foaf:member)

is based on the FOAF vocabulary and defines the properties e:mass and e:height for the class foaf:Person and uses them to compute the e:bmi properties of individual persons and their average e:aBMI over a group.

An interesting point is that we can use an RDF graph for storing intermediate results of computations. For example, e:bmi is stored as a property of an foaf:Person and later reused for computing e:aBMI.

I faced the same need to represent expressions in RDF. Not only mathematical expressions, but any symbolic expressions and data structures (e.g., type expressions from programming languages, queries). I came up with a solution that I presented at ESWC'13 (as a poster, but a longer version is available as a technical report). It was designed to be simple, standard RDF, generic, and suitable for structural querying with SPARQL or other RDF query language.

The principle I followed was to reuse the structure of RDF containers, which are not much used in practice. The math expression $\int x^2 + 1 dx$ is represented in RDF/Turtle by (assuming math: is a namespace for mathematical symbols)

[ a math:Integral ;
  rdf:_1 [ a math:Addition ;
           rdf:_1 [ a math:Power ;
                    rdf:_1 _:x ;
                    rdf:_2 "2"^^xsd:integer ] ;
           rdf:_2 "1"^^xsd:integer ] ;
  rdf:_2 _:x ]

The type of the container is a mathematical symbol/operation/function, and its elements are its arguments. Typed literals are used for constants. Named blank nodes (here, _:x) are used to represent bound variables, and support alpha-equivalence (renaming of bound variables).

To make Turtle notations lighter, I also proposed a new syntactic abbreviation for Turtle (which already has abbreviations for lists), allowing for functional notation of expressions.

[ a C; rdf:_1 E1 ; ... ; rdf:_N EN ] can be noted C(E1,...,EN)

The above expression can now be represented as

math:Integral(math:Addition(math:Power(_:x,2),1),_:x)

Such representations allow for rich queries based on the structure of expressions. For example, it is possible to retrieve all integrals in x whose body contains x^2 as a sub-expression.

SELECT ?e WHERE { ?e is math:Integral(...math:Power(?x,2)..., ?x) }

This query uses additional abbreviations, and is equivalent to the following query, which only uses standard notations.

SELECT ?e
WHERE {
  ?e a math:Integral ;
     rdf:_1 [ rdfs:member* [ a math:Power ; rdf:_1 ?x ; rdf:_2 2 ] ] ;
     rdf:_2 ?x .
}

The use of a SPARQL variable ?x for the bound variable of the integral, enables to retrieve the expression $\int y^2 - y dy$ as well as $\int x^2 + 1 dx$.

The technical report provides more details, and also what can done with such RDF expressions (pretty-printing, interactive exploration).

The simplest approach is to define a new datatype for MathML literals. Let say http://www.w3c.org/datatypes/mathMLLiteral. (ideally it should be defined by W3C). Other serialization datatypes could be introduced for Latex or Mathematica expressions.

In Turtle format, this would look like this:

@prefix math:<http://example.org/ont/math#>

:APlusB a math:Addition
math:serialization “”"<apply> <csymbol cd=“arith1”>plus</csymbol>
<ci id=‘exampleontology#a’>a</ci>
<ci id=‘exampleontology#b’>b</ci>
</apply> “”"^^<http://www.w3c.org/datatypes/mathMLLiteral> .

A similar encoding pattern is used in GeoSPARQL standard, where geometries are encoded in WKT or GML. Geosparql introduces two different datatypes for each: http://www.opengis.net/ont/geosparql#wktLiteral and <http://www.opengis.net/ont/geosparql#gmlLiteral