justin2004
justin2004

Reputation: 305

RDF: when a property is used the thing in the object position is a literal of datatype X

In quite a few ontologies we have triples like this:

gist:containedText
    a owl:DatatypeProperty ;
    rdfs:range xsd:string ;

What they intend to express is that if you use that property (gist:containedText) the thing in the object position is a literal of datatype xsd:string.

e.g. This is a triple that conforms to that intention:

:thing0 gist:containedText "this is my text"^^xsd:string .
# "this is my text" is a literal of datatype xsd:string

I think several answers get this wrong by recommending rdfs:range:

In an ontology, how to define a property's value as a datetime

RDF : Is it possible to set the range of a property to a literal in Turtle

The simplest way of declaring that a property’s range is one of a limited number of literal values

But rdfs:range isn't about the datatype of literals -- it is about instances of classes.

Look what an rdfs reasoner produces if you give it those triples as input:

apache-jena-5.0.0/bin/riot --formatted=turtle --rdfs=tbox.ttl abox.ttl 
<snip>
"this is my text"  rdf:type  xsd:string .
:thing0  gist:containedText  "this is my text" .

And notice that is not well-formed RDF (there is a literal in the subject position).

And we don't mean that "this is my text" is an instance of the class xsd:string.

Apache Jena confirms that the RDF is not well-formed:

apache-jena-5.0.0/bin/riot --rdfs=tbox.ttl abox.ttl | apache-jena-5.0.0/bin/riot --validate --syntax=turtle -
16:34:29 ERROR riot            :: [line: 2, col: 69] Subject is not a URI or blank node

The question is then: in RDF, how do we express if you use property Y the thing in the object position is a literal of datatype X?

Upvotes: 1

Views: 227

Answers (2)

Chris Mungall
Chris Mungall

Reputation: 772

You should keep using rdfs:range in this way, it has the intended meaning under both RDFS and OWL2-DL semantics. Jena should be filtering out triples that have have literals in the subject position. As @IS4 states, this is just syntax, and there are proposals to relax that restriction, approved by Tim Berners-Lee.

Things get more complicated when you start mixing owl:DatatypeProperty and owl:ObjectProperty, not permitted in OWL2-DL, but allowed in OWL-Full. However, in the case you provide, gist:containedText is syntactically constrained to be a owl:DatatypeProperty in OWL2-DL, and the semantics of range are as you intend. You can check this by substituting xsd:string for xsd:integer, and OWL-DL reasoners such as HermiT will tell you the ontology plus ABox is inconsistent. Don't be misled by Jena's confusing behavior.

Upvotes: 3

IS4
IS4

Reputation: 13177

You are conflating syntax with semantics. OWL (and RDF Schema) gives an interpretation to RDF terms, but such an interpretation does not necessarily have to be encodable in RDF.

What they intend to express is that if you use that property (gist:containedText) the thing in the object position is a literal of datatype xsd:string.

I can't speak for the authors of the ontologies, but OWL gives a clear definition of what this expresses: that the individual identified by the literal has to be an instance of xsd:string.

You are right that this does not say anything about the literal itself ‒ :thing0 gist:containedText "en"^^xsd:language is also valid according to the ontology, because xsd:language rdfs:subClassOf xsd:string.

But rdfs:range isn't about the datatype of literals -- it is about instances of classes.

Correct. But a literal is not an object an OWL reasoner works with ‒ it is a term in RDF whose interpretation is an instance of rdfs:Literal ‒ a literal value. OWL reasoning does not operate on RDF terms, it operates on classes, properties, and individuals (with the exception of owl:DatatypeProperty which is meaningless in OWL Full anyway).

"this is my text"  rdf:type  xsd:string .
:thing0  gist:containedText  "this is my text" .

And notice that is not well-formed RDF (there is a literal in the subject position).

It is not syntactically valid (though it may certainly be one day) but that is irrelevant ‒ it is obvious what this means. That being said, it is not even necessary to emit "this is my text" rdf:type xsd:string . as that is already true even without using the property.

You could also fix the syntax issue very easily with owl:sameAs:

:thing0 gist:containedText _:literal .
_:literal a xsd:string ; owl:sameAs "this is my text" .

This is exactly the same thing, but well-formed and valid.

And we don't mean that "this is my text" is an instance of the class xsd:string.

Who exactly? Feel free to pick a different alternative if OWL/RDF Schema is not compatible with what you yourself need to express. SHACL, for example, operates on a lower level and should be able to constrain the datatype precisely, not as an individual. Otherwise you have to accept that (in RDF Schema and OWL Full) all of these graphs are also valid and in accordance with the ontology:

:thing0 gist:containedText [] .
:thing0 gist:containedText <http://example.com/> .
:thing0 gist:containedText [
  a rdfs:Literal
] .
:thing0 gist:containedText [
  a xsd:string
] .
:thing0 gist:containedText [
  a xsd:language
] .

While this is not:

:thing0 gist:containedText 1 .

I believe OWL actually works well in this situation ‒ when you express that a property links a thing to a string, this is exactly how you should express it, with rdfs:domain and rdfs:range, but it also gives you a bit of very convenient leeway not to identify the literal directly, as it doesn't tell you anything about how the property is actually supposed to be used. That is what an ontology should do, it should give meaning to the terms you use, not restrict how you should use them. That is the job of a system that accepts RDF graphs, which may well restrict it to a subset of what an ontology permits.

You can also do some pretty nice things with the way OWL works, like this:

:thing0 gist:containedText _:unknown .
:thing1 gist:containedText _:unknown .

You can express naturally that two things contain the same text without stating what that text is, which you couldn't do if you constrained the form and not the essence.

Upvotes: 2

Related Questions