Reputation: 2261
I'm comparing two graphs, one from a Turtle file with simple literal objects, the other from a file with explicit datatype IRIs. The graphs are otherwise equal.
Graph A:
<s> <p> "o"
Graph B:
<s> <p> "o"^^xsd:string
According to RDF 1.1 (3.3 Literals), "[s]imple literals are syntactic sugar for abstract syntax literals with the datatype IRI http://www.w3.org/2001/XMLSchema#string". This is reflected in the concrete syntax specifications as well (N-Triples, Turtle, RDF XML).
So I'd expect both my graphs to consists of a single triple with a URI node s subject, a URI node p predicate, and a literal node o with type xsd:string object. Based on this I'd expect there to be no difference between the two.
However this is not the case in practice:
var graphStringA = "<http://example.com/subject> <http://example.com/predicate> \"object\".";
var graphStringB = "<http://example.com/subject> <http://example.com/predicate> \"object\"^^<http://www.w3.org/2001/XMLSchema#string>.";
var graphA = new Graph();
var graphB = new Graph();
StringParser.Parse(graphA, graphStringA);
StringParser.Parse(graphB, graphStringB);
var diff = graphA.Difference(graphB);
There's one added and one removed triple in the difference report. The graphs are different, because the datatypes for the object nodes are different: graphA.Triples.First().Object.Datatype
is nothing, while graphB.Triples.First().Object.Datatype
is the correct URI.
It appears to me that to modify this behaviour I'd have to either
A workaround is to remove the "default" datatypes:
private static void RemoveDefaultDatatype(IGraph g)
{
var triplesWithDefaultDatatype =
from triple in g.Triples
where triple.Object is ILiteralNode
let literal = triple.Object as ILiteralNode
where literal.DataType != null
where literal.DataType.AbsoluteUri == "http://www.w3.org/2001/XMLSchema#string" || literal.DataType.AbsoluteUri == "http://www.w3.org/2001/XMLSchema#langString"
select triple;
var triplesWithNoDatatype =
from triple in triplesWithDefaultDatatype
let literal = triple.Object as ILiteralNode
select new Triple(
triple.Subject,
triple.Predicate,
g.CreateLiteralNode(
literal.Value,
literal.Language));
g.Assert(triplesWithNoDatatype.ToArray());
g.Retract(triplesWithDefaultDatatype);
}
Is there a way in dotnetrdf to compare simple literals to typed literals in a way that's consistent with RDF 1.1, without resorting to major rewrite or workaround as above?
Upvotes: 4
Views: 524
Reputation: 2261
Using the Handlers API as per RobV's answer:
class StripStringHandler : BaseRdfHandler, IWrappingRdfHandler
{
protected override bool HandleTripleInternal(Triple t)
{
if (t.Object is ILiteralNode)
{
var literal = t.Object as ILiteralNode;
if (literal.DataType != null && (literal.DataType.AbsoluteUri == "http://www.w3.org/2001/XMLSchema#string" || literal.DataType.AbsoluteUri == "http://www.w3.org/2001/XMLSchema#langString"))
{
var simpleLiteral = this.CreateLiteralNode(literal.Value, literal.Language);
t = new Triple(t.Subject, t.Predicate, simpleLiteral);
}
}
return this.handler.HandleTriple(t);
}
private IRdfHandler handler;
public StripStringHandler(IRdfHandler handler) : base(handler)
{
this.handler = handler;
}
public IEnumerable<IRdfHandler> InnerHandlers
{
get
{
return this.handler.AsEnumerable();
}
}
protected override void StartRdfInternal()
{
this.handler.StartRdf();
}
protected override void EndRdfInternal(bool ok)
{
this.handler.EndRdf(ok);
}
protected override bool HandleBaseUriInternal(Uri baseUri)
{
return this.handler.HandleBaseUri(baseUri);
}
protected override bool HandleNamespaceInternal(string prefix, Uri namespaceUri)
{
return this.handler.HandleNamespace(prefix, namespaceUri);
}
public override bool AcceptsAll
{
get
{
return this.handler.AcceptsAll;
}
}
}
Usage:
class Program
{
static void Main()
{
var graphA = Load("<http://example.com/subject> <http://example.com/predicate> \"object\".");
var graphB = Load("<http://example.com/subject> <http://example.com/predicate> \"object\"^^<http://www.w3.org/2001/XMLSchema#string>.");
var diff = graphA.Difference(graphB);
Debug.Assert(diff.AreEqual);
}
private static IGraph Load(string source)
{
var result = new Graph();
var graphHandler = new GraphHandler(result);
var strippingHandler = new StripStringHandler(graphHandler);
var parser = new TurtleParser();
using (var reader = new StringReader(source))
{
parser.Load(strippingHandler, reader);
}
return result;
}
}
Upvotes: 3
Reputation: 28646
dotNetRDF is not RDF 1.1 compliant nor do we claim to be. There is a branch which is rewritten to be compliant but it is not remotely production ready.
Assuming that you control the parsing process you can customise the handling of incoming data using the RDF Handlers API. You can then strip the implicit xsd:string
type off literals as they come into the system by overriding the HandleTriple(Triple t)
method as desired.
Upvotes: 3