Reputation: 92
Given this RDF:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rdf:RDF [<!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<!ENTITY rdfs 'http://www.w3.org/2000/01/rdf-schema#'>
<!ENTITY xsd 'http://www.w3.org/2001/XMLSchema#'>]>
<rdf:RDF xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="Fadi" xml:startTime="00:01:38" xml:endTime="00:01:39">
<ns0:eat xmlns:ns0="http://example.org/">Apple</ns0:eat>
</rdf:Description>
</rdf:RDF>
when I execute this SPARQL query
SELECT *
WHERE {
?s ?p ?o .
FILTER (regex(?o, 'Apple','i'))
}
I get the subject and predicate:
s: http://example.org/Fadi , p: http://example.org/eat .
but when I execute
SELECT *
WHERE {
?s ?p ?o .
FILTER (regex(?s, 'Fadi','i'))
}
or
SELECT *
WHERE {
?s ?p ?o .
FILTER (regex(?s, 'http://example.org/Fadi','i'))
}
I get nothing. How can i query for subject or predicate?
How can I query about startTime
and endTime
?
Upvotes: 3
Views: 2756
Reputation: 85863
REGEX
is for querying text values, not for matching against resource IRIs. You could use the str
function to get the IRI of a resource, so your filter would look like
FILTER (regex( str( ?s ), 'http://example.org/Fadi','i'))
but that's really not what you want to do here. Since you are looking to retrieve triples of the form
<http://example.org/Fadi> ?p ?o
ask for them with a query like this:
SELECT *
WHERE {
<http://example.org/Fadi> ?p ?o .
}
You can define prefixes in SPARQL queries, too, so if you're using a bunch of terms from one namespace, you can save some typing by, e.g.,
PREFIX ex: <http://example.org/>
SELECT *
WHERE {
ex:Fadi ?p ?o .
}
However, there's still another problem with your example. Your RDF document doesn't have any XML base, the IRI for Fadi
in <rdf:Description rdf:about="Fadi" ...
is unpredictable. A SPARQL engine might resolve it against a filename, creating, for instance /home/user/input.rdf/Fadi
. Either specify an XML base, or use full IRIs for the rdf:about
property. Assuming we add xml:base="http://www.example.org/"
to the rdf:RDF
element, we can run those queries using Jena ARQ command line tools, we get output containing the triples we expect, but also some messages about those startTime
and endTime
attributes:
$ arq --data fadi.rdf --query fadi.sparql
12:13:21 WARN riot :: {W118} XML attribute: xml:startTime is not known and is being discarded.
12:13:21 WARN riot :: {W118} XML attribute: xml:endTime is not known and is being discarded.
----------------------------------------------------
| s | p | o |
====================================================
| <http://www.example.org/Fadi> | ex:eat | "Apple" |
----------------------------------------------------
Those property values need to specified by elements within the rdf:Description
element. I don't think that xml:startTime
and xml:endTime
are meaningful properties; whatever start time and end time mean here, they should probably be specified by different properties, but that's a modeling issue, not a syntax issue. At any rate, we can adjust the input file accordingly to get (with the xml:base
and xml:(start|end)Time
elements):
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rdf:RDF [<!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<!ENTITY rdfs 'http://www.w3.org/2000/01/rdf-schema#'>
<!ENTITY xsd 'http://www.w3.org/2001/XMLSchema#'>]>
<rdf:RDF xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xml:base="http://www.example.org/">
<rdf:Description rdf:about="Fadi">
<ns0:eat xmlns:ns0="http://example.org/">Apple</ns0:eat>
<xml:startTime>00:01:38</xml:startTime>
<xml:endTime>00:01:39</xml:endTime>
</rdf:Description>
</rdf:RDF>
Now when we run the query, we get
$ /usr/local/lib/apache-jena-2.10.0/bin/arq --data fadi.rdf --query fadi.sparql
------------------------------------------------------------------------------------------------
| s | p | o |
================================================================================================
| <http://www.example.org/Fadi> | <http://www.w3.org/XML/1998/namespaceendTime> | "00:01:39" |
| <http://www.example.org/Fadi> | <http://www.w3.org/XML/1998/namespacestartTime> | "00:01:38" |
| <http://www.example.org/Fadi> | ex:eat | "Apple" |
------------------------------------------------------------------------------------------------
which seems like what you wanted. More specific queries, e.g., for the Fadi's start and end times, are easy to construct too. Using the startTime
and endTime
properties as they appear so far (even though they should be refactored into a different namespace later), we have:
PREFIX ex: <http://www.example.org/>
PREFIX xml: <http://www.w3.org/XML/1998/namespace>
SELECT *
WHERE {
ex:Fadi xml:startTime ?start ;
xml:endTime ?end .
}
which produces
$ /usr/local/lib/apache-jena-2.10.0/bin/arq --data fadi.rdf --query fadi.sparql
---------------------------
| start | end |
===========================
| "00:01:38" | "00:01:39" |
---------------------------
Upvotes: 12
Reputation: 16630
?s is a URI and regex works on strings. Use the str function to get a string:
FILTER (regex(str(?s), 'Fadi','i'))
Upvotes: 7