Reputation: 569
I have a column of URIs from different domains. Example,
http://comicmeta.org/cbo/category
http://purl.org/dc/terms/hasVersion
http://schema.org/contributor
and so on. I want to extract the last part, i.e, the string after the last slash '/' on each such URI.
Expected results on the above list of URIs:
category
hasVersion
contributor
How do I write a generic SPARQL query to extract this last part from any given URI?
This is what I have tried so far:
SELECT distinct ?s ?x WHERE {
?s ?p ?o .
BIND (STRBEFORE(STRAFTER(STR(?s),"/"), " ") as ?x) .
#To extract the part after the slash '/' and before the end of string indicated by a space ' '.
}
But, this only returns empty strings "".
How can I make this work? Can someone help me with this?
Upvotes: 1
Views: 1219
Reputation: 13207
Using REPLACE
is the way:
BIND (REPLACE(STR(?s), "^.*/([^/]*)$", "$1") as ?x)
This replaces the whole string with only the part found after the last /
character. Note however that, due to your examples, not all vocabularies use /
as the delimiter; some also use #
. Something like http://www.w3.org/1999/02/22-rdf-syntax-ns#type
will be turned into 22-rdf-syntax-ns#type
If you do not want that, you could use something a bit more complicated:
BIND (REPLACE(STR(?s), "^.*?([_\\p{L}][-_\\p{L}\\p{N}]*)$", "$1") as ?x)
This selects the longest part from the end based on what usually is a valid XML name.
Upvotes: 4