AnonymousMe
AnonymousMe

Reputation: 569

SPARQL - Extract Last part of a URI

I have a column of URIs from different domains. Example,

http://comicmeta.org/cbo/category
http://purl.org/dc/terms/hasVersion
http://schema.org/contributor

and so on. I want to extract the last part, i.e, the string after the last slash '/' on each such URI.

Expected results on the above list of URIs:

category
hasVersion
contributor

How do I write a generic SPARQL query to extract this last part from any given URI?

This is what I have tried so far:

SELECT distinct ?s ?x WHERE { 
    ?s ?p ?o .
    BIND (STRBEFORE(STRAFTER(STR(?s),"/"), " ") as ?x) .
    #To extract the part after the slash '/' and before the end of string indicated by a space ' '. 

}

But, this only returns empty strings "".

How can I make this work? Can someone help me with this?

Upvotes: 1

Views: 1219

Answers (1)

IS4
IS4

Reputation: 13207

Using REPLACE is the way:

BIND (REPLACE(STR(?s), "^.*/([^/]*)$", "$1") as ?x)

This replaces the whole string with only the part found after the last / character. Note however that, due to your examples, not all vocabularies use / as the delimiter; some also use #. Something like http://www.w3.org/1999/02/22-rdf-syntax-ns#type will be turned into 22-rdf-syntax-ns#type

If you do not want that, you could use something a bit more complicated:

BIND (REPLACE(STR(?s), "^.*?([_\\p{L}][-_\\p{L}\\p{N}]*)$", "$1") as ?x)

This selects the longest part from the end based on what usually is a valid XML name.

Upvotes: 4

Related Questions