RadaKk
RadaKk

Reputation: 188

SPARQL: Distinct with different data types

I have a request in SPARQL on DBPEDIA, I would like to get actors with their birth date, name ... For instance:

select ?actor ?name ?birthDate where { 
    ?actor <http://purl.org/linguistics/gold/hypernym> dbr:Actor ;     
    rdfs:label ?name ; dbo:birthDate ?birthDate .
    filter(?name = "Tom Cruise"@en)
} LIMIT 5

I get this result: enter image description here

My concern is about the data type of birthDate, I am looking for a way to declare a DISTINCT type insensitive and so have only one result in the previous request.

Any ideas?

Upvotes: 1

Views: 444

Answers (1)

Median Hilal
Median Hilal

Reputation: 1531

First, I think that DBpedia uses only xsd:date data type for birthdates, to make sure, you can try out this query:

select distinct datatype(?birthDate)  where { 
  ?actor <http://purl.org/linguistics/gold/hypernym> dbr:Actor ;     
  dbo:birthDate ?birthDate .   
} 

The problem is that some of the values are dirty, as mentioned in the comments, they need to be cleansed. There is some workaround for that, but not sure if it is fine for you.

First, you should guarantee that all formats of the same intended date are unified, so that DISTINCT can be used to filter.

For some reason, that I am not really aware of, xsd:dateTime shows some tolerance in practice, while it should take values of yyyy-mm-dd ...., it accepts values of the form yyyy-m-d ..... As a consequence, convert ?birthDate to xsd:dateTime, and then to xsd:date. For example, try select xsd:date (xsd:dateTime ("2000-1-1")) {}, it results in "2000-01-01"^^xsd:date. Somehow, it just works.

Then, and as some data are dirty, you have no option, but to get rid of these data, i.e., values like 2000-0-0 should be excluded. To do this you should make sure that transferring the value of ?birthDate to the required format succeeds. To this end ( coalesce(xsd:dateTime(xsd:date(?birthDate)), '!') ) would do it, as it returns '!' if the ?birthDate cannot be cast.

I don't have a working query, but this should, in principle, help.

Upvotes: 2

Related Questions