xhgf5
xhgf5

Reputation: 21

SPARQL query mixes up different statement nodes for Q-item

Edit: Please read the comments to this post - the post itself is answered.


The relevant part of my query (retrieving Nobel Prize winners who, at the time of winning, were head of state or head of government) is this:

?nobel_laureate p:P39 ?p39 .   # p39 = "position held"
?p39 ps:P39 ?status .
?p39 pq:P580 ?start_time .
OPTIONAL { ?p39 pq:P582 ?end_time . }
...
FILTER ( ?point_in_time >= ?start_time && ?point_in_time <= ?end_time )

I now want to retrieve the "start time" and "end time" qualifier values for the "position held" statements with value in ?status, e.g., "Prime Minister of Sweden" and "Monarch of Sweden" for Hjalmar Branting, who was a Swedish citizen.

He was Prime Minister of Sweden twice. I have the problem that if I want to check if ?point_in_time is between "start time" and "end time", the different Prime Minister of Sweden statements get mixed up.

What can I do to keep them apart?

Full query

I did not try anything because I am clueless.

Edit: There a several other people from this Wikipedia list which don't show up in the output for reasons I can't find. Barack Obama and Abiy Ahmed are eliminated by the FILTER: For Mikhail Gorbachev and F.W. de Klerk, I don't know why they are missing.

Upvotes: 0

Views: 52

Answers (1)

Marijn
Marijn

Reputation: 1925

Regarding the Abiy Ahmed Ali issue: the filter does not succeed if ?end_time is not bound by the OPTIONAL clause, because then the <= comparison results in an error.

SPARQL has the bound() function for this, which checks if the argument is bound or not. In this case something like (untested):

FILTER ( year(?point_in_time) >= year(?start_time) && (year(?point_in_time) <= year(?end_time) || ! bound(?end_time) ) )

However, QLever does not implement this, so you will get the error:

Invalid SPARQL query: Built-in function "bound" not yet implemented

An alternative is to use COALESCE. This function takes n arguments and returns the result of the first expression that does not trigger an error. So you can check the end date condition, if there is an end date it will produce true or false as required, if there is no end date you go to the second argument, which you need to choose such that it always evaluates to true (since somebody who is still in office qualifies for the list if the start date condition is met).

Code:

PREFIX p: <http://www.wikidata.org/prop/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT DISTINCT ?nobel_laureate_name ?nobel_prize_name ?country_name ?status_name ?year
WHERE
  { { ?country wdt:P31/(wdt:P279)* wd:Q6256 . } UNION { ?country wdt:P31/(wdt:P279)* wd:Q7275 . }
   { ?country wdt:P1906 ?status . } UNION { ?country wdt:P1313 ?status . }
   { ?nobel_laureate wdt:P27 ?country . } UNION { ?nobel_laureate wdt:P27/p:P31 ?p31 . 
    ?p31 ps:P31 wd:Q11514315 .
    ?p31 pq:P642 ?country .
 }
    ?nobel_laureate p:P39 ?p39 .
    ?p39 ps:P39 ?status ;
              pq:P580 ?start_time
              OPTIONAL { ?p39 pq:P582 ?end_time . }
    ?nobel_laureate
              p:P166 ?p166 .
    ?p166 ps:P166 ?nobel_prize .
    ?nobel_prize wdt:P361 wd:Q7191 .
    ?p166 pq:P585 ?point_in_time
    FILTER ( year(?point_in_time) >= year(?start_time) && COALESCE ( year(?point_in_time) <= year(?end_time), 1 ) )
    BIND(year(?point_in_time) AS ?year)
    OPTIONAL { ?nobel_laureate wdt:P18 ?image }
    OPTIONAL { ?country wdt:P41 ?flag_image }
    ?nobel_laureate @en@rdfs:label ?nobel_laureate_name .
    ?nobel_prize @en@rdfs:label ?nobel_prize_name .
    ?status @en@rdfs:label ?status_name .
    ?country @en@rdfs:label ?country_name .
  }
ORDER BY ASC(?year)

Result (pictures removed to save space):

enter image description here

Upvotes: 1

Related Questions