sap
sap

Reputation: 1228

SPARQL BIND inside UNION is too slow

Given an IMDb ID, i want to get a list of directors and actors for that movie from Wikidata.

The problem is, I want to UNION both the director and actor query into a single column while also providing a new column with the role of director or actor.

Pretty easy query overall: first I get the movie entity from the IMDb ID, then I get all the directors from that movie followed by getting all the actors from that movie and UNION them together while filling a new column (?role) with the role.

This is what I have:

PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?person ?personLabel ?role ?imdb WHERE
{
  ?movie wdt:P345 "tt0110912" .
  { ?movie p:P57 ?cast .
    ?cast ps:P57 ?person .
    BIND("director" as ?role) .
  } UNION {
    ?movie p:P161 ?cast .
    ?cast ps:P161 ?person .
    BIND("actor" as ?role) . }
  
  ?person wdt:P345 ?imdb .
  OPTIONAL { ?cast prov:wasDerivedFrom ?ref . }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?person ?personLabel ?role ?imdb
ORDER BY DESC(?role)
LIMIT 100

This works and gives the result I want, problem is it takes about 10secs. If I remove the BINDs its instant speed, but I don't get a column with the roles.

Upvotes: 3

Views: 759

Answers (2)

Joshua Taylor
Joshua Taylor

Reputation: 85813

I'd write this using values instead of bind and union. The idea is that you're saying when the properties are one thing, then ?role is one thing, and when the properties are another, ?role is another. The easy way to do that with values is something like:

select ?owner ?pet ?petType {
  values (?hasPet ?petType) { 
    (:hasCat "cat")
    (:hasDog "dog")
  }
  ?owner ?hasPet ?pet
}

In your case, this would be:

PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?person ?personLabel ?role ?imdb WHERE
{
  ?movie wdt:P345 "tt0110912" .

  values (?p ?ps ?role) {
    (p:P161 ps:P161 "actor")
    (p:P57 ps:P57 "director")
  }
  ?movie ?p ?cast .
  ?cast ?ps ?person .

  ?person wdt:P345 ?imdb .
  OPTIONAL { ?cast prov:wasDerivedFrom ?ref . }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?person ?personLabel ?role ?imdb
ORDER BY DESC(?role)
LIMIT 100

When I run this at query.wikidata.org, it produces 35 results almost instantly.

Upvotes: 2

UninformedUser
UninformedUser

Reputation: 8465

I guess that BIND leads to some problems with the query optimizer. You can try as an alternative to bind the role outside of the UNION clause, i.e.

PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?person ?personLabel ?role ?imdb WHERE
{
  ?movie wdt:P345 "tt0110912" .
  ?person wdt:P345 ?imdb .
  { 
     ?movie p:P57 ?c1 . ?c1 ps:P57 ?person .
     ?movie p:P57 ?cast .
  } UNION {
     ?movie p:P161 ?c2 . ?c2 ps:P161 ?person .
     ?movie p:P161 ?cast . 
  }
  BIND(IF(bound(?c1), "director", "actor") as ?role)

  OPTIONAL { ?cast prov:wasDerivedFrom ?ref . }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?person ?personLabel ?role ?imdb
ORDER BY DESC(?role)
LIMIT 100

(If you do not the ?ref variable, you can omit the triple patterns to retrieve the ?cast in the UNION clauses.)

Upvotes: 2

Related Questions