Correct way to group several queries in SPARQL

Question

I have to retrieve quite a lot of data using a remote end point and SPARQL. Problem is: this is terribly slow. I'd like to group my requests in order to diminish the impact of network latency in the global performance scheme.

My queries are very simple:

PREFIX skos: 
SELECT * WHERE
{
   skos:prefLabel ?prefLabel
}

But I am not sure how to group them properly. For example, I guess that:

PREFIX skos: 
SELECT * WHERE
{
  ?id skos:prefLabel ?prefLabel .
  FILTER(?id IN ('my_id1', 'my_id2', 'my_id3'))
}

is a terrible idea since it would make the endpoint skim through all the instances before filtering them.

Any hint on how to implement that request grouping will be greatly appreciated.

RobV · Accepted Answer

Assuming your endpoint supports SPARQL 1.1 you can use the VALUES clause like so:

PREFIX skos: 
SELECT * WHERE
{
  VALUES ( ?id )
  {
    (  )
    (  )
    (  )
    # etc.
  }
  ?id skos:prefLabel ?prefLabel
}

Assuming the underlying SPARQL engine behind your endpoint uses hash joins rather than nested loop joins to evaluate joins with shared variables (I'd be very surprised if any up to date implementation did not) this should be significantly more performant than the equivalent FILTER (?id IN ( , , ) ) form

NB - A good optimizer may translate the FILTER (?id IN ( )) form to something like the above so YMMV depending on the SPARQL engine behind your endpoint.

Correct way to group several queries in SPARQL

Answers (1)

Related Questions