Philippe G.
Philippe G.

Reputation: 155

What is the difference between UNION and EXISTS filters in a SPARQL query

I'm experimenting with DBpedia SPARQL endpoint and I've noticed a difference between to similar queries using either a UNION or a EXISTS filter.

SELECT (COUNT(?w1) as ?nbWriter) WHERE {
    ?w1 a dbo:Writer; 
    dbo:spouse ?w2 .
    FILTER ( EXISTS {?w2 a dbo:Writer} || EXISTS {?w2 a yago:AmericanNovelists.} )
}

produces result nbWriters=371

while query

SELECT (COUNT(?w1) as ?nbWriter) WHERE {
    ?w1 a dbo:Writer;
    dbo:spouse ?w2 .
    {?w2 a dbo:Writer.} 
    UNION
    {?w2 a yago:AmericanNovelists.} 
}

produces result nbWriters=414

Why is there a difference between these two queries ? Are they not equivalent (see previous question and answer Proper way to add OR clause to SPARQL query) ?

Upvotes: 5

Views: 952

Answers (1)

svick
svick

Reputation: 244988

The second query does not count distinct writers. For example, it counts Robert Lowell four times because:

  1. his spouse Lady Caroline Blackwood was a writer
  2. his spouse Jean Stafford was a writer
  3. his spouse Jean Stafford was an American novelist
  4. his spouse Elizabeth Hardwick was an American novelist

But the first query is also incorrect, it counts Robert Lowell three times, because:

  1. his spouse Lady Caroline Blackwood was a writer
  2. his spouse Jean Stafford was a writer and an American novelist
  3. his spouse Elizabeth Hardwick was an American novelist

Using DISTINCT on either query will give you the right answer (364):

SELECT (COUNT(DISTINCT ?w1) as ?nbWriter)

In general, to find what is the cause of an error in such queries, list all the results instead of just counting them.

Upvotes: 5

Related Questions