Jonas Sourlier
Jonas Sourlier

Reputation: 14435

What order does Wikidata's SPARQL endpoint use when there is no `ORDER BY` clause?

The following query loads twenty cities from Wikidata, together with the country or state of which they are the capital:

SELECT ?item ?capitalOf WHERE {
 ?item wdt:P31/wdt:P279* wd:Q515.
 OPTIONAL { ?item wdt:P1376 ?capitalOf. }
}
LIMIT 20 OFFSET 0

The results are Q60 (New York), Q62 (San Francisco), Q64 (Berlin), Q84 (London) etc.

Now set the OFFSET parameter to 1, and you get the same list starting at Q62. Index 0 is omitted, as expected. With OFFSET set to 2, you get the same list starting at index 2.

However, sometimes you get a completely different result. For example, I just got the list Q2807, Q2861, Q2900 ... when using OFFSET 70, but this list didn't overlap the list from OFFSET 60. It seems that there is some randomness in the LIMIT and OFFSET query clauses.

What is the default sort order of SPARQL results?

The reason I'm asking: We are using some queries that need to be loaded with LIMIT and ORDER BY, because the number of results is so big. Moreover, these queries run into timeouts when using an ORDER BY.

Upvotes: 2

Views: 226

Answers (1)

TallTed
TallTed

Reputation: 9434

Just as with SQL, there is no default sort order of SPARQL results.

The SPARQL processor is allowed to return solutions in any order, which may (but typically will not) vary with each execution of the query.

The only way to be certain of the order of solutions is to include an ORDER BY clause.

When running expensive (e.g., long-running) queries, the best way to address this is to set up your own local repository/processor/endpoint, instead of using a public endpoint, such as that at wikidata.org.

Upvotes: 2

Related Questions