Reputation: 31
I'm trying to extract buildings from Wikidata using a recursive SPARQL query but I keep getting query timeouts. Is there a way to circumvent this?
This is my current query, selecting all buildings with either a Freebase ID or a Google Knowledge Graph ID, and a Dutch label:
SELECT DISTINCT ?building ?buildingLabel
WHERE {
?building p:P2671|p:P646 ?id;
p:P31/ps:P31/wdt:P279* wd:Q41176;
rdfs:label ?buildingLabel .
FILTER(LANG(?buildingLabel) = 'nl') .
FILTER (?building != ?buildingLabel) .
}
I've tried manually looking a few layers deep instead but, for some reason, I get no results for three or more layers deep even though those definitely exist. I've tried this using:
SELECT ?building
WHERE {
?building p:P31/ps:P31/wdt:P279 [p:P31/ps:P31/wdt:P279 [p:P31/ps:P31/wdt:P279 wd:Q41176]].
}
and using
SELECT ?building
WHERE {
?parent2 p:P31/ps:P31/wdt:P279 wd:Q41176.
?parent1 p:P31/ps:P31/wdt:P279 ?parent2.
?building p:P31/ps:P31/wdt:P279 ?parent1.
}
There are about 2.24 million buildings and about 18 million entities with either a Freebase ID or a Google Knowledge Graph ID on Wikidata. I've looked at this guide but couldn't quite figure out how to apply it to my query. I've also read the answer to this question but, unfortunately, using multiple queries isn't really an option for me.
Upvotes: 3
Views: 152
Reputation: 466
If your intention is to use the "recursive" property path to find things of type building and also types that are subclasses of buildings, your first query using wdt:P279*
is right, while the later attempts at repeating the full p:P31/ps:P31/wdt:P279
pattern won't match any data.
By simplifying the first query a bit I was able to get this to run (returning 96,297 results in 39s):
SELECT DISTINCT ?building ?buildingLabel
WHERE {
?building p:P2671|p:P646 ?id;
wdt:P31/wdt:P279* wd:Q41176 .
?building rdfs:label ?buildingLabel .
FILTER(LANGMATCHES(LANG(?buildingLabel), "nl"))
}
Two notable changes:
p:P31/ps:P31
is replaced by wdt:P31
, removing one join from the query.FILTER
is unnecessary, as ?building
(a URI) and ?buildingLabel
(a string) are necessarily going to be unequalUpvotes: 1