user2360545
user2360545

Reputation: 131

Get information from Wikidata

I have this Wikidata query that returns all the football stadiums with the names, coordinates, club labels and stuff like this. But I cannot figure out how to also get the country and city names where stadiums are located (and possibly the coordinates of the cities too).

Here is my query:

SELECT ?club ?clubLabel ?venue ?venueLabel ?coordinates
WHERE
{
    ?club wdt:P31 wd:Q476028 .
    ?club wdt:P115 ?venue .
    ?venue wdt:P625 ?coordinates .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

Link to test the query

EDIT 19th november 2020:

I need the timezone of the cities so I tried this query after looking at the documentation but it does not return the value. Just links like "wd:Q6723" :

SELECT DISTINCT ?timezone ?club ?locationLabel ?countryLabel ?clubLabel ?venue ?venueLabel ?coordinates 
WHERE
{
    ?venue (wdt:P421|wd:Q12143) ?timezone .
    ?club wdt:P31 wd:Q476028 .
    ?club wdt:P115 ?venue .
    ?venue wdt:P625 ?coordinates .
    OPTIONAL {?club wdt:P159|(wdt:P115/(wdt:P131|wdt:P276)) ?location . 
              OPTIONAL { ?location wdt:P17 ?country . }
              }     
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
} limit 500

Upvotes: 0

Views: 387

Answers (1)

Karl Amort
Karl Amort

Reputation: 16394

Split over two now. Original query:

SELECT DISTINCT ?club ?locationLabel ?countryLabel ?clubLabel ?venue 
?venueLabel ?coordinates 
WHERE {
    ?club wdt:P31 wd:Q476028 .
    ?club wdt:P115 ?venue .
    ?venue wdt:P625 ?coordinates .
    OPTIONAL {
        ?club wdt:P159|(wdt:P115/(wdt:P131|wdt:P276)) ?location . 
        OPTIONAL { ?location wdt:P17 ?country . }
    }     
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
} limit 500

(Update #2: Previously, I asked for the club's timezone. But, of course, that's not the sort of data usually recorded for a club. Instead, you have to go via the location/venue/headquarters or similar, and possibly a level up to region/country because some suburb also doesn't have timezone data.

This is the general idea how it should work, but it's running into a timeout, and so am I:

SELECT DISTINCT ?timezone ?timezoneLabel ?offset 
?club ?clubLabel 
WHERE {
       ?club wdt:P31 wd:Q476028 .
       
       # via country. not perfect, because some have multiple timezones, but shoud be faster  
       ?club wdt:P17/wdt:P421 ?timezone . 
       
        # what I really want to do; all sorts of alternatives 
       #?club wdt:P115?/(wdt:P159|wdt:P276)/wdt:P131?/wdt:P421 ?timezone . 

    ?timezone wdt:P2907 ?offset.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
} limit 500

Short explanation:

This uses three new things. OPTIONAL makes the following statement, well, optional. Clubs where nothing can be found will still be included in the output. The second OPTIONAL ist nested in the first, as it's pointless to ask for the country of a location that we haven't found.

The pipe symbol (|) allows for alternatives. Here, I'm asking for "headquarter location (P159) or check for two different ways to specify the location of the stadium. The slash, used in the latter case, denotes a path (club / venue / "located in district|location").

If there is missing data (there will be missing data), you may want to look at examples and figure out if there are other common patterns that locations are recorded. You could, for example, move the inner OPTIONAL outside for cases where the club has a country statement but no other, more specific, location.

Update: I've included the timezone as requested in the comment. To note:

  • ?timezoneLabel gets the timezone's label (= name), just as ?clubLabel gets the club's. The apppended "...Label" is a "magic" function that translates from IDs to huma-readable labels. It is enabled by including that SERVICE wikibase:label... line.

  • As you might want to use these timezones, I've included the marked line that gets the numeric offset in hours.

  • The offset may vary because UTC doesn't have dalight savings time. There should be multiple lines in the results for such cases, and you would need to read the ''qualifiers'' to see when they apply. Alternatively, maybe substract the offset from some other timezone's offset (i. e. yours) and you might get lucky and they cancel out.

Upvotes: 1

Related Questions