Dominik
Dominik

Reputation: 2813

extract city data from dbpedia or LinkedGeoData

I'm trying now for a couple of hours to figure out how to get various informations out of dbpedia or LinkedGeoData. I used this interface (http://dbpedia.org/snorql) and tried a different approaches, but I never got the result that I need.

If I use something lik this:

SELECT * WHERE {
?subject rdf:type <http://dbpedia.org/ontology/City>.
    OPTIONAL {
        ?subject <http://dbpedia.org/ontology/populationTotal> ?populationTotal.
    }
    OPTIONAL {
        ?subject <http://dbpedia.org/ontology/populationUrban> ?populationUrban.
    }
    OPTIONAL {
        ?subject <http://dbpedia.org/ontology/areaTotal> ?areaTotal.
    }
    OPTIONAL {
        ?subject <http://dbpedia.org/ontology/populationUrbanDensity> ?populationUrbanDensity.
    }
    OPTIONAL {
        ?subject <http://dbpedia.org/ontology/isPartOf> ?isPartOf.
    }
    OPTIONAL {
        ?subject <http://dbpedia.org/ontology/country> ?country.
    }
    OPTIONAL {
        ?subject <http://dbpedia.org/ontology/utcOffset> ?utcOffset.
    }
    OPTIONAL {
        ?subject <http://dbpedia.org/property/janHighC> ?utcOffset.
    }
    OPTIONAL {
        ?subject <http://dbpedia.org/property/janLowC> ?utcOffset.
    }
}
LIMIT 20

I run out of limits.

I also tried this:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT *  WHERE {
  ?subject rdf:type <http://dbpedia.org/ontology/City>.
  ?subject rdfs:label ?label.
FILTER ( lang(?label) = 'en'
}
LIMIT 100

But that give me en error, which I don't understand. If I remove the FILTER, it works but give me the labels in all languages...

What I'm looking for is something like this http://dbpedia.org/page/Vancouver But not all the data, but some of it like population, area, coutry, elevation, lat, long, timezone, label@en, abstract@en etc.

Can someone help me to get working syntax?

Thanks for y'all help.


UPDATE:

I got it to work so far with:

SELECT DISTINCT *
WHERE {
   ?city rdf:type dbpedia-owl:Settlement ;
         rdfs:label ?label;
         dbpedia-owl:abstract ?abstract ;
         dbpedia-owl:populationTotal ?pop ;
         dbpedia-owl:country ?country ;
         dbpprop:website ?website .
   FILTER ( lang(?abstract) = 'en' && lang(?label) = 'en')
}
LIMIT 20

But still running out of limits if I want to get all settlements. Btw. is there a way to get all cities and settlements in one table?

Upvotes: 1

Views: 1767

Answers (1)

mgs
mgs

Reputation: 86

By "run out of limits", do you mean the error "Bandwidth Limit Exceeded URI = '/!sparql/'"? I guess this is a limit set by dbpedia to make sure that it is not flooded with queries that take "forever" to run, and if so, then there is probably not so much you can do. You can ask for data in chunks, using OFFSET, LIMIT and ORDER BY, see http://www.w3.org/TR/rdf-sparql-query/#modOffset.

UPDATE: Yes, this seems to be the way to go: http://www.mail-archive.com/[email protected]/msg03368.html

In the second query the error is a missing parenthesis. This

FILTER ( lang(?label) = 'en'

should be

FILTER ( lang(?label) = 'en')

For your last question, a natural way to collect multiple things/(similiar queries) in one query/table is using UNION, e.g.,

SELECT ?x
WHERE {
  { ?x rdf:type dbpedia-owl:City }
UNION
  { ?x rdf:type dbpedia-owl:Settlement }
}

Upvotes: 1

Related Questions