Dev-otchka
Dev-otchka

Reputation: 337

Sparql query from Dbpedia and another graph returns less results than expected

I'm a beginner in SPARQL and I'm working on this endpoint http://spcdata.digitpa.gov.it:8899/sparql. I'd like to join data from the DBpedia graph. I'm using the property owl:sameAs for referencing to DBpedia resources.

I'd like to fetch the name and population values of all cities falling in the class pa:Comune for which a dbp:populationTotal value is defined. Here is my query:

PREFIX pa:  <http://spcdata.digitpa.gov.it/> 
PREFIX rdf: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT ?label ?populationTotal WHERE {
  ?s a pa:Comune .
  ?s rdf:label ?label .
  ?s owl:sameAs ?sameAs .
  ?sameAs dbp:populationTotal ?populationTotal .
}
ORDER BY ?label

Unfortunately, though results are correct, I get only a very small subset of them. I've checked and there are many more municipalities that have a reference on DBpedia with a value for property dbp:populationTotal. I've tried with all different sponge values but the results are still the same. I guess the problem might be I'm fetching data from another graph, but I don't know what to do.


EDIT: i've tried this query after the suggestion of Ian Dickinson, and it works!

PREFIX pa:  <http://spcdata.digitpa.gov.it/> 
PREFIX rdf: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT DISTINCT ?label ?sameAs ?populationTotal WHERE {
  ?s a pa:Comune .
  ?s rdf:label ?label .
  ?s owl:sameAs ?sameAs .
FILTER (REGEX(STR(?sameAs), "dbpedia", "i")).
  SERVICE <http://dbpedia.org/sparql> 
  {
  ?sameAs dbp:populationTotal ?populationTotal .
   }
} LIMIT 1700

Unfortunately, there are 8000+ muncipalities in Italy, so I had to cap the results (hence the LIMIT 1700, which is the higher number of hits I can get without having a timeout.).

Upvotes: 0

Views: 543

Answers (1)

Ian Dickinson
Ian Dickinson

Reputation: 13295

It's not clear to me what data source your Virtuoso endpoint is connected to, but there are not many places with a population total in your dataset. The following query returns only 28 results:

PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT distinct *  WHERE {
  ?sa dbo:populationTotal ?total
}

As you observe, the same query run against the DbPedia SPARQL endpoint returns many more results. I can only surmise that you have loaded only a subset of the data into the Virtuoso graph that you have put up at http://spcdata.digitpa.gov.it:8899/sparql.

Upvotes: 2

Related Questions