Susmita Sadhu
Susmita Sadhu

Reputation: 67

SPARQL query to get all Person available in DBpedia is showing only some Person data, not all

I am writing SPARQL query to get all Person available in DBpedia. My query is ->

 PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
 PREFIX  dbo: <http://dbpedia.org/ontology/>
 PREFIX  dbp: <http://dbpedia.org/property/>

SELECT ?resource ?name
WHERE {
    ?resource  rdf:type  dbo:Person;
               dbp:name ?name.  
    FILTER (lang(?name) = 'en')
  }
ORDER BY ASC(?name)

It's giving around 10000 rows,when I am taking the output as HTML/csv/spreadsheet format. But when I am giving query to get total count

PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX  dbo: <http://dbpedia.org/ontology/>
PREFIX  dbp: <http://dbpedia.org/property/>

SELECT COUNT(*)
WHERE{
    ?resource  rdf:type  dbo:Person;
               dbp:name ?name.  
    FILTER (lang(?name) = 'en')
 }

It's giving -> 1783404

Can anyone suggest a solution to get all rows of Person available in DBpedia?

Upvotes: 2

Views: 2025

Answers (1)

scotthenninger
scotthenninger

Reputation: 4001

DBPedia is being smart enough here to not overload its servers with large queries, and capping matches at 10000. Since you are ordering the results, you can use LIMIT and OFFSET to get result in sets of 10000. For example, to get the second set of 10000 results use this:

PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX  dbo: <http://dbpedia.org/ontology/>
PREFIX  dbp: <http://dbpedia.org/property/>

SELECT ?resource ?name
WHERE {
  ?resource  rdf:type  dbo:Person;
             dbp:name ?name.  
  FILTER (lang(?name) = 'en')
}
ORDER BY ASC(?name)
LIMIT 10000 OFFSET 10000

Actually, since DBPedia is limiting the results to 10000 matches, the LIMIT isn't really necessary.

Upvotes: 5

Related Questions