user3436624
user3436624

Reputation: 764

sparql / dbpedia regarding extracting rdf:type person

I'd like to extract all the dpbedia entries of rdf:type person using some things called dbpedia and sparql which I barely understand.

I was mostly successful with the following (varying the offset). Is there a better way? I'd like to basically get all the examples of people from the English wikipedia.

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX : <http://dbpedia.org/resource/>
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX dbo: <http://dbpedia.org/ontology/>

SELECT ?name ?birth ?description ?person WHERE {
     ?person dbo:birthDate ?birth .
     ?person foaf:name ?name .
     ?person rdfs:comment ?description .
     FILTER (LANG(?description) = 'en') .
}
ORDER BY ?name
OFFSET 100

Upvotes: 1

Views: 1893

Answers (1)

Joshua Taylor
Joshua Taylor

Reputation: 85813

You're going about it in roughly the right way, though you should OFFSET and LIMIT, so that you can paginate the results (and of course, for OFFSET and LIMIT to be useful, you need to keep using the ORDER BY). You're using more prefixes than you need, though. You only use three, so you only need to declare those three. Finally, you can specially ask for things of type Person. There are 1649645 of them.

select (count(*) as ?n) where {
 ?person a dbo:Person 
}

1649645

Finally, you should check the languages of strings with langMatches, not =. The webservice that you can work with interactively defines some prefixes, so I usually follow those. You might also want to select only English names, and probably order by the URI, since the names aren't always perfect:

select ?person ?name ?birth ?description where {
  ?person a dbo:Person ;
          foaf:name ?name ;
          dbo:birthDate ?birth ;
          dbo:abstract ?description
  filter langMatches(lang(?name),'en')
  filter langMatches(lang(?description),'en')
}
order by ?person
offset 100
limit 50

SPARQL results

Of course, if you want lots of data, you might want to just download it and store it locally. See DBpedia 2014 Downloads.

Upvotes: 2

Related Questions