Reputation: 1532
How can I get a list of all organisations from DBpedia? By "organisation", I mean a entity of any type that is either a organisation or any subclass of organisation.
I found the question How to get all companies from DBPedia? but this doesn't work in the current DBpedia SPARQL web version and I wasn't able to adapt the query.
Upvotes: 4
Views: 1528
Reputation: 2431
You can get all organisations with a query like this, giving you English label and Wikipedia page for those resources that have it:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX o: <http://dbpedia.org/ontology/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?orgURI ?orgName ?Wikipedia_page
WHERE {
?orgURI a o:Organisation .
OPTIONAL { ?orgURI rdfs:label ?orgName .
FILTER (lang(?orgName) = "en") }
OPTIONAL { ?orgURI ^foaf:primaryTopic ?Wikipedia_page }
}
ORDER BY ?orgName
This will currently return 350033 results for those resources that are classified as http://dbpedia.org/ontology/Organisation
.
To get also the members of subclasses of http://dbpedia.org/ontology/Organisation
, you can change the first pattern by turning the property into a property path going though zero or more rdfs:subClassOf
:
?orgURI a/rdfs:subClassOf* o:Organisation
Upvotes: 3
Reputation: 2277
To simply get all resources that are an instance of dbo:Organization
or its subclass:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?org { ?org a/rdfs:subClassOf* dbo:Organisation . }
However, as the question you linked shows, DBpedia has a cap on how many results are returned. So, as in the answer to said question, you can use a subquery with LIMIT
and OFFSET
to get all the results in chunks:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?org {
SELECT DISTINCT ?org {
?org a/rdfs:subClassOf* dbo:Organisation .
} ORDER BY ?org
}
LIMIT 10000 OFFSET 0
This would get you the first 10000 results. To get the next 10000, just add 10000 to the offset: LIMIT 10000 OFFSET 10000
. Then, the next 10000 with OFFSET 20000
, and so on.
Upvotes: 5