Amir Pournasserian
Amir Pournasserian

Reputation: 1674

Extract all types and their labels in English from DBPedia

I'm trying to get all types from DBpedia using this SPARQL query:

select ?type {
   ?type a owl:Class .
}

Now, I want to also include the English label of each type returned by the query. What do I need to add to my query?

Upvotes: 2

Views: 5722

Answers (2)

Joshua Taylor
Joshua Taylor

Reputation: 85883

This is a good opportunity to learn a bit more about how to retrieve arbitrary information from DBpedia. Your first query (with a limit added) is:

select ?type {
   ?type a owl:Class .
}
limit 10

SPARQL results

One of the results is http://dbpedia.org/ontology/Animal, which you can actually visit in a web browser, and the corresponding page will display all of that resources properties. For animal, there aren't all that many, but the ones of interest to us are

rdfs:label  Tier
rdfs:label  animal
rdfs:label  animal
rdfs:label  žival
rdfs:label  동물

The property that we're interested in here is rdfs:label, so we can extend the query to

select ?type ?label {
   ?type a owl:Class .
   ?type rdfs:label ?label .
}
limit 10

which we can actually abbreviate a little bit, using the semicolon:

select ?type ?label {
   ?type a owl:Class ;
         rdfs:label ?label .
}
limit 10

SPARQL results

That query, though will return multiple results for each ?type; in fact, one per ?label, so we get results including:

http://dbpedia.org/ontology/Animal  "Tier"@de
http://dbpedia.org/ontology/Animal  "animal"@en

Notice that the labels aren't simply strings, but are RDF literals with language tags. In SPARQL, we can get the language tag of an RDF literal (if it has one) using the lang function. It is possible to compare the language tag to "en" with the = operator, but a more robust solution is to use langMatches, which will handle trickier cases like the one given in the documentation where

filter langMatches( lang(?title), "FR" )

can be used to find select both the following values for ?title, whereas filter( lang(?title) = "fr" ) would find only the first:

"Cette Série des Années Soixante-dix"@fr
"Cette Série des Années Septante"@fr-BE

Using langMatches, lang, and filter, we can update the query once more to

select ?type ?label {
   ?type a owl:Class ;
         rdfs:label ?label .
   filter(langMatches(lang(?label),"EN"))
}
limit 10

SPARQL Results

which retrieves DBpedia types and their English labels.

Upvotes: 24

Antoine Zimmermann
Antoine Zimmermann

Reputation: 5505

Try this:

SELECT ?type (STR(?l) AS ?label) {
   ?type a owl:Class;
         rdfs:label  ?l .
   FILTER (LANG(?l) = "en")
}

Upvotes: 6

Related Questions