Suzan Cioc
Suzan Cioc

Reputation: 30097

How to know exact name/URI of some entity in DBpedia?

In examples section for DBpedia article, there is an example query

PREFIX dbprop: <http://dbpedia.org/property/>
PREFIX db: <http://dbpedia.org/resource/>
SELECT ?who, ?WORK, ?genre WHERE { 
 db:Tokyo_Mew_Mew dbprop:author ?who .
 ?WORK  dbprop:author ?who .
 OPTIONAL { ?WORK dbprop:genre ?genre } .
}

about manga series Tokyo Mew Mew. But how would I know that its URI is

http://dbpedia.org/resource/Tokyo_Mew_Mew

and that the "author" property URI is

http://dbpedia.org/property/author

and so on?

Is there some search engine for these URIs or something?

To compare, in Wikidata project I can do search on their main site and deduce, that Tokyo Mew Mew URI postfix is Q392125, because is coincides with last part of Web URL.

How to do the same with DBpedia?

Upvotes: 3

Views: 3589

Answers (3)

jimkont
jimkont

Reputation: 923

The exact algorithm that translates a wikipedia page to a DBpedia URI/IRI is described in http://wiki.dbpedia.org/uri-encoding In most cases it is exactly the same name (as noted above) but special character handling might change a bit.

(disclaimer: a DBpedia dev)

Upvotes: 2

Jeen Broekstra
Jeen Broekstra

Reputation: 22042

One way to do this is using a SPARQL query. In this particular example, what you know beforehand is that you are looking for something called "Tokyo Mew Mew". A simple query like so:

   PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
   SELECT ?x 
   WHERE { 
        ?x rdfs:label ?name . 
        FILTER(bif:contains(?name, "Tokyo Mew Mew*"))  
   }

(small disclaimer: at the time of writing, dbpedia website is down for maintenance, so I have not been able to verify these queries are 100% correct)

will likely give you the desired result. The bif:contains bit in this query, by the way, is a Virtuoso-specific extension to the SPARQL language which does optimized full-text-search.

However, it is of course possible that such a search retrieves more than one possible hit. In that case, you can extend your query to narrow down the result. For example, in this example, since you know that you are looking for a comic, you could extend your query to include this:

   PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
   PREFIX dbo: <http://dbpedia.org/ontology/> 
   SELECT ?x 
   WHERE { 
        ?x a dbo:Comic .
        ?x rdfs:label ?name . 
        FILTER(bif:contains(?name, "Tokyo Mew Mew*"))  
   }

etc.

If you find using SPARQL for this kind of thing a bit daunting, DBPedia offers other ways to access the dataset, including a faceted search interface, which you can use to explore the data.

You can also just guess what the URI might be and then manually see if you're right. For example, in many cases, the DBPedia URI will simply be the name of whatever you're looking for, tacked onto the namespace 'http://dbpedia.org.org/resource/' (with spaces replaced by underscores). The nice thing about Linked Data is that you can just go to that address and see what you get back. Thus, for Tokoy Mew Mew, the URI http://dbpedia.org/resource/Tokyo_Mew_Mew is a good guess, and when you go to this URI with your browser, you will get an overview of what that URI represents, which in this case turns out to be the precise thing you were looking for.

And if it turns out it isn't the exact one you're looking for, there is usually an entry on that page that tells what other entries disambiguate to the resource you landed on. Clicking that and browsing a bit usually gets you to the resource you are looking for quite quickly. More generally speaking, browsing the DBPedia resources via your browser is a good way to familiarize yourself a bit with the data structure, as you can quickly see what properties and relations are available, what the typing hierarchy looks like, etc.

If you use this manual browsig technique, there is one caveat: DBPedia redirects requests for a resource, to a page about that resource. So if you type in 'http://dbpedia.org/resource/Tokyo_Mew_Mew', you will be redirected to 'http://dbpedia.org/page/Tokyo_Mew_Mew. The actual URI you need for the data resource, however, is the first one.

Upvotes: 2

Joshua Taylor
Joshua Taylor

Reputation: 85823

In addition to Jeen Broekstra's fairly comprehensive answer, note that DBpedia information is extracted from Wikipedia data. In general, if there's a Wikipedia article with the name Foobar, with the URL

        https://en.wikipedia.org/wiki/Foobar,

then the corresponding DBpedia resource is

        http://dbpedia.org/resource/Foobar.

Browsing that interactively (noting that in the browser, you'll get redirected to a /page/ URL instead of the /resource/ URL), you can see the properties. There are three main families of properties:

  • Raw Infobox properties with the namespace http://dbpedia.org/property/, which are sort of "dirty", in the sense that they're just the raw data values. You might get some interconnected links, but mostly you'll have literal values, and those might not be normalized, sanitized, etc.
  • Infobox ontology properties with the namespace http://dbpedia.org/ontology/. These are the results of more sophisticated infobox mappings, and the data in these are much cleaner, and generally preferred over the raw infobox properties, if they are available.
  • Everything else. These tend to be from well known vocabularies, like Dublin Core, FOAF, RDFS, OWL, etc.

Upvotes: 3

Related Questions