Fernando Ferreira
Fernando Ferreira

Reputation: 808

Yago ontology for entity disambiguation

I am using the propriety rdfs:type equal to dbpedia-owl:Organisation for selecting (obviously) organizations on my SPARQL query:

 SELECT ?s
 WHERE {
     ?s a dbpedia-owl:Organisation .
 } LIMIT 10

I would like to consider the YAGO ontology for increasing my performance on getting real organizations. For example, the FBI (http://dbpedia.org/resource/Federal_Bureau_of_Investigation) is not considered as a dbpedia-owl:Organisation but is tagged as yago:Organization108008335 .

Note the "random" (at least for me) number in the end of the class name. Does anyone know what this number stands for? How do I suppose to know it a priori?

Moreover, when I look for more classes with this format (using the query below), I can find two more classes: http://dbpedia.org/class/yago/Organization108008335, http://dbpedia.org/class/yago/Organization101008378, http://dbpedia.org/class/yago/Organization101136519

SELECT DISTINCT ?t WHERE {
    ?s a ?t
    FILTER(regex(str(?t), "http://dbpedia.org/class/yago/Organization\\d+"))
}

Are they different? Why aren't they all "yago:Organization". Should I expect "new" organization classes as new versions of YAGO ontologies are made available? Is there any other class I should consider when selecting Organizations?

Upvotes: 3

Views: 598

Answers (1)

Daniel Garijo
Daniel Garijo

Reputation: 959

I have been digging into this lately, so I'll try to answer all your questions one by one:

Note the "random" (at least for me) number in the end of the class name. Does anyone know what this number stands for? How do I suppose to know it a priori?

That number corresponds to the synset id of the word in Wordnet. For example, if you look up wordnet_organization_101136519 in wordnet (the URI in the dbpedia is not resolvable at this moment, maybe they have changed something in the last releases), you will see that it has a synsetID "101136519". I don't think you can know it a priori without looking into wordnet.

Are they different? Why aren't they all "yago:Organization".

They are different because they have a different definition in wordnet. For example:

wordnet_organization_101136519: "the activity or result of distributing or disposing persons or things properly or methodically 'his organization of the work force was very efficient'". Example of an instance: Bogo-Indian_Defence. See more details here

wordnet_organization_101008378: "the act of organizing a business or an activity related to a business 'he was brought in to supervise the organization of a new department'". Example of an instance: Adam_Smith_Foundation. See more details here

If you follow the links I provided you can see more differences and common similarities.

Should I expect "new" organization classes as new versions of YAGO ontologies are made available?

When they generated Yago they associated every word in wordnet to a URI. If more words about organizations are added, then I guess that you'll have more definitions. However it is impossible to know beforehand.

Is there any other class I should consider when selecting Organizations?

You can look for all the classes with the label "organization" in wordnet and then add optionals to your query (or issue one query per class retrieving the different organizations you are interested in). These are the classes with the "organization" label in Wordnet.

I hope it helps.

Upvotes: 4

Related Questions