nicole.torres
nicole.torres

Reputation: 327

How to classify Wikidata items?

I am trying to classify items into the main categories supported by Wikidata: Generic, Person, Organization, Events, Works, Terms, Place, Others. These categories are listed here: https://www.wikidata.org/wiki/Wikidata:List_of_properties

I could not find a property that specifies the main category. I looked into the P31 "instance of" property and P279 "subclass of" but they are not what I need.

For example for "IBM" the P31 returns "public company" and "software house" and for "Swiss International Air Lines" it returns "airline". So I cannot tell that they are both organizations.

Is there a way to do this?

One option would be to check the properties of an item, so if an item has the P21 "sex or gender" then it's a human (or animal). But I don't think that is stable since no property is mandatory.

I'm using the Wikidata Toolkit for my queries.

Upvotes: 4

Views: 925

Answers (1)

Addshore
Addshore

Reputation: 598

Wikidata used to have a main type property but it was deleted in favour of instance of and a more flexible schema. You can see lots of archived discussion about the main type at https://www.wikidata.org/wiki/Property_talk:P107

You probably want to take a look at the SPARQL endpoint at http://query.wikidata.org

Q4830453 is business enterprise / company. To find all items that are a company or a subclass of company just do:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT DISTINCT ?item
WHERE {
    ?item wdt:P31/wdt:P279* wd:Q4830453
}

The query takes a little time, there are currently 150k results.

Upvotes: 8

Related Questions