duhaime
duhaime

Reputation: 27594

Retrieving all people from the Wikipedia API

I'm wanting to find all people in the Wikipedia database using their API. So far, my approach to this task has been to use a query to fetch all people who belong to a category, such as:

https://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=json&list=categorymembers&cmlimit=100&cmtitle=Category:French_revolutionaries

This approach requires me to know that French_revolutionaries is a category of people. My question is: How can one retrieve all people (not just the people in a category) from the API?

One approach I considered was to start with an arbitrary category, such as French_revolutionaries. For each member of that category, retrieve their information and the other categories to which they belong, and then search for those categories in the same fashion, operating in this recursive fashion until there were no new categories to fetch. This wouldn't work if the network isn't fully connected, though, and is less elegant than I'd like.

Is there a straightforward way to find all people in the wikipedia database? Can dbpedia provide that array? I'm downloading a SQL dump of the Wiki category data right now, but wanted to raise the question in case others know of a fast solution. Any help others can offer will be very appreciated!

Upvotes: 1

Views: 1417

Answers (2)

innovimax
innovimax

Reputation: 560

Perhaps with Wikidata

SELECT ?person WHERE { ?person wdt:P31 wd:Q5 }
limit 100

https://query.wikidata.org/#SELECT%20%3Fperson%20WHERE%20%7B%20%3Fperson%20wdt%3AP31%20wd%3AQ5%20%7D%0Alimit%20100

Upvotes: 3

Joshua Taylor
Joshua Taylor

Reputation: 85853

The question is tagged with Sparql, so I assume you're open to Sparql- based solutions. Is there a problem with a query like

select * { ?person a dbo:Person }

SPARQL Results

Upvotes: 2

Related Questions