Petter
Petter

Reputation: 783

Wikipedia API - get random page(s)

I'm trying to get a JSON result with a set of random pages from Wikipedia, including their titles, content and images.

I've played around with their API sandbox, and so far the best I've got is this:

https://en.wikipedia.org/w/api.php?action=query&list=random&format=json&rnnamespace=0&rnlimit=10

But this only includes the namespace, id, and title of ten random pages. I would like to get the content as well as images as well.

Do anyone know how?

Alternatively I could do with the title, content and image url's of a single random page. Best I've got here is:

https://en.wikipedia.org/w/api.php?action=query&generator=random&format=json

Upvotes: 18

Views: 17754

Answers (2)

mancini0
mancini0

Reputation: 4703

If you'd rather use their REST api,

curl -X GET "https://en.wikipedia.org/api/rest_v1/page/random/summary"

Documentation

Upvotes: 11

svick
svick

Reputation: 244848

You're close. generator=random is the right way to go. You can then use various prop values to get the info you want:

  • Page title is always included.

  • To get the text, use prop=revisons along with rvprop=content.

  • To get all images used on the page, use prop=images.

    Note that this will often include images you're probably not interested in, like icons and flags. To fix that, you might try instead prop=pageimages, though it doesn't seem to work always. Or you could try using both.

So, the final query could look like this:

https://en.wikipedia.org/w/api.php?format=json&action=query&generator=random&grnnamespace=0&prop=revisions|images&rvprop=content&grnlimit=10

Upvotes: 23

Related Questions