G. L. Merebet
G. L. Merebet

Reputation: 97

How to get all links and their Wikidata IDs for a Wikipedia page?

(When) will the following be possible?

Upvotes: 5

Views: 3185

Answers (1)

Termininja
Termininja

Reputation: 7036

To get all Wikipedia page links you have to use Wikipedia API, and to get all Wikidata item properties you need Wikidata API, so it is not possible to create one query with two requests to both APIs. But! The first part of your question is already possible. And about the second... you didn't say anything for this what information you need from Wikidata.

You can get Wikidata IDs and a lot of other information for all Wikipedia page links, like coordinates, refs, internal and external links, images, text content, contributors, history, page rights, categories, templates etc... To do this we can use only Wikipedia API because our entry point is the Wikipedia page, plus generator feature of the API.

For example, this is how to get Wikidata ID, short intro text and the main image for first 20 interwiki links on Dolphin Wikipedia page:

https://en.wikipedia.org/w/api.php?action=query&generator=links&format=xml&redirects=1&titles=Dolphin&prop=pageprops|extracts|pageimages&gpllimit=20&ppprop=wikibase_item&exintro=1&exlimit=20&piprop=name&pilimit=20

Main query parameters:

  • action=query&format=xml&redirects=1&titles=Dolphin
  • generator=links - to get all page links (works together with gpllimit=20)
  • prop=pageprops|extracts|pageimages - what to get from the links

Properties:

  • pageprops - to get Wikidata ID (works with ppprop=wikibase_item)
  • extracts - to get first text lines from that page (works with exintro=1 and exlimit=20)
  • pageimages - to get main image (works with piprop=name and pilimit=20)

In the same way you can get and another information listed here in parameter prop.

Upvotes: 5

Related Questions