Reputation: 2126
I'm trying to use Wikidata as an intermediary to get from a unique identifier listed in Wikidata (for example VIAF ID) to a Wikipedia description.
I've managed to piece together this query to get the Wikipedia page ID from a given VIAF ID ("153672966"
below is the VIAF ID for "Southern Illinois University Press"):
SELECT ?pageid WHERE {
?item wdt:P214 "153672966".
[ schema:about ?item ; schema:name ?name ;
schema:isPartOf <https://en.wikipedia.org/> ]
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:endpoint "en.wikipedia.org" .
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam mwapi:generator "allpages" .
bd:serviceParam mwapi:gapfrom ?name .
bd:serviceParam mwapi:gapto ?name .
?pageid wikibase:apiOutput "@pageid" .
}
}
This results in the pageid 9393762
which I am able to lookup in the Wikipedia API and get the introduction text I need using this request:
https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&pageids=9393762
The resulting response includes an unparsed description (explaintext
) taken from the first section in the wikipedia article, so this gets me where I need to be given the language is english.
Now the problem is that I need to use this on a internationalized site where I might not even know upfront which languages might be used in the future. The query against Wikidata is supposed to run as a batch job on the backend, while fetching the actual descriptions from Wikipedia will be done from the frontend and be rendered asynchronously.
Ideally I would want the Wikidata query to return a pageid
for each given language where there is a Wikipedia article available. On the frontend I would then check whether the current active language has a pageid
associated and call the Wikipedia api or render a fallback if no pageid
is given.
In the future I would need to make similar queries with other library related identifiers such as ISNI for example, but I don't imagine that being much different than the current use-case.
Is this a reasonable way to get the job done and how can I expand it to support multiple languages?
Upvotes: 1
Views: 403
Reputation: 2826
To get the explaintext you don't necessarily need the pageid
but the page title is enough.
The page titles in all languages you get from Wikidata with the following query:
SELECT ?item ?title ?site WHERE {
?item wdt:P214 "153672966" .
[ schema:about ?item ; schema:name ?title ;
schema:isPartOf ?site ] .
}
And afterwards you can use Wikipedia API to get the explaintext:
https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&titles=Southern Illinois University Press
The downside of working with page titles is that they are not stable. So you will need to run your batch job regularly to check for renamings of articles.
Upvotes: 2