Python_Learner
Python_Learner

Reputation: 1637

Get user attribution link from Wikipedia API

I'm a newbit who's been working with the Wikipedia API and have figured out many things, but this one last one is making me crazy.

I've been able to find the wiki pages I need and then following the documentation here leverage the pageids to get direct links to the images.

page_id = '1649237' 
image_url_base = 'https://ja.wikipedia.org/w/api.php?' 
    image_params = {
            "action": "query",
            "format": "json",
            "prop": "images",
            "pageids": page_id
            }
    image_url = wiki_image_url_base + page_id
    r = requests.get(url = wiki_image_url_base, params = image_params).json()
    
    image_file_name = str(r['query']['pages'][ja_page_id]['images'][0]['title'])

And then have been able to use the image_file_name to create a link to the main file, like so:

https://upload.wikimedia.org/wikipedia/commons/9/9e/Flag_of_Japan.svg

This seems to give odd results. For my script I really wanted the top image on a page, but this seems to return various results.

Where I'm stuck is I can't figure out how to get the documentation on this page to work:

https://www.mediawiki.org/wiki/API:Imageinfo

What I'm really after is both a direct link URL for the image and also a link for attribution. If the attribution link gets too complicated I'd even be happy with being able to get the link to the images like this page:

https://en.wikipedia.org/w/index.php?curid=32376184

It seems like this ImageInfo API would work but I can't get it to work... I'm sure it's me...

Thank you for helping me.

Upvotes: 0

Views: 410

Answers (1)

KC.
KC.

Reputation: 3107

I think you have read the Python's example in https://www.mediawiki.org/wiki/API:Imageinfo. And if you try to open the url the document gave, you will easy to figure how it works. Because what to gain after you opened the url is what you can get through requests library(https://www.mediawiki.org/wiki/Special:MyLanguage/API:Imageinfo appears to work badly, but the parameters below works well)

import requests
from pprint import pprint
page_id = '1649237' 
image_url_base = 'https://ja.wikipedia.org/w/api.php' 
image_params = {
            "action": "query",
            "format": "json",
            "prop": "images",
            "pageids": page_id
            }

resp = requests.get(image_url_base, params=image_params)
my_data = resp.json()
pprint(my_data)

first_img = my_data["query"]["pages"][page_id]["images"][0]["title"]

for_img_details = "https://www.mediawiki.org/w/api.php?" # https://ja.wikipedia.org/w/api.php
details_params = {
            "action": "query",
            "titles": "File:{}".format(first_img.split(":")[-1]),
            "prop": "imageinfo",
            "format": "json",
            "iiprop":"timestamp|user|url"
}
# action=query&generator=images&titles=Main%20Page&prop=info
resp2 = requests.get(for_img_details, params=details_params)
pprint(resp2.json())

Upvotes: 0

Related Questions