Reputation: 77
I go through the living people category on wikipedia and I collect page images,. The problem is, some images are stored on the wikimedia commons site, whereas some are stored on the original wikipedia:en site. I want to know where the image is stored (if it were stored somewhere else besides en:wiki and commons)
import pywikibot
enwiki = pywikibot.Site("en", "wikipedia")
commons = pywikibot.Site("commons","commons")
page1 = pywikibot.Page(enwiki, "50 Cent")
page2 = pywikibot.Page(enwiki, "0010x0010")
pageimage1 = page1.page_image()
pageimage2 = page2.page_image()
pageimage1.exists() //outputs False (50 Cent page image is stored on commons)
pageimage2.exists() //outputs True (0010x0010 page imaged is stored on wikipedia:en)
This is fine, I can check commons if the wikipedia .exists() outputs False, but I'm worried about a situation the image would be stored on a different site.
I've tried the Page.image_repository attribute, but this returns commons even though the page image does not exist there and is stored on wikipedia:en
Is there a way I can get the original site from the Page object? Because the only way I know this possible is to download the HTML page and parse it, which is way too complicated.
Upvotes: 0
Views: 176
Reputation: 333
As noted by Tgr the best way is to use the FilePage.file_is_shared()
method. To upcast the file you may do:
import pywikibot
def repo_file(filepage):
"""Return a FilePage residing on repository."""
if filepage.file_is_shared():
filepage = pywikibot.FilePage(filepage.site.image_repository(), filepage.title())
return filepage
Using your first sample it will work like this:
site = pywikibot.Site('wikipeda:de')
page1 = pywikibot.Page(site, '50 Cent')
page2 = pywikibot.Page(site, '0010x0010')
img1 = page1.page_image()
img2 = page2.page_image()
Test the site:
img1.site
img2.site
will give
APISite("en", "wikipedia")
APISite("en", "wikipedia")
Now upcast it:
img1 = repo_file(img1)
img2 = repo_file(img2)
Again test the site:
img1.site
img2.site
will give
APISite("commons", "commons")
APISite("en", "wikipedia")
Upvotes: 1