Sumanth
Sumanth

Reputation: 507

Get links from summary section of wikipedia page

I am trying to extract links from the summary section of a wikipedia page. I tried the below methods :

This url extracts all the links of the Deep learning page: https://en.wikipedia.org/w/api.php?action=query&prop=links&titles=Deep%20learning

And for extracting links associated to any section I can filter based on the section id - for e.g.,

for the Definition section of same page I can use this url : https://en.wikipedia.org/w/api.php?action=parse&prop=links&page=Deep%20learning&section=1

for the Overview section of same page I can use this url : https://en.wikipedia.org/w/api.php?action=parse&prop=links&page=Deep%20learning&section=2

But I am unable to figure out how to extract only the links from summary section

enter image description here

I even tried using pywikibot to extract linkedpages and adjusting plnamespace variable but couldn't get links only for summary section.

Upvotes: 1

Views: 751

Answers (2)

xqt
xqt

Reputation: 333

You can use Pywikibot with the following commands

>>> import pywikibot
>>> from pwikibot import textlib
>>> site = pywikibot.Site('wikipedia:en')  # create a Site object
>>> page = pywikibot.Page(site, 'Deep learning')  # create a Page object
>>> sect = textlib.extract_sections(page.text, site)  # divide content into sections
>>> links = sorted(link.group('title') for link in pywikibot.link_regex.finditer(sect.head))

Now links is a list containing all link titles in alphabethical order. If you prefer Page objects as result you may create them with

>>> pages = [pywikibot.Page(site, title) for title in links]

It's up to you to create a script with this code snippets.

Upvotes: 2

smartse
smartse

Reputation: 1721

You need to use https://en.wikipedia.org/w/api.php?action=parse&prop=links&page=Deep%20learning&section=0

Note that this also includes links in the {{machine learning bar}} and {{Artificial intelligence|Approaches}} templates however (to the right of the screen).

Upvotes: 2

Related Questions