Equation_Charmer
Equation_Charmer

Reputation: 85

Using wikipedia module in python

I am using wikipedia module in my python code. I would like to have an input from user to search from wikipedia and get 2 lines from its summary. Since there might be many topics with same name, I used like this.

import wikipedia
value=input("Enter what u want to search")
m=wikipedia.search(value,3)
print(wikipedia.summary(m[0],sentences=2))

While executing this its showing some 3 pages of exceptions. Whats wrong with this? Edit: As suggested by @ Ruperto, I changed the code like this.

import wikipedia
import random
value=input("Enter the words: ")
try:
    p=wikipedia.page(value)
    print(p)
except wikipedia.exceptions.DisambiguationError as e:
    s=random.choice(e.options)
    p=wikipedia.summary(s,sentences=2)
    print(p)

Now the error I get is,

Traceback (most recent call last):   File "C:\Users\vdhan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 160, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw   File "C:\Users\vdhan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\util\connection.py", line 84, in create_connection
    raise err   File "C:\Users\vdhan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\util\connection.py", line 74, in create_connection
    sock.connect(sa) TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

During handling of the above exception, another exception occurred:

Traceback (most recent call last):   File "C:\Users\vdhan\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 677, in urlopen
    chunked=chunked, urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x03AEEAF0>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

What to do now?

Upvotes: 2

Views: 3195

Answers (3)

just_another_beginner
just_another_beginner

Reputation: 159

I encountered a similar problem, and after a lot of head-scratching and googling, found this solution:

import wikipediaapi as api
import wikipedia as wk

# Wikipediaapi 'initialization'
wiki_wiki = api.Wikipedia('en')


# Getting fixed number of sentences from summary
def summary(pg, sentences=5):
    summ = pg.summary.split('. ')
    summ = '. '.join(summ[:sentences])
    summ += '.'
    return summ


s_term = 'apple'# Any term, ambiguous or not
wk_res = wk.search(s_term)
page = wiki_wiki.page(wk_res[0])
print("Page summary", summary(page))

Basically, from what I've seen you don't get a good solution with just the wikipedia module. For example, If I were to search 'India', I'd never be able to get the page for India the country, which was what I wanted. This happens because the title of India(Country)'s wikipedia page is just titled 'India'. However, that title is invalid due to the number of things it could refer to. This case applies to a lot of other things as well.

However, wiki_wiki_.page can get a page with an ambiguous title, which is the system this code depends on.

Upvotes: 1

think-maths
think-maths

Reputation: 967

The above error is due to the connectivity issue of the internet. However the below code works

value=input("Enter the words: ")
try:
    m=wikipedia.search(value,3)
    print(wikipedia.summary(m[0],sentences=2))
except wikipedia.exceptions.DisambiguationError as e:
    s=random.choice(e.options)
    p=wikipedia.summary(s,sentences=2)
    print(p)

However a note of caution here would be that since this is a part of a larger code block it would be better doing a abstractive or extractive summarization using any NLP library as wikipdia package just uses beautifulsoup and soupsieve for web scraping and reverts the just few top lines, in a way that is not summarization. Also the content on wikipedia can change every 2 hours

Upvotes: 1

ashraful16
ashraful16

Reputation: 2782

It may due to No/Poor internet connection, as your error says,

A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

You can change/check your internet conncetion and try again. Neither, it is problem of your python environment. My implementation is,

import warnings
warnings.filterwarnings("ignore")

import wikipedia
import random


value=input("Enter the words: ")
try:
    m=wikipedia.search(value,3)
    print(wikipedia.summary(m[0],sentences=2))
    # print(p)
except wikipedia.exceptions.DisambiguationError as e:
    s=random.choice(e.options)
    p=wikipedia.summary(s,sentences=2)
    print(p)

Output:

Enter the words: programming
Program management or programme management is the process of managing several related projects, often with the intention of improving an organization's performance. In practice and in its aims, program management is often closely related to systems engineering, industrial engineering, change management, and business transformation.

It works fine in google colab, my implementation colab file you can find here.

Upvotes: 2

Related Questions