Michael
Michael

Reputation: 23

Python Wikipedia library

I'm using python library Wikipedia to parse data. When its get to the second part of the code I'm getting page errors.Page Errors

import wikipedia


print ("1: Searching Wikipedia for 'List of Lexus vehicles'")
try:
    print (wikipedia.page('List of Lexus'))
    print ('-' * 60)
except wikipedia.exceptions.DisambiguationError as e:
    print (str(e))
    print ('+' * 60)
    print ('DisambiguationError: The page name is ambiguous')
print


print ("2: Searching Wikipedia for 'List of Lexus (vehicles)'")
print (wikipedia.page('List of Lexus_(vehicles)'))
print


result = wikipedia.page('List of Lexus_(vehicles)').content.encode('UTF8')
print ("3: Result of searching Wikipedia for 'List of Lexus_(vehicles)':")
print (result)
print

lexus_count = result.count('ct','lfa','rx')
print


print ("The Wikipedia page for 'Lexus_(company)' has " + \
    "{} occurrences of the word 'Lexus'".format(lexus_count))
print

Updated I'm able to parse page data but getting Type error on count

23 print
24
25 lexus_count = result.count('ct','lfa','rx')
26 print
TypError: slice indices must be integers or None or have an __index__ method

Upvotes: 0

Views: 1158

Answers (2)

Rohin Gopalakrishnan
Rohin Gopalakrishnan

Reputation: 664

There was multiple issues with your program. Here is an updated program, with the errors fixed and marked.

import wikipedia


print ("1: Searching Wikipedia for 'Lexus'")
try:
    print (wikipedia.page('Lexus'))
    print ('-' * 60)
except wikipedia.exceptions.DisambiguationError as e:
    print (str(e))
    print ('+' * 60)
    print ('DisambiguationError: The page name is ambiguous')
print


print ("2: Searching Wikipedia for 'Lexus (company)'")
result = wikipedia.page('Lexus (company)') 
# ERR; PAGE NAME SEPARATED BY SPACE NOT WITH AN UNDERSCORE
# <> PAGE ERROR AS PAGE WILL NOT BE FOUND.  
print (result)
print


result = result.content
print ("3: Result of searching Wikipedia for 'Lexus_(company)':")
print (result)
print

lexus_count = result.count('Lexus')
# changed variable name from orange_count -> lexus_count, as referenced in the print function below. 
# you were counting for 'lexus' you will not find any occurrences as this function is case sensitive.
print


print ("The Wikipedia page for 'Lexus_(company)' has " + \
    "{} occurrences of the word 'Lexus'".format(lexus_count))
print

Hope this helps.

Upvotes: 1

Alex C.
Alex C.

Reputation: 31

Which page error exactly are you getting?

According to the wikipedia documentation: https://wikipedia.readthedocs.io/en/latest/quickstart.html#quickstart

But watch out - wikipedia.summary will raise a DisambiguationError if the page is a disambiguation page, or a PageError if the page doesn’t exist (although by default, it tries to find the page you meant with suggest and search.):

Upvotes: 0

Related Questions