Binita Ghimire
Binita Ghimire

Reputation: 11

BeautifulSoup Assignment got error module 'urllib' has no attribute 'urlopen' Can anyone provide solutions for this?

I am trying to do an assignment: write a Python program that expands on http://www.py4e.com/code3/urllinks.py. The program will use urllib to read the HTML from the data files below, extract the href= vaues from the anchor tags, scan for a tag that is in a particular position relative to the first name in the list, follow that link and repeat the process a number of times and report the last name you find. Actual problem: Start at: http://py4e-data.dr-chuck.net/known_by_Kylen.html Find the link at position 18 (the first name is 1). Follow that link. Repeat this process 7 times. The answer is the last name that you retrieve. Hint: The first character of the name of the last page that you will load is: P[enter image description here][1]

#Code I used:
import re
import urllib 
import urllib.request
import urllib.parse
import urllib.error
from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl

ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

url = input('Enter URL:')
count = int(input('Enter count:'))
position = int(input('Enter position:'))-1
html = urllib.urlopen(url, context=ctx).read()

soup = BeautifulSoup(html,"html.parser")
href = soup('a')
#print href

for i in range(count):
    link = href[position].get('href', None)
    print(href[position].contents[0])
    html = urllib.urlopen(link).read()
    soup = BeautifulSoup(html,"html.parser")
    href = soup('a')

But got an error: html = urllib.urlopen(url, context=ctx).read() AttributeError: module 'urllib' has no attribute 'urlopen'

Can anyone provide solutions for this?

Upvotes: 1

Views: 341

Answers (1)

haigou
haigou

Reputation: 28

You imported urlopen already, but never used it. Instead you used urllib.urlopen which doesn't exist.

Instead of using urllib.urlopen just use urlopen

Example:

from urllib.request import urlopen

# before: html = urllib.urlopen(url, context=ctx).read()
html = urlopen(url, context=ctx).read()

Upvotes: 1

Related Questions