Reputation: 692
I am trying to figure out how to only read in each line that is a url from a website, every time I run the code I get the error:
AttributeError: module 'urllib' has no attribute 'urlopen'
My code is below
import os
import subprocess
import urllib
datasource = urllib.urlopen("www.google.com")
while 1:
line = datasource.readline()
if line == "": break
if (line.find("www") > -1) :
print (line)
li = ['www.apple.com', 'www.google.com']
os.chdir('..')
os.chdir('..')
os.chdir('..')
os.chdir('Program Files (x86)\\LinkChecker')
for s in li:
os.system('Start .\linkchecker ' + s)
Upvotes: 1
Views: 426
Reputation: 4093
The AttributeError was because it should be urllib.request.urlopen
instead of urllib.urlopen
.
Apart from the AttributeError
mentioned in the question, I faced 2 more errors.
ValueError: unknown url type: 'www.google.com'
Solution: Rewrite the line defining datasource
as follows where the https
part is included:
datasource = urllib.request.urlopen("https://www.google.com")
TypeError: a bytes-like object is required, not 'str' in the line ' if (line.find("www") > -1) :`.
The overall solution code is:
import os
import urllib
datasource = urllib.request.urlopen("https://www.google.com")
while 1:
line = str(datasource.read())
if line == "": break
if (line.find("www") > -1) :
print (line)
li = ['www.apple.com', 'www.google.com']
os.chdir('..')
os.chdir('..')
os.chdir('..')
os.chdir('Program Files (x86)\\LinkChecker')
for s in li:
os.system('Start .\linkchecker ' + s)
Upvotes: 0
Reputation: 4920
This is very simple example.
This works in Python 3.2 and greater.
import urllib.request
with urllib.request.urlopen("http://www.apple.com") as url:
r = url.read()
print(r)
For reference, go through this question. Urlopen attribute error.
Upvotes: 1