Reputation: 31
I'm trying to download images via URL's stored in a .txt file using Python 3 and I'm getting an error when trying to do so on some websites.This is the error i get:
File "C:/Scripts/ImageScraper/ImageScraper.py", line 14, in <module>
dl()
File "C:/Scripts/ImageScraper/ImageScraper.py", line 10, in dl
urlretrieve(URL, IMAGE)
File "C:\Python34\lib\urllib\request.py", line 186, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "C:\Python34\lib\urllib\request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "C:\Python34\lib\urllib\request.py", line 469, in open
response = meth(req, response)
File "C:\Python34\lib\urllib\request.py", line 579, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python34\lib\urllib\request.py", line 507, in error
return self._call_chain(*args)
File "C:\Python34\lib\urllib\request.py", line 441, in _call_chain
result = func(*args)
File "C:\Python34\lib\urllib\request.py", line 587, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
using this code:
from urllib.request import urlretrieve
def dl():
with open('links.txt', 'r') as input_file:
for line in input_file:
URL = line
IMAGE = URL.rsplit('/',1)[1]
urlretrieve(URL, IMAGE)
if __name__ == '__main__':
dl()
I'm assuming its because they do not allow 'bots' to access their website, but with some research I found out there is a way around, atleast when using urlopen, but I cant manage to apply the workaround to my code when I'm using urlretrieve. Is it possible to get this to work?
Upvotes: 3
Views: 249
Reputation: 3585
I think the error is an actual HTTP Error : 403, saying Access is forbidden to that URL. You might want to try and print the URL before it is accessed and try accessing the URL through your browser. You should also get a forbidden error (403). Learn more about http_status_codes and specifically 403 forbidden
Upvotes: 1