Error when trying to scrape images

Question

I'm trying to download images via URL's stored in a .txt file using Python 3 and I'm getting an error when trying to do so on some websites.This is the error i get:

 File "C:/Scripts/ImageScraper/ImageScraper.py", line 14, in 
 dl()
 File "C:/Scripts/ImageScraper/ImageScraper.py", line 10, in dl
 urlretrieve(URL, IMAGE)
 File "C:\Python34\lib\urllib\request.py", line 186, in urlretrieve
 with contextlib.closing(urlopen(url, data)) as fp:
 File "C:\Python34\lib\urllib\request.py", line 161, in urlopen
 return opener.open(url, data, timeout)
 File "C:\Python34\lib\urllib\request.py", line 469, in open
 response = meth(req, response)
 File "C:\Python34\lib\urllib\request.py", line 579, in http_response
 'http', request, response, code, msg, hdrs)
 File "C:\Python34\lib\urllib\request.py", line 507, in error
 return self._call_chain(*args)
 File "C:\Python34\lib\urllib\request.py", line 441, in _call_chain
 result = func(*args)
 File "C:\Python34\lib\urllib\request.py", line 587, in http_error_default
 raise HTTPError(req.full_url, code, msg, hdrs, fp)
 urllib.error.HTTPError: HTTP Error 403: Forbidden

using this code:

from urllib.request import urlretrieve

def dl():
    with open('links.txt', 'r') as input_file:
        for line in input_file:
            URL = line
            IMAGE = URL.rsplit('/',1)[1]
            urlretrieve(URL, IMAGE)


if __name__ == '__main__':
    dl()

I'm assuming its because they do not allow 'bots' to access their website, but with some research I found out there is a way around, atleast when using urlopen, but I cant manage to apply the workaround to my code when I'm using urlretrieve. Is it possible to get this to work?

Error when trying to scrape images

Answers (1)

Related Questions