dmitriys
dmitriys

Reputation: 317

Catching SSLError due to unsecure URL with requests in Python?

I have a list of a few thousand URLs and noticed one of them is throwing as SSLError when passed into requests.get(). Below is my attempt to work around this using both a solution suggested in this similar question as well as a failed attempt to catch the error with a "try & except" block using ssl.SSLError:

url = 'https://archyworldys.com/lidl-recalls-puff-pastry/'

session = requests.Session()
retry = Retry(connect=3, backoff_factor=0.5)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)

try:
    response = session.get(url,allow_redirects=False,verify=True)
except ssl.SSLError:
    pass

The error returned at the very end is:

SSLError: HTTPSConnectionPool(host='archyworldys.com', port=443): Max retries exceeded with url: /lidl-recalls-puff-pastry/ (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",),))

When I opened the URL in Chrome, I get a "Not Secure" / "Privacy Error" that blocks the webpage. However, if I try the URL with HTTP instead of HTTPS (e.g. 'http://archyworldys.com/lidl-recalls-puff-pastry/') it works just fine in my browser. Per this question, setting verify to False solves the problem, but I prefer to find a more secure work-around.

While I understand a simple solution would be to remove the URL from my data, I'm trying to find a solution that let's me proceed (e.g. if in a for loop) by simply skipping this bad URL and moving on the next one.

Upvotes: 2

Views: 5913

Answers (1)

Steffen Ullrich
Steffen Ullrich

Reputation: 123330

The error I get when running your code is:

requests.exceptions.SSLError:
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)

Based on this one needs to catch requests.exceptions.SSLError and not ssl.SSLError, i.e.:

try:
    response = session.get(url,allow_redirects=False,verify=True)
except requests.exceptions.SSLError:
    pass

While it looks like the error you get is different this is probably due the code you show being not exactly the code you are running. Anyway, look at the exact error message you get and figure out from this which exception exactly to catch. You might also try to catch a more general exception like this and by doing this get the exact Exception class you need to catch:

try:
    response = session.get(url,allow_redirects=False,verify=True)
except Exception as x:
    print(type(x),x)
    pass

Upvotes: 6

Related Questions