David
David

Reputation: 47

urlerror and ssl.CertificateError

I have the following code:

from urllib.request import urlopen
from urllib.error import HTTPError, URLError
from bs4 import BeautifulSoup

# target = "https://www.rolcruise.co.uk/cruise-detail/1158731-hawaii-round-trip-honolulu-2020-05-23"
target = "https://www.rolcruise.co.uk"

try:
    html = urlopen(target)
except HTTPError as e:
    print("You got a HTTP Error. Something wrong with the path.")
    print("Here is the error code: " + str(e.code))
    print("Here is the error reason: " + e.reason)
    print("Happy for the program to end here"
except URLError as e:
    print("You got a URL Error. Something wrong with the URL.")
    print("Here is the error reason: " + str(e.reason))
    print("Happy for the program to end here")
else:
    bs_obj = BeautifulSoup(html, features="lxml")
    print(bs_obj)

If I deliberately make a mistake in typing certain parts of the url, the urlerror handling works fine, i.e. if I deliberately type "htps" instead of "https", or "ww" instead of "www", or "u" instead of "uk". e.g.

target = "https://www.rolcruise.co.u"

However if there is a mistake in the typing of the hostname ("rolcruise") or in the "co" part of url, urlerror does not work and I get an error message that says ssl.CertificateError. e.g.

target = "https://www.rolcruise.c.uk"
  1. I do not understand why URLError doesn't cover all scenarios where there is a typo somewhere in a url?

  2. Given that it is happening, what is the next move to handle the ssl.CertificateError?

Thanks for your help!

Upvotes: 1

Views: 99

Answers (1)

JacobIRR
JacobIRR

Reputation: 8946

Get ssl into your namespace to start:

import ssl

Then you can catch that kind of exception:

try:
    html = urlopen(target)
except HTTPError as e:
    print("You got a HTTP Error. Something wrong with the path.")
    print("Here is the error code: " + str(e.code))
    print("Here is the error reason: " + e.reason)
    print("Happy for the program to end here"
except URLError as e:
    print("You got a URL Error. Something wrong with the URL.")
    print("Here is the error reason: " + str(e.reason))
    print("Happy for the program to end here")
except ssl.CertificateError:
     # Do your stuff here...
else:
    bs_obj = BeautifulSoup(html, features="lxml")
    print(bs_obj)

Upvotes: 1

Related Questions