Reputation: 47
I have the following code:
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
from bs4 import BeautifulSoup
# target = "https://www.rolcruise.co.uk/cruise-detail/1158731-hawaii-round-trip-honolulu-2020-05-23"
target = "https://www.rolcruise.co.uk"
try:
html = urlopen(target)
except HTTPError as e:
print("You got a HTTP Error. Something wrong with the path.")
print("Here is the error code: " + str(e.code))
print("Here is the error reason: " + e.reason)
print("Happy for the program to end here"
except URLError as e:
print("You got a URL Error. Something wrong with the URL.")
print("Here is the error reason: " + str(e.reason))
print("Happy for the program to end here")
else:
bs_obj = BeautifulSoup(html, features="lxml")
print(bs_obj)
If I deliberately make a mistake in typing certain parts of the url, the urlerror handling works fine, i.e. if I deliberately type "htps" instead of "https", or "ww" instead of "www", or "u" instead of "uk". e.g.
target = "https://www.rolcruise.co.u"
However if there is a mistake in the typing of the hostname ("rolcruise") or in the "co" part of url, urlerror does not work and I get an error message that says ssl.CertificateError. e.g.
target = "https://www.rolcruise.c.uk"
I do not understand why URLError doesn't cover all scenarios where there is a typo somewhere in a url?
Given that it is happening, what is the next move to handle the ssl.CertificateError?
Thanks for your help!
Upvotes: 1
Views: 99
Reputation: 8946
Get ssl into your namespace to start:
import ssl
Then you can catch that kind of exception:
try:
html = urlopen(target)
except HTTPError as e:
print("You got a HTTP Error. Something wrong with the path.")
print("Here is the error code: " + str(e.code))
print("Here is the error reason: " + e.reason)
print("Happy for the program to end here"
except URLError as e:
print("You got a URL Error. Something wrong with the URL.")
print("Here is the error reason: " + str(e.reason))
print("Happy for the program to end here")
except ssl.CertificateError:
# Do your stuff here...
else:
bs_obj = BeautifulSoup(html, features="lxml")
print(bs_obj)
Upvotes: 1