Reputation: 2390
I am having troubles opening links that have the HTTPS protocol, with urlib. I'm running python 2.7.1 in Ubuntu, using a home network (no proxy).
It returns and error all the time, it works if I change to HTTP, what am I missing here?
Code sample:
from BeautifulSoup import *
import urllib
url = "https://path/file.html"
html = urllib.urlopen(url).read()
Returned Error:
Traceback (most recent call last):
File "/home/.../links.py", line 4, in <module> html = urllib.urlopen(url).read()
html = urllib.urlopen(url).read()
File "/usr/lib/python2.7/urllib.py", line 87, in urlopen
return opener.open(url)
File "/usr/lib/python2.7/urllib.py", line 213, in open
return getattr(self, name)(url)
File "/usr/lib/python2.7/urllib.py", line 443, in open_https
h.endheaders(data)
File "/usr/lib/python2.7/httplib.py", line 1048, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 892, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 854, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 1273, in connect
server_hostname=server_hostname)
File "/usr/lib/python2.7/ssl.py", line 352, in wrap_socket
_context=self)
File "/usr/lib/python2.7/ssl.py", line 579, in __init__
self.do_handshake()
File "/usr/lib/python2.7/ssl.py", line 808, in do_handshake
self._sslobj.do_handshake()
IOError: [Errno socket error] [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)
Upvotes: 0
Views: 2093
Reputation: 2390
I found out the solution for this, SSL code has to be set.
This part was missing in my code!
import requests
import json
import ssl
scontext = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
req = urllib.urlopen(url, context=scontext)
html = req.read()
` This way it's possible to go through HTTPS websites
Upvotes: 3
Reputation: 142641
It is not best answer.
I had this problem only with servers which use incorrect SSL certificate - like https://pygame.org/
.
In request
there is option to disable certificate verification.
import requests
r = requests.get("https://pygame.org", verify=False)
html = r.content
With verification script doesn't work. Without verification script show warning but works.
But I didn't find this option in urllib
.
Upvotes: 0