Alg_D
Alg_D

Reputation: 2390

python - Opening HTTPS links fails with urllib

I am having troubles opening links that have the HTTPS protocol, with urlib. I'm running python 2.7.1 in Ubuntu, using a home network (no proxy).

It returns and error all the time, it works if I change to HTTP, what am I missing here?

Code sample:

from BeautifulSoup import *

import urllib

url = "https://path/file.html"

html = urllib.urlopen(url).read()

Returned Error:

Traceback (most recent call last): 
  File "/home/.../links.py", line 4, in <module> html = urllib.urlopen(url).read()
    html = urllib.urlopen(url).read()
  File "/usr/lib/python2.7/urllib.py", line 87, in urlopen
    return opener.open(url)
  File "/usr/lib/python2.7/urllib.py", line 213, in open
    return getattr(self, name)(url)
  File "/usr/lib/python2.7/urllib.py", line 443, in open_https
    h.endheaders(data)
  File "/usr/lib/python2.7/httplib.py", line 1048, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python2.7/httplib.py", line 892, in _send_output
    self.send(msg)
  File "/usr/lib/python2.7/httplib.py", line 854, in send
    self.connect()
  File "/usr/lib/python2.7/httplib.py", line 1273, in connect
    server_hostname=server_hostname)
  File "/usr/lib/python2.7/ssl.py", line 352, in wrap_socket
    _context=self)
  File "/usr/lib/python2.7/ssl.py", line 579, in __init__
    self.do_handshake()
  File "/usr/lib/python2.7/ssl.py", line 808, in do_handshake
    self._sslobj.do_handshake()
IOError: [Errno socket error] [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)

Upvotes: 0

Views: 2093

Answers (2)

Alg_D
Alg_D

Reputation: 2390

I found out the solution for this, SSL code has to be set.

This part was missing in my code!

import requests
import json
import ssl

scontext = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
req = urllib.urlopen(url, context=scontext)
html = req.read()

` This way it's possible to go through HTTPS websites

Upvotes: 3

furas
furas

Reputation: 142641

It is not best answer.

I had this problem only with servers which use incorrect SSL certificate - like https://pygame.org/.

In request there is option to disable certificate verification.

import requests

r = requests.get("https://pygame.org", verify=False)

html = r.content

With verification script doesn't work. Without verification script show warning but works.

But I didn't find this option in urllib.

Upvotes: 0

Related Questions