Reputation: 101083
I am trying to open an https URL using the urlopen
method in Python 3's urllib.request
module. It seems to work fine, but the documentation warns that "[i]f neither cafile
nor capath
is specified, an HTTPS request will not do any verification of the serverโs certificate".
I am guessing I need to specify one of those parameters if I don't want my program to be vulnerable to man-in-the-middle attacks, problems with revoked certificates, and other vulnerabilities.
cafile
and capath
are supposed to point to a list of certificates. Where am I supposed to get this list from? Is there any simple and cross-platform way to use the same list of certificates that my OS or browser uses?
Upvotes: 11
Views: 25478
Reputation: 11
I was looking for a way to make this work out-of-the-box, without installing new modules.
I noticed that pip
itself maintains an internal certifi
module (see Lib/site-packages/pip/_vendor/certifi
). Using this one would remove the need to install certifi
yourself (pip
is still required, but it's likely that everyone has it)
import ssl
from urllib import request
from pip._vendor import certifi # use embedded pip._vendor.certifi
ctx = ssl.create_default_context(cafile=certifi.where())
with request.urlopen('https://your-url', context=ctx) as req:
req.read()
Upvotes: 1
Reputation: 7995
To open an https URL in Python with validation using system certificates (i.e on Windows or macOS), use:
import ssl
from urllib.request import urlopen
ctx = ssl.create_default_context(ssl.Purpose.SERVER_AUTH)
response = urlopen("http://www.example.com", context=ctx)
If there are no system certificates or they aren't in a reliable location, you can use certificates bundled with the certifi package:
import certifi # ๐
import ssl
from urllib.request import urlopen
ctx = ssl.create_default_context(ssl.Purpose.SERVER_AUTH)
ctx.load_verify_locations(cafile=certifi.where()) # ๐
response = urlopen("http://www.example.com", context=ctx)
If you additionally want to allow users to specify their own certificates - in the case that certifi-bundled certificates become out of date - you can allow users to specify the $SSL_CERT_FILE
environment variable to a certificate bundle (which is a convention originating from the OpenSSL library):
import certifi
import os # ๐
import ssl
from urllib.request import urlopen
ctx = ssl.create_default_context(ssl.Purpose.SERVER_AUTH)
ctx.load_verify_locations(cafile=certifi.where())
if (cafile := os.environ.get('SSL_CERT_FILE')) is not None: # ๐
ctx.load_verify_locations(cafile=cafile) # ๐
response = urlopen("http://www.example.com", context=ctx)
All of the above should work on Python 3.8+. Or Python 3.4+ if you rewrite use of the walrus operator (:=
).
Upvotes: 0
Reputation: 1999
Different Linux distributives have different pack names. I tested in Centos and Ubuntu. These certificate bundles are updates with system update. So you may just detect which bundle is available and use it with urlopen
.
import os
cafile = None
for i in [
'/etc/ssl/certs/ca-bundle.crt',
'/etc/ssl/certs/ca-certificates.crt',
]:
if os.path.exists(i):
cafile = i
break
if cafile is None:
raise RuntimeError('System CA-certificates bundle not found')
Upvotes: 1
Reputation: 1685
import certifi
import ssl
import urllib.request
try:
from urllib.request import HTTPSHandler
context = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
context.options |= ssl.OP_NO_SSLv2
context.verify_mode = ssl.CERT_REQUIRED
context.load_verify_locations(certifi.where(), None)
https_handler = HTTPSHandler(context=context, check_hostname=True)
opener = urllib.request.build_opener(https_handler)
except ImportError:
opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', YOUR_USER_AGENT)]
urllib.request.install_opener(opener)
Upvotes: 2
Reputation: 432
Works in python 2.7 and above
context = ssl.create_default_context(cafile=certifi.where())
req = urllib2.urlopen(urllib2.Request(url, body, headers), context=context)
Upvotes: 12
Reputation: 2817
Elias Zamarias answer still works, but gives a deprecation warning:
DeprecationWarning: cafile, cpath and cadefault are deprecated, use a custom context instead.
I was able to solve the same problem this way instead (using Python 3.7.0):
import ssl
import urllib.request
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
response = urllib.request.urlopen("http://www.example.com", context=ssl_context)
Upvotes: 5
Reputation: 101083
I found a library that does what I'm trying to do: Certifi. It can be installed by running pip install certifi
from the command line.
Making requests and verifying them is now easy:
import certifi
import urllib.request
urllib.request.urlopen("https://example.com/", cafile=certifi.where())
As I expected, this returns a HTTPResponse
object for a site with a valid certificate and raises a ssl.CertificateError
exception for a site with an invalid certificate.
Upvotes: 8
Reputation: 123320
You can download the certificates Mozilla in a format usable for urllib (e.g. PEM format) at http://curl.haxx.se/docs/caextract.html
Upvotes: 2