Using certifi module with urllib2?

Question

I'm having trouble downloading https pages with the urllib2 module, which seems to result from urllib2's inability to access the system's certificate store.

To get around this issue, one possible solution is to download https web pages with pycurl, by using the certifi module. The following is an example of doing so:

def download_web_page_with_curl(url_website):
    from pycurl import Curl, CAINFO, URL
    from certifi import where
    from cStringIO import StringIO

    response = StringIO()
    curl = Curl()
    curl.setopt(CAINFO, where())
    curl.setopt(URL, url_website)
    curl.setopt(curl.WRITEFUNCTION, response.write)
    curl.perform()
    curl.close()
    return response.getvalue()

Is there a way to use certifi with urllib2 (in a fashion comparable to the pycurl example above), which will permit me to download https sites? Alternatively, is there another feasible urllib2-based workaround which will remedy the permissions issue, without compromising security?

wojtow · Accepted Answer

Would recommend using requests per my other answer. However, to answer the original question of how to do this with urllib2:

import urllib2
import certifi
def download_web_page_with_urllib2(url_website):
    t = urllib2.urlopen(url_website, cafile=certifi.where())
    return t.read()
text = download_web_page_with_urllib2('https://www.google.com/')

The same recommendations about error checking apply.

Using certifi module with urllib2?

Answers (2)

Related Questions