Wizzard
Wizzard

Reputation: 12712

urllib2 failing with https websites

Using urllib2 and trying to get an https page, it keeps failing with

Invalid url, unable to resolve

The url is https://www.domainsbyproxy.com/default.aspx but I have this happening on multiple https sites.

I am using python 2.7, and below is the code I am using to setup the connection

opener = urllib2.OpenerDirector()
opener.add_handler(urllib2.HTTPHandler())
opener.add_handler(urllib2.HTTPDefaultErrorHandler())
opener.addheaders = [('Accept-encoding', 'gzip')]
fetch_timeout = 12
response = opener.open(url, None, fetch_timeout)

The reason I am setting handlers manually is because I don't want redirects handled (which works fine). The above works fine for http requests, however https - fails.

Any clues?

Upvotes: 1

Views: 5603

Answers (2)

Burhan Khalid
Burhan Khalid

Reputation: 174748

If you don't mind external libraries, consider the excellent requests module. It takes care of these quirks with urllib.

Your code, using requests is:

import requests
r = requests.get(url, headers={'Accept-encoding': 'gzip'}, timeout=12)

Upvotes: 2

drewag
drewag

Reputation: 94803

You should be using HTTPSHandler instead of HTTPHandler

Upvotes: 6

Related Questions