Double AA
Double AA

Reputation: 5969

Sensing password protected sites when using urllib in python

Hi I have a long series of urls of images (eg. site.com/pic.jpg) which I am retrieving in order for my program (in Python v2.6). I'm using urllib.urlretreive(). Sometimes the url prompts me for a username and password. So I placed urllib.urlretreive() in a try/except to avoid those urls but I still need to insert a fake username and password to prompt the error that triggers the try/except to skip that url. Is there a way I can sense when there is a password request and skip the url automatically? It's a very long list and I don't want to be waiting here the whole time to push enter occasionally... Thanks

Upvotes: 2

Views: 1205

Answers (1)

Brent Newey
Brent Newey

Reputation: 4509

If the site has HTTP authentication, you need to add a header to your request to insert a username and password (fake or otherwise). Here's how you can do this using urllib2.

import base64
import urllib2

headers = {'Authorization': 'Basic ' + base64.encodestring('[username]:[password]')}
req = urllib2.Request(url, data, headers)
resp = urllib2.urlopen(req).read()

This will return urllib2.HTTPError: HTTP Error 401: Unauthorized if the username/password is incorrect, but the server will ignore the authentication if it is not required.

Upvotes: 2

Related Questions