Matthew H
Matthew H

Reputation: 5879

How to get the URL of a redirect with Python

In Python, I'm using urllib2 to open a url. This url redirects to another url, which redirects to yet another url.

I wish to print out the url after each redirect.

For example

-> = redirects to

A -> B -> C -> D

I want to print the URL of B, C and D (A is already known because it's the start URL).

Upvotes: 35

Views: 43307

Answers (3)

jadelord
jadelord

Reputation: 1765

For Python 3, the solution with urllib is much simpler:

import urllib


def resolve(url):
    return urllib.request.urlopen(url).geturl()

Upvotes: 5

Wooble
Wooble

Reputation: 89897

Probably the best way is to subclass urllib2.HTTPRedirectHandler. Dive Into Python's chapter on redirects may be helpful.

Upvotes: 10

chmullig
chmullig

Reputation: 13406

You can easily get D by just asking for the current URL.

req = urllib2.Request(starturl, datagen, headers)
res = urllib2.urlopen(req)
finalurl = res.geturl()

To deal with the intermediate redirects you'll probably need to build your own opener, using HTTPRedirectHandler that records the redirects.

Upvotes: 48

Related Questions