Nemo
Nemo

Reputation: 77

Using urllib2 to fetch internet resources, get http 402 error

I tried to use urllib2 to fetch an zip file from a subtitle website.

The example website is http://sub.makedie.me and I tried to download this file http://sub.makedie.me/download/601943/Game%20of%20Thrones%20-%2005x08%20-%20Hardhome.KILLERS.English.HI.C.orig.Addic7ed.com.zip

I tested in my script and print the url. The url was fine. I copied and pasted in the web browser and I could download it successfully.

At first, the script looked like this:

    try:
        f = urllib2.urlopen(example_url)
        f.read()
        something...
    except URLError, e:
        print e.code

But I got 403 error code. After searching, I tried to change the header to {'User-Agent': 'Mozilla/5.0'}. The code was changed to:

    try:
        req = urllib2.Request(example_url,headers={'User-Agent': 'Mozilla/5.0'})
        f = urllib2.urlopen(req)
        something...
    except URLError, e:
        print e.code

Then I got 402 error. I am wondering is this because of the website setting or because the error in my code?

Upvotes: 0

Views: 1067

Answers (2)

Steve Barnes
Steve Barnes

Reputation: 28405

I would try with:

urllib.urlretrieve(url, outname)

as you are trying to download the file rather than to open it.

Upvotes: 1

Meghdeep Ray
Meghdeep Ray

Reputation: 5537

402 Means the request isn't valid at the moment.

It is reserved for future use.

From http://en.wikipedia.org/wiki/List_of_HTTP_status_codes :

402 Payment Required

Reserved for future use. The original intention was that this code might be used as part of some form of digital cash or micropayment scheme, but that has not happened, and this code is not usually used. YouTube uses this status if a particular IP address has made excessive requests, and requires the person to enter a CAPTCHA.

Hence there might be a CAPTCHA involved which is causing the issue.

Check the Robots.txt file for the site: www.domain_name.com/robots.txt

Upvotes: 1

Related Questions