Why python requests get a 404 error?

Question

I try to use requests library to get a content from an URL. In more details, I do it in the following way:

import requests

proxies = {'http':'my_proxy.blabla.com/'}
r = requests.get(url, proxies = proxies)
print r.text

As a result I get the following:




 
  404 - Not Found
 
 
  404 - Not Found

So, it looks like the proxy let me go and I reached the server. However, the web server was unable to interpret my request (wrong path or so) and did not know what content to return. Do I interpret it correctly?

What can be the reason for that? I do get the expected content if I put the URL in one of my browsers.

ADDED

It has been suggested in the comments that the root of the problem is in the headers. So, I used this web site: http://www.procato.com/my+headers/ to find out what headers are sent by my browser. I used these values to set the headers variable given to the requests.get function. I set the values for the following keys: 'User-Agent', 'Accept', 'Referer', 'Accept-Encoding', 'Accept-Language', 'X-Forwarded-For', 'Cache-Control', 'Connection'. Unfortunately, it does not resolve the problem. I am still getting the same 404 response.

ADDED 2

I have tested my function for tow different URLs and got exactly the same response. So, my previous assumption that the responses (XML that I see) comes from the web-server is probably wrong. It is unlikely that two completely different web-servers (one of them was Google) generate the same responses.

So, now I do not understand where the XML comes from. Can it be that it comes from the proxy server?

Why python requests get a 404 error?

Answers (1)

Related Questions