Reputation: 24731
I'm using the Requests library in Python. In the browser, my URL loads okay. In Python, it throws a 403.
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access /admin/license.php on this server.</p>
<p>Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.</p>
</body></html>
This is my own site, and I don't have any robot protection on it that I know of. I made the PHP file that I'm loading and it's just a simple database query. In the root of the site, I have a WordPress site with default settings. However, I'm not sure if that's relevant.
My code:
import requests
url = "myprivateurl.com"
r = requests.get(url)
print r.text
Does anyone have any guesses why it's throwing a 403 by Python and not by browser?
Thanks so much.
Upvotes: 4
Views: 4452
Reputation: 4406
Adding headers to the request worked for me:
req = urllib.request.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7')
response = urllib.request.urlopen(req)
data = response.read() # a `bytes` object
html = data.decode('utf-8') # a `str`; this step can't be used if data is binary
return html
Upvotes: 4
Reputation: 24731
After contacting my web host, and having the ticket upgraded to level 2 support, they disabled mod_security and it works fine now. Not sure if this is a bad thing, but that fixed it.
Upvotes: 3
Reputation: 189337
myprivateurl.com
is not a valid URL. Firefox goes through a number of user-friendly behaviors to guess at what you actually mean, and (depending somewhat on resolver results etc) eventually ends up at something like http://myprivateurl.com/
. Requests does not do this; you have to pass in a real, valid URL.
Upvotes: 0