Mechanize response returns no content

Question

I'm using Mechanize in Python to perform some web scraping. Most of the website works but one particular page doesn't return any Content or Response.

My settings are

self._browser = mechanize.Browser()
self._browser.set_handle_refresh(True)  
self._browser.set_debug_responses(True)
self._browser.set_debug_redirects(True)  
self._browser.set_debug_http(True)

and the code to execute is:

response = self._browser.open(url)

This is the debug output:

add_cookie_header
Checking xyz.com for cookies to return
- checking cookie path=/
 - checking cookie 
   it's a match
send: 'GET /page.aspx?leagueID=39 HTTP/1.1
Accept-Encoding: identity
Host: xyz.com
Cookie: ASP.NET_SessionId=aapg9wnavh3yqyrtg1v3ar45
Connection: close
User-Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2

'
reply: 'HTTP/1.1 200 OK
'
header: Date: Tue, 07 Feb 2012 19:04:37 GMT
header: Pragma: no-cache
header: Expires: -1
header: Connection: close
header: Cache-Control: no-cache
header: Content-Length: 0
extract_cookies: Date: Tue, 07 Feb 2012 19:04:37 GMT
Pragma: no-cache
Expires: -1
Connection: close
Cache-Control: no-cache
Content-Length: 0

I've tried with and without Redirect to no avail. Any ideas?

I might add the page works fine in a browser.

jcollado · Accepted Answer

The procedure to find out what's the problem usually is this one:

Capture your web browser traffic when successfully opening the url
Capture python traffic when trying to open the url

For the first step, there are many tools available. For example, in firefox, HttpFox and Live HTTP Headers might be quite useful.

For the second step, programmatically logging the headers being sent/received should be enough.

For both steps, you can also capture traffic in your network card with something like wireshark.

Mechanize response returns no content

Answers (1)

Related Questions