Ryflex
Ryflex

Reputation: 5769

Mechanize erroring out randomly now and again?

I'm using the mechanize library to emulate a browser to get the html as follows, but now and again I keep getting an error...

Code erroring out:

post_url = "http://www.stackoverflow.com/"
browser = mechanize.Browser()
browser.set_handle_robots(False)
browser.addheaders = [('User-agent', 'Firefox')]

html = browser.open(post_url).read().decode('UTF-8')

Error:

Traceback (most recent call last):
  File "C:\test.py", line 1538, in <module>
    periodically(180, -60, +60, getData)
  File "C:\test.py", line 262, in periodically
    s.run()
  File "C:\Python27\lib\sched.py", line 117, in run
    action(*argument)
  File "C:\test.py", line 1241, in getData
    html = browser.open(post_url).read().decode('UTF-8')
  File "build\bdist.win32\egg\mechanize\_mechanize.py", line 203, in open
    return self._mech_open(url, data, timeout=timeout)
  File "build\bdist.win32\egg\mechanize\_mechanize.py", line 255, in _mech_open
    raise response
httperror_seek_wrapper: HTTP Error 500: Internal Server Error
>>> 

Anyone know how to fix it / get around this error?

Upvotes: 1

Views: 2466

Answers (2)

sphere
sphere

Reputation: 1350

HTTP Error 500 means "Internal Server Error".

I guess that you don't have the errors with exact the sample code you provided, correct?

Two possible reasons:

  1. Server has an error e.g. overload/corrupt-database/etc. (unlikely)
  2. You are posting "strange data" and the server (application) doesn't handle it correctly (likely). In theory the server (application) should validate the posted data and send 400 "Bad Request".

I don't think that its related to the mechanize lib.

EDIT If you don't care the reason for that error and just want to catch the Exception you could use:

try:
    html = browser.open(post_url).read().decode('UTF-8')
except mechanize.HTTPError, e:
    # handle http errors explicit by code
    if int(e.code) == 500:
        # do nothing. Maybe you need to set "html" to empy string.
        pass
    else:
        raise e  # if http error code is not 500, reraise the exception

Upvotes: 2

4d4c
4d4c

Reputation: 8159

You can't fix it, only double check if the data that you are parsing is correct.

To get around this use try/except:

from urllib2 import HTTPError

try:
    post_url = "http://www.stackoverflow.com/"
    browser = mechanize.Browser()
    browser.set_handle_robots(False)
    browser.addheaders = [('User-agent', 'Firefox')]
    html = browser.open(post_url).read().decode('UTF-8')

except HTTPError, e:
    print "Got error code", e.code  

Upvotes: 0

Related Questions