GangstaGraham
GangstaGraham

Reputation: 9355

Getting JSON From URLOpen

I cannot consistently get JSON from a given url. It works only about 60% of the time

jsonurl = urlopen('http://www.reddit.com/r/funny/hot.json?limit=16')
r_content = json.load(jsonurl)['data']['children']

The program crashes on the second line sometimes, because the info from the url is not retrieved properly for some reason

With some debugging, I found out that I was getting the following error from the first line:

<addinfourl at 4321460952 whose fp = <socket._fileobject object at 0x10185b050>>

This error occurs about 40% of the time, the other 60% of the time, the code works perfectly. What am I doing wrong? How do I make the url opening more consistent?

Upvotes: 2

Views: 831

Answers (1)

pyfunc
pyfunc

Reputation: 66709

It is usually not an issue from the client side. Your code is consistent in behavior but the server response can vary.

I ran your code a few times and It does throw up certain issues:

>>> jsonurl = urlopen('http://www.reddit.com/r/funny/hot.json?limit=16')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 406, in open
    response = meth(req, response)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 519, in http_response
    'http', request, response, code, msg, hdrs)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 444, in error
    return self._call_chain(*args)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 527, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 429: Unknown

You have to handle cases where server response is anything but HTTP 200. You can wrap your code in a try / except block and you should pass jsonurl to json.loads() only when your request succeeds.

Also urlopen returns a file-like descriptor. Hence if you print jsourl, it simply provides jsonurl.__repr__() value. See below:

>>> jsonurl.__repr__()
'<addinfourl at 4393153672 whose fp = <socket._fileobject object at 0x105978450>>'

You have to look for the following::

>>> jsonurl.getcode()
200
>>> 

and only if it 200, should you process the data obtained from the request.

Upvotes: 1

Related Questions