thefragileomen
thefragileomen

Reputation: 1547

Error handling in Python with JSON and a dictionary

I currently have a Python 2.7 script which scrapes Facebook and captures some JSON data from each page. The JSON data contains personal information. A sample of the JSON data is below:-

{
   "id": "4",
   "name": "Mark Zuckerberg",
   "first_name": "Mark",
   "last_name": "Zuckerberg",
   "link": "http://www.facebook.com/zuck",
   "username": "zuck",
   "gender": "male",
   "locale": "en_US"
}

The JSON values can vary from page to page. The above example lists all the possibles but sometimes, a value such as 'username' may not exist and I may encounter JSON data such as:-

{
   "id": "6",
   "name": "Billy Smith",
   "first_name": "Billy",
   "last_name": "Smith",
   "gender": "male",
   "locale": "en_US"
}

With this data, I want to populate a database table. As such, my code is as below:-

results_json = simplejson.loads(scraperwiki.scrape(profile_url))
        for result in results_json:
            profile = dict()
            try:
                profile['id'] = int(results_json['id'])
            except:
                profile['id'] = ""
            try:
                profile['name'] = results_json['name']
            except:
                profile['name'] = ""
            try:
                profile['first_name'] = results_json['first_name']
            except:
                profile['first_name'] = ""
            try:
                profile['last_name'] = results_json['last_name']
            except:
                profile['last_name'] = ""
            try:
                profile['link'] = results_json['link']
            except:
                profile['link'] = ""
            try:
                profile['username'] = results_json['username']
            except:
                profile['username'] = ""
            try:
                profile['gender'] = results_json['gender']
            except:
                profile['gender'] = ""
            try:
                profile['locale'] = results_json['locale']
            except:
                profile['locale'] = ""

The reason I have so many try/excepts is to account for when the key value doesn't exist on the webpage. Nonetheless, this seems to be a really clumpsy and messy way to handle this issue.

If I remove these try / exception clauses, should my scraper encounter a missing key, it returns a KeyError such as "KeyError: 'username'" and my script stops running.

Any suggestions on a much smarter and improved way to handle these errors so that, should a missing key be encountered, the script continues.

I've tried creating a list of the JSON values and looked to iterate through them with an IF clause but I just can't figure it out.

Upvotes: 4

Views: 4805

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121854

Use the .get() method instead:

>>> a = {'bar': 'eggs'}
>>> a['foo']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'foo'
>>> a.get('foo', 'default value')
'default value'
>>> a.get('bar', 'default value')
'eggs'

The .get() method returns the value for the requested key, or the default value if the key is missing.

Or you can create a new dict with empty strings for each key and use .update() on it:

profile = dict.fromkeys('id name first_name last_name link username gender locale'.split(), '')
profile.update(result)

dict.fromkeys() creates a dictionary with all keys you request set to a given default value ('' in the above example), then we use .update() to copy all keys and values from the result dictionary, replacing anything already there.

Upvotes: 10

Related Questions