Reputation: 16479
I am trying to find out which portion of my code contains a KeyError
in my events list
. Events is a list that contains JSON elements. I want to put timestamp
, event_sequence_number
, and device_id
in their respective variables. However each JSON object is different and some do not contain the timestamp
, event_sequence_number
, or device_id
keys. How can I change my bit of code so that I am able to output which specific key(s) is missing?
ex:
When timestamp is missing
"timestamp key is missing"
when timestamp and device_id is missing
"timestamp key is missing"
"device_id key is missing"
etc
Code:
for event in events:
try:
timestamp = event["event"]["timestamp"]
event_sequence_num = event["event"]["properties"]["event_sequence_number"]
device_id = event["application"]["mobile"]["device_id"]
event_identifier = str(device_id) + "_" + str(timestamp) + "_" + str(event_sequence_num)
event_dict[event_identifier] = 1
except KeyError:
print "JSON Key does not exist"
Upvotes: 1
Views: 65
Reputation: 22473
Simeon Visser's answer is spot-on. Reporting the key causing the KeyError
is probably the best that can be done in bare, straightforward Python. If you're only accessing the JSON structure once, that's the way to go.
I offer a longer alternative, however, for situations where you need to access the multi-level event data repeatedly. If you're accessing it often, your program can afford a few more lines of setup and infrastructure. Consider:
def getpath(obj, path, post=str):
"""
Use path as sequence of keys/indices into obj. Return the value
there, filtered through the post (postprocessing function).
If there is no such value, raise KeyError displaying the
partial path to the point where there is no index/key.
"""
c = obj
try:
for i, p in enumerate(path):
c = c[p]
return post(c) if post else c
except (KeyError, IndexError) as e:
msg = "JSON keys {0!r} don't exist".format(path[:i+1])
raise KeyError(msg)
# raise type(e)(msg) # Alternative if you want more exception variety
EID_COMPONENTS = [('application', 'mobile', 'device_id'),
('event', 'timestamp'),
('event', 'properties', 'event_sequence_number')]
for event in events:
event_identifier = '_'.join(getpath(event, p) for p in EID_COMPONENTS)
event_dict[event_identifier] = 1
There is more preparation here, with a separate getpath
function and globally defined specification of what paths into the JSON data to get. On the plus side, the assembly of event_identifier
is much shorter (if it were wrapped in a function, it'd be about 1/3 the size in either source lines or bytecodes).
If an attempted access fails, it returns a more complete error message, giving the path into the structure up to that point, not just the final key that was missing. In complex JSON with duplicated keys in different sub-structures (multiple timestamp
s, e.g.), knowing which attempted access failed can save you much debugging effort. You may also notice that the code is prepared to use integer indices and gracefully handle IndexError
; in JSON, array values are common.
This is abstraction in action: More framework and more setup, but if you need to do a lot of deep structure accesses, the code size savings and better error reporting would advantage multiple parts of your program, making it potentially a good investment.
Upvotes: 1
Reputation: 122476
You can print the exception as that will include the key for which the KeyError
was raised:
except KeyError as exc:
print "JSON Key does not exist: " + str(exc)
You can also access the key by looking at exc.args[0]
:
except KeyError as exc:
print "JSON Key does not exist: " + str(exc.args[0])
Upvotes: 1