Reputation: 2747
I have a few dictionaries containing similar data.
Most queries would be resolved in a single search of one dictionary.
So is it better performance wise to leave out preliminary checks for existence of a key in a dict and have a try at the next dict in an except clause catching key error?
Or maybe something like
# d1, d2, d3 = bunch of dictionaries
value = d1.get(key, d2.get(key, d3.get(key, 0)))
?
Upvotes: 8
Views: 5986
Reputation: 63727
It seems in almost all cases, using get
would be faster. Here is my test run using try..except
and get
>>> def foo1(n):
spam = dict(zip(range(-99,100,n),[1]*200))
s = 0
for e in range(1,100):
try:
s += spam[e]
except KeyError:
try:
s += spam[-e]
except KeyError:
s += 0
return s
>>> def foo2(n):
spam = dict(zip(range(-99,100,n),[1]*200))
s = 0
for e in range(1,100):
s += spam.get(e, spam.get(-e,0))
return s
>>> for i in range(1,201,10):
res1 = timeit.timeit('foo1({})'.format(i), setup = "from __main__ import foo1", number=1000)
res2 = timeit.timeit('foo2({})'.format(i), setup = "from __main__ import foo2", number=1000)
print "{:^5}{:10.5}{:10.5}{:^10}{:^10}".format(i,res1,res2,foo1(i),foo2(i))
1 0.075102 0.082862 99 99
11 0.25096 0.054272 9 9
21 0.2885 0.051398 10 10
31 0.26211 0.060171 7 7
41 0.26653 0.053595 5 5
51 0.2609 0.052511 4 4
61 0.2686 0.052792 4 4
71 0.26645 0.049901 3 3
81 0.26351 0.051275 3 3
91 0.26939 0.051192 3 3
101 0.264 0.049924 2 2
111 0.2648 0.049875 2 2
121 0.26644 0.049151 2 2
131 0.26417 0.048806 2 2
141 0.26418 0.050543 2 2
151 0.26585 0.049787 2 2
161 0.26663 0.051136 2 2
171 0.26549 0.048601 2 2
181 0.26425 0.050964 2 2
191 0.2648 0.048734 2 2
>>>
Upvotes: 6
Reputation: 16212
try...except
usually takes longer than using get
but it depends on a few things...
Try making use of the timeit module to test performance in your particular situation like so:
def do_stuff():
blah
timeit.timeit('testfunc()', 'from __main__ import do_stuff as testfunc')
Upvotes: 1
Reputation: 309919
Since you say that most of the queries will be resolved by looking at the first dict, your fastest solution would be to do something like:
try:
item = d1[key]
except KeyError:
try:
item = d2[key]
except KeyError:
...
However, that is certainly not the most maintainable of solutions and I don't recommend using it. You could create a function:
def get_from(item,dicts):
for d in dicts:
try:
return d[item]
except KeyError:
pass
else:
raise KeyError("No item in dicts")
which you would call like:
get_from(key,(d1,d2,d3))
(this is a simplified, slightly less clean, version of the already very simple Chained map recipe suggested by @MartijnPieters in the comments on the original question -- I would advocate using that over this code posted here. This code is only to demonstrate the concept in a more simplified way.)
Finally, perhaps a hybrid solution would work best in practice. Factor the first try
out of the loop -- This is a little ugly, but it avoids the overhead of the loop
most of the time. Only if the first try
raises a KeyError
do you enter the loop type solution I suggested above on the remaining dicts. e.g.:
try:
item = d1[key]
except KeyError:
item = get_from(key,(d2,d3))
again, only do this if you can reliably demonstrate (think timeit
) that it makes a measureable difference
The important thing to know is that in python, try
is cheap, but except
costs a decent amount of time. If your code is expected to succeed, use try
-except
. If it isn't expected to succeed, often it's best to use try-except
anyway, but in that case, you should evaluate whether performance is really an issue and only if you can demonstrate that it is an issue should you resort to "looking before you leap".
One final note, If the dictionaries are relatively static, it might be worth combining them into 1 dict
:
d1.update(d2)
d1.update(d3)
Now you can just use d1
-- It has all the information from d2
and d3
. (of course, the order of the updates matters if the dicts have keys that are the same but have different values).
Upvotes: 1
Reputation: 2670
The difference between checking the conditional
if 'key' in a_dict
or similarly,
if a_dct.get('key') == None
and handling the KeyError
thrown when not 'key' in a_dict
is generally considered trivial, and likely to depend upon the implementation of python you are using.
Using the conditional form is undoubtedly more pythonic, and is generally considered more expressive than catching the exception, often leading to cleaner code. However, if your dictionary may contain arbitrary data, and you cannot know that a value of None
, or some other magic value indicates that your key is not found, using the conditional form will require two lookups, as you first check if the key is in the dictionary, and then retrieve the value. I.E.:
if 'key': in a_dict:
val = a_dcit['key']
Given the situation you describe, the code you have provided is the slowest possible option, as the key
will be looked up in each of the dictionaries. A faster option is to guess the dictionary it will be in, and sequentially search through the other dictionaries:
my_val = d1.get(key,None)
if my_val == None:
my_val = d2.get(key,None)
if my_val == None:
my_val = d3.get(key,None)
if my_val == None:
return False #handle not found in any case
However, your particular use case sounds interesting and strange. Why are there multiple dictionaries with similar data? How are these dictionaries stored? If you already have a list or some other data structure holding these dictionaries, it would be even more expressive to loop through the dictionaries.
dict_list = [{},{},{}] #pretend you have three dicts in a list
for d in dict_list:
val = d.get('key',None)
if val == None:
break
#val is now either None, or found.
Upvotes: 0
Reputation: 91017
You could as well do
sentinel = object()
values = (d.get(key, sentinel) for d in (d1, d2, d3))
value = next(v for v in values if v is not sentinel)
If none of the dicts contain the key, this raises a StopIteration
rather than a KeyError
.
Upvotes: 0
Reputation: 9441
Depends on the keys in the dictionaries.
If you predict with confidence that it is more common for keys to be missing use then use get.
If you predict with confidence that it is more common for keys to be there use try except.
Upvotes: 4