Reputation: 2747

Check if in dict or try/except which has better performance in python?

I have a few dictionaries containing similar data.

Most queries would be resolved in a single search of one dictionary.

So is it better performance wise to leave out preliminary checks for existence of a key in a dict and have a try at the next dict in an except clause catching key error?

Or maybe something like

# d1, d2, d3 = bunch of dictionaries

value = d1.get(key, d2.get(key, d3.get(key, 0)))

Upvotes: 8

Answers (6)

Abhijit

Reputation: 63727

It seems in almost all cases, using get would be faster. Here is my test run using try..except and get

>>> def foo1(n):
    spam = dict(zip(range(-99,100,n),[1]*200))
    s = 0
    for e in range(1,100):
        try:
            s += spam[e]
        except KeyError:
            try:
                s += spam[-e]
            except KeyError:
                s += 0
    return s

>>> def foo2(n):
    spam = dict(zip(range(-99,100,n),[1]*200))
    s = 0
    for e in range(1,100):
        s += spam.get(e, spam.get(-e,0))
    return s


>>> for i in range(1,201,10):
    res1 =  timeit.timeit('foo1({})'.format(i), setup = "from __main__ import foo1", number=1000)
    res2 =  timeit.timeit('foo2({})'.format(i), setup = "from __main__ import foo2", number=1000)
    print "{:^5}{:10.5}{:10.5}{:^10}{:^10}".format(i,res1,res2,foo1(i),foo2(i))


  1    0.075102  0.082862    99        99    
 11     0.25096  0.054272    9         9     
 21      0.2885  0.051398    10        10    
 31     0.26211  0.060171    7         7     
 41     0.26653  0.053595    5         5     
 51      0.2609  0.052511    4         4     
 61      0.2686  0.052792    4         4     
 71     0.26645  0.049901    3         3     
 81     0.26351  0.051275    3         3     
 91     0.26939  0.051192    3         3     
 101      0.264  0.049924    2         2     
 111     0.2648  0.049875    2         2     
 121    0.26644  0.049151    2         2     
 131    0.26417  0.048806    2         2     
 141    0.26418  0.050543    2         2     
 151    0.26585  0.049787    2         2     
 161    0.26663  0.051136    2         2     
 171    0.26549  0.048601    2         2     
 181    0.26425  0.050964    2         2     
 191     0.2648  0.048734    2         2     
>>>

Upvotes: 6

Sheena

Reputation: 16212

try...except usually takes longer than using get but it depends on a few things...

Try making use of the timeit module to test performance in your particular situation like so:

def do_stuff():
    blah

timeit.timeit('testfunc()', 'from __main__ import do_stuff as testfunc')

Upvotes: 1

mgilson

Reputation: 309919

Since you say that most of the queries will be resolved by looking at the first dict, your fastest solution would be to do something like:

try:
    item = d1[key]
except KeyError:
    try:
        item = d2[key]
    except KeyError:
        ...

However, that is certainly not the most maintainable of solutions and I don't recommend using it. You could create a function:

def get_from(item,dicts):
    for d in dicts:
        try:
           return d[item]
        except KeyError:
           pass
    else:
        raise KeyError("No item in dicts")

which you would call like:

get_from(key,(d1,d2,d3))

(this is a simplified, slightly less clean, version of the already very simple Chained map recipe suggested by @MartijnPieters in the comments on the original question -- I would advocate using that over this code posted here. This code is only to demonstrate the concept in a more simplified way.)

Finally, perhaps a hybrid solution would work best in practice. Factor the first try out of the loop -- This is a little ugly, but it avoids the overhead of the loop most of the time. Only if the first try raises a KeyError do you enter the loop type solution I suggested above on the remaining dicts. e.g.:

try:
   item = d1[key]
except KeyError:
   item = get_from(key,(d2,d3))

again, only do this if you can reliably demonstrate (think timeit) that it makes a measureable difference

The important thing to know is that in python, try is cheap, but except costs a decent amount of time. If your code is expected to succeed, use try-except. If it isn't expected to succeed, often it's best to use try-except anyway, but in that case, you should evaluate whether performance is really an issue and only if you can demonstrate that it is an issue should you resort to "looking before you leap".

One final note, If the dictionaries are relatively static, it might be worth combining them into 1 dict:

d1.update(d2)
d1.update(d3)

Now you can just use d1 -- It has all the information from d2 and d3. (of course, the order of the updates matters if the dicts have keys that are the same but have different values).

Upvotes: 1

Cory Dolphin

Reputation: 2670

The difference between checking the conditional

if 'key' in a_dict or similarly, if a_dct.get('key') == None

and handling the KeyError thrown when not 'key' in a_dict is generally considered trivial, and likely to depend upon the implementation of python you are using.

Using the conditional form is undoubtedly more pythonic, and is generally considered more expressive than catching the exception, often leading to cleaner code. However, if your dictionary may contain arbitrary data, and you cannot know that a value of None, or some other magic value indicates that your key is not found, using the conditional form will require two lookups, as you first check if the key is in the dictionary, and then retrieve the value. I.E.:

if 'key': in a_dict:
   val = a_dcit['key']

Given the situation you describe, the code you have provided is the slowest possible option, as the key will be looked up in each of the dictionaries. A faster option is to guess the dictionary it will be in, and sequentially search through the other dictionaries:

my_val = d1.get(key,None)

if my_val == None:
    my_val = d2.get(key,None)
    if my_val == None:
        my_val = d3.get(key,None)
        if my_val == None:
            return False #handle not found in any case

However, your particular use case sounds interesting and strange. Why are there multiple dictionaries with similar data? How are these dictionaries stored? If you already have a list or some other data structure holding these dictionaries, it would be even more expressive to loop through the dictionaries.

dict_list = [{},{},{}] #pretend you have three dicts in a list

for d in dict_list:
   val = d.get('key',None)
   if val == None:
      break
#val is now either None, or found.

Upvotes: 0

glglgl

Reputation: 91017

You could as well do

sentinel = object()
values = (d.get(key, sentinel) for d in (d1, d2, d3))
value = next(v for v in values if v is not sentinel)

If none of the dicts contain the key, this raises a StopIteration rather than a KeyError.

Upvotes: 0

Rich Tier

Reputation: 9441

Depends on the keys in the dictionaries.

If you predict with confidence that it is more common for keys to be missing use then use get.

If you predict with confidence that it is more common for keys to be there use try except.

Upvotes: 4

Check if in dict or try/except which has better performance in python?

Answers (6)

Related Questions