Jobs
Jobs

Reputation: 3377

Best way (performance-wise) to iterate over a dictionary

This suggests that .items() is the best way to iterate over a dictionary because it's pythonic.

Performance-wise, which of the below is the best and why?

  1. for key in dic:
        value = dic[key]
    
  2. for key, value in dic.items():
    
  3. for key in dic.keys():
        value = dic[key]
    

Upvotes: 0

Views: 937

Answers (4)

benvc
benvc

Reputation: 15120

These sorts of generic timings are rarely useful as performance is driven by a much wider array of variables specific to the data, application and environment in question, but for the sake of providing some simple comparison data here are a few tests that you can copy and try in your own environment.

Just iterating without accessing dict values unsurprisingly suggests that there is very little performance difference in any of the methods except for dict.items() running just slightly behind since it is creating a view of both keys and values (while the other methods shown are only dealing with one or the other).

from timeit import timeit

loop = """
d = dict(zip(range(1000), reversed(range(1000))))
for k in d: pass"""
print(timeit(stmt=loop, number=10000))
# 1.0733639170002789

keys = """
d = dict(zip(range(1000), reversed(range(1000))))
for k in d.keys(): pass"""
print(timeit(stmt=keys, number=10000))
# 1.0360493710004448

values = """
d = dict(zip(range(1000), reversed(range(1000))))
for v in d.values(): pass"""
print(timeit(stmt=values, number=10000))
# 1.0380961279997791

items = """
d = dict(zip(range(1000), reversed(range(1000))))
for v in d.items(): pass"""
print(timeit(stmt=items, number=10000))
# 1.2011308679993817

Iterating while accessing the value for each key in an attempt to level the playing field unsurprisingly suggests that dict.items() is slightly faster when you need to both iterate over and access keys and values.

from timeit import timeit

loop = """
d = dict(zip(range(1000), reversed(range(1000))))
for k in d: d[k]"""
print(timeit(stmt=loop, number=10000))
# 1.4128917540001567

keys = """
d = dict(zip(range(1000), reversed(range(1000))))
for k in d.keys(): d[k]"""
print(timeit(stmt=keys, number=10000))
# 1.3668724469998779

items = """
d = dict(zip(range(1000), reversed(range(1000))))
for v in d.items(): pass"""
print(timeit(stmt=items, number=10000))
# 1.1864945030001763

Upvotes: 2

rassar
rassar

Reputation: 5660

Based on timeit values :

To find keys and values,

from timeit import timeit

d = {i: i for i in range(100)}

def a():
    for key in d:
        d[key]
        pass

def b():
    for key, value in d.items():
        pass

def c():
    for key in d.keys():
        d[key]
        pass

for fn in [a, b, c]:
    print(fn.__name__, timeit(fn))


Solution 1 7.8113735559999995
Solution 2 2.6758934780000008
Solution 3 5.499667492

To just find the keys,

from timeit import timeit

d = {i: i for i in range(100)}

def a():
    for key in d:
        pass

def b():
    for key, value in d.items():
        pass

def c():
    for key in d.keys():
        pass

for fn in [a, b, c]:
    print(fn.__name__, timeit(fn))

Solution 1 1.5981329149999999
Solution 2 2.649033456
Solution 3 1.6517934609999996

So to find the keys the first solution is fastest, but to find the keys and values the second solution is fastest.

Upvotes: 2

ShadowRanger
ShadowRanger

Reputation: 155353

If you need both keys and values, use for k, v in mydict.items():; iterating the keys and then looking it up means unnecessarily looking up information you could have gotten for free with .items(). For very short dicts, it might be slower than for k in mydict: followed by lookup (simply because there is a very small cost to creating the items view that may exceed that of a few lookups), but for dicts of moderate length, .items() will always win.

If you only need the keys, both for k in mydict: and for k in mydict.keys(): are roughly identical, though the latter is a tiny bit slower in Py3 (due to needing to construct the keys view) and can be significantly slower on Py2 (where it makes a temporary list with a copy of the keys, where the former approach lazily iterates the dict directly).

Upvotes: 1

damaredayo
damaredayo

Reputation: 1079

Two is your best option for two reasons,

1) You are able to get both the key and the value so you save resources on calling it again when you want it

2) Like you said, it is 'pythonic' and it is the fastest method there.

Upvotes: -1

Related Questions