Reputation: 46493
I often used collections.defaultdict
to be able to append an element to d[key]
without having to initialize it first to []
(benefit: you don't need to do: if key not in d: d[key] = []
):
import collections, random
d = collections.defaultdict(list)
for i in range(100):
j = random.randint(0,20)
d[j].append(i) # if d[j] does not exist yet, initialize it to [], so we can use append directly
Now I realize we can simply use a normal dict
and setdefault
:
import random
d = {}
for i in range(100):
j = random.randint(0,20)
d.setdefault(j, []).append(i)
Question: when using a dict
whose values are lists, is there a good reason to use a collections.defaultdict
instead of the second method (using a simple dict
and setdefault
), or are they purely equivalent?
Upvotes: 4
Views: 1276
Reputation: 8520
When using defaultdict
you have a possibility to do inplace addition:
import collections, random
d = collections.defaultdict(list)
for i in range(100):
j = random.randint(0,20)
d[j] += [i]
There is no equivalent construction like d.setdefault(j, []) += [i]
, it gives SyntaxError: cannot assign to function call
.
Upvotes: 1
Reputation: 453
In addition to the answer by Chris_Rands, I want to further emphasize that a primary reason to use defaultdict
is if you want key accesses to always succeed, and to insert the default value if there was none.
This can be for any reason, and a completely valid one is the convenience of being able to use []
instead of having to call dict.setdefault
before every access.
Also note that key in default_dict
will still return False
if that key has never been accessed before, so you can still check for existence of keys in a defaultdict
if necessary. This allows appending to the lists without checking for their existence, but also checking for the existence of the lists if necessary.
Upvotes: 2
Reputation: 41168
collections.defaultdict
is generally more performant, it is optimised exactly for this task and C-implemented. However, you should use dict.setdefault
if you want accessing an absent key in your resulting dictionary to result in a KeyError
rather than inserting an empty list. This is the most important practical difference.
Upvotes: 3