Reputation: 537
Say I have a list of dictionaries. They mostly have the same keys in each row, but a few don't match and have extra key/value pairs. Is there a fast way to get a set of all the keys in all the rows?
Right now I'm using this loop:
def get_all_keys(dictlist):
keys = set()
for row in dictlist:
keys = keys.union(row.keys())
It just seems terribly inefficient to do this on a list with hundreds of thousands of rows, but I'm not sure how to do it better
Thanks!
Upvotes: 2
Views: 3901
Reputation: 309899
A fun one which works on python3.x1 relies on reduce
and the fact the dict.keys()
now returns a set-like object:
>>> from functools import reduce
>>> dicts = [{1:2},{3:4},{5:6}]
>>> reduce(lambda x,y:x | y.keys(),dicts,{})
{1, 3, 5}
For what it's worth,
>>> reduce(lambda x,y:x | y.keys(),dicts,set())
{1, 3, 5}
works too, or, if you want to avoid a lambda
(and the initializer), you could even do:
>>> reduce(operator.or_, (d.keys() for d in dicts))
Very neat.
This really shines most when you only have two elements. Then, instead of doing something like set(a) | set(b)
, you can do a.keys() | b.keys()
which seems a little nicer to me.
1It can be made to work on python2.7 as well. Use dict.viewkeys
instead of dict.keys
Upvotes: 4
Reputation: 123453
sets
are like dictionaries, and have an update()
method, so this would work in your loop:
keys.update(row.iterkeys())
Upvotes: 1
Reputation: 21595
you can do:
from itertools import chain
return set(chain.from_iterable(dictlist))
As @Jon Clements noted, this can keep only the required data in memory, in contrast to using the *
operator for either chain
or union
.
Upvotes: 3
Reputation: 142136
You could try:
def all_keys(dictlist):
return set().union(*dictlist)
Avoids imports, and will make the most of the underlying implementation of set
. Will also work with anything iterable.
Upvotes: 11
Reputation: 5817
If you worry about performance, you should quit the dict.keys()
method, since it creates a list in memory. And you can use set.update()
instead of union, but I don't know if it is faster than set.union()
.
Upvotes: 0