Reputation: 2269
What is the most Pythonic way to take a list of dicts and sum up all the values for matching keys from every row in the list?
I did this but I suspect a comprehension is more Pythonic:
from collections import defaultdict
demandresult = defaultdict(int) # new blank dict to store results
for d in demandlist:
for k,v in d.iteritems():
demandresult[k] = demandresult[k] + v
In Python - sum values in dictionary the question involved the same key all the time, but in my case, the key in each row might be a new key never encountered before.
Upvotes: 2
Views: 1446
Reputation: 129
I suppose you want to return a list of summed values of each dictionary.
list_of_dict = [
{'a':1, 'b':2, 'c':3},
{'d':4, 'e':5, 'f':6}
]
sum_of_each_row = [sum(v for v in d.values()) for d in list_of_dict] # [6,15]
If you want to return the total sum, just simply wrap sum() to "sum_of_each_row".
EDIT:
The main problem is that you don't have a default value for each of the keys, so you can make use of the method dict.setdefault() to set the default value when there's a new key.
list_of_dict = [
{'a':1, 'b':1},
{'b':1, 'c':1},
{'a':2}
]
d = {}
d = {k:d[k]+v if k in d.keys() else d.setdefault(k,v)
for row in list_of_dict for k,v in row.items()} # {'a':3, 'b':2, 'c':1}
Upvotes: 0
Reputation: 53029
Here is another one-liner (ab-)using collections.ChainMap
to get the combined keys:
>>> from collections import ChainMap
>>> {k: sum(d.get(k, 0) for d in demand_list) for k in ChainMap(*demand_list)}
{'2018-04-17': 1, '2018-04-21': 1, '2018-05-01': 1, '2018-04-30': 1, '2018-04-19': 1, '2018-04-29': 1, '2018-04-18': 1}
This is easily the slowest of the methods proposed here.
Upvotes: 1
Reputation: 13498
I think that your method is quite pythonic. Comprehensions are nice but they shouldn't really be overdone, and they can lead to really messy one-liners, like the one below :).
If you insist on a dict comp:
demand_list = [{u'2018-04-29': 1, u'2018-04-30': 1, u'2018-05-01': 1},
{u'2018-04-21': 1},
{u'2018-04-18': 1, u'2018-04-19': 1, u'2018-04-17' : 1}]
d = {key:sum(i[key] for i in demand_list if key in i)
for key in set(a for l in demand_list for a in l.keys())}
print(d)
>>>{'2018-04-21': 1, '2018-04-17': 1, '2018-04-29': 1, '2018-04-30': 1, '2018-04-19': 1, '2018-04-18': 1, '2018-05-01': 1}
Upvotes: 2
Reputation: 6927
The only thing that seemed unclear in your code was the double-for-loop. It may be clearer to collapse the demandlist
into a flat iterable—then the loopant presents the logic as simply as possible. Consider:
demandlist = [{
u'2018-04-29': 1,
u'2018-04-30': 1,
u'2018-05-01': 1
}, {
u'2018-04-21': 1
}, {
u'2018-04-18': 1,
u'2018-04-19': 1,
u'2018-04-17': 1
}]
import itertools as it
from collections import defaultdict
demandresult = defaultdict(int)
for k, v in it.chain.from_iterable(map(lambda d: d.items(), demandlist)):
demandresult[k] = demandresult[k] + v
(With this, print(demandresult)
prints defaultdict(<class 'int'>, {'2018-04-29': 1, '2018-04-30': 1, '2018-05-01': 1, '2018-04-21': 1, '2018-04-18': 1, '2018-04-19': 1, '2018-04-17': 1})
.)
Imagining myself reading this for the first time (or a few months later), I can see myself thinking, "Ok, I'm collapsing demandlist
into a key-val iterable, I don't particularly care how, and then summing values of matching keys."
It's unfortunate that I need that map
there to ensure the final iterable has key-val pairs… it.chain.from_iterable(demandlist)
is a key-only iterable, so I need to call items
on each dict.
Note that unlike many of the answers proposed, this implementation (like yours!) minimizes the number of scans over the data to just one—performance win (and I try to pick up as many easy performance wins as I can).
Upvotes: 0