Mark Ginsburg
Mark Ginsburg

Reputation: 2269

How to sum a list of dicts

What is the most Pythonic way to take a list of dicts and sum up all the values for matching keys from every row in the list?

I did this but I suspect a comprehension is more Pythonic:

from collections import defaultdict
demandresult = defaultdict(int)   # new blank dict to store results 
for d in demandlist:
    for k,v in d.iteritems():
        demandresult[k] = demandresult[k] + v

In Python - sum values in dictionary the question involved the same key all the time, but in my case, the key in each row might be a new key never encountered before.

Upvotes: 2

Views: 1446

Answers (4)

K.Marker
K.Marker

Reputation: 129

I suppose you want to return a list of summed values of each dictionary.

list_of_dict = [
    {'a':1, 'b':2, 'c':3},
    {'d':4, 'e':5, 'f':6}
]

sum_of_each_row = [sum(v for v in d.values()) for d in list_of_dict] # [6,15]

If you want to return the total sum, just simply wrap sum() to "sum_of_each_row".

EDIT:

The main problem is that you don't have a default value for each of the keys, so you can make use of the method dict.setdefault() to set the default value when there's a new key.

list_of_dict = [
    {'a':1, 'b':1},
    {'b':1, 'c':1},
    {'a':2}
]

d = {}
d = {k:d[k]+v if k in d.keys() else d.setdefault(k,v)
    for row in list_of_dict for k,v in row.items()} # {'a':3, 'b':2, 'c':1}

Upvotes: 0

Paul Panzer
Paul Panzer

Reputation: 53029

Here is another one-liner (ab-)using collections.ChainMap to get the combined keys:

>>> from collections import ChainMap
>>> {k: sum(d.get(k, 0) for d in demand_list) for k in ChainMap(*demand_list)}
{'2018-04-17': 1, '2018-04-21': 1, '2018-05-01': 1, '2018-04-30': 1, '2018-04-19': 1, '2018-04-29': 1, '2018-04-18': 1}

This is easily the slowest of the methods proposed here.

Upvotes: 1

Primusa
Primusa

Reputation: 13498

I think that your method is quite pythonic. Comprehensions are nice but they shouldn't really be overdone, and they can lead to really messy one-liners, like the one below :).

If you insist on a dict comp:

demand_list = [{u'2018-04-29': 1, u'2018-04-30': 1, u'2018-05-01': 1}, 
               {u'2018-04-21': 1},
               {u'2018-04-18': 1, u'2018-04-19': 1, u'2018-04-17' : 1}]

d = {key:sum(i[key] for i in demand_list if key in i) 
     for key in set(a for l in demand_list for a in l.keys())}

print(d)
>>>{'2018-04-21': 1, '2018-04-17': 1, '2018-04-29': 1, '2018-04-30': 1, '2018-04-19': 1, '2018-04-18': 1, '2018-05-01': 1}

Upvotes: 2

Ahmed Fasih
Ahmed Fasih

Reputation: 6927

The only thing that seemed unclear in your code was the double-for-loop. It may be clearer to collapse the demandlist into a flat iterable—then the loopant presents the logic as simply as possible. Consider:

demandlist = [{
    u'2018-04-29': 1,
    u'2018-04-30': 1,
    u'2018-05-01': 1
}, {
    u'2018-04-21': 1
}, {
    u'2018-04-18': 1,
    u'2018-04-19': 1,
    u'2018-04-17': 1
}]

import itertools as it
from collections import defaultdict

demandresult = defaultdict(int)

for k, v in it.chain.from_iterable(map(lambda d: d.items(), demandlist)):
    demandresult[k] = demandresult[k] + v

(With this, print(demandresult) prints defaultdict(<class 'int'>, {'2018-04-29': 1, '2018-04-30': 1, '2018-05-01': 1, '2018-04-21': 1, '2018-04-18': 1, '2018-04-19': 1, '2018-04-17': 1}).)

Imagining myself reading this for the first time (or a few months later), I can see myself thinking, "Ok, I'm collapsing demandlist into a key-val iterable, I don't particularly care how, and then summing values of matching keys."

It's unfortunate that I need that map there to ensure the final iterable has key-val pairs… it.chain.from_iterable(demandlist) is a key-only iterable, so I need to call items on each dict.

Note that unlike many of the answers proposed, this implementation (like yours!) minimizes the number of scans over the data to just one—performance win (and I try to pick up as many easy performance wins as I can).

Upvotes: 0

Related Questions