user2286041
user2286041

Reputation: 43

Sum the nested dictionary values in python

I have a dictionary like this,

data={11L: [{'a': 2, 'b': 1},{'a': 2, 'b': 3}],
22L: [{'a': 3, 'b': 2},{'a': 2, 'b': 5},{'a': 4, 'b': 2},{'a': 1, 'b': 5}, {'a': 1, 'b': 0}],
33L: [{'a': 1, 'b': 2},{'a': 3, 'b': 5},{'a': 5, 'b': 2},{'a': 1, 'b': 3}, {'a': 1, 'b': 6},{'a':2,'b':0}],
44L: [{'a': 4, 'b': 2},{'a': 4, 'b': 5},{'a': 3, 'b': 1},{'a': 3, 'b': 3}, {'a': 2, 'b': 3},{'a':1,'b':2},{'a': 1, 'b': 0}]}

Here i ll get rid of the outer keys, and give new key values 1, 2 , 3 so on, i want to get the result as shown below,

result={1:{'a':10,'b':7},2:{'a':11,'b':18},3:{'a':12,'b':5},4:{'a':5,'b':11},5:{'a':3,'b':9},6:{'a':3,'b':2},7:{'a':1,'b':0}}

I tried some thing like this, but i dint get the required result,

d = defaultdict(int)
for dct in data.values():
  for k,v in dct.items():
    d[k] += v
print dict(d)

I want the keys of result dictionary to be dynamic, like in the above data dictionary we have 44 which has highest with 7 key value pairs, hence we have the result dictionary with 7 keys and so on

Upvotes: 3

Views: 2885

Answers (3)

Adler Santos
Adler Santos

Reputation: 405

First find the length of the longest list among all the values (which are lists):

max_length = 0
for key in data.keys():
    if max_length < len(data[key]):
        max_length = len(data[key])

In your case, max_length = 7. Now iterate as follows:

result = {}
for i in range(max_length):
    result[i+1] = {'a': 0, 'b': 0} # i + 1 since the result starts with key = 1
    for key in data.keys():
        if i < len(data[key]):
            result[i+1]['a'] += data[key][i]['a']
            result[i+1]['b'] += data[key][i]['b']

You should get:

print result
{1: {'a': 10, 'b': 7}, 2: {'a': 11, 'b': 18}, 3: {'a': 12, 'b': 5}, 4: {'a': 5, 'b': 11}, 5: {'a': 4, 'b': 9}, 6: {'a': 3, 'b': 2}, 7: {'a': 1, 'b': 0}}

Edit: @user2286041 If you'd like the result dict to be reduced to

reduced_result = {'a': [10, 11,12,5,4,3,1], 'b': [7, 18,5,11,9,2,0]}

then you can try the following code:

reduced_result = {}
inner_keys = ['a', 'b']
for inner_key in inner_keys:
    temp = []
    for outer_key in result:
        temp.append(result[outer_key][inner_key])
    reduced_result[inner_key] = temp

I'm not sure though how to get the inner_keys in a more general way, aside from explicitly specifying them.

Upvotes: 0

Martijn Pieters
Martijn Pieters

Reputation: 1121952

You want to use a list here, and you want to perhaps use Counter() objects to make the summing that much easier:

from collections import Counter
from itertools import izip_longest

for dcts in data.values():
    for i, dct in enumerate(dcts):
        if i >= len(result):
            result.append(Counter(dct))
        else:
            result[i].update(dct)

Result:

>>> result
[Counter({'a': 10, 'b': 7}), Counter({'b': 18, 'a': 11}), Counter({'a': 12, 'b': 5}), Counter({'b': 11, 'a': 5}), Counter({'b': 9, 'a': 4}), Counter({'a': 3, 'b': 2}), Counter({'a': 1, 'b': 0})]

Counter() objects are subclasses of dict, so they otherwise behave as dictionaries. If you have to have dict values afterwards, add the following line:

result = [dict(r) for r in result]

Taking inspiration from Eric, you can transform the above into a one-liner:

from collections import Counter
from itertools import izip_longest

result = [sum(map(Counter, col), Counter()) 
    for col in izip_longest(*data.values(), fillvalue={})]

This version differs slightly from the loop above in that keys that are 0 are dropped from the counter when summing. If you want to keep 'b': 0 in the last counter, use:

[reduce(lambda c, d: c.update(d) or c, col, Counter())
    for col in izip_longest(*data.values(), fillvalue={})]

This uses .update() again.

Upvotes: 5

Eric
Eric

Reputation: 97591

izip_longest allows you to transpose the rows:

from itertools import izip_longest

print [
    {
       'a': sum(cell['a'] for cell in column), 
       'b': sum(cell['b'] for cell in column)
    }
    for column in izip_longest(*data.values(), fillvalue={'a': 0, 'b': 0})
]
[{'a': 10, 'b': 7}, {'a': 11, 'b': 18}, {'a': 12, 'b': 5}, {'a': 5, 'b': 11}, {'a': 4, 'b': 9}, {'a': 3, 'b': 2}, {'a': 1, 'b': 0}]

Or combining that with counters:

print [
    sum(Counter(cell) for cell in column, Counter())
    for column in izip_longest(*data.values(), fillvalue={})
]
[Counter({'a': 10, 'b': 7}), Counter({'b': 18, 'a': 11}), Counter({'a': 12, 'b': 5}), Counter({'b': 11, 'a': 5}), Counter({'b': 9, 'a': 4}), Counter({'a': 3, 'b': 2}), Counter({'a': 1, 'b': 0})]

Upvotes: 2

Related Questions