Reputation: 850
This is my input. I have a list of dictionaries:
[{'name1':'a', 'name2':'b','val1':10,'val2':20},
{'name1':'a', 'name2':'b','val1':15,'val2':25},
{'name1':'r', 'name2':'s','val1':30,'val2':20}]
If the keys name1
and name2
have both the same value, then add val1
and val2
.
Here is the expected output:
[{'name1':'a', 'name2':'b','val1':25,'val2':45},
{'name1':'r', 'name2':'s','val1':30,'val2':20}]
In the first dict and second dict, both name1
is a
and both name2
is b
, so we add their values.
I was trying with loop but was not getting anywhere.
Upvotes: 0
Views: 87
Reputation: 15872
You can use collections.Counter
and itertools.groupby
:
>>> dicts = [{'name1':'a', 'name2':'b','val1':10,'val2':20},
{'name1':'a', 'name2':'b','val1':15,'val2':25},
{'name1':'r', 'name2':'s','val1':30,'val2':20}]
>>> new_dicts = []
>>> for k, groups in groupby(dicts, lambda d: (d.pop('name1'), d.pop('name2'))):
new_d = {
'name1': k[0],
'name2': k[1],
**sum([Counter(g) for g in groups], Counter())
}
new_dicts.append(new_d)
>>> new_dicts
[{'name1': 'a', 'name2': 'b', 'val1': 25, 'val2': 45},
{'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]
On the other hand, if you use pandas
:
>>> pd.DataFrame(dicts).groupby(['name1', 'name2']).sum().reset_index().to_dict('r')
[{'name1': 'a', 'name2': 'b', 'val1': 25, 'val2': 45},
{'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]
If you want to do this without modules, you can try:
>>> new_dicts = []
>>> for d in dicts:
if not new_dicts:
new_dicts.append(d)
else:
last_dict = new_dicts[-1]
if (last_dict['name1'], last_dict['name2']) == (d['name1'], d['name2']):
last_dict['val1'] += d['val1']
last_dict['val2'] += d['val2']
else:
new_dicts.append(d)
>>> new_dicts
[{'name1': 'a', 'name2': 'b', 'val1': 25, 'val2': 45},
{'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]
NOTE:
First and third solution assume that your list is sorted, i.e. same name1
name2
entries will appear consecutively, if that is not the case, you can add this line at the beginning:
>>> dicts = sorted(dicts, key=lambda x: (x['name1'], x['name2']))
Upvotes: 2
Reputation: 11938
Run it through pandas, which is keenly good at this type of stuff. (and yes, this could probably be collapsed down to 1 or 2 chained statements.:
In [37]: a
Out[37]:
[{'name1': 'a', 'name2': 'b', 'val1': 10, 'val2': 20},
{'name1': 'a', 'name2': 'b', 'val1': 15, 'val2': 25},
{'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]
In [38]: df = pd.DataFrame(a)
In [39]: df
Out[39]:
name1 name2 val1 val2
0 a b 10 20
1 a b 15 25
2 r s 30 20
In [40]: grouped_sum = df.groupby(['name1', 'name2']).sum()
In [41]: grouped_sum
Out[41]:
val1 val2
name1 name2
a b 25 45
r s 30 20
In [42]: grouped_sum.reset_index(inplace=True)
In [43]: data = grouped_sum.to_dict('records')
In [44]: data
Out[44]:
[{'name1': 'a', 'name2': 'b', 'val1': 25, 'val2': 45},
{'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]
Upvotes: 1
Reputation: 1057
I suggest you to post the code you tried and then ask for help, so others can help by suggesting some changes. But something like this can help you,
di = [{'name1': 'a', 'name2': 'a', 'val1': 10, 'val2': 20},
{'name1': 'a', 'name2': 'b', 'val1': 15, 'val2': 25},
{'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]
for i in di:
if i['name1'] == i['name2']:
print("sum:", i['val1']+i['val2'])
It prints the sum of val1 and val2 if name1 amd name2 are equal.
Upvotes: -1
Reputation: 11939
You can just iterate and use an intermediate dictionary where (name1, name2)
is the key to achieve linear time time complexity.
>>> for d in l:
... name1, name2, val1, val2 = d['name1'], d['name2'], d['val1'], d['val2']
... if (name1, name2) in res:
... res[(name1, name2)] = res[(name1, name2)][0] + val1, res[(name1, name2)][1] + val2
... else:
... res[(name1, name2)] = (val1, val2)
...
>>> res
{('a', 'b'): (25, 45), ('r', 's'): (30, 20)}
>>> output = [{'name1': k[0], 'name2': k[1], 'val1': v[0], 'val2': v[1]} for k,v in res.items()]
>>> output
[{'name1': 'a', 'name2': 'b', 'val1': 25, 'val2': 45}, {'name1': 'r', 'name2': 's', 'val1': 30, 'val2': 20}]
Upvotes: 1