Intelligently merging dicts

Question

I am trying to merge some dicts on some specific requirements, here is some example data

data = [{"nid": 363, "cid": "509cd9aaad4d5", "count": 57, "value": 12.5},
        {"nid": 363, "cid": "509cd9aaad4d5", "count": 57, "value": 22},
        {"nid": 363, "cid": "cd9aaad4d5", "count": 57, "value": 49},
        {"nid": 570, "cid": "cd9aaad4d5", "count": 58, "value": 62},
    ]

I need to merge all the dict's that share the same nid and cid and sum the value, but leave the count as it is.

So the above example would be returned as (or similar, I did it by hand it might have a mistake)

[
    {'count': 58, 'value': 62, 'nid': 570, 'cid': 'cd9aaad4d5'},
    {'count': 57, 'value': 34.5, 'nid': 363, 'cid': '509cd9aaad4d5'},
    {'count': 57, 'value': 49, 'nid': 363, 'cid': 'cd9aaad4d5'}
]

My code attempt so far is ugly, and I could really do with some guidance,

tmp = defaultdict(lambda: defaultdict(lambda: [0, 0]))
for d in data:
    tmp[d["nid"]][d["cid"]][1] = d["count"]
    tmp[d["nid"]][d["cid"]][0] += d["value"]

print tmp

new_data = []

for key in tmp:
    for cid in tmp[key]:
        new_data.append({"nid": key, "cid": cid, "count": tmp[key][cid][1], "value": tmp[key][cid][0]})

print new_data

Can anyone help me identify a far cleaner, and more intelligent way of merging the list of dicts.

Martijn Pieters · Accepted Answer

You can improve a little on your attempt by using a compound key:

from collections import defaultdict 

tmp = defaultdict(lambda: {'value': 0})
for d in data:
    tmp[d["nid"], d["cid"]]['count'] = d["count"]
    tmp[d["nid"], d["cid"]]['value'] += d["value"]

new_data = [{'nid': nid, 'cid': cid, 'count': v['count'], 'value': v['value']} 
            for (nid, cid), v in tmp.iteritems()]

The alternative would be to sort data and use itertools.groupby(), but because of the sort that is more costly.

Intelligently merging dicts

Answers (2)

Related Questions