Mostafa Abedi
Mostafa Abedi

Reputation: 541

How to merge duplicated keys in array of dictionaries python

I have an array of dictionaries in Python:

[
   {'id':1, 'name':'foo', 'email':'[email protected]'},
   {'id':2, 'name':'foo', 'email':'[email protected]'},
   {'id':3, 'name':'bar', 'email':'[email protected]'},
   {'id':3, 'name':'bar', 'email':'[email protected]'},
   {'id':3, 'name':'bar', 'email':'[email protected]'},
 ]

Expected Output:

[
   {'id':1, 'name':'foo', 'email':['[email protected]', '[email protected]']},
   {'id':2, 'name':'bar', 'email':['[email protected]', '[email protected]', '[email protected]]}
]

is there any short way to achieve expected output? thanks

Upvotes: 0

Views: 81

Answers (2)

Mohamed Ali JAMAOUI
Mohamed Ali JAMAOUI

Reputation: 14689

This problem is a perfect application of itertools.groupby. To solve it, you simply use groupby to group the entries in your list by the key "name", then format the results however you want.

Here's concretely how to do that:

from itertools import groupby

d = [
   {'id':1, 'name':'foo', 'email':'[email protected]'},
   {'id':2, 'name':'foo', 'email':'[email protected]'},
   {'id':3, 'name':'bar', 'email':'[email protected]'},
   {'id':3, 'name':'bar', 'email':'[email protected]'},
   {'id':3, 'name':'bar', 'email':'[email protected]'},
 ]

result = []
# if the keys you want to group by aren't consecutive
d = sorted(d, key=lambda x: x["name"])

for idx, val in enumerate(groupby(d, key=lambda x: x["name"])):
    result.append(
        {"id": idx + 1,
         "name": val[0],
         "email": [x["email"] for x in val[1]]}
    )

Output:

[{'id': 1, 'name': 'foo', 'email': ['[email protected]', '[email protected]']},
 {'id': 2,
  'name': 'bar',
  'email': ['[email protected]', '[email protected]', '[email protected]']}]

Upvotes: 1

abc
abc

Reputation: 11929

You can iterate and group items with the same name

res = {}
unique_id = 1

for d in records:
    if d['name'] in res:
        res[d['name']]['email'].append(d['email'])
    else:
        res[d['name']] = {'id':unique_id, 'name':d['name'], 'email':[d['email']]}
        unique_id+=1


>>> print(*res.values(), sep='\n')
{'id': 1, 'name': 'foo', 'email': ['[email protected]', '[email protected]']}
{'id': 2, 'name': 'bar', 'email': ['[email protected]', '[email protected]', '[email protected]']}

Upvotes: 3

Related Questions