Reputation: 13
I have set of dictionaries: And I wnat merge dictioneries with this same value for key:userId. I know that, In set I can find only one or two dictionaries with this same userId. Only merged dictionaries are intresed for me. My code works, but I just want to know, maybe is any other more elegant way to do it. In my example (below) I have only few dictionaries in list, with few position in each dictionary. I want use it on very larg set of dictionaries, where after merge I expect ~ 30 elemnts in dictionary.
set_of_dict=[
{'prop1':'firstName','prop2':'lastname','userId':'100','prop3':'somefield'},
{'prop1':'value1','prop2':'value2','userId':'200','prop3':'value3'},
{'prop4':'email','prop5':'www','userId':'100','prop6':'blah'},
{'prop4':'abc','prop5':'qaq','userId':'200','prop6':'xx'},
{'prop1':'value1','prop2':'value2','userId':'400','prop3':'value3'},
{'prop4':'value4','prop5':'ssss','userId':'484','prop6':'val66'}]
"""
#output:
result=[
{'prop1':'firstName','prop2':'lastname','userId':'100','prop3':'somefield','prop4':'email','prop5':'www','prop6':'blah'}},
{'prop1':'value1','prop2':'value2','userId':'200','prop3':'value3','prop4':'abc','prop5':'qaq','userId':'200','prop6':'xx'}
]
"""
temp={}
result=[]
list_of_merged_id=[]
lastStep=[]
for j in set_of_dict:
if not any(b['userId'] == j['userId'] for b in result):
result.append(j)
else:
for item in result:
if item.has_key('userId') and item['userId']==j.get('userId'):
item.update(j)
list_of_merged_id.append(j.get('userId'))
for one in result:
if one['userId'] in list_of_merged_id:
lastStep.append(one)
else:
print str(one['userId']) + ": no merged - some data has been lost"
for a in lastStep:
print a
Upvotes: 1
Views: 45
Reputation: 95908
Fundamentally, you want a grouping operation. In which case, it is easiest to use another dict to do the grouping:
>>> from collections import defaultdict
>>> grouped = defaultdict(dict)
>>> set_of_dict=[
... {'prop1':'firstName','prop2':'lastname','userId':'100','prop3':'somefield'},
... {'prop1':'value1','prop2':'value2','userId':'200','prop3':'value3'},
... {'prop4':'email','prop5':'www','userId':'100','prop6':'blah'},
... {'prop4':'abc','prop5':'qaq','userId':'200','prop6':'xx'},
... {'prop1':'value1','prop2':'value2','userId':'400','prop3':'value3'},
... {'prop4':'value4','prop5':'ssss','userId':'484','prop6':'val66'}]
>>> for d in set_of_dict:
... grouped[d['userId']].update(d)
...
>>> from pprint import pprint
>>> pprint(list(grouped.values()))
[{'prop1': 'value1',
'prop2': 'value2',
'prop3': 'value3',
'prop4': 'abc',
'prop5': 'qaq',
'prop6': 'xx',
'userId': '200'},
{'prop1': 'firstName',
'prop2': 'lastname',
'prop3': 'somefield',
'prop4': 'email',
'prop5': 'www',
'prop6': 'blah',
'userId': '100'},
{'prop1': 'value1', 'prop2': 'value2', 'prop3': 'value3', 'userId': '400'},
{'prop4': 'value4', 'prop5': 'ssss', 'prop6': 'val66', 'userId': '484'}]
>>>
If you only want the "merged" dicts, then it's probably easiest to do it in two steps. You can still group using a dict, but group into a list first, and only merge those that have more than one dict:
>>> grouped = defaultdict(list)
>>> for d in set_of_dict:
... grouped[d['userId']].append(d)
...
>>> result = []
>>> for v in grouped.values():
... if len(v) > 1:
... temp = {}
... for d in v:
... temp.update(d)
... result.append(temp)
...
>>> pprint(result)
[{'prop1': 'value1',
'prop2': 'value2',
'prop3': 'value3',
'prop4': 'abc',
'prop5': 'qaq',
'prop6': 'xx',
'userId': '200'},
{'prop1': 'firstName',
'prop2': 'lastname',
'prop3': 'somefield',
'prop4': 'email',
'prop5': 'www',
'prop6': 'blah',
'userId': '100'}]
>>>
Upvotes: 1