Reputation: 60
I have this example list:
my_list = [
{
'data': {
'color': 'white'
},
'name': 'item1'
},
{
'data': {
'color': 'white',
'property': 'value1'
},
'name': 'item2'
},
{
'data': {
'color': 'white',
'property': 'value1'
},
'name': 'item3'
},
{
'data': {
'color': 'black',
'property': 'value1'
},
'name': 'item4'
},
{
'data': {
'color': 'white',
'property': 'value1',
'custom': 'valueA'
},
'name': 'item5'
},
{
'data': {
'color': 'white'
},
'name': 'item6'
},
]
I want to group the name values of dictionary items with other name values of items which share the same 'data' value .
So I want to get this result for this specific example:
result = [('item1', 'item6'), ('item2', 'item3')]
Update upon request: I have tried to seperate them using groupby with no success:
import itertools
for key, group in itertools.groupby([item["data"] for item in my_list]):
print("-"*10)
for data in group:
print(data)
Upvotes: 1
Views: 61
Reputation: 5939
You can use itertools.groupby
from itertools import groupby
keyfunc = lambda x: sorted(zip(x['data'].keys(), x['data'].values()))
ls = sorted(my_list, key = keyfunc)
groups = []
for k, g in groupby(ls, keyfunc):
groups.append([n['name'] for n in g])
>>> groups
[['item4'], ['item1', 'item6'], ['item5'], ['item2', 'item3']]
Sample run:
>>> sorted(map(keyfunc, my_list))
[[('color', 'black'), ('property', 'value1')],
[('color', 'white')],
[('color', 'white')],
[('color', 'white'), ('custom', 'valueA'), ('property', 'value1')],
[('color', 'white'), ('property', 'value1')],
[('color', 'white'), ('property', 'value1')]]
>>> ls
[{'data': {'color': 'black', 'property': 'value1'}, 'name': 'item4'},
{'data': {'color': 'white'}, 'name': 'item1'},
{'data': {'color': 'white'}, 'name': 'item6'},
{'data': {'color': 'white', 'property': 'value1', 'custom': 'valueA'}, 'name': 'item5'},
{'data': {'color': 'white', 'property': 'value1'}, 'name': 'item2'},
{'data': {'color': 'white', 'property': 'value1'}, 'name': 'item3'}]
Remark: groups in itertools.groupby
are allowed to contain a single item only. You can throw them with:
>>> [n for n in groups if len(n) > 1]
[['item1', 'item6'], ['item2', 'item3']]
Upvotes: 1
Reputation: 195418
out = {}
for d in my_list:
out.setdefault(tuple(d['data'].items()), []).append(d['name'])
out = [v for v in out.values() if len(v) > 1]
print(out)
Prints:
[['item1', 'item6'], ['item2', 'item3']]
Upvotes: 1