Arda Kutlu
Arda Kutlu

Reputation: 60

Group dictionary items in a list by their specific keys

I have this example list:

my_list = [
    {
        'data': {
            'color': 'white'
        },
        'name': 'item1'
    },
    {
        'data': {
            'color': 'white',
            'property': 'value1'
        },
        'name': 'item2'
    },
    {
        'data': {
            'color': 'white',
            'property': 'value1'
        },
        'name': 'item3'
    },
    {
        'data': {
            'color': 'black',
            'property': 'value1'
        },
        'name': 'item4'
    },
    {
        'data': {
            'color': 'white',
            'property': 'value1',
            'custom': 'valueA'
        },
        'name': 'item5'
    },
    {
        'data': {
            'color': 'white'
        },
        'name': 'item6'
    },
]

I want to group the name values of dictionary items with other name values of items which share the same 'data' value .

So I want to get this result for this specific example: result = [('item1', 'item6'), ('item2', 'item3')]

Update upon request: I have tried to seperate them using groupby with no success:

import itertools
for key, group in itertools.groupby([item["data"] for item in my_list]):
    print("-"*10)
    for data in group:
        print(data)

Upvotes: 1

Views: 61

Answers (2)

mathfux
mathfux

Reputation: 5939

You can use itertools.groupby

from itertools import groupby
keyfunc = lambda x: sorted(zip(x['data'].keys(), x['data'].values()))
ls = sorted(my_list, key = keyfunc)
groups = []
for k, g in groupby(ls, keyfunc):
    groups.append([n['name'] for n in g]) 
>>> groups
[['item4'], ['item1', 'item6'], ['item5'], ['item2', 'item3']]

Sample run:

>>> sorted(map(keyfunc, my_list))
[[('color', 'black'), ('property', 'value1')], 
[('color', 'white')], 
[('color', 'white')], 
[('color', 'white'), ('custom', 'valueA'), ('property', 'value1')], 
[('color', 'white'), ('property', 'value1')], 
[('color', 'white'), ('property', 'value1')]]

>>> ls
[{'data': {'color': 'black', 'property': 'value1'}, 'name': 'item4'},
 {'data': {'color': 'white'}, 'name': 'item1'}, 
 {'data': {'color': 'white'}, 'name': 'item6'}, 
 {'data': {'color': 'white', 'property': 'value1', 'custom': 'valueA'}, 'name': 'item5'}, 
 {'data': {'color': 'white', 'property': 'value1'}, 'name': 'item2'},
 {'data': {'color': 'white', 'property': 'value1'}, 'name': 'item3'}]

Remark: groups in itertools.groupby are allowed to contain a single item only. You can throw them with:

>>> [n for n in groups if len(n) > 1]
[['item1', 'item6'], ['item2', 'item3']]

Upvotes: 1

Andrej Kesely
Andrej Kesely

Reputation: 195418

out = {}
for d in my_list:
    out.setdefault(tuple(d['data'].items()), []).append(d['name'])

out = [v for v in out.values() if len(v) > 1]
print(out)

Prints:

[['item1', 'item6'], ['item2', 'item3']]

Upvotes: 1

Related Questions