Reputation: 551
I'd like to group by the values of the following dictionary:
my_dict = {"Q1": {0: "no", 1: "yes"}, "Q2": {0: "no", 1: "yes"},
"Q3": {1: "animal", 2: "vehicle"}, Q4: {1: "animal", 2: "vehicle"}}
The result should look like this:
result = {("Q1", "Q2"): {0: "no", 1: "yes"},
("Q3", "Q4"): {1: "animal", 2: "vehicle"}}
I've tried the solutions listed here: Grouping Python dictionary keys as a list and create a new dictionary with this list as a value
Using collections.defaultdict does not work because the result would imply that the dictionaries which I use as a key for grouping end up as keys of the result dictionary like that:
result = {{0: "no", 1: "yes"}: ["Q1", "Q2"] ,
{1: "animal", 2: "vehicle"}: ["Q3", "Q4"]}
Of course this does not work because keys of dictionaries have to be immutible. So I would require something like a frozendict which is not available in the standard library of python.
Using itertools.groupby also does not work because it requires the data to be sorted. But operator.itemgetter cannot sort dictionaries. It says:
TypeError: '<' not supported between instances of 'dict' and 'dict'
Therefore, I'd like to know a Pythonic way of solving this problem! Thank you for your help :)
Upvotes: 0
Views: 757
Reputation: 14216
Here is another way using both frozenset
and groupby
from operator import itemgetter
from itertools import groupby
first = itemgetter(0)
second = itemgetter(1)
my_hashes = sorted([(k, hash(frozenset(v))) for k, v in my_dict.items()], key=second)
d = dict()
for k, v in groupby(my_hashes, key=second):
items = list(v)
d[tuple(map(first, items))] = my_dict.get(first(first(items)))
print(d)
{('Q3', 'Q4'): {1: 'animal', 2: 'vehicle'}, ('Q1', 'Q2'): {0: 'no', 1: 'yes'}}
Upvotes: 0
Reputation: 26039
Assuming a sorted dictionary by value, you can use itertools.groupby
:
{tuple(g): k for k, g in groupby(my_dict, key=my_dict.get)}
Code:
from itertools import groupby
my_dict = {"Q1": {0: "no", 1: "yes"}, "Q2": {0: "no", 1: "yes"},
"Q3": {1: "animal", 2: "vehicle"}, "Q4": {1: "animal", 2: "vehicle"}}
print({tuple(g): k for k, g in groupby(my_dict, key=my_dict.get)})
# {('Q1', 'Q2'): {0: 'no', 1: 'yes'}, ('Q3', 'Q4'): {1: 'animal', 2: 'vehicle'}}
Upvotes: 3
Reputation: 15035
Instead of using frozendict
, you can use frozenset
's of the dictionaries' items:
intermediate_dict = defaultdict(list)
for k, v in my_dict.items():
intermediate_dict[frozenset(v.items())].append(k)
result = {tuple(v): dict(k) for k, v in intermediate_dict.items()}
Output:
{('Q1', 'Q2'): {0: 'no', 1: 'yes'}, ('Q3', 'Q4'): {1: 'animal', 2: 'vehicle'}}
The above does not assume or require sorted input, making it O(n)
for all cases, while sorting is O(n log n)
.
Upvotes: 4
Reputation: 4689
So I would require something like a frozendict which is not available in the standard library of python.
Could you elaborate on this? While frozendict
is not in the language standard, there's an extension available that you could install: https://pypi.org/project/frozendict/
Alternatively, you can turn the dictionaries into a tuple of (key-sorted) (key, value)
items to get an immutable, canonical and reversible representation that can be used as a dictionary key.
(Note that if the dictionaries can have further mutable values inside them, you might need to do this recursively.)
Edit: Or use a frozenset() for the items, as the other answer points out. Note that this also requires recursively ensuring the values of the inner dictionary are immutable.
Upvotes: 0