Reputation: 1341
In Python, I have the following dictionary of sets:
{
1: {'Hello', 'Bye'},
2: {'Bye', 'Do', 'Action'},
3: {'Not', 'But', 'No'},
4: {'No', 'Yes'}
}
My goal is combine the keys which contain match values (like in this example, "Bye" and "No"), so the result will look like this:
{
1: {'Hello', 'Bye', 'Do', 'Action'},
3: {'Not', 'But', 'No', 'Yes'}
}
Is there a way to do this?
Upvotes: 1
Views: 1087
Reputation: 180411
If there are overlapping matches and you want the longest matches:
from collections import defaultdict
d = {
1: {'Hello', 'Bye'},
2: {'Bye', 'Do', 'Action'},
3: {'Not', 'But', 'No'},
4: {'No', 'Yes'}
}
grp = defaultdict(list)
# first group all keys with common words
for k, v in d.items():
for val in v:
grp[val].append(k)
# sort the values by lengths to find longest matches.
for v in sorted(grp.values(), key=len, reverse=True):
for val in v[1:]:
if val not in d:
continue
# use first ele as the key and union to existing values
d[v[0]] |= d[val]
del d[val]
print(d)
if you don't have overlaps you can just:
grp = defaultdict(list)
for k, v in d.items():
for val in v:
grp[val].append(k)
for v in grp.values():
for val in v[1:]:
d[v[0]] |= d[val]
del d[val]
Or if you want a new dict:
new_d = {}
for v in grp.values():
if len(v) > 1:
k = v[0]
new_d[k] = d[k]
for val in v[1:]:
new_d[k] |= d[val]
All three give you the following but key order could be different:
{1: set(['Action', 'Do', 'Bye', 'Hello']), 3: set(['Not', 'Yes', 'But', 'No'])}
Upvotes: 2
Reputation: 23484
If there is no overlapping matches:
a = {1: {'Hello', 'Bye'}, 2: {'Bye', 'Do', 'Action'}, 3: {'Not', 'But', 'No'}, 4: {'No', 'Yes'}}
output = {}
for k, v in a.items():
if output:
for k_o, v_o in output.items():
if v_o.intersection(v):
output[k_o].update(v)
break
else:
output[k] = v
else:
output[k] = v
print(output)
Output:
{1: {'Action', 'Bye', 'Do', 'Hello'}, 3: {'But', 'No', 'Not', 'Yes'}}
Upvotes: 1