Reputation: 358
As per title, given n sets like the following:
set1 = {"aa","ab","ba","cc","ca"},
set2 = {"fr","fa","po","pl","cc"},
set3 = {"fr","xz","hn","fa"},
set4 = {"jq","we","hn","ca","aa","fk"},
set5 = {"jp","wx","we","fr","ba"}
I want to get the exclusively intersections between them, conceptually like a Venn Diagram representation. So, for example:
The intersection between the set2
and set3
, despite sharing {"fr","fa"}
, will only be {"fa"}
, as the string "fr"
is also present in the intersection between set2
,set3
and set5
.
The brilliant solution proposed in this answer managed correctly all the permutations of given sets, but didn't manage to account for exclusivity, considering each overlap independently. So, in the previous example, it'll return {"fr","fa"}
I tried to handle this situation in a very slow way, and want to know if there's a more proficient way to do the job. By creating a list of sets
set_list = [set1, set2, set3, set4, set5]
I start from finding the whole sets intersection
whole_intersect = set.intersections(*map(set,set_list))
Then I'm moving in every permutation of length n-1
using list indices
isect_1_2_3_4 = set.intersections(*map(set, map(set_list.__getitem__, [0,1,2,3])) - whole_intersect
isect_1_2_3_5 = set.intersections(*map(set, map(set_list.__getitem__, [0,1,2,5])) - whole_intersect
..
Then by n-2
isect_1_2_3 = set.intersections(*map(set, map(set_list.__getitem__, [0,1,2])) - whole_intersect - isect_1_2_3_4 - isect_1_2_3_5
..
Till filling intersections of 2 sets
isect_1_2 = set.intersections(*map(set, map(set_list.__getitem__, [0,1,2])) - whole_intersect - isect_1_2_3_4 - isect_1_2_3_5 - isect_1_2_3 - isect_1_2_4 - isect_1_2_5
..
As expectable, this approach is a true pain. How could I manage to do it in a less handcrafted and pythonic way?
Upvotes: 1
Views: 508
Reputation: 6369
I think I have done it by first making a dictionary of all of the intersections, of every combination, and then finding the unique intersection by taking the intersection and subtracting the union of every other intersection:
import itertools
set1 = {"aa","ab","ba","cc","ca"}
set2 = {"fr","fa","po","pl","cc"}
set3 = {"fr","xz","hn","fa"}
set4 = {"jq","we","hn","ca","aa","fk"}
set5 = {"jp","wx","we","fr","ba"}
sets = {
1: set1,
2: set2,
3: set3,
4: set4,
5: set5
}
intersections = {}
for n_combinations in range(2, len(sets) + 1):
tmp = list(map(dict, itertools.combinations(sets.items(), n_combinations)))
tmp = {tuple(x.keys()):set.intersection(*list(x.values())) for x in tmp}
intersections.update(tmp)
unique_in_intersection = {}
for n_combinations in range(2, len(sets)+1):
for lookup_set in itertools.combinations(range(1, len(sets)+1), n_combinations):
s1_intersection_s2 = intersections[lookup_set]
union_other_intersections = set.union(*[v for k, v in intersections.items() if k != lookup_set and len(k) > len(lookup_set)])
unique_in_intersection[lookup_set] = s1_intersection_s2 - union_other_intersections
result:
{
(1, 2): {'cc'},
(1, 3): set(),
(1, 4): {'ca', 'aa'},
(1, 5): {'ba'},
(2, 3): {'fa'},
(2, 4): set(),
(2, 5): set(),
(3, 4): {'hn'},
(3, 5): set(),
(4, 5): {'we'},
(1, 2, 3): set(),
(1, 2, 4): set(),
(1, 2, 5): set(),
(1, 3, 4): set(),
(1, 3, 5): set(),
(1, 4, 5): set(),
(2, 3, 4): set(),
(2, 3, 5): {'fr'},
(2, 4, 5): set(),
(3, 4, 5): set(),
(1, 2, 3, 4): set(),
(1, 2, 3, 5): set(),
(1, 2, 4, 5): set(),
(1, 3, 4, 5): set(),
(2, 3, 4, 5): set(),
(1, 2, 3, 4, 5): set()
}
Upvotes: 1