Shred
Shred

Reputation: 358

Find all possible exclusively intersection of multiple sets python

As per title, given n sets like the following:

set1 = {"aa","ab","ba","cc","ca"},
set2 = {"fr","fa","po","pl","cc"},
set3 = {"fr","xz","hn","fa"},
set4 = {"jq","we","hn","ca","aa","fk"},
set5 = {"jp","wx","we","fr","ba"}

I want to get the exclusively intersections between them, conceptually like a Venn Diagram representation. So, for example:


The intersection between the set2 and set3, despite sharing {"fr","fa"}, will only be {"fa"}, as the string "fr" is also present in the intersection between set2,set3 and set5.


The brilliant solution proposed in this answer managed correctly all the permutations of given sets, but didn't manage to account for exclusivity, considering each overlap independently. So, in the previous example, it'll return {"fr","fa"}

Tryout

I tried to handle this situation in a very slow way, and want to know if there's a more proficient way to do the job. By creating a list of sets

set_list = [set1, set2, set3, set4, set5]

I start from finding the whole sets intersection

whole_intersect = set.intersections(*map(set,set_list))

Then I'm moving in every permutation of length n-1 using list indices

isect_1_2_3_4 = set.intersections(*map(set, map(set_list.__getitem__, [0,1,2,3])) - whole_intersect
isect_1_2_3_5 = set.intersections(*map(set, map(set_list.__getitem__, [0,1,2,5])) - whole_intersect
..

Then by n-2

isect_1_2_3 = set.intersections(*map(set, map(set_list.__getitem__, [0,1,2])) - whole_intersect - isect_1_2_3_4 - isect_1_2_3_5
..

Till filling intersections of 2 sets

isect_1_2 = set.intersections(*map(set, map(set_list.__getitem__, [0,1,2])) - whole_intersect - isect_1_2_3_4 - isect_1_2_3_5 - isect_1_2_3 - isect_1_2_4 - isect_1_2_5
..

As expectable, this approach is a true pain. How could I manage to do it in a less handcrafted and pythonic way?

Upvotes: 1

Views: 508

Answers (1)

Tom McLean
Tom McLean

Reputation: 6369

I think I have done it by first making a dictionary of all of the intersections, of every combination, and then finding the unique intersection by taking the intersection and subtracting the union of every other intersection:

import itertools

set1 = {"aa","ab","ba","cc","ca"}
set2 = {"fr","fa","po","pl","cc"}
set3 = {"fr","xz","hn","fa"}
set4 = {"jq","we","hn","ca","aa","fk"}
set5 = {"jp","wx","we","fr","ba"}
sets = {
    1: set1,
    2: set2,
    3: set3,
    4: set4,
    5: set5
}

intersections = {}
for n_combinations in range(2, len(sets) + 1):
    tmp = list(map(dict, itertools.combinations(sets.items(), n_combinations)))
    tmp = {tuple(x.keys()):set.intersection(*list(x.values())) for x in tmp}
    intersections.update(tmp)

unique_in_intersection = {}
for n_combinations in range(2, len(sets)+1):
    for lookup_set in itertools.combinations(range(1, len(sets)+1), n_combinations):
        s1_intersection_s2 = intersections[lookup_set]
        union_other_intersections = set.union(*[v for k, v in intersections.items() if k != lookup_set and len(k) > len(lookup_set)])
        unique_in_intersection[lookup_set] = s1_intersection_s2 - union_other_intersections

result:

{
    (1, 2): {'cc'},
    (1, 3): set(),
    (1, 4): {'ca', 'aa'},
    (1, 5): {'ba'},
    (2, 3): {'fa'},
    (2, 4): set(),
    (2, 5): set(),
    (3, 4): {'hn'},
    (3, 5): set(),
    (4, 5): {'we'},
    (1, 2, 3): set(),
    (1, 2, 4): set(),
    (1, 2, 5): set(),
    (1, 3, 4): set(),
    (1, 3, 5): set(),
    (1, 4, 5): set(),
    (2, 3, 4): set(),
    (2, 3, 5): {'fr'},
    (2, 4, 5): set(),
    (3, 4, 5): set(),
    (1, 2, 3, 4): set(),
    (1, 2, 3, 5): set(),
    (1, 2, 4, 5): set(),
    (1, 3, 4, 5): set(),
    (2, 3, 4, 5): set(),
    (1, 2, 3, 4, 5): set()
}

Upvotes: 1

Related Questions