patha
patha

Reputation: 75

An efficient way of finding pairs of sets in a set of sets

I have a set of sets 𝐴: {𝐴1,𝐴2,𝐴3...𝐴𝑛}, and I want to find all pairs (𝐴𝑖, 𝐴𝑗) such that 𝐴𝑖∩𝐴𝑗≠∅.

The only way I could come up with was repeating condition check Σ𝑛 time.

If I can represent this with a python-ish pseudo code:

from copy import deepcopy

setA = {A1, A2, A3, A4....}
setACopy = deepcopy(setA)

intersectingPairs = set()

for Ai in setA:
    for Aj in setACopy:
        if isIntersect(Ai, Aj):
            intersectingPairs.add((Ai, Aj))
    setACopy.remove(Ai)

I expect this to be very time consuming as the size of setA increases.

Is there a better algorithm that I can refer to?

Upvotes: 2

Views: 170

Answers (1)

Kenny Ostrom
Kenny Ostrom

Reputation: 5871

Rather than checking each pair individually, you can linearly parse the data to build a lookup table. If the connections are sparse, this can pretty quickly yield the unique pairs you wanted.

from itertools import combinations
from collections import defaultdict
from pprint import pprint

A1 = {1, 2, 3}
A2 = {2, 4, 6}
A3 = {1, 3, 5}
A4 = {2, 3, 5}
A5 = {8, 9}

# Change them to tuples so they are hashable
superset = set((tuple(subset) for subset in [A1, A2, A3, A4, A5]))

# build families of all sets with a specific element
lookup = defaultdict(list)
for subset in superset:
    for element in subset:
        lookup[element].append(subset)

# If the families are small, brute force from there
pairs = set()
for __, family in lookup.items():
    for result in combinations(family, 2):
        pairs.add(result)

pprint (pairs)

If the connections are not sparse, this might even be worse than looking at every pair. But if they are sparse, it can be really fast.

Upvotes: 2

Related Questions