Sim
Sim

Reputation: 13528

Computing unique intersections of subsets

Given a set S = {si : {zj : z ∈ N} }, what is a time-efficient algorithm for computing the unique sets of intersections of the subsets of S?

For background, I am dealing with several versions of this problem, some larger than others. In the smallest one |S| ≈ 1,000, |si| ≈ 10,000 and the values are zip codes.

Tiny example for clarity:

Input: S = {{},{1},{2,3},{3,4,5,6,7,8,9,10}}
Output: {{},{1},{2,3},{3,4,5,6,7,8,9,10},{3}}

|S| = 4 and there are 24 = 16 subsets of S. However, there are only five unique sets of subset intersections. The first four are the members of S themselves. The fifth is {3}. The empty set is already a member of S. All other 10 subset intersections produce empty sets also.

Upvotes: 3

Views: 270

Answers (2)

David Eisenstat
David Eisenstat

Reputation: 65437

Here's a fast preprocessing step that might be worthwhile.

Call elements x and y equivalent if, for every set si, either both or neither of x and y belong to si. Simplify the problem by deleting all elements except one representative of each equivalence class. At the end, expand each representative to its class.

To simplify, scan sets one by one while maintaining a map from each element to a label for its equivalence class, where equivalence is determined with respect to the sets processed so far. Initially, all elements are in the same class, so this map sends each element to the same label. To process a set, initialize an empty map of new labels. For each element x in the set, let k be x's current label. If the key k exists in the new label map, then the value corresponding to k becomes x's new label. Otherwise, we allocate a new label and assign it to x and add a mapping from k to the new label.

Here's the execution on your input.

  1. Initially label = {1: a, 2: a, 3: a, 4: a, 5: a, 6: a, 7: a, 8: a, 9: a, 10: a}.
  2. Scan {}. Nothing happens.
  3. Scan {1}. Change label[1] to b.
  4. Scan {2, 3}. Change label[2] and label[3] to c.
  5. Scan {3, 4, 5, 6, 7, 8, 9, 10}. Change label[3] to d and label[4..10] to e.
  6. At the end, label = {1: b, 2: c, 3: d, 4: e, 5: e, 6: e, 7: e, 8: e, 9: e, 10: e}. Select 1 (b) and 2 (c) and 3 (d) and 4 (e) to represent their classes. All others are deleted.

Upvotes: 2

Khaled.K
Khaled.K

Reputation: 5940

Asymptotic Time Complexity:

n: number of sets, which will change during execution

m: average set size

Time: T(Search-Matching-Sets) x T(Intersection) x SUM { k } for k: 1..n

Time: 2m x 2m x 1/2 n(n+1)

Time: O(4 m^2 x 1/2 n x (n+1)) ~ O(n^2 m^2)

Proposed Algorithm:

FOR i: 1..n
{

    FOR j: i..n
    {

        IF i = j THEN SKIP

        INTERSECTION := FIND-INTERSECTION ( SETS[i], SETS[j] )

        IF INTERSECTION ⊄ SETS[] THEN ADD INTERSECTION TO SETS[]

    }

}

Upvotes: 0

Related Questions