Reputation: 13528
Given a set S = {si : {zj : z ∈ N} }, what is a time-efficient algorithm for computing the unique sets of intersections of the subsets of S?
For background, I am dealing with several versions of this problem, some larger than others. In the smallest one |S| ≈ 1,000, |si| ≈ 10,000 and the values are zip codes.
Tiny example for clarity:
Input: S = {{},{1},{2,3},{3,4,5,6,7,8,9,10}} Output: {{},{1},{2,3},{3,4,5,6,7,8,9,10},{3}}
|S| = 4 and there are 24 = 16 subsets of S. However, there are only five unique sets of subset intersections. The first four are the members of S themselves. The fifth is {3}. The empty set is already a member of S. All other 10 subset intersections produce empty sets also.
Upvotes: 3
Views: 270
Reputation: 65437
Here's a fast preprocessing step that might be worthwhile.
Call elements x and y equivalent if, for every set si, either both or neither of x and y belong to si. Simplify the problem by deleting all elements except one representative of each equivalence class. At the end, expand each representative to its class.
To simplify, scan sets one by one while maintaining a map from each element to a label for its equivalence class, where equivalence is determined with respect to the sets processed so far. Initially, all elements are in the same class, so this map sends each element to the same label. To process a set, initialize an empty map of new labels. For each element x in the set, let k be x's current label. If the key k exists in the new label map, then the value corresponding to k becomes x's new label. Otherwise, we allocate a new label and assign it to x and add a mapping from k to the new label.
Here's the execution on your input.
Upvotes: 2
Reputation: 5940
Asymptotic Time Complexity:
n: number of sets, which will change during execution
m: average set size
Time: T(Search-Matching-Sets) x T(Intersection) x SUM { k } for k: 1..n
Time: 2m x 2m x 1/2 n(n+1)
Time: O(4 m^2 x 1/2 n x (n+1)) ~ O(n^2 m^2)
Proposed Algorithm:
FOR i: 1..n
{
FOR j: i..n
{
IF i = j THEN SKIP
INTERSECTION := FIND-INTERSECTION ( SETS[i], SETS[j] )
IF INTERSECTION ⊄ SETS[] THEN ADD INTERSECTION TO SETS[]
}
}
Upvotes: 0