Reputation: 697
We have N sets, each having N integers.
We take every pair of sets, and find their intersection set, S.
Now, we are interested to find the cardinality of the intersection set S that has maximum cardinality.
Example
For example, let N be 4 and we have 4 sets each of 4 elements:
A= {1,2,5,6}, B= {2,5,7,6}, C= {3,4,2,6}, D= {1,4,7,8}
Now we take pairwise intersection of these sets and an intersection set having maximum cardinality= {2,5,6}
So, we return 3.
A brute force solution can be done with Θ(N3) time. Can we do it more efficiently with an other method ?
Upvotes: 4
Views: 430
Reputation: 76297
The following can reduce the running time if:
There is some α > 0 such that (for fixed α and increasing n), there are two sets with at least an α fraction of overlap.
Many pairs are disjoint or nearly disjoint (made more precise later).
You are willing to use a high-probability Monte-Carlo algorithm (with probability of error decreasing with n).
Suppose you first transform each set into a hash table. This takes (expected) Θ(n2) time. Now for each pair of sets (i.e., an n2 loop), select √n random elements from the first set, and count the number of hits in the second set (expected O(√ n) time).
By a combination of the multiplicative Chernoff bound and the union bound, you can retain only pairs for which the number of hits was at least 1/2 α √ n.
So, with high probability, you can narrow down n2 pairs to some β n2 pairs, with time only O(n2.5). From here on continue with brute force. This is either useful or not depending on how β grows as a function as n.
Upvotes: 2