Reputation: 1018
First of all: no, this is not homework! This fits into a solver I'm writing for a game. I just reduced the problem to this concise statement:
Given a set of sets S, find and remove any elements of S which are subsets of other elements of S.
Domain
1 <= |S| <= C^K
1 <= |Si| <= K
2 <= C <= 10
10 <= K <= 500
Details
Si is a subset of [0, K)
min(|Si|) >= logC(|S|)
My current approach is to keep each set inside S sorted via what I call a "NatSet" which is simply a bool[K]
. I then sort S by |Si| and do a O(|S|^2) search to find elements which are subsets of other elements. This, unfortunately, is too slow for my target values C=6 and K=16*9.
Upvotes: 3
Views: 910
Reputation: 6158
Considering I can't try it (the input of set S
is invisible for me) and decide whether the following are helpful, I can only call them some tips for you:
When comparing two set: 'diminished' binary search
: when you compare whether set A (x1...xn) is the subset of set B( y1...ym ), suppose you find yk = x1
(n <= m)
y1, y2, ...,yk, yk+1 ... ym
|
x1
then you can search x2
in the range of [yk+1, ym]
. And the other are the same.
s1
and sk
, then s1
and sk-1
), you said you sort it by size and the large one is more likely to contain it.S
can improve your performance and you can try without sorting or using a max-heap
(see tip 4)If leaf doesn't removed (ie, isn't the subset of it's father), just remove it from heap and you can swap it to the place where the leaf is removed. (as following show)
Notice: Using heap can't guarantee all subset are removed.
More:
1.And you said you are pruning tree, do you want Alpha–beta pruning ?
2.Efficient list intersection algorithm
Hope this can be helpful.
Upvotes: 2
Reputation: 511
The inherent issue here is that given two sets, one with p elements and the other with q elements (without loss of generality, we can say p < q), the fastest way you can get a Union of the two is O(p). This is because if you use HashSets you can reduce your lookup to O(1).
As I understand it, you are doing an O(K^2) search so you get a total run of O((n^2)*(K^2)), where n is the number of elements of S. You could reduce it to O((n^2)K) using the HashSet approach. Here's a link to Java http://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html and almost all major languages have some implementation in their standard libraries. Although if your set S is anywhere near n= 6^(16*9) then I don't think this will help much (I'm not even sure if you can fit any relevant data in modern computers of that size).
If you could drive k down to <= 64 (or 32 in 32 bit systems) you could use bit maps to store each set inside S and parallel bit operations to reduce your time down to O(n^2). It would be equivalent to your current approach but you would just do a bitwise or and check if it equals the same number : ((a1 | a2) == a1) || ((a1 | a2) == a2). This way you get a union operation in constant time.
Upvotes: 0