Reputation: 2246
Recently I had to develop a naive recursive algorithm for the set problem outlined below, but would now like to know what the formal description/name of the problem is and whether there's an algorithm that allows the problem to be solved more efficiently (I suspect there is).
I know there are separate algorithms for finding intersects and disjoints, but I haven't recognized any that cover this problem as a whole.
The problem:
To take an indefinite number of sets and return one set for each intersect and non-intersect of all the sets.
For example, given 4 sets as input A,B,C,D:
A {1,14,2,10,13,12,8,9}
B {2,3,4,15,11,13,9}
C {13,10,4,15,5,6,12,11}
D {7,8,9,13,11,6,12}
The result should be the following 13 sets:
{1,14},{2},{3},{4,15},{5},{6},{7},{8},{9},{10},{11},{12},{13}
The algorithm I developed was naive in that it compared all set permutations recursively (I wanted to be able to implement this relatively easily in XSLT 2.0) until no further intersects were found.
Some context, I needed to perform this operation to describe how multiple versions of a tree modify the original tree, at each point on the tree. The 'tree' is actually an XML document.
[Update] Showing the 1st algorithm in the accepted answer run against the 3 sets: {1,2,3},{2,3,4},{2,3,5}
Compared to my original nieve algorithm run over the 3 sets:
Upvotes: 2
Views: 244
Reputation: 11938
The problem becomes quite easy, once some set property are clear.
A int B int C = ( A int B ) int C
what this tells us is that we can solve the problem for 2 sets, and then intersect every result with the the third set, and so on.
By solving hte problem, I mean simply that, A int B
generates 3 sets:
A op B => A int B , A - (A int B) , B - (A int B)
once this is clear, the implementation comes simply, and should look like:
resultset = emptyset
while ( queue not empty )
tmpset = emptyset
A = queue.pop()
residual = A
for B in resultset
tmpset = tmpset + {A int B , B - (A int B)}
residual = residual - B
tmpset = tempset + residual
resultset = tmpset
obviously if any of the set is empty, it doesn't need to be added to the solution. This is some C# code that solves the problem using the dictionary idea:
var query = arr.Select((x, i) => x.Select(y => new { Elem = y, Index = i }))
.SelectMany(x => x)
.GroupBy(x => x.Elem)
.ToDictionary(x => x.Key, y => new HashSet<int>(y.Select(z => z.Index).ToList()))
.GroupBy(x => x.Value, HashSet<int>.CreateSetComparer())
.Select(x=>x.Select(y=>y.Key).ToList())
.ToList();
Upvotes: 1