genichm
genichm

Reputation: 535

Finding overlapping data in arrays

We are writing a C# application that will help to remove unnecessary data repeaters. A repeater can only be removed in the case that all data it receives are received by other repeaters. What we need as a first step is explained bellow:

I have collection of int arrays, for example

a. {1, 2, 3, 4, 5}

b. {2, 4, 6, 7}

c. {1, 3, 5, 8, 11, 100}

It may be thousands of such arrays. I need to find arrays that can be removed. An array can only be removed in the case that all its numbers are included in other arrays. In the example above, array a can be removed because its numbers 2 and 4 are in array b and numbers 1, 3, 5 are in array c.

What the best way to do such operation?

Upvotes: 10

Views: 336

Answers (2)

David Eisenstat
David Eisenstat

Reputation: 65506

Getting the minimum number of remaining arrays (as opposed to a subset of arrays where no more arrays can be removed) is the NP-hard set cover problem. Even with thousands of arrays, however, there's a good chance that, if you apply a mixed integer program solver to the formulation in the linked Wikipedia article, it will be able to find the optimal solution.

Upvotes: 4

Ali Sepehri.Kh
Ali Sepehri.Kh

Reputation: 2508

This is not optimized solution for minimal number of arrays left.

make the abundance dictionary for the member of arrays. for example:

1 => 2
2 => 2
3 => 2
4 => 2
5 => 2
6 => 1
7 => 1
...

Check each of arrays and if abundance of all members are greater than 1, remove array and reduce the count of each number in your dictionary.

Upvotes: 4

Related Questions