Ben Lesh
Ben Lesh

Reputation: 108491

Tricky algorithm... finding multiple combinations of subsets within nested HashSets?

I have a problem where I have to find multiple combinations of subsets within nested hashsets. Basically I have a "master" nested HashSet, and from a collection of "possible" nested HashSets I have to programmatically find the "possibles" that could be simultaneous subsets of the "master".

Lets say I have the following:

            var master = new HashSet<HashSet<string>>(new HashSet<string>[] {
                        new HashSet<string>( new string[] { "A", "B", "C"}),
                        new HashSet<string>( new string[] { "D", "E"}),
                        new HashSet<string>( new string[] { "F"})
                    }
                ); 
            var possible1  = new HashSet<HashSet<string>>(new HashSet<string>[] {
                        new HashSet<string>( new string[] { "A", "B", "C"}),
                        new HashSet<string>( new string[] { "F"})
                    }
                 );
            var possible2 = new HashSet<HashSet<string>>(new HashSet<string>[] {
                        new HashSet<string>( new string[] { "D", "E"})
                    }
                 ); 
            var possible3 = new HashSet<HashSet<string>>(new HashSet<string>[] {
                        new HashSet<string>( new string[] { "F"})
                    }
                 ); 
            var possible4 = new HashSet<HashSet<string>>(new HashSet<string>[] {
                        new HashSet<string>( new string[] { "X", "Y", "Z"})
                    }
                ); 
            var possible5 = new HashSet<HashSet<string>>(new HashSet<string>[] {
                        new HashSet<string>( new string[] { "A", "B" }),
                        new HashSet<string>( new string[] { "D", "E"})
                    }
                ); 

The output I should get from my algorithm should be as follows:

All possible combination subsets:

possible1 and possible2
possible3 and possible5
possible2 and possible3
possible1
possible2
possible3
possible5

I'm trying to figure out the best way to approach this. There is, of course, the brute force option, but I'm trying to avoid that if I can.

I just hope my question was clear enough.

EDIT

To further elaborate on what constitutes a subset, here are some examples, given the master {{"A","B","C"},{"C","D","E",F"},{"X","Y","Z"}} :

Basically each child set needs to be a subset of a corresponding child in the master.

Upvotes: 1

Views: 747

Answers (2)

Jaroslav Jandek
Jaroslav Jandek

Reputation: 9563

Use bruteforce:

public static int IsCsInMaster(HashSet<string> childSubset, List<HashSet<string>> master, int startIndex)
{
    for (int i = startIndex; i < master.Count; i++)
        if (childSubset.IsSubsetOf(master[i])) return i;

    return -1;
}

public static bool IsChildInMaster(List<HashSet<string>> child, List<HashSet<string>> master)
{
    foreach (var childSubset in child) if (IsCsInMaster(childSubset, master, 0) == -1) return false;

    return true;
}

public static bool IsChildInMasterMulti(List<HashSet<string>> child, List<HashSet<string>> master)
{
    Dictionary<int, int> subsetChecker = new Dictionary<int, int>();
    List<IEnumerable<int>> multiMatches = new List<IEnumerable<int>>();
    int subsetIndex;

    // Check for matching subsets.
    for (int i = 0; i < child.Count; i++)
    {
        subsetIndex = 0;
        List<int> indexes = new List<int>();

        while ((subsetIndex = IsCsInMaster(child[i], master, subsetIndex)) != -1)
        {
            indexes.Add(subsetIndex++);
        }

        if (indexes.Count == 1)
        {
            subsetIndex = indexes[0];
            if (subsetChecker.ContainsKey(subsetIndex)) return false;
            else subsetChecker[subsetIndex] = subsetIndex;
        }
        else
        {
            multiMatches.Add(indexes);
        }
    }

    /*** Check for multi-matching subsets. ***/ //got lazy ;)
    var union = multiMatches.Aggregate((aggr, indexes) => aggr.Union(indexes));

    // Filter the union so only unmatched subset indexes remain.
    List<int> filteredUion = new List<int>();
    foreach (int index in union)
    {
        if (!subsetChecker.ContainsKey(index)) filteredUion.Add(index);
    }

    return (filteredUion.Count >= multiMatches.Count);
}

And in code:

IsChildInMasterMulti(possible2, master)

The code does not handle the {{"A","B"},{"A","B"}} case, though. That is a LOT more difficult (flagging used subsets in master, maybe even individual elements - recursively).

Edit2: The third method handles the {{"A","B"},{"A","B"}} case as well (and more).

Upvotes: 1

Jay
Jay

Reputation: 14471

Use the simplest solution possible.

Keep in mind that if someone else has to look at your code they should be able to understand what it's doing with as little effort as possible. I already found it hard to understand from your description what you want to do and I haven't had to read code yet.

If you find that it's too slow after it's working optimize it then.

If possible write unit tests. Unit tests will ensure that your optimized solution is also working correctly and will help others ensure their changes don't break anything.

Upvotes: 0

Related Questions