Reputation: 16077
Suppose I have two lists, how do I iterate through every possible combination of every sublist, such that each item appears once and only once.
I guess an example could be if you have employees and jobs and you want split them into teams, where each employee can only be in one team and each job can only be in one team. Eg
List<string> employees = new List<string>() { "Adam", "Bob"} ;
List<string> jobs = new List<string>() { "1", "2", "3"};
I want
Adam : 1
Bob : 2 , 3
Adam : 1 , 2
Bob : 3
Adam : 1 , 3
Bob : 2
Adam : 2
Bob : 1 , 3
Adam : 2 , 3
Bob : 1
Adam : 3
Bob : 1 , 2
Adam, Bob : 1, 2, 3
I tried using the answer to this stackoverflow question to generate a list of every possible combination of employees and every possible combination of jobs and then select one item from each from each list, but that's about as far as I got.
I don't know the maximum size of the lists, but it would be certainly be less than 100 and there may be other limiting factors (such as each team can have no more than 5 employees)
Update
Not sure whether this can be tidied up more and/or simplified, but this is what I have ended up with so far.
It uses the Group algorithm supplied by Yorye (see his answer below), but I removed the orderby which I don't need and caused problems if the keys are not comparable.
var employees = new List<string>() { "Adam", "Bob" } ;
var jobs = new List<string>() { "1", "2", "3" };
int c= 0;
foreach (int noOfTeams in Enumerable.Range(1, employees.Count))
{
var hs = new HashSet<string>();
foreach( var grouping in Group(Enumerable.Range(1, noOfTeams).ToList(), employees))
{
// Generate a unique key for each group to detect duplicates.
var key = string.Join(":" ,
grouping.Select(sub => string.Join(",", sub)));
if (!hs.Add(key))
continue;
List<List<string>> teams = (from r in grouping select r.ToList()).ToList();
foreach (var group in Group(teams, jobs))
{
foreach (var sub in group)
{
Console.WriteLine(String.Join(", " , sub.Key ) + " : " + string.Join(", ", sub));
}
Console.WriteLine();
c++;
}
}
}
Console.WriteLine(String.Format("{0:n0} combinations for {1} employees and {2} jobs" , c , employees.Count, jobs.Count));
Since I'm not worried about the order of the results, this seems to give me what I need.
Upvotes: 6
Views: 3000
Reputation: 784
Good question.
First of all before you write your code down, lets understand the underlying combinatorics of your question.
Basically, you require that for any partition of set A, you require the same number of parts in set B.
E.g. If you split set A to 3 groups you require that set B will be split to 3 groups as well, if you don't at least one element won't have a corresponding group in the other set.
It's easier to picture it with splitting set A to 1 group we must have one group made from set B as in your example (Adam, Bob : 1, 2, 3) .
So now, we know that set A has n elements and that set B has k elements.
So naturally we cannot request that any set be split more that Min(n,k)
.
Let's say we split both sets to 2 groups (partitions) each, now we have 1*2=2!
unique pairs between the two sets.
Another example is 3 groups each would give us 1*2*3=3!
unique pairs.
However, we're still not finished, after any set is split into several subsets (groups), we can still order the elements in many combinations.
So for m
partitions of a set we need to find how many combinations of placing n
elements in m
partitions.
This can be found by using the Stirling number of the second kind formula:
(Eq 1)
This formula gives you The number of ways of partitioning a set of n
elements into k
nonempty sets.
So the total number of combinations of pairs for all partitions from 1 to min(n,k)
is:
(Eq 2)
In short, it's the sum of all partition combinations of both sets, times all combinations of pairs.
So now we understand how to partition and pair our data we can write the code down:
Code:
So basically, if we look at our final equation (2) we understand the we need four parts of code to our solution.
1. A summation (or loop)
2. A way to get our Stirling sets or partitions from both sets
3. A way to get the Cartesian product of both Stirling sets.
4. A way to permutate through the items of a set. (n!)
On StackOverflow you can find many ways of how to permuatate items and how to find cartesian products, here is a an example (as extension method):
public static IEnumerable<IEnumerable<T>> Permute<T>(this IEnumerable<T> list)
{
if (list.Count() == 1)
return new List<IEnumerable<T>> { list };
return list.Select((a, i1) => Permute(list.Where((b, i2) => i2 != i1)).Select(b => (new List<T> { a }).Union(b)))
.SelectMany(c => c);
}
This was the easy part of the code.
The more difficult part (imho) is finding all possible n
partitions of a set.
So to solve this, I first solved the greater problem of how to find all possible partitions of a set (not just of size n).
I came up with this recursive function:
public static List<List<List<T>>> AllPartitions<T>(this IEnumerable<T> set)
{
var retList = new List<List<List<T>>>();
if (set == null || !set.Any())
{
retList.Add(new List<List<T>>());
return retList;
}
else
{
for (int i = 0; i < Math.Pow(2, set.Count()) / 2; i++)
{
var j = i;
var parts = new [] { new List<T>(), new List<T>() };
foreach (var item in set)
{
parts[j & 1].Add(item);
j >>= 1;
}
foreach (var b in AllPartitions(parts[1]))
{
var x = new List<List<T>>();
x.Add(parts[0]);
if (b.Any())
x.AddRange(b);
retList.Add(x);
}
}
}
return retList;
}
The return value of : List<List<List<T>>>
just means a List of all possible partitions where a partition is a list of sets and a set is a list of elements.
We do not have to use the type List here, but it simplifies indexing.
So now let's put everything together:
Main Code
//Initialize our sets
var employees = new [] { "Adam", "Bob" };
var jobs = new[] { "1", "2", "3" };
//iterate to max number of partitions (Sum)
for (int i = 1; i <= Math.Min(employees.Length, jobs.Length); i++)
{
Debug.WriteLine("Partition to " + i + " parts:");
//Get all possible partitions of set "employees" (Stirling Set)
var aparts = employees.AllPartitions().Where(y => y.Count == i);
//Get all possible partitions of set "jobs" (Stirling Set)
var bparts = jobs.AllPartitions().Where(y => y.Count == i);
//Get cartesian product of all partitions
var partsProduct = from employeesPartition in aparts
from jobsPartition in bparts
select new { employeesPartition, jobsPartition };
var idx = 0;
//for every product of partitions
foreach (var productItem in partsProduct)
{
//loop through the permutations of jobPartition (N!)
foreach (var permutationOfJobs in productItem.jobsPartition.Permute())
{
Debug.WriteLine("Combination: "+ ++idx);
for (int j = 0; j < i; j++)
{
Debug.WriteLine(productItem.employeesPartition[j].ToArrayString() + " : " + permutationOfJobs.ToArray()[j].ToArrayString());
}
}
}
}
Output:
Partition to 1 parts:
Combination: 1
{ Adam , Bob } : { 1 , 2 , 3 }
Partition to 2 parts:
Combination: 1
{ Bob } : { 2 , 3 }
{ Adam } : { 1 }
Combination: 2
{ Bob } : { 1 }
{ Adam } : { 2 , 3 }
Combination: 3
{ Bob } : { 1 , 3 }
{ Adam } : { 2 }
Combination: 4
{ Bob } : { 2 }
{ Adam } : { 1 , 3 }
Combination: 5
{ Bob } : { 3 }
{ Adam } : { 1 , 2 }
Combination: 6
{ Bob } : { 1 , 2 }
{ Adam } : { 3 }
We can easily check our out put just by counting the results.
In this example we have a set of 2 elements, and a set of 3 elements, Equation 2 states that we need S(2,1)S(3,1)1!+S(2,2)S(3,2)2! = 1+6 = 7
which is exactly the number of combinations we got.
For reference here are examples of Stirling number of the second kind:
S(1,1) = 1
S(2,1) = 1
S(2,2) = 1
S(3,1) = 1
S(3,2) = 3
S(3,3) = 1
S(4,1) = 1
S(4,2) = 7
S(4,3) = 6
S(4,4) = 1
Edit 19.6.2012
public static String ToArrayString<T>(this IEnumerable<T> arr)
{
string str = "{ ";
foreach (var item in arr)
{
str += item + " , ";
}
str = str.Trim().TrimEnd(',');
str += "}";
return str;
}
Edit 24.6.2012
The main part of this algorithm is finding the Stirling sets, I've used an inneficient Permutation method, here is a faster one based on the QuickPerm algorithm:
public static IEnumerable<IEnumerable<T>> QuickPerm<T>(this IEnumerable<T> set)
{
int N = set.Count();
int[] a = new int[N];
int[] p = new int[N];
var yieldRet = new T[N];
var list = set.ToList();
int i, j, tmp ;// Upper Index i; Lower Index j
T tmp2;
for (i = 0; i < N; i++)
{
// initialize arrays; a[N] can be any type
a[i] = i + 1; // a[i] value is not revealed and can be arbitrary
p[i] = 0; // p[i] == i controls iteration and index boundaries for i
}
yield return list;
//display(a, 0, 0); // remove comment to display array a[]
i = 1; // setup first swap points to be 1 and 0 respectively (i & j)
while (i < N)
{
if (p[i] < i)
{
j = i%2*p[i]; // IF i is odd then j = p[i] otherwise j = 0
tmp2 = list[a[j]-1];
list[a[j]-1] = list[a[i]-1];
list[a[i]-1] = tmp2;
tmp = a[j]; // swap(a[j], a[i])
a[j] = a[i];
a[i] = tmp;
//MAIN!
// for (int x = 0; x < N; x++)
//{
// yieldRet[x] = list[a[x]-1];
//}
yield return list;
//display(a, j, i); // remove comment to display target array a[]
// MAIN!
p[i]++; // increase index "weight" for i by one
i = 1; // reset index i to 1 (assumed)
}
else
{
// otherwise p[i] == i
p[i] = 0; // reset p[i] to zero
i++; // set new index value for i (increase by one)
} // if (p[i] < i)
} // while(i < N)
}
This will take cut the time in half.
However, most of the CPU cycles go to string building, which is specifically needed for this example.
This will be a make it a bit faster:
results.Add(productItem.employeesPartition[j].Aggregate((a, b) => a + "," + b) + " : " + permutationOfJobs.ToArray()[j].Aggregate((a, b) => a + "," + b));
Compiling in x64 will be better cause these strings are taking a lot of memory.
Upvotes: 5
Reputation: 14522
In my answer I will ignore the last result you had: Adam, Bob: 1, 2, 3
, because it is an exception logically. Skip to the end to see my output.
Explanation:
The idea would be iterating combinations of "which group an element will belong to".
Say you have elements "a, b, c", and you have groups "1, 2", lets have an array of size 3, as the number of elements, which will contain all possible combinations of the groups "1, 2", WITH repetition:
{1, 1, 1} {1, 1, 2} {1, 2, 1} {1, 2, 2}
{2, 1, 1} {2, 1, 2} {2, 2, 1} {2, 2, 2}
Now we shall take each group, and sort of make a key-value collection out of it, with the following logic:
Group of elements[i]
will be the value of comb[i]
.
Example with {1, 2, 1}:
a: group 1
b: group 2
c: group 1
And in a different view, the way you wanted it:
group 1: a, c
group 2: b
After all that, you just have to filter all the combinations that do not contain all groups, because you wanted all groups to have at least one value.
So you should check if all groups appear in a certain combination and filter the ones that don't match, so you'll be left with:
{1, 1, 2} {1, 2, 1} {1, 2, 2}
{2, 1, 2} {2, 2, 1} {2, 1, 1}
Which will result in:
1: a, b
2: c
1: a, c
2: b
1: a
2: b, c
1: b
2: a, c
1: c
2: a, b
1: b, c
2: a
This group-breaking combination logic will work for bigger amount of elements and groups as well. Here is my implementation, which can probably be done better, because even I got a little lost in there while I coded it (it's really not an intuitive problem hehe), but it works good.
Implementation:
public static IEnumerable<ILookup<TValue, TKey>> Group<TKey, TValue>
(List<TValue> keys,
List<TKey> values,
bool allowEmptyGroups = false)
{
var indices = new int[values.Count];
var maxIndex = values.Count - 1;
var nextIndex = maxIndex;
indices[maxIndex] = -1;
while (nextIndex >= 0)
{
indices[nextIndex]++;
if (indices[nextIndex] == keys.Count)
{
indices[nextIndex] = 0;
nextIndex--;
continue;
}
nextIndex = maxIndex;
if (!allowEmptyGroups && indices.Distinct().Count() != keys.Count)
{
continue;
}
yield return indices.Select((keyIndex, valueIndex) =>
new
{
Key = keys[keyIndex],
Value = values[valueIndex]
})
.OrderBy(x => x.Key)
.ToLookup(x => x.Key, x => x.Value);
}
}
Usage:
var employees = new List<string>() { "Adam", "Bob"};
var jobs = new List<string>() { "1", "2", "3"};
var c = 0;
foreach (var group in CombinationsEx.Group(employees, jobs))
{
foreach (var sub in group)
{
Console.WriteLine(sub.Key + ": " + string.Join(", ", sub));
}
c++;
Console.WriteLine();
}
Console.WriteLine(c + " combinations.");
Output:
Adam: 1, 2
Bob: 3
Adam: 1, 3
Bob: 2
Adam: 1
Bob: 2, 3
Adam: 2, 3
Bob: 1
Adam: 2
Bob: 1, 3
Adam: 3
Bob: 1, 2
6 combinations.
UPDATE
Combined keys combinations prototype:
public static IEnumerable<ILookup<TKey[], TValue>> GroupCombined<TKey, TValue>
(List<TKey> keys,
List<TValue> values)
{
// foreach (int i in Enumerable.Range(1, keys.Count))
for (var i = 1; i <= keys.Count; i++)
{
foreach (var lookup in Group(Enumerable.Range(0, i).ToList(), keys))
{
foreach (var lookup1 in
Group(lookup.Select(keysComb => keysComb.ToArray()).ToList(),
values))
{
yield return lookup1;
}
}
}
/*
Same functionality:
return from i in Enumerable.Range(1, keys.Count)
from lookup in Group(Enumerable.Range(0, i).ToList(), keys)
from lookup1 in Group(lookup.Select(keysComb =>
keysComb.ToArray()).ToList(),
values)
select lookup1;
*/
}
There's still a little problem of duplicates, but it produces all results.
Here is what I would use to remove duplicates, as a temporary solution:
var c = 0;
var d = 0;
var employees = new List<string> { "Adam", "Bob", "James" };
var jobs = new List<string> {"1", "2"};
var prevStrs = new List<string>();
foreach (var group in CombinationsEx.GroupCombined(employees, jobs))
{
var currStr = string.Join(Environment.NewLine,
group.Select(sub =>
string.Format("{0}: {1}",
string.Join(", ", sub.Key),
string.Join(", ", sub))));
if (prevStrs.Contains(currStr))
{
d++;
continue;
}
prevStrs.Add(currStr);
Console.WriteLine(currStr);
Console.WriteLine();
c++;
}
Console.WriteLine(c + " combinations.");
Console.WriteLine(d + " duplicates.");
Output:
Adam, Bob, James: 1, 2
Adam, Bob: 1
James: 2
James: 1
Adam, Bob: 2
Adam, James: 1
Bob: 2
Bob: 1
Adam, James: 2
Adam: 1
Bob, James: 2
Bob, James: 1
Adam: 2
7 combinations.
6 duplicates.
Note that it will also produce non-combined groups (if possible - because no empty groups are allowed). To produce ONLY combined-keys you will need to replace this:
for (var i = 1; i <= keys.Count; i++)
With this:
for (var i = 1; i < keys.Count; i++)
In the beginning of the GroupCombined method. Test the method with three employees and three jobs to see how it works exactly.
ANOTHER EDIT:
A better duplicates-handling would be handling duplicate key-combinations in the GroupCombined level:
public static IEnumerable<ILookup<TKey[], TValue>> GroupCombined<TKey, TValue>
(List<TKey> keys,
List<TValue> values)
{
for (var i = 1; i <= keys.Count; i++)
{
var prevs = new List<TKey[][]>();
foreach (var lookup in Group(Enumerable.Range(0, i).ToList(), keys))
{
var found = false;
var curr = lookup.Select(sub => sub.OrderBy(k => k).ToArray())
.OrderBy(arr => arr.FirstOrDefault()).ToArray();
foreach (var prev in prevs.Where(prev => prev.Length == curr.Length))
{
var isDuplicate = true;
for (var x = 0; x < prev.Length; x++)
{
var currSub = curr[x];
var prevSub = prev[x];
if (currSub.Length != prevSub.Length ||
!currSub.SequenceEqual(prevSub))
{
isDuplicate = false;
break;
}
}
if (isDuplicate)
{
found = true;
break;
}
}
if (found)
{
continue;
}
prevs.Add(curr);
foreach (var group in
Group(lookup.Select(keysComb => keysComb.ToArray()).ToList(),
values))
{
yield return group;
}
}
}
}
As it seems, it would be smart to add constraints to the method so that TKey
will be ICompareable<TKey>
and maybe also IEquatable<TKey>
.
This results with 0 duplicates in the end.
Upvotes: 1
Reputation: 49331
Ignoring your other constraints ( such as maximum size of team ), what you are doing is creating a partition of set P and a partition of set Q with the same number of subsets, then finding all combinations of one of the sets to map the first partition to the second.
Michael Orlov's paper has what appears to be a simple algorithm for generating partitions where each iteration uses constant space in this paper. He also provides an algorithm for listing partitions of a given size.
Start with P = { A, B } and Q = { 1, 2, 3 } then the partitions of size 1 are [ P ] and [ Q ], so the only pairing is ( [ P ], [ Q ] )
For partitions of size 2 , P has only two elements so only one partition of size 2 [ { A }, { B } ], Q has three partitions of size 2 [ { 1 }, { 2, 3 } ], [ { 1, 2 }, { 3 } ], [ { 1, 3 }, { 2 } ].
As each partition of Q contains two subsets, there are 2 combinations of each partition, which gives you six pairings:
( [ { A }, { B } ], [ { 1 }, { 2, 3 } ] )
( [ { A }, { B } ], [ { 2, 3 }, { 1 } ] )
( [ { A }, { B } ], [ { 1, 2 }, { 3 } ] )
( [ { A }, { B } ], [ { 3 }, { 1, 2 } ] )
( [ { A }, { B } ], [ { 1, 3 }, { 2 } ] )
( [ { A }, { B } ], [ { 2 }, { 1, 3 } ] )
As the size of one of the original sets was 2, there are no partitions of size 3 of it, so we stop.
Upvotes: 0
Reputation: 9133
Say we have two sets, A and B.
A = {a1, a2, a3} ; B = {b1, b2, b3}
First, let's get a set consisting of tuples containing a subset, and its complement:
{a1} {a2 a3}
{a2} {a1 a3}
{a3} {a1 a2}
{a2, a3} {a1}
...
For this, you can use Rikon's library above, or write your own code. You'll need to do this:
Order matters here; {a1} {a2 a3} and {a2 a3} {a1}
are different tuples, after all.
Then we get a similar set for B. Then, we perform a cross join between the two sets, to get:
{a1} {a2 a3} | {b1} {b2 b3}
{a2} {a1 a3} | {b2} {b1 b3}
...
This pretty much matches your description above. Just look at {Bob, Adam} as one set, and {1, 2, 3} as another set. You'll have to dump some of the empty sets (since a power set includes empty subsets too), depending on your requirements. This is the general gist, though, as far as I can tell. Sorry for the lack of code; I gotta go to sleep :P
Upvotes: 0
Reputation: 2706
Would you be ok using another library? Here is a generic library for combinations (you obviously want the one with no repeat). Then you'd only need to do a foreach over your employees list and run the combination w/ both.
I don't think you're doing yourself any favors from a big O perspective, is efficiency a priority here?
This is from the hip, but this should be the code to get what you want (with that library):
Combinations<string> combinations = new Combinations<string>(jobs, 2);
foreach(IList<string> c in combinations) {
Console.WriteLine(String.Format("{{{0} {1}}}", c[0], c[1]));
}
And then that would need to be applied to each employee
Upvotes: 3