Reputation: 10855
Related to this question, I am wondering the algorithms (and actual code in java/c/c++/python/etc., if you have!) to generate all combinations of r
elements for a list with m
elements in total. Some of these m
elements may be repeated.
Thanks!
Upvotes: 2
Views: 1875
Reputation: 24336
Here is a recursion that I believe is closely related to Jean-Bernard Pellerin's algorithm, in Mathematica.
This takes input as the number of each type of element. The output is in similar form. For example:
{a,a,b,b,c,d,d,d,d} -> {2,2,1,4}
Function:
f[k_, {}, c__] := If[+c == k, {{c}}, {}]
f[k_, {x_, r___}, c___] := Join @@ (f[k, {r}, c, #] & /@ 0~Range~Min[x, k - +c])
Use:
f[4, {2, 2, 1, 4}]
{{0, 0, 0, 4}, {0, 0, 1, 3}, {0, 1, 0, 3}, {0, 1, 1, 2}, {0, 2, 0, 2}, {0, 2, 1, 1}, {1, 0, 0, 3}, {1, 0, 1, 2}, {1, 1, 0, 2}, {1, 1, 1, 1}, {1, 2, 0, 1}, {1, 2, 1, 0}, {2, 0, 0, 2}, {2, 0, 1, 1}, {2, 1, 0, 1}, {2, 1, 1, 0}, {2, 2, 0, 0}}
An explanation of this code was requested. It is a recursive function that takes a variable number of arguments. The first argument is k
, length of subset. The second is a list of counts of each type to select from. The third argument and beyond is used internally by the function to hold the subset (combination) as it is constructed.
This definition therefore is used when there are no more items in the selection set:
f[k_, {}, c__] := If[+c == k, {{c}}, {}]
If the total of the values of the combination (its length) is equal to k
, then return that combination, otherwise return an empty set. (+c
is shorthand for Plus[c]
)
Otherwise:
f[k_, {x_, r___}, c___] := Join @@ (f[k, {r}, c, #] & /@ 0~Range~Min[x, k - +c])
Reading left to right:
Join
is used to flatten out a level of nested lists, so that the result is not an increasingly deep tensor.
f[k, {r}, c, #] &
calls the function, dropping the first position of the selection set (x
), and adding a new element to the combination (#
).
/@ 0 ~Range~ Min[x, k - +c])
for each integer between zero and the lesser of the first element of the selection set, and k
less total of combination, which is the maximum that can be selected without exceeding combination size k
.
Upvotes: 1
Reputation: 134005
I'm going to make this an answer rather than a bunch of comments.
My original comment was:
The CombinationGenerator Java class systematically generates all combinations of n elements, taken r at a time. The algorithm is described by Kenneth H. Rosen, Discrete Mathematics and Its Applications, 2nd edition (NY: McGraw-Hill, 1991), pp. 284-286." See merriampark.com/comb.htm. It has a link to source code.
As you pointed out in your comment, you want unique combinations. So, given the array ["a", "a", "b", "b"]
, you want it to generate aab, abb
. The code I linked generates aab, aab, baa, baa
.
With that array, removing duplicates is very easy. Depending on how you implement it, you either let it generate the duplicates and then filter them after the fact (i.e. selecting unique elements from an array), or you modify the code to include a hash table so that when it generates a combination, it checks the hash table before putting the item into the output array.
Looking something up in a hash table is an O(1)
operation, so either of those is going to be efficient. Doing it after the fact will be a little bit more expensive, because you'll have to copy items. Still, you're talking O(n)
, where n
is the number of combinations generated.
There is one complication: order is irrelevant. That is, given the array ["a", "b", "a", "b"]
, the code will generate aba, abb, aab, bab
. In this case, aba
and aab
are duplicate combinations, as are abb
and bab
, and using a hash table isn't going to remove those duplicates for you. You could, though, create a bit mask for each combination, and use the hash table idea with the bit masks. This would be slightly more complicated, but not terribly so.
If you sort the initial array first, so that duplicate items are adjacent, then the problem goes away and you can use the hash table idea.
There's undoubtedly a way to modify the code to prevent it from generating duplicates. I can see a possible approach, but it would be messy and expensive. It would probably make the algorithm slower than if you just used the hash table idea. The approach I would take:
Sort the input array
Use the linked code to generate the combinations
Use a hash table or some other code to select unique items.
Although ... a thought that occurred to me.
Is it true that if you sort the input array, then any generated duplicates will be adjacent? That is, given the input array ["a", "a", "b", "b"]
, then the generated output will be aab, aab, abb, abb
, in that order. This will be implementation dependent, of course. But if it's true in your implementation, then modifying the algorithm to remove duplicates is a simple matter of checking to see if the current combination is equal to the previous one.
Upvotes: 1
Reputation: 12670
recurse for each element type
int recurseMe(list<list<item>> items, int r, list<item> container)
{
if (r == container.length)
{
//print out your collection;
return 1;
}
else if (container.length > score)
{
return 0;
}
list<item> firstType = items[0];
int score = 0;
for(int i = 0; i < firstType.length; i++)
{
score += recurseMe(items without items[0], r, container + i items from firstType);
}
return score;
}
This takes as input a list containing lists of items, assuming each inner list represents a unique type of item. You may have to build a sorting function to feed as input to this.
//start with a list<item> original;
list<list<item>> grouped = new list<list<item>>();
list<item> sorted = original.sort();//use whichever method for this
list<item> temp = null;
item current = null;
for(int x = 0; x < original.length; x++)
if (sorted[x] == current)
{
temp.add(current);
}
else
{
if (temp != null && temp.isNotEmpty)
grouped.add(temp);
temp = new list<item>();
temp.add(sorted[x]);
}
}
if (temp != null && temp.isNotEmpty)
grouped.add(temp);
//grouped is the result
This sorts the list, then creates sublists containing elements that are the same, inserting them into the list of lists grouped
Upvotes: 2