Reputation: 307
actually I'm looking for combinatory with a limited number of repetitions, I know in python in itertools we already have for no repetitions and with ANY repetition, but I can't found anything for this.
Lets remember, a Combinatory we pick n elements, with a max repetitions, and in the elements, we don't care the order, (A, B, C) is the same as (B, C, A).
Here an example:
A B, C picking 2, repeated 0:
A, B
A, C
B, C
picking 2, repeated 1:
A, A
A, B
A, C
B, B
B, C
C, C
The functions combinations and combinations_with_replacement doesn't give this behavior, is like what I'm looking for is the mix of both.
Lets be clear, the number of repetitions is the max, so with ABCD, pick 3 and repeat 2, AAB would be valid.
Is there a lib or module with somthing like this?
Considerations:
I use big list to apply this, so even If I can filter the results from combinations_with_replacement is not very efficient one by one, I need a generator to not overload the ram.
I would like to avoid this method, or some other more efficient:
def Combs2(data, pick, rep):
for i in itertools.combinations_with_replacement(data, pick):
s = set(i)
show = True
for j in s:
if i.count(j) > (rep+1):
show = False
break
if show:
yield i
I test this code, is soo slow that kills every multiprocessing that I'm using, instead using the cores ends using 1....
Edited:
To show the difference with combinations_with_replacement
We have ABCDE, lets pick 3 elements with 1 rep.
here example, AAB, BBC, BCA
AAA would be not valid.
We can't get this with combinations_with_replacement or combinations.
Upvotes: 4
Views: 221
Reputation: 52008
There is a one-to-one correspondence between the combinations that you seek and k-tuples of bounded non-negative integers with a given target sum. For example,
AAB
, when drawn from ABC
consists of 2 A, 1 B and 0 C
, so the correspondence is AAB <=> (2,1,0)
.
One strategy is to write a generator for such tuples and then decode it as it generates to get the output that you want:
#generates multisets of a given size but with bounded multiplicity
#The following generator generates all tuples of
#non-negative integers of length k
#bounded by d and summing to n
#shouldn't be called if n > k*d
def f(n,k,d,path = ()):
if n == 0:
yield path + (0,)*k
elif k == 1:
yield path + (n,)
else:
lower = max(0,n - (k-1)*d)
upper = min(n,d)
for i in range(lower,upper+1):
yield from f(n-i,k-1,d,path + (i,))
def bounded_combos(items,count,bound):
for t in f(count,len(items),bound):
combo = []
for item,count in zip(items,t):
combo.extend([item]*count)
yield combo
For example,
>>> for c in bounded_combos('ABC',3,2): print(c)
['B', 'C', 'C']
['B', 'B', 'C']
['A', 'C', 'C']
['A', 'B', 'C']
['A', 'B', 'B']
['A', 'A', 'C']
['A', 'A', 'B']
In terms of the number tuples:
>>> for t in f(3,3,2): print(t)
(0, 1, 2)
(0, 2, 1)
(1, 0, 2)
(1, 1, 1)
(1, 2, 0)
(2, 0, 1)
(2, 1, 0)
As far as how many goes, you can work out a recursive formula for the number of combinations with the key idea that if
a_1 + a_2 + ... + a_k = n
then
a_2 + a_3 + ... + a_k = n - a_1
hence the count for a given k and n can be reduced to counts for k-1 and smaller n. A naive implementation of this recursion would involve repeatedly evalauting the same expression, but memoization makes it feasible for large k,n:
def g(n,k,d):
memo = {} #memoization dictionary
def h(n,k):
if (n,k) in memo:
return memo[(n,k)]
else:
if n == 0:
count = 1
elif k == 1:
count = 1
else:
lower = max(0,n - (k-1)*d)
upper = min(n,d)
count = sum(h(n-i,k-1) for i in range(lower,upper+1))
memo[(n,k)] = count
return count
return h(n,k)
Examples:
>>> g(3,3,2)
7
>>> g(10,5,5)
651
>>> g(100,20,10)
18832730699014127291
I don't know of any closed-form formula for these. As an experiment, I evaluated
','.join(str(g(n,n,2)) for n in range(1,11))
and pasted it into the search bar of the On-Line Encyclopedia of Integer Sequences and got a very interesting hit: A002426, the central trinomial coefficients, which are discussed here. If you rerun this experiment with different choices of n,k,d
, you might stumble upon a nice formula for the overall function.
Upvotes: 2