Reputation: 111
I'm trying to generate all the possible permutations of a string like "0000011111" or "000 11 2222 333". I tried using permutations from itertools on "0000011111" like so:
from itertools import permutations
basestring = "0"*5 +"1"*5
perms = [''.join(p) for p in permutations(basestring)]
print(len(perms), perms)
print(len(set(perms)), set(perms))
But the list perms had 3 million entries when there are only 10 C 5 = 252 permutations.
Is there a built in tool I can use that is better at handling permutations of strings with many repeated characters?
Otherwise how is this algorithm for generating the permutations (for "0000 1111 222")?
Start with 2 characters "0000 1111"
Move right most 0 over one "0001 0111" and add it to the list
Continue moving it to the end "0001 1011" -> "0001 1101" -> "0001 1110"
Now move the next 0 over one "0010 0111" -> "0010 1011"
...
Until you get to "1111 0000".
Then for each of the strings generated, repeat the process with 2's.
222 xxxx xxxx -> 22x 2xxx xxxx -> 22x x2xx xxx...
Or am I better off just doing set(perms) to get rid of the repeats? (I need to permute 20 character lists with 3-5 characters where itertools permutations would give me 10e18 strings)
I've been programming casually for 3 years, but only know as much as someone with 1 semester of a programming class.
Upvotes: 5
Views: 1138
Reputation: 23945
I'm not sure how efficient this is but you could try something like:
map = ['0','1','2','3']
perms = []
def multisetPerms(s,result):
if all(v is 0 for v in s):
perms.append(result)
return
for i in xrange(len(s)):
if s[i] > 0:
_s = s[:]
_s[i] = _s[i] - 1
multisetPerms(_s,result + map[i])
multisetPerms([3,2,4],'') # 9!/(3!*2!*4!) = 1260
print (len(perms), perms[0:10])
Output:
(1260, ['000112222', '000121222', '000122122', '000122212', '000122221'
, '000211222', '000212122', '000212212', '000212221', '000221122'])
Upvotes: 0
Reputation: 4164
Firstly let's look at your first example.
from itertools import permutations
basestring = "0"*5 +"1"*5
This gives basestring = [0000011111]
Calling permutations(basestring) without any argument will give you all permutations of length n of the n position string, which is just n! That's indeed a large number for n=10. Is that really what you want?
Next, if you are looking for the permutations of this string of length 5, you need to specify that length 5 in the call to itertools.permutations.
perms = [''.join(p) for p in permutations(basestring,5)]
This will return all permutations of length 5 of all characters in basestring by position, not value. So you will get some duplicates.
As documented in the itertools.permutations documentation see Python 2 version here, the number of permutations of length r on a string of length n returned by that function will be
n!/(n-r)! or in this case 30240 for n=10, r=5.
If you want to remove the duplicates, you can take
set(perms)
The number of combinations returned by this will be len(set(perms)) = 2^5 or 32. This is the number of strings of length k that can be formed from an "alphabet" of length n, which is n^k. The "alphabet" is the unique characters in your basestring - there are 2 of them (0 and 1) so you can form 32 unique strings of length 5 from that.
Upvotes: 0