Reputation: 1013
I use two digits in an integer to represent one element. I.e. 3345512
represents the four elements [3,34,55,12]
. Next, I repeatedly add one to the integer to get another sequence of elements. When generating the sequences like this, I get permutations of the same sequence, i.e. 3341255 = [3,34,12,55]
, which in my case is equivalent to the number 3345512 = [3,34,55,12]
. So I would like to avoid permutations of a sequence I already encountered. I do not want to store the numbers as they grow very large (10^30
and more). I tried to use a bloom filter, but it could not handle the amount of elements. Is there a trivial solution to generate the sequences without permutations?
[EDIT] Here is a tiny python script, that should work. For the sake of better comprehensibility, I use a single digit with if s[idx] == 9:
instead of if s[idx] == 99:
If you have a more simple solution I will accept it as an answer.
import time
s = [1]
while True:
idx = 0
while not idx+1 == len(s) and not s[idx] < s[idx+1]:
s[idx] = 1
idx += 1
if s[idx] == 9:
s[idx] = 1
s.append(1)
else:
s[idx] += 1
print repr(s)
time.sleep(0.7)
Upvotes: 0
Views: 362
Reputation: 134005
What you're asking for is combinations rather than permutations. It looks like you're trying to get unique combinations of 4 items from a list of 100. That is, your possible values are 00
through 99
.
You could generate all of the combinations with a nested loop, like this:
for (i = 0 to 96)
for (j = i + 1 to 97)
for (k = j + 1 to 98)
for (l = k + 1 to 99)
write i, j, k, l
That guarantees that you won't get the same combination.
You'll also note that in the generated combinations:
i < j < k < l
If you make your sequences always satisfy that inequality, then you won't get a duplicate combination.
So your example 3345512
would never be generated. It would be 3123455
.
Given that, when you increment from 3123499
, you don't go to 3123500
, but rather to 3123536
. Basically, if you overflow one position, you increment the next position, and the original position becomes (next position + 1). So 3999999
increments to 4050607
.
Obviously, you can't do this with a simple integer increment. I would suggest using a 4-byte value and a little bit of logic.
Here's another way to do it.
Imagine your alphabet is only 4 characters, [0, 1, 2, 3]
. There is a total of 16 possible unique combinations. You want the unique combinations of two items.
Now, consider a 4-bit number that can hold the values 0 through 15. You can map that number to the unique combinations so that the number 3 (0011 binary) corresponds to the combination 0, 1. That is, bit 0 is set and bit 1 is set. It should be clear that a number with 2 bits set corresponds to a unique 2-character combination. In the case of our 2-combinations in the alphabet of 4 items, we have:
0, 1 (0011)
0, 2 (0101)
0, 3 (1001)
1, 2 (0110)
1, 3 (1010)
2, 3 (1100)
With this, you can increment an integer, see if it has two bits set, and then translate from set bits to characters in your alphabet.
You can do exactly the same thing with selecting 4-combinations from a set of 100 characters. Working with a 100-bit number is a bit difficult but not impossible. Since you're just incrementing, you can do it with a pair of 64 bit numbers.
Determining how many bits are set is easy to do the naive way. The Bit Twiddling Hacks page shows several faster ways.
There are several ways to solve this problem, all of which can be parameterized to select unique k-combinations from a list of n items.
Upvotes: 2
Reputation: 3056
To check if the given sequence has already been generated and can be skipped you should generate "previous" permutation of the current sequence that would be generated if counting from zero, and compare it with the starting number.
You can find this permutation the following way:
If the numbers are in the increasing order - for example [3] [4] [5] [6] - you can stop, as there is no smaller permutation of those numbers.
If the sequence has some non-increasing part - for example [3] [5] [4] [8] [6] [7] [8] [9]. This sequence has two non-increasing parts. 5-4 and 8-6.
Locate the last number which is bigger than it's successor. It's the first 8.
[3] [5] [4] [8] [6] [7] [8] [9]
Find the biggest number behind the 8, that's actually smaller. It's 7.
[3] [5] [4] [8] [6] [7] [8] [9]
Move 7 to the position of 8, and leave all other numbers in decreasing order.
[3] [5] [4] [7] [9] [8] [8] [6]
Now if this sequence is smaller than the starting number - you know that it was not yet generated. Otherwise you can skip it.
Upvotes: 0