Reputation: 152
I have a list
of integers which is being continuously modified in a loop and I need to tell if its content repeats itself after a certain amount of iterations to break the loop.
If it doesn't, the list
will eventually modify to []
or the loop terminates when a certain limit of iterations is reached. My solution so far:
def modify(numlist, a, b):
numlist = [(x * a) for x in numlist]
for i in range((len(numlist) - 1), 0, -1):
if numlist[i] >= b: numlist[i - 1] += numlist[i] // b
numlist[i] = numlist[i] % b
numlist[0] %= b
while numlist[-1] == 0:
numlist.pop(-1)
if numlist == []: break
numlist = [1, 2, 3, 4, 5]
listHistory = [numlist]
a, b = someValue, anotherValue
n = 0
while numlist != [] and n <= limit:
modify(numlist, int(a), int(b))
listHistory.append(numlist)
if numlist in listHistory: break
n += 1
limit
can be very large (ca. 10**6
- 10**7
) and checking the current numlist
against all its previous versions becomes really slow.
Is there a more efficient way to do this or even a method to predetermine if the modification is periodic by the lists initial content and given a
, b
?
Upvotes: 1
Views: 166
Reputation: 291
OK, got something.
If you look at the last element in your list, lets call it m
. What happens to it it gets multiplied by a and then taken modulo b. It never gets mixed with any other element, so if a configuration of the list has to repeat itself the following must hold:
m*a^n=m modulo b
<==>a^n=1 modulo b
< >a^(n+1)=a modulo b
This is a problem where you can make use of Fermats little theorem If a and b are coprimes, then
a^phi(b)=1 modulo b
where phi
is Eulers totient function.
So this reduces the amount of list configurations which you have to store in your history drastically. You only have to store it every phi(b)
steps.
I found an implementation of phi here: Computing Eulers Totient Function
UPDATE:
Ok, I found a quick solution if you were to do += list[i] % b
instead of += list[i] // b
. Otherwise you need b^4*phi(b)
steps in the worst case
UPDATE2:
I rewrote the code in C
(see below) to make it faster and implemented the "tortoise and the hare" algorithm proposed by @user2357112. This way i can check some million loops per second what should be way faster than the python implementation.
I tried it for some different value combinations:
a b steps b^4*phi(b) (b/GCD(b,a))^4*phi(b/GCD(n,a)) (b/GCD(b,a))^4*phi(b/GCD(n,a))/steps
2 37 67469796 67469796 67469796 1
3 37 33734898 67469796 67469796 2
4 37 33734898 67469796 67469796 2
5 37 67469796 67469796 67469796 1
6 37 7496644 67469796 67469796 9
7 37 16867449 67469796 67469796 4
36 37 3748322 67469796 67469796 18
2 36 39366 20155392 629856 16
3 36 256 20155392 82944 27648
4 36 19683 20155392 39366 2
5 36 5038848 20155392 20155392 4
So you see where this is going: The cycle length seems always to be a divisor of (b/GCD(b,a))^4*phi(b/GCD(n,a))
, so the worst case is (b/GCD(b,a))^4*phi(b/GCD(n,a))
steps as suspected
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void modify(int *, int, int);
void printl(int * );
int main(int argc, const char*argv[])
{
int l[5]={1,2,3,4,5};
int lz[5]={1,2,3,4,5};
int i=1,a,b,n;
if (argc<4) {
printf("Not enough arguments!!\n");
exit(1);
}
a=atoi(argv[1]);
b=atoi(argv[2]);
n=atoi(argv[3]);
modify(l,a,b);
while (i<n) {
modify(l,a,b);
modify(l,a,b);
modify(lz,a,b);
i++;
if (memcmp(l,lz,sizeof(l))==0) {
printf("success!\n");
break;
}
if (i%1000000==0) printf("Step %d.000.000\n",i/1000000);
}
printf("Final step: %d\n",i);
printl(l);
printl(lz);
return 0;
}
void modify(int * li, int a, int b) {
int i=0;
while (i<=4) {
li[i]*=a;
i++;
}
i=4;
while (i>=1) {
if (li[i]>=b) {
li[i-1]+=li[i]/b;
}
li[i]=li[i]%b;
i--;
}
li[0]=li[0]%b;
}
void printl(int * li) {
printf("l=(%d,%d,%d,%d,%d)\n",li[0],li[1],li[2],li[3],li[4]);
Upvotes: 1
Reputation: 7187
First please allow me to say, all lists are periodic, if you consider a large enough period.
That said, this might be a good place to use a bloom filter, EG: https://pypi.python.org/pypi/drs-bloom-filter/
Bloom filters are sets that can perform set membership tests pretty quickly, and can add things to the set without actually storing the data of the element. This means it's probabilistic, but you can adjust the probability. So you might use a bloom filter test for a quick check, and upon detecting a match using the bloom filter, confirm the result using your slow+deterministic algorithm.
Usage looks like:
In [1]: import bloom_filter_mod
In [2]: bloom_filter = bloom_filter_mod.Bloom_filter(1000000, 0.01)
In [3]: for i in range(10):
...: bloom_filter.add(i)
...:
In [4]: for i in range(0, 20, 2):
...: if i in bloom_filter:
...: print('{} present'.format(i))
...:
0 present
2 present
4 present
6 present
8 present
The 1000000 is the maximum number of elements you want to store in the filter, and the 0.01 is the maximum probability of a false positive when full.
So you could "store" each subsequence in the filter, and quickly detect recurrences.
Upvotes: 0
Reputation: 281585
Your list
(which you really ought to rename, by the way) stores a number mod b**something
in base b
. Every run of modify
multiplies the number by a
, then truncates zeros at the end of the representation.
Call the number originally represented by the list n
, and call the original length of the list l
. If this process terminates, it will do so at the first iteration k
such that b**l
divides n * a**k
, which will only ever occur if and only if all prime factors of b**l / gcd(n, b**l)
are factors of a
. This is simple to determine:
def all_prime_factors_of_first_divide_second(a, b):
while a != 1:
factor = gcd(a, b)
if factor == 1:
return False
while not a % factor:
a //= factor
return True
Upvotes: 0