Marcel Tesch
Marcel Tesch

Reputation: 152

testing whether or not a list modification is periodic

I have a list of integers which is being continuously modified in a loop and I need to tell if its content repeats itself after a certain amount of iterations to break the loop.

If it doesn't, the list will eventually modify to [] or the loop terminates when a certain limit of iterations is reached. My solution so far:

def modify(numlist, a, b):
    numlist = [(x * a) for x in numlist]
    for i in range((len(numlist) - 1), 0, -1):
        if numlist[i] >= b: numlist[i - 1] += numlist[i] // b
        numlist[i] = numlist[i] % b
    numlist[0] %= b
    while numlist[-1] == 0:
        numlist.pop(-1)
        if numlist == []: break 

numlist = [1, 2, 3, 4, 5]
listHistory = [numlist]
a, b = someValue, anotherValue

n = 0
while numlist != [] and n <= limit:
    modify(numlist, int(a), int(b))
    listHistory.append(numlist)
    if numlist in listHistory: break
    n += 1

limit can be very large (ca. 10**6 - 10**7) and checking the current numlist against all its previous versions becomes really slow.

Is there a more efficient way to do this or even a method to predetermine if the modification is periodic by the lists initial content and given a, b?

Upvotes: 1

Views: 166

Answers (3)

MaSp
MaSp

Reputation: 291

OK, got something. If you look at the last element in your list, lets call it m. What happens to it it gets multiplied by a and then taken modulo b. It never gets mixed with any other element, so if a configuration of the list has to repeat itself the following must hold:

m*a^n=m modulo b
<==>a^n=1 modulo b
<  >a^(n+1)=a modulo b

This is a problem where you can make use of Fermats little theorem If a and b are coprimes, then

a^phi(b)=1 modulo b

where phi is Eulers totient function.

So this reduces the amount of list configurations which you have to store in your history drastically. You only have to store it every phi(b) steps.

I found an implementation of phi here: Computing Eulers Totient Function

UPDATE:

Ok, I found a quick solution if you were to do += list[i] % b instead of += list[i] // b. Otherwise you need b^4*phi(b) steps in the worst case

UPDATE2:

I rewrote the code in C (see below) to make it faster and implemented the "tortoise and the hare" algorithm proposed by @user2357112. This way i can check some million loops per second what should be way faster than the python implementation. I tried it for some different value combinations:

a  b    steps     b^4*phi(b)  (b/GCD(b,a))^4*phi(b/GCD(n,a))    (b/GCD(b,a))^4*phi(b/GCD(n,a))/steps
2  37   67469796  67469796    67469796                          1
3  37   33734898  67469796    67469796                          2
4  37   33734898  67469796    67469796                          2
5  37   67469796  67469796    67469796                          1
6  37   7496644   67469796    67469796                          9
7  37   16867449  67469796    67469796                          4
36 37   3748322   67469796    67469796                          18
2  36   39366     20155392    629856                            16
3  36   256       20155392    82944                             27648
4  36   19683     20155392    39366                             2
5  36   5038848   20155392    20155392                          4

So you see where this is going: The cycle length seems always to be a divisor of (b/GCD(b,a))^4*phi(b/GCD(n,a)), so the worst case is (b/GCD(b,a))^4*phi(b/GCD(n,a)) steps as suspected

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void modify(int *, int, int);
void printl(int * );
int main(int argc, const char*argv[])
{
  int l[5]={1,2,3,4,5};
  int lz[5]={1,2,3,4,5};
  int i=1,a,b,n;
  if (argc<4) {
    printf("Not enough arguments!!\n");
    exit(1);
  }
  a=atoi(argv[1]);
  b=atoi(argv[2]);
  n=atoi(argv[3]);
  modify(l,a,b);
  while (i<n) {
    modify(l,a,b);
    modify(l,a,b);
    modify(lz,a,b);
    i++;
    if (memcmp(l,lz,sizeof(l))==0) {
      printf("success!\n");
      break;
    }
    if (i%1000000==0) printf("Step %d.000.000\n",i/1000000);
  }
  printf("Final step:  %d\n",i);
  printl(l);
  printl(lz);
  return 0;

}
void  modify(int * li, int a, int b) {
  int i=0;
  while (i<=4) {
   li[i]*=a;
   i++;
  }
  i=4;
  while (i>=1) {
    if (li[i]>=b) {
      li[i-1]+=li[i]/b;
    }
    li[i]=li[i]%b;
    i--;

  }
  li[0]=li[0]%b;
}
void printl(int * li) {
  printf("l=(%d,%d,%d,%d,%d)\n",li[0],li[1],li[2],li[3],li[4]);

Upvotes: 1

dstromberg
dstromberg

Reputation: 7187

First please allow me to say, all lists are periodic, if you consider a large enough period.

That said, this might be a good place to use a bloom filter, EG: https://pypi.python.org/pypi/drs-bloom-filter/

Bloom filters are sets that can perform set membership tests pretty quickly, and can add things to the set without actually storing the data of the element. This means it's probabilistic, but you can adjust the probability. So you might use a bloom filter test for a quick check, and upon detecting a match using the bloom filter, confirm the result using your slow+deterministic algorithm.

Usage looks like:

In [1]: import bloom_filter_mod

In [2]: bloom_filter = bloom_filter_mod.Bloom_filter(1000000, 0.01)

In [3]: for i in range(10):
   ...:     bloom_filter.add(i)
   ...:

In [4]: for i in range(0, 20, 2):
   ...:     if i in bloom_filter:
   ...:         print('{} present'.format(i))
   ...:
0 present
2 present
4 present
6 present
8 present

The 1000000 is the maximum number of elements you want to store in the filter, and the 0.01 is the maximum probability of a false positive when full.

So you could "store" each subsequence in the filter, and quickly detect recurrences.

Upvotes: 0

user2357112
user2357112

Reputation: 281585

Your list (which you really ought to rename, by the way) stores a number mod b**something in base b. Every run of modify multiplies the number by a, then truncates zeros at the end of the representation.

Call the number originally represented by the list n, and call the original length of the list l. If this process terminates, it will do so at the first iteration k such that b**l divides n * a**k, which will only ever occur if and only if all prime factors of b**l / gcd(n, b**l) are factors of a. This is simple to determine:

def all_prime_factors_of_first_divide_second(a, b):
    while a != 1:
        factor = gcd(a, b)
        if factor == 1:
            return False
        while not a % factor:
            a //= factor
    return True

Upvotes: 0

Related Questions