Jonah Fleming
Jonah Fleming

Reputation: 1185

How to stop loop running out of memory?

I've come back to programming after a long haitus so please forgive any stupid errors/inefficient code.

I am creating an encryption program that uses the RSA method of encryption which involves finding the coprimes of numbers to generate a key. I am using the Euclidean algorithm to generate highest common factors and then add the coprime to the list if HCF == 1. I generate two lists of coprimes for different numbers then compare to find coprimes in both sets. The basic code is below:

def gcd(a, b):
    while b:
        a,b=b,a%b
    return a


def coprimes(n):
    cp = []
    for i in range(1,n):
        if gcd(i, n) == 1:
            cp.append(i)
    print(cp)

def compare(n,m):
    a = coprimes(n)
    b = coprimes(m)
    c = []
    for i in a:
        if i in b:
            c.append(i)
    print(c)

This code works perfectly for small numbers and gives me what I want but execution takes forever and is finally Killed when comupting for extremely large numbers in the billions range, which is necessary for even a moderate level of security. I assume this is a memory issue but I cant work out how to do this in a non memory intensive way. I tried multiprocessing but that just made my computer unusable due to the amount of processes running.

How can I calculate the coprimes of large numbers and then compare two sets of coprimes in an efficent and workable way?

Upvotes: 0

Views: 333

Answers (2)

Alain T.
Alain T.

Reputation: 42139

You could change the strategy and approach this from the perspective of common prime factors. The common coprimes between n and m will be all numbers that are not divisible by any of their common prime factors.

def primeFactors(N):
    p = 2
    while p*p<=N:
        count = 0
        while N%p == 0:
            count += 1
            N //= p
        if count: yield p
        p += 1 + (p&1)
    if N>1: yield N

import math
def compare2(n,m):
    # skip list for multiples of common prime factors
    skip = { p:p for p in primeFactors(math.gcd(n,m)) } 
    for x in range(1,min(m,n)):
        if x in skip:
            p = skip[x]                 # skip multiple of common prime
            m = x + p                   # determine next multiple to skip
            while m in skip: m += p     # for that prime
            skip[m] = p
        else:
            yield x                     # comon coprime of n and m 

The performance is considerably better than matching lists of coprimes, especially on larger numbers:

from timeit import timeit

timeit(lambda:list(compare2(10**5,2*10**5)),number=1)
# 0.025 second

timeit(lambda:list(compare2(10**6,2*10**6)),number=1)
# 0.196 second

timeit(lambda:list(compare2(10**7,2*10**7)),number=1)
# 2.18 seconds

timeit(lambda:list(compare2(10**8,2*10**8)),number=1)
# 20.3 seconds

timeit(lambda:list(compare2(10**9,2*10**9)),number=1)
# 504 seconds

At some point, building lists of all the coprimes becomes a bottleneck and you should just use/process them as they come out of the generator (for example to count how many there are):

timeit(lambda:sum(1 for _ in compare2(10**9,2*10**9)),number=1)
# 341 seconds

Another way to approach this, which is somewhat slower than the prime factor approach but much simpler to code, would be to list coprimes of the gcd between n and m:

import math
def compare3(n,m):
    d = math.gcd(n,m)
    for c in range(1,min(n,m)):
        if math.gcd(c,d) == 1:
            yield c

timeit(lambda:list(compare3(10**6,2*10**6)),number=1)
# 0.28 second

timeit(lambda:list(compare3(10**7,2*10**7)),number=1)
# 2.84 seconds

timeit(lambda:list(compare3(10**8,2*10**8)),number=1)
# 30.8 seconds

Given that it uses no memory resource, it could be advantageous in some cases:

timeit(lambda:sum(1 for _ in compare3(10**9,2*10**9)),number=1)
# 326 seconds

Upvotes: 1

Mitch
Mitch

Reputation: 3416

If the only thing you're worried about is running out of memory here you could use generators.

def coprimes(n):
    for i in range(1,n):
        if gcd(i, n) == 1:
            yield i

This way you can use the coprime value then discard it once you don't need it. However, nothing is going to change the fact your code is O(N^2) and will always perform slow for large primes. And this assumes Euclid's algorithm is constant time, which it is not.

Upvotes: 3

Related Questions