Reputation: 1185
I've come back to programming after a long haitus so please forgive any stupid errors/inefficient code.
I am creating an encryption program that uses the RSA method of encryption which involves finding the coprimes of numbers to generate a key. I am using the Euclidean algorithm to generate highest common factors and then add the coprime to the list if HCF == 1. I generate two lists of coprimes for different numbers then compare to find coprimes in both sets. The basic code is below:
def gcd(a, b):
while b:
a,b=b,a%b
return a
def coprimes(n):
cp = []
for i in range(1,n):
if gcd(i, n) == 1:
cp.append(i)
print(cp)
def compare(n,m):
a = coprimes(n)
b = coprimes(m)
c = []
for i in a:
if i in b:
c.append(i)
print(c)
This code works perfectly for small numbers and gives me what I want but execution takes forever and is finally Killed
when comupting for extremely large numbers in the billions range, which is necessary for even a moderate level of security.
I assume this is a memory issue but I cant work out how to do this in a non memory intensive way. I tried multiprocessing but that just made my computer unusable due to the amount of processes running.
How can I calculate the coprimes of large numbers and then compare two sets of coprimes in an efficent and workable way?
Upvotes: 0
Views: 333
Reputation: 42139
You could change the strategy and approach this from the perspective of common prime factors. The common coprimes between n and m will be all numbers that are not divisible by any of their common prime factors.
def primeFactors(N):
p = 2
while p*p<=N:
count = 0
while N%p == 0:
count += 1
N //= p
if count: yield p
p += 1 + (p&1)
if N>1: yield N
import math
def compare2(n,m):
# skip list for multiples of common prime factors
skip = { p:p for p in primeFactors(math.gcd(n,m)) }
for x in range(1,min(m,n)):
if x in skip:
p = skip[x] # skip multiple of common prime
m = x + p # determine next multiple to skip
while m in skip: m += p # for that prime
skip[m] = p
else:
yield x # comon coprime of n and m
The performance is considerably better than matching lists of coprimes, especially on larger numbers:
from timeit import timeit
timeit(lambda:list(compare2(10**5,2*10**5)),number=1)
# 0.025 second
timeit(lambda:list(compare2(10**6,2*10**6)),number=1)
# 0.196 second
timeit(lambda:list(compare2(10**7,2*10**7)),number=1)
# 2.18 seconds
timeit(lambda:list(compare2(10**8,2*10**8)),number=1)
# 20.3 seconds
timeit(lambda:list(compare2(10**9,2*10**9)),number=1)
# 504 seconds
At some point, building lists of all the coprimes becomes a bottleneck and you should just use/process them as they come out of the generator (for example to count how many there are):
timeit(lambda:sum(1 for _ in compare2(10**9,2*10**9)),number=1)
# 341 seconds
Another way to approach this, which is somewhat slower than the prime factor approach but much simpler to code, would be to list coprimes of the gcd between n and m:
import math
def compare3(n,m):
d = math.gcd(n,m)
for c in range(1,min(n,m)):
if math.gcd(c,d) == 1:
yield c
timeit(lambda:list(compare3(10**6,2*10**6)),number=1)
# 0.28 second
timeit(lambda:list(compare3(10**7,2*10**7)),number=1)
# 2.84 seconds
timeit(lambda:list(compare3(10**8,2*10**8)),number=1)
# 30.8 seconds
Given that it uses no memory resource, it could be advantageous in some cases:
timeit(lambda:sum(1 for _ in compare3(10**9,2*10**9)),number=1)
# 326 seconds
Upvotes: 1
Reputation: 3416
If the only thing you're worried about is running out of memory here you could use generators.
def coprimes(n):
for i in range(1,n):
if gcd(i, n) == 1:
yield i
This way you can use the coprime value then discard it once you don't need it. However, nothing is going to change the fact your code is O(N^2) and will always perform slow for large primes. And this assumes Euclid's algorithm is constant time, which it is not.
Upvotes: 3