Reputation: 31
Given an integer number, I want to find the biggest prime number under it. For example:
input 20 -> output 19
input 100 -> output 97.
I already have the simple program given below, but I'm curious about how to make it faster.
def isPrime(x):
for j in range(2,int(x**0.5)+1):
if x%j==0:
return False
return True
def findPrimeNum(num):
for i in range(num-1,1,-1):
if isPrime(i):
return i
findPrimeNum(600851475143) # -> 600851475067
Upvotes: 3
Views: 2537
Reputation: 1
def primeNumbers(nth_number:int):
prime_number=[True]*(nth_number+1)
prime_number[0],prime_number[1]=False,False
#1)check for odd ith iteration only
for i in range(3,int(nth_number**0.5)+1,2):
if prime_number[i]:
#2) remove all odd number from prime number list that is
for j in range(i*i,nth_number+1,i):
prime_number[j]=False
#include 2 also as we iterated only on odd prime finding
prime_result=[2]
prime_result.extend([i for i in range(3,nth_number,2) if prime_number[i]])
return prime_result
print(primeNumbers(100))
output:
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
Upvotes: 0
Reputation: 16747
Very interesting question you have, thanks!
I implemented from scratch fully working Python solution based on following algorithms: Sieve of Eratosthenes (fast generation of primes), Miller-Rabin Probablity Test, Previous Prime through Sieving.
Full algorithm is as following:
We're given number N. And need to find first prime not bigger than N.
Through Sieve of Eratosthenes first I generate around 10 000 smallest prime numbers.
Sieve of boolean values is created that would be used to sieve the range [N - offset, N]. By sieving I mean marking which numbers are Definitely composite, and which are probably prime.
Sieve of boolean value is marked with True at places where we have composite values. This is done by taking every prime P (out of generated) and marking positions in sieve N - N % P
and N - N % P - P
and N - N % P - 2 * P
..., i.e. starting from N - N % P
with step P backwards we mark sieve entries with True.
Those values that were marked with False in sieve are probably prime values, while True marks are certainly composite. Hence now we check only False entries if they are prime, by using Miller-Rabin Primality Test.
Before running a script install just single time PIP packages python -m pip install numpy gmpy2 sympy
.
For 2048-bit input random integer it takes only several seconds to compute previous prime:
SieveTime 0.847 sec, MillerTime 2.810 sec, MillerCnt 188
ReferenceTime 7.736 sec
N 2^2046.803, PrevPrime offset 4254, PrevPrime Reference offset 4254
import numpy as np, gmpy2, math, random, time, sympy
def GenPrimes_SieveOfEratosthenes(end):
# https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
composites = np.zeros((end,), dtype = np.uint8)
composites[:2] = 1
for p, comp in enumerate(composites):
if comp:
continue
if p * p >= len(composites):
break
composites[p * p :: p] = 1
return np.array([i for i in range(len(composites)) if not composites[i]], dtype = np.uint32)
def IsProbablyPrime_MillerRabin(n, *, cnt = 32):
# According to https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test
if n == 2:
return True
if n < 2 or n & 1 == 0:
return False
def Num(n):
return gmpy2.mpz(n)
n = Num(n)
d, r = Num(n - 1), 0
while d & 1 == 0:
d >>= 1
r += 1
nm1 = Num(n - 1)
for i in range(cnt):
a = Num(random.randint(2, n - 2))
x = pow(a, d, n)
if x == 1 or x == nm1:
continue
prp = False
for j in range(r - 1):
x *= x
x %= n
if x == nm1:
prp = True
break
if prp:
continue
return False
return True
def PrevPrime(N):
primes = GenPrimes_SieveOfEratosthenes(1 << 17)
approx_dist = math.log(N)
sieve_time, miller_time, miller_cnt = 0, 0, 0
for icycle in range(100):
sieve_size = round(approx_dist * 1.5)
sieve_begin = N - sieve_size * (icycle + 1)
sieve_last = N - sieve_size * icycle
sieve = np.zeros((1 + sieve_last - sieve_begin,), dtype = np.uint8)
tb = time.time()
for p in primes:
sieve[np.arange(sieve_last - sieve_last % p - sieve_begin, -1, -int(p))] = 1
sieve_time += time.time() - tb
tb = time.time()
for i in range(len(sieve) - 1, -1, -1):
if sieve[i]:
continue
x = sieve_begin + i
miller_cnt += 1
if IsProbablyPrime_MillerRabin(x):
miller_time += time.time() - tb
print(f'SieveTime {sieve_time:.3f} sec, MillerTime {miller_time:.3f} sec, MillerCnt {miller_cnt}', flush = True)
return x
miller_time += time.time() - tb
def PrevPrime_Reference(N):
tim = time.time()
for i in range(1 << 14):
p = N - i
if (p & 1) == 0:
continue
if sympy.isprime(p):
tim = time.time() - tim
print(f'ReferenceTime {tim:.3f} sec', flush = True)
return p
def Main():
N = random.randrange(1 << 2048)
p = PrevPrime(N)
p0 = PrevPrime_Reference(N)
print(f'N 2^{math.log2(N):.3f}, PrevPrime offset {N - p}, PrevPrime Reference offset {N - p0}', flush = True)
if __name__ == '__main__':
Main()
Upvotes: 2
Reputation: 5409
The best way is probably to mirror what several number theory libraries use to implement an efficient next_prime function. The basic layout is identical to yours: an isPrime(x) function, and a loop counting down. Those provide two areas for optimization.
Optimizing isPrime The recommended approach is to use a number theory library for Python like gmpy2, and a probabilistic primality test like Miller-Rabin repeated an appropriate number of times. This scales fairly well with larger numbers, and you can even get a deterministic primality test by using Miller-Rabin with enough small bases.
Optimizing iteration over possible primes There is already a comprehensive guide for optimizing next_prime in practice, which basically involves choosing the first k primes (e.g. 2, 3 and 5) and computing all residues modulo their product that could potentially be a prime. Then, rather than iterating over all smaller numbers, you jump between those residues only. To adapt this to previousPrime, we just jump between residues in the opposite direction.
For 2, 3, and 5, those residues mod 30 are L = [1, 7, 11, 13, 17, 19, 23, 29]. For an integer x > 30, say x = 30*q + r, you would do a binary search over L for the largest element less than r, and iterate backwards in this list:
residues = [1, 7, 11, 13, 17, 19, 23, 29]
small_primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
def prevPrime(num):
if num <= 2:
return 0
q, r = divmod(num, 30)
if q == 0:
return small_primes[bisect_left(small_primes, num) - 1]
num = 30 * q
if r <= 2:
num -= 30
q -= 1
idx = len(residues) - 1
else:
idx = bisect_left(residues, r) - 1
num += residues[idx]
while not (isPrime(num)):
idx -= 1
if idx < 0:
q -= 1
idx = len(residues) - 1
num = 30 * q + residues[idx]
return num
Upvotes: 1