aspie
aspie

Reputation: 43

Python: isPrime function much slower after adding lookup of pregenerated list?

I am solving Project Euler problems with Python. I have seen some solvers use isPrime() functions that simply test whether x % y == 0 for all y from 2 to x ** 0.5. That is not efficient and I want to write a better isPrime() function, based on the num % 30 test. This is what I came up with:

primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
primality = [1, 7, 11, 13, 17, 19, 23, 29]

def isPrime(num):
    if not type(num) in (int, long):
        raise ValueError
    if num in primes:
        return True
    elif num < 2:
        return False
    elif num % 30 not in primality:
        return False
    else:
        for prime in primes[3:]:
            if num % prime == 0:
                return False
        seed, sqrt, tryfactor = 1, getIntSquareRoot(num), 1
        while tryfactor < sqrt:
            for trymod in primality:
                tryfactor = seed * 30 + trymod
                if num % tryfactor == 0 and not(num == tryfactor):
                    return False
            seed += 1
        return True

Problem 7 is to find the 10001st prime. So I decided to make the code append all these primes to a list which subsequent problems can refer to. I thought that given a 5-digit number num, num in primes would be a much faster operation that repeated num % tryfactor. For parameters where primes[-1] < num < (primes[-1] ** 0.2) it should still be faster to get tryfactor values from the list than repeatedly generating them through tryfactor = seed * 30 + trymod.

So I came up with the following:

def problem7():
    count, seed = len(primes), 1
    while True:
        for modulus in primality:
            num = seed * 30 + modulus
            if isPrime(num):
                count += 1
                primes.append(num)
                if count > 10000:
                    return num
        seed += 1

def isPrimeB(num):
    if not type(num) in (int, long):
        raise ValueError
    if num in primes:
        return True
    elif num < 2:
        return False
    elif num % 30 not in primality:
        return False
    else:
        for prime in primes[3:]:
            if num % prime == 0:
                return False
        seed, sqrt, tryfactor = 1, getIntSquareRoot(num), 1
        while tryfactor < sqrt:
            for trymod in primality:
                tryfactor = seed * 30 + trymod
                if num % tryfactor == 0 and not(num == tryfactor):
                    return False
            seed += 1
        return True

Of course I expect the code for problem 7 to be much slower, because generating the list of primes a few seconds. I also expect the code for later problems calling isPrime() (such as 10, 27, 35, 41 and 58) to run much faster.

However, I got a shock when the code for problems 27, 35, 41 and 58 became much slower. Could someone explain why looking up values in a list is much slower than calculating them? Or is there a mistake in my code? What else can I do to make the isPrime() function more efficient?

Upvotes: 0

Views: 410

Answers (2)

user448810
user448810

Reputation: 17866

@Freakish answered your question about why isPrimeB is slow. Let me propose a couple of alternatives to what you have written.

Here is my version of primality checking with a 2,3,5-wheel, which is the same algorithm as your isPrime function but stated rather differently:

def isPrime(n):
    d, w, wheel = 2, 0, [1,2,2,4,2,4,2,4,6,2,6]
    while d*d <= n:
        if n%d == 0: return False
        d = d + wheel[w]
        w = 3 if w == 10 else w+1
    return True

There are a couple of ways to compute the nth prime. One way uses a Sieve of Eratosthenes. Number theory tells us that the nth prime is always less than n(log n + log log n) with logarithms to base e, so you could sieve up to the limit and discard the excess primes. Here's the sieve:

def primes(n):
    m = (n-1)//2; b = [True] * m
    i, p, ps = 0, 3, [2]
    while i < m:
        if b[i]:
            ps.append(p)
            for j in range(2*i*i+6*i+3, m, p):
                b[j] = False
        i, p = i+1, p+2
    return ps

So, to get the 10001st prime:

>>> 10000 * ( log(10000) + log(log(10000)) )
114306.67178344031
>>> (primes(114306))[10000]
104743

Another alternative generates candidate primes using a 2,3,5,7-wheel and confirms their primality by a Miller-Rabin test to the three bases 2, 7, 61, which is sufficient for primes less then 2^32 (actually, a little bit bigger):

def genPrimes(): # valid to 2^32
    def isPrime(n):
        def isSpsp(n, a):
            d, s = n-1, 0
            while d%2 == 0:
                d /= 2; s += 1
            t = pow(a,d,n)
            if t == 1: return True
            while s > 0:
                if t == n-1: return True
                t = (t*t) % n; s -= 1
            return False
        for p in [2, 7, 61]:
            if n % p == 0: return n == p
            if not isSpsp(n, p): return False
        return True
    w, wheel = 0, [1,2,2,4,2,4,2,4,6,2,6,4,2,4,\
        6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,\
        2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10]
    p = 2; yield p
    while True:
        p = p + wheel[w]
        w = 4 if w == 51 else w + 1
        if isPrime(p): yield p

Then the nth prime can be computed by the expression next(itertools.islice(genPrimes(), n, n+1)):

>>> next(itertools.islice(genPrimes(), 10000, 10001))
104743

Both methods return the 10001st prime instantly, as soon as you press the enter key.

If your interested in programming with prime numbers (or you just want to solve the prime number problems in Project Euler), you might be interested in this essay or in these entries at my blog.

Upvotes: 1

freakish
freakish

Reputation: 56487

The reason it is slower is because the list lookup is O(n). Instead of using lists use sets:

primes = set()
primes.add(num)

The num in primes check will be now O(1).

Also forget about this "optimization": primes[3:]. It actually slows down your code, since it recreates the list (note that it won't work anyway if you switch to sets).

Finally you can implement the Sieve of Eratosthenes (although there are more sophisticated sieves) to generate primes fast.

Upvotes: 5

Related Questions