Marko Savolainen
Marko Savolainen

Reputation: 35

Why is Sieve of Eratosthenes so slow compared to "optimized" iterated prime search? (Python 3)

I've tried some different versions of Sieve both my own and from different pages. All of them have one thing in common though. They're slower than the iterative method I'm using, which goes agains everything I've read about it.Is it me or is this because python has slow list handling?

from time import time
from random import randint

def isPrime(n):
    if (n <= 3):
        return (n >= 2)
    i = 5
    if (n % 2 == 0 or n % 3 == 0) :
        return False
    for i in range(5, int(n**.5)+1, 6):
        if (n%i == 0 or n%(i+2) == 0):
            return False
    return True

def sieveOfEratosthenes(n):
    array = [True for i in range((n-1)//2)]
    i = 3
    for e in array:
        if e is True:
            for k in range(i**2, n+1, 2*i):
                array[k//2-1] = False
        i += 2
    return [2]+[2*i+1 for i, e in enumerate(array, 1) if e is True]

def runIsPrime(arr):
    start = time()
    for e in arr:
        isPrime(e)
    print("Primes:\t\t", time()-start)

def runSieveOfEratosthenes(arr):
    start = time()
    primes = sieveOfEratosthenes(max(arr))
    #print("Sie. mid.:\t", time() - start)
    for e in arr:
        e in primes
    print("Sieve tot.:\t", time()-start)

upto = 10**6
times = 10**3
arr = [randint(2, upto) for i in range(times)]

runIsPrime(arr)
runSieveOfEratosthenes(arr)

Gives a print out:

Primes:      0.0019884109497070312 
Sieve tot.:  1.1940925121307373

Upvotes: 2

Views: 109

Answers (1)

awesoon
awesoon

Reputation: 33671

Your sieveOfEratosthenes function returns a list, and checking if a value is in list is O(n). However, even after changing your code to use a set, the sieve will be slower than your straightforward approach:

def runSieveOfEratosthenes(arr):
    start = time()
    primes = set(sieveOfEratosthenes(max(arr)))
    #print("Sie. mid.:\t", time() - start)
    for e in arr:
        e in primes
    print("Sieve tot.:\t", time()-start)

upto = 10**6
times = 10**3
arr = [randint(2, upto) for i in range(times)]

runIsPrime(arr)
runSieveOfEratosthenes(arr)

Output:

Primes:      0.0013408660888671875
Sieve tot.:  0.19181394577026367

This is because your functions actually differ in complexity. isPrime checks up to the root and exits early, which allows it to quickly process small compound numbers. sieveOfEratosthenes builds a sieve, which is O(n log log n) and it does not depend on the length of the list. Let's see if we pass list of a bigger size with primes only:

In [57]: ps = [996361] * 10**4

In [58]: runIsPrime(ps)
Primes:      0.20616984367370605

In [59]: runSieveOfEratosthenes(ps)
Sieve tot.:  0.1783740520477295

You can see, that runIsPrime performance decreased dramatically, while runSieveOfEratosthenes did not change a lot. If you think there may be a cache involved here, let's prepare list of different primes:

In [64]: ps2 = [x for x in range(10**5, 10**6) if isPrime(x)]

In [65]: len(ps2)
Out[65]: 68906

In [66]: ps2[-1]
Out[66]: 999983

In [67]: runIsPrime(ps2)
Primes:      0.9789869785308838

In [68]: runSieveOfEratosthenes(ps2)
Sieve tot.:  0.18288207054138184

As you can see, runIsPrime execution time grows with the array length, while runSieveOfEratosthenes time stays more or less the same. Let's increase the array size even more:

In [69]: runIsPrime(ps2 * 10)
Primes:      9.91222095489502

In [70]: runSieveOfEratosthenes(ps2 * 10)
Sieve tot.:  0.2231128215789795

As a conclusion - if you need to check if a number is prime or not, you can just iterate up to root of the number - this will be faster than building a sieve. If you have a list of numbers, you may want to run some tests, and if the list is small, you may still go with the first approach. But when you have a large list of numbers to check, it is better to build a sieve.

Upvotes: 6

Related Questions