Reputation: 13703

Understanding Sieve of Eratosthenes in Python

I've found an example code in python that gives out all prime numbers upto n but I simply don't get it, Why does it does what it does?

I've read the wikipedia article about the Sieve of Eratosthenes but simply have no idea about how this works.

pp = 2
ps = [pp]
lim = raw_input("Generate prime numbers up to what number? : ")
while pp < int(lim):
    pp += 1
    for a in ps:
        if pp%a==0:
            break
        else:
            ps.append(pp)


print set(ps)

An explanation of how the loop works would be appreciated.

EDIT - Figured out that the code's all wrong for it denotes 25 as a prime and through more intensive searching found that this ain't no sieve, can someone show an generator which utilizes the sieves in python and explain it

Upvotes: 1

Answers (4)

dansalmo

Reputation: 11696

Since no one has yet to show a true sieve or explain it, I will try.

The basic method is to start counting at 2 and eliminate 2*2 and all higher multiples of 2 (ie 4, 6, 8...) since none of them can be prime. 3 survived the first round so it is prime and now we eliminate 3*3 and all higher multiples of 3 (ie 9, 12, 15...). 4 was eliminated, 5 survived etc. The squaring of each prime is an optimization that makes use of the fact that all smaller multiples of each new prime will have been eliminated in previous rounds. Only the prime numbers will be left as you count and eliminate non-primes using this process.

Here is a very simple version, notice it does not use modulo division or roots:

def primes(n): # Sieve of Eratosthenes
    prime, sieve = [], set()
    for q in xrange(2, n+1):
        if q not in sieve:
            prime.append(q)
            sieve.update(range(q*q, n+1, q))
    return prime

>>> primes(100)
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73
79, 83, 89, 97]

The simple approach above is surprisingly fast but does not make use of the fact that primes can be only odd numbers.

Here is a generator based version that is faster than any other I have found but hits a Python memory limit at n = 10**8 on my machine.

def pgen(n): # Fastest Eratosthenes generator
    yield 2
    sieve = set()
    for q in xrange(3, n+1, 2):
        if q not in sieve:
            yield q
            sieve.update(range(q*q, n+1, q+q))

>>> timeit('n in pgen(n)', setup="from __main__ import pgen; n=10**6", number=10)
5.987867565927445

Here is a slightly slower but much more memory efficient generator version:

def pgen(maxnum): # Sieve of Eratosthenes generator
    yield 2
    np_f = {}
    for q in xrange(3, maxnum+1, 2):
        f = np_f.pop(q, None)
        if f:
            while f != np_f.setdefault(q+f, f):
                q += f
        else:
            yield q
            np = q*q
            if np < maxnum:
                np_f[np] = q+q

>>> timeit('n in pgen(n)', setup="from __main__ import pgen; n=10**6", number=10)
7.420101730225724

>>> list(pgen(10))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

To test if a number is prime just do:

>>> 539 in pgen(539)
False
>>> 541 in pgen(541)
True

Here are some hints as to how this more memory efficient version works. It uses a dict to store only the bare minimum of information, the next non-prime numbers (as keys) along with their factors (as values). As each non-prime is found in the dict, it is removed and the next non-prime key is added with the same factor value.

Upvotes: 5

Will Ness

Reputation: 71099

That code is an attempt at using trial division to produce a sequence of primes.

To correct it:

pp = 2
ps = [pp]
lim = raw_input("Generate prime numbers up to what number? : ")
while pp < int(lim):
    pp += 1
    for a in ps:
        if pp%a==0:
            break
    else:                # unindent
        ps.append(pp)    #  this

To make it much more efficient (in fact, optimal) trial division:

pp = 2
ps = [pp]
lim = raw_input("Generate prime numbers up to what number? : ")
while pp < int(lim):
    pp += 1
    for a in ps:
        if a*a > pp:         # stop
            ps.append(pp)    #  early
            break
        if pp%a==0:
            break

Upvotes: 1

rspencer

Reputation: 2739

Firstly, this is not a sieve.

This is how it works. pp is the number that we are going to test. In each iteration of the while loop, we go over all the known primes (ps) and check if they divide pp. If one of them does, pp is not a prime, and we move to the next number. Otherwise, we add pp to the list of primes before moving on.

The line pp%a==0 is basically saying "the remander of pp when divided by a is zero", ie a divides pp and pp is not prime.

This continues until the number we are checking is larger than some upper limit that we have set (lim)

[EDIT: this is a sieve]

isPrime = [True for i in range(lim)]
isPrime[0] = False
isPrime[1] = False

for i in range(lim):
    if isPrime[i]:
        for n in range(2*i, lim, i):
            isPrime[n] = False

This is not the most efficient sieve (more efficient ones do things in the for n in range(2*i, lim, i): line, but it will work, and isPrime[i] will be true iff i is prime.

Upvotes: 3

Sukrit Kalra

Reputation: 34503

The above implementation produces wrong answers. I've done some changes to the code.

But, here's how the above code works.

pp = 2
ps = [pp]

We know that the first prime number is 2, so, we generate a list containing only the number 2.

lim = raw_input("Generate prime numbers up to what number? : ")

The above line takes an input from the user, which gives us the upper limit of the prime numbers to generate.

while pp < int(lim):    # 1
      pp += 1           # 2
      primeFlag = True  # 3
      for a in ps:      # 4
          if pp%a==0:
             primeFlag = False
          break
      if primeFlag:     # 5
          ps.append(pp)

The numbered lines do the following things.

Runs a loop until the upper limit is reached.
Increments the pp variable by 1.
Sets a flag variable which is used for testing if the number is prime.
This for loop iterates over the list of prime numbers stored in ps and checks that the current number, pp is divisible by any one of those numbers, if yes, then the number is not prime and the primeFlag is set to False and we break out of the inner for loop.
If the number was not divisible by any of the primes before it, then it must be a prime, hence, the variable primeFlag is True and the if statement appends the list ps with pp.

Upvotes: 1

Understanding Sieve of Eratosthenes in Python

Answers (4)

Related Questions