Is there some way to decrease memory requirement in these implementations of Sieve of Eratosthenes

Question

I am using Python 2.7

Among the two implementations of Sieve of Eratosthenes erat() and erat2() that I wrote erat2() has the benefit that on the 2nd run of erat2() it gives the results in comparatively much lesser time.

def erat2(num, isprime = [2]):
    if num > len(isprime) + 2:

        last_original = len(isprime) + 2
        isprime += [num for num in xrange(last_original ,num + 1)]

        for i in xrange(2,num + 1):
            if isprime[i-2]:
                if i <= last_original:
                    j = last_original//i + 1
                else:
                    j = 2
                temp = j * i
                while temp <= num:
                    isprime[temp-2] = 0
                    j += 1
                    temp = j * i

    return filter(lambda x: x != 0, isprime[:num - 1])

def erat(num):
    isprime = [num for num in xrange(2,num + 1)]
    for i in xrange(2,num + 1):
        if isprime[i-2]:
            j = 2
            temp = j * i
            while temp <= num:
                isprime[temp-2] = 0
                j += 1
                temp = j * i

    return filter(lambda x: x != 0, isprime)

import time
def t():
    num = 100000000
    i = 10
    while i < num:
        s = time.time()
        erat2(i)
        x = time.time() - s

        print "%10d %10f" %(i,x),

        s = time.time()
        erat(i)
        y = time.time() - s

        print " %10f" %(y)

        i *= 10

To support the fact that on the second run of the code the results will be much faster some timing analysis is given here. The first column is test input. The second column is for timing of erat2() and third is for timing of erat(). As is clear the time is reduced by factor of 7 in second run.

>>> t()
        10   0.000000    0.000000
       100   0.000000    0.000000
      1000   0.000000    0.000000
     10000   0.010000    0.010000
    100000   0.100000    0.110000
   1000000   1.231000    1.410000
  10000000  13.605000   15.081000
>>> t()
        10   0.000000    0.000000
       100   0.000000    0.000000
      1000   0.000000    0.000000
     10000   0.000000    0.020000
    100000   0.020000    0.140000
   1000000   0.170000    1.550000
  10000000   1.770000   15.752000

The problem that I am facing is that the memory usage spikes after this test input.

Is there some optimization that can be done to reduce memory consumption?
Is such an increase in memory a good implementation practice?

EDIT:

I found a little optimization for both functions erat() and erat2() to increase the speed. Changed the lambda function from

lambda x: x != 0

to

lambda x: x

The same result but slightly faster. One second faster for num = 10000000.

EDIT2:

Used vartec and btilly's suggestions. Improved erat() to erat3(). Here are the improved implementation alongwith timing check. Also found that placing expressions in xrange function led to performance loss. Added variable to improve performance.

def erat3(num):
    ''' Improves Sieve of eratosthenes '''
    #REQUIRES MATH MODULE
    if num < 2:
        return []

    isprime = [num for num in xrange(3,num + 1,2)]
    #temp2's expression placed in xrange function => performance-loss
    temp2 = int(math.sqrt(num)) + 1
    for i in xrange(3, temp2 ,2):
        if isprime[(i-3)/2]:
            j = 3
            temp = j * i
            while temp <= num:
                isprime[(temp-3)/2] = 0
                j += 2
                temp = j * i

    return [2] + filter(lambda x: x, isprime)

Timing for erat() and erat3()

>>> t()
        10   0.000000    0.000000
       100   0.000000    0.000000
      1000   0.000000    0.000000
     10000   0.010000    0.010000
    100000   0.110000    0.040000
   1000000   1.241000    0.530000
  10000000  14.131000    6.111000

btilly · Accepted Answer

It is common to have a tradeoff between memory and performance. Which matters more to you depends on your application.

In this case I would suggest mitigating that by using BitVector (see https://pypi.python.org/pypi/BitVector for details) so that the data structure you create is much more compact.

Also in this case special casing 2 and only storing odd bits will double performance, and halve memory, at the cost of slightly more code complexity. That is probably worth it.

Is there some way to decrease memory requirement in these implementations of Sieve of Eratosthenes

Answers (2)

Related Questions