Aseem Bansal
Aseem Bansal

Reputation: 6962

Is there some way to decrease memory requirement in these implementations of Sieve of Eratosthenes

I am using Python 2.7

Among the two implementations of Sieve of Eratosthenes erat() and erat2() that I wrote erat2() has the benefit that on the 2nd run of erat2() it gives the results in comparatively much lesser time.

def erat2(num, isprime = [2]):
    if num > len(isprime) + 2:

        last_original = len(isprime) + 2
        isprime += [num for num in xrange(last_original ,num + 1)]

        for i in xrange(2,num + 1):
            if isprime[i-2]:
                if i <= last_original:
                    j = last_original//i + 1
                else:
                    j = 2
                temp = j * i
                while temp <= num:
                    isprime[temp-2] = 0
                    j += 1
                    temp = j * i

    return filter(lambda x: x != 0, isprime[:num - 1])

def erat(num):
    isprime = [num for num in xrange(2,num + 1)]
    for i in xrange(2,num + 1):
        if isprime[i-2]:
            j = 2
            temp = j * i
            while temp <= num:
                isprime[temp-2] = 0
                j += 1
                temp = j * i

    return filter(lambda x: x != 0, isprime)

import time
def t():
    num = 100000000
    i = 10
    while i < num:
        s = time.time()
        erat2(i)
        x = time.time() - s

        print "%10d %10f" %(i,x),

        s = time.time()
        erat(i)
        y = time.time() - s

        print " %10f" %(y)

        i *= 10

To support the fact that on the second run of the code the results will be much faster some timing analysis is given here. The first column is test input. The second column is for timing of erat2() and third is for timing of erat(). As is clear the time is reduced by factor of 7 in second run.

>>> t()
        10   0.000000    0.000000
       100   0.000000    0.000000
      1000   0.000000    0.000000
     10000   0.010000    0.010000
    100000   0.100000    0.110000
   1000000   1.231000    1.410000
  10000000  13.605000   15.081000
>>> t()
        10   0.000000    0.000000
       100   0.000000    0.000000
      1000   0.000000    0.000000
     10000   0.000000    0.020000
    100000   0.020000    0.140000
   1000000   0.170000    1.550000
  10000000   1.770000   15.752000

The problem that I am facing is that the memory usage spikes after this test input.

EDIT:

I found a little optimization for both functions erat() and erat2() to increase the speed. Changed the lambda function from

lambda x: x != 0

to

lambda x: x

The same result but slightly faster. One second faster for num = 10000000.

EDIT2:

Used vartec and btilly's suggestions. Improved erat() to erat3(). Here are the improved implementation alongwith timing check. Also found that placing expressions in xrange function led to performance loss. Added variable to improve performance.

def erat3(num):
    ''' Improves Sieve of eratosthenes '''
    #REQUIRES MATH MODULE
    if num < 2:
        return []

    isprime = [num for num in xrange(3,num + 1,2)]
    #temp2's expression placed in xrange function => performance-loss
    temp2 = int(math.sqrt(num)) + 1
    for i in xrange(3, temp2 ,2):
        if isprime[(i-3)/2]:
            j = 3
            temp = j * i
            while temp <= num:
                isprime[(temp-3)/2] = 0
                j += 2
                temp = j * i

    return [2] + filter(lambda x: x, isprime)

Timing for erat() and erat3()

>>> t()
        10   0.000000    0.000000
       100   0.000000    0.000000
      1000   0.000000    0.000000
     10000   0.010000    0.010000
    100000   0.110000    0.040000
   1000000   1.241000    0.530000
  10000000  14.131000    6.111000

Upvotes: 0

Views: 132

Answers (2)

John La Rooy
John La Rooy

Reputation: 304255

You can use single bits for the isprime array. Python doesn't give you a really fast way to manipulate the bits, but one simple way is to use a long.

is_prime = (1 << num) - 1

and use this to check if a number has been sieved

is_prime & (1 << x)

Here is how to apply it with minimal changed to your second function

def erat(num):
    isprime = (1 << num) - 1
    for i in xrange(2,num + 1):
        if isprime & (1 << i):
            j = 2
            temp = j * i
            while temp <= num:
                isprime &= (1 << num) - 1 - (1 << temp)
                j += 1
                temp = j * i
    return [i for i in range(num) if isprime&(1 << i)]

It's probably worthwhile storing (1 << num) in a local variable to save building it over and over. As @btilly points out, with a little extra work you can save half the space by only keeping track of odd numbers.

Upvotes: 0

btilly
btilly

Reputation: 46417

It is common to have a tradeoff between memory and performance. Which matters more to you depends on your application.

In this case I would suggest mitigating that by using BitVector (see https://pypi.python.org/pypi/BitVector for details) so that the data structure you create is much more compact.

Also in this case special casing 2 and only storing odd bits will double performance, and halve memory, at the cost of slightly more code complexity. That is probably worth it.

Upvotes: 2

Related Questions