Reputation: 7664
In my python code, I have the following line
vals = [True] * n
This code works when n
is a small number but I want this to work when n ~ 100,000,000
without getting a MemoryError
I even tried using a dictionary like this but it still wont work
vals = {}
for i in range(0, n):
vals[i] = True
Any suggestions?
Edit
For those of you wondering what I am trying to do. I am trying to implement the Sieve of Eratosthenes algorithm to generate prime numbers. The vals
variable keeps track of the composite numbers which are found later on in the algorithm
Edit 2 I don't think so I can use generators because I need to able to change values later on like this:
vals = [True] * n
for i in range(0, n):
if condition:
vals[i] = False
Upvotes: 0
Views: 490
Reputation: 36036
Do not keep all the data in memory, only keep what is needed at the moment or keep some other representation so generate it on the fly.
In this case, you should probably use a "thin air" representation instead:
Alternatively, you can offload your data to disk and e.g. use offsets to access elements (that would be rather slow) or write a class with a list-like interface that would use disk and a cache, loading and offloading data when needed (mmap
is a good backing choice IMO).
Upvotes: 1
Reputation: 2909
Check these out:
http://stromberg.dnsalias.org/~strombrg/bits_mod/
http://stromberg.dnsalias.org/~strombrg/sieve/
The first is a bit array type that uses at least 31 bits of multiple ints.
The second is a sieve built on that bit array type.
If you still need something that won't fit into memory, you might look into modifying bits_mod to use mmap, as done in, for example: http://stromberg.dnsalias.org/~strombrg/drs-bloom-filter/
Here's some timing information, using pypy3 2.4.0, in case you decide to go the numpy route, but remain curious how pypy3 could do:
$ time ./sieve 100000000 > /dev/null
Creating bit array
Clearing multiples of 2
Clearing multiples of 3
Clearing multiples of 5
Clearing multiples of 7
...
Clearing multiples of 9931
Clearing multiples of 9941
Clearing multiples of 9949
Clearing multiples of 9967
Clearing multiples of 9973
real 1m10.749s
user 1m8.133s
sys 0m2.405s
Upvotes: -2
Reputation: 2396
You can use generators
vals = lambda n: (True for _ in range(n))
EDIT: since OP asked for a numpy implementation
n = 100000000
prime = lambda n: list(filter(lambda x: (x % np.arange(2, 1+int(math.sqrt(x)))).all(), range(2, n+1)))
Upvotes: 0
Reputation: 2562
Using numpy
you can handle your large list:
import numpy as np
vals = np.ones(100000000,bool)
print(vals)
# [ True True True ..., True True True]
Upvotes: 3
Reputation: 1773
You might want to use a generator:
def vals():
for i in range(0, n):
yield True
For compatibility with Python 2.x, it might be preferable to use xrange
:
try:
xrange
except NameError:
xrange = range
def vals():
for i in xrange(0, n):
yield True
Otherwise, your program will silently run out of resources if run with an older interpreter.
Upvotes: 0