Reputation: 11533
it would contain at most 1000 x 1000 x 1000 elements, which is too big for python dictionary.
with dict, around 30 x 1000 x 1000 elements, on my machine it already consumed 2gb of memory and everything got stoned.
any modules that can handle 3-dimension array whose value would be only True/False? I check bitarray http://pypi.python.org/pypi/bitarray, which seems reasonable and coded in C, however it seems more like a bit-stream instead of an array, since it supports only 1 dimension.
Upvotes: 2
Views: 7462
Reputation: 6085
numpy is your friend:
import numpy as np
a = np.zeros((1000,1000,1000), dtype=bool)
a[1,10,100] = True
Has a memory footprint as little as possible.
If you really need you can also look at the defaultdict class container in the collections module, which doesn't store the values that are of the default value. But if it's not really a must, use numpy.
Upvotes: 6
Reputation: 365597
numpy
has already been suggested by EnricoGiampieri, and if you can use this, you should.
Otherwise, there are two choices:
A jagged array, as suggested by NPE, would be a list
of list
of bitarray
s. This allows you to have jagged bounds—e.g., each row could be a different width, or even independently resizable:
bits3d = [[bitarray.bitarray(1000) for y in range(1000)] for x in range(1000)]
myvalue = bits3d[x][y][z]
Alternatively, as suggested by Xymostech, do your own indexing on a 1-D array:
bits3d = bitarray.bitarray(1000*1000*1000)
myvalue = bits3d[x + y*1000 + z*1000*1000]
Either way, you'd probably want to wrap this up in a class, so you can do this:
bits3d = BitArray(1000, 1000, 1000)
myvalue = bits3d[x, y, z]
That's as easy as:
class Jagged3DBitArray(object):
def __init__(self, xsize, ysize, zsize):
self.lll = [[bitarray(zsize) for y in range(ysize)]
for x in range(xsize)]
def __getitem__(self, key):
x, y, z = key
return self.lll[x][y][z]
def __setitem__(self, key, value):
x, y, z = key
self.lll[x][y][z] = value
class Fixed3DBitArray(object):
def __init__(self, xsize, ysize, zsize):
self.xsize, self.ysize, self.zsize = xsize, ysize, zsize
self.b = bitarray(xsize * ysize * zsize)
def __getitem__(self, key):
x, y, z = key
return self.b[x + y * self.ysize + z * self.ysize * self.zsize]
def __setitem__(self, key, value):
x, y, z = key
self.b[x + y * self.ysize + z * self.ysize * self.zsize] = value
Of course if you want more functionality (like slicing), you have to write a bit more.
The jagged array will use a bit more memory (after all, you have the overhead of 1M bitarray
objects and 1K list
objects), and may be a bit slower, but this usually won't make much difference.
The important deciding factor should be whether it's inherently an error for your data to have jagged rows. If so, use the second solution; if it might be useful to have jagged or resizable rows, use the former. (Keeping in mind that I'd use numpy
over either solution, if at all possible.)
Upvotes: 2
Reputation: 113905
How about getting inspired by unix file permissions? 755
is read,write,execute for owner and read,execute for everyone else. This is because 7
translates to binary 111
.
So your 1000x1000x1000 bool array could be a 1000x1000 list of int
s in which the binary representation of each int gives you a 1000 "bit" string representing the bool array.
All of that should fit in under 1GB of memory
Upvotes: 0
Reputation: 500167
How about a list of lists of bitarrays, perhaps wrapped into your own class with a nice API?
Alternatively, an 3D NumPy array of integers, with your own code packing/unpacking multiple booleans into each integer.
Upvotes: 2