Reputation: 79
I am new to Python. I am using Python 2.7.
I want to creat a 2D array, I know how to do it using a list. But the data is large by using a list.
In order to save memory, I want to use array rather than list.
This was inspired by the
"Use array.array('l')
instead of list for the (integer) values" given in the answer to Huge memory usage of loading large dictionaries in memory .
Can this method work for 2D array?
Upvotes: 2
Views: 1948
Reputation: 123473
You can't really create a 2d array.array()
because their elements are restricted to the types: characters, integers, and floating point numbers. Instead you could store your data in a regular one-dimensional array and access it through some helper functions.
Here's an illustration of what I'm trying to describe:
from array import array
INFO_SIZE = 3 # Number of entries used to store info at beginning of array.
WIDTH, HEIGHT = 1000, 1000 # Dimensions.
array2d = array('l', (0 for _ in range(INFO_SIZE + WIDTH*HEIGHT)))
array2d[:INFO_SIZE] = array('l', (INFO_SIZE, WIDTH, HEIGHT)) # save array info
def get_elem(two_d_array, i, j):
info_size, width, height = two_d_array[:INFO_SIZE]
return two_d_array[info_size + j*width + i]
def set_elem(two_d_array, i, j, value):
info_size, width, height = two_d_array[:INFO_SIZE]
two_d_array[info_size + j*width + i] = value
import sys
print(format(sys.getsizeof(array2d), ",d")) # -> 4,091,896
print(get_elem(array2d, 999, 999)) # -> 0
set_elem(array2d, 999, 999, 42)
print(get_elem(array2d, 999, 999)) # -> 42
As you can see the size of array2d
is only slightly more (relatively speaking) than the size of the data itself (4,000,000 bytes in this case). You could dispense with the functions altogether and just do the offset calculation in-line to avoid the overhead of calling a function to do it on each access. On the other hand, if that's not a big concern, you could go even further and encapsulate all the logic in a generalized class Array2D
.
Encapsulating the implementation in a Class
Here's an example of that generalized class Array2D
I mentioned. It has the advantage of being able to be used in the more natural array-like fashion of passing two integers to the indexing operator — i.e. my_array2d[row, col]
— instead of calling standalone functions retrieve or set the values of its elements.
import array
from array import array as Array
import string
import sys
# Determine dictionary of valid typecodes and default initializer values.
_typecodes = dict()
for code in string.ascii_lowercase + string.ascii_uppercase: # Assume single ASCII chars.
initializer = 0
try:
Array(code, [initializer])
except ValueError:
continue # Skip
except TypeError:
initializer = u'\x20' # Assume it's a Unicode character.
_typecodes[code] = initializer
class Array2D:
"""Partial implementation of preallocated 2D array.array()."""
def __init__(self, width, height, typecode, initializer=None):
if typecode not in _typecodes:
raise NotImplementedError
self.width, self.height, self._typecode = width, height, typecode
initializer = _typecodes[typecode]
self.data = Array(typecode, (initializer for _ in range(width * height)))
def __getitem__(self, key):
i, j = key
return self.data[j*self.width + i]
def __setitem__(self, key, value):
i, j = key
self.data[j*self.width + i] = value
def __sizeof__(self):
# Not called by sys.getsizeof() in Python 2 (although it should be).
return sum(map(sys.getsizeof, (self.width, self.height, self.data)))
@property
def typecode(self):
return self._typecode
@property
def itemsize(self):
return self.data.itemsize
array2d = Array2D(1000, 1000, 'l') # 1 million unsigned 4 byte longs.
print(format(sys.getsizeof(array2d), ',d')) # -> 4,091,936
print(format(array2d.itemsize, ',d')) # -> 4
print(array2d[999, 999]) # -> 0
array2d[999, 999] = 42
print(array2d[999, 999]) # -> 42
Upvotes: 4
Reputation: 123473
The question you refer to is about dictionaries, not arrays. Anyhow you could do this, which creates a list
of array
s of 4 byte integers initialized to zero, which is effectively a 2D array
:
from array import array
width, height = 1000, 1000
array2d = [array('l', (0 for _ in xrange(width))) for _ in xrange(height)]
array2d[999][999] = 42
Upvotes: 1
Reputation: 620
In python arrays are lists.
The memory advantage in the other question was gained from not using a dictionary.
In general you will not see memory savings in moving "from a list to a 2d array".
Give me a sample of your data and I will update my answer.
Upvotes: -1