Shirley
Shirley

Reputation: 79

Can array.array() be used to define a 2d array?

I am new to Python. I am using Python 2.7. I want to creat a 2D array, I know how to do it using a list. But the data is large by using a list. In order to save memory, I want to use array rather than list. This was inspired by the "Use array.array('l') instead of list for the (integer) values" given in the answer to Huge memory usage of loading large dictionaries in memory .

Can this method work for 2D array?

Upvotes: 2

Views: 1948

Answers (3)

martineau
martineau

Reputation: 123473

You can't really create a 2d array.array() because their elements are restricted to the types: characters, integers, and floating point numbers. Instead you could store your data in a regular one-dimensional array and access it through some helper functions.

Here's an illustration of what I'm trying to describe:

from array import array

INFO_SIZE = 3  # Number of entries used to store info at beginning of array.
WIDTH, HEIGHT = 1000, 1000  # Dimensions.

array2d = array('l', (0 for _ in range(INFO_SIZE + WIDTH*HEIGHT)))
array2d[:INFO_SIZE] = array('l', (INFO_SIZE, WIDTH, HEIGHT))  # save array info

def get_elem(two_d_array, i, j):
    info_size, width, height = two_d_array[:INFO_SIZE]
    return two_d_array[info_size + j*width + i]

def set_elem(two_d_array, i, j, value):
    info_size, width, height = two_d_array[:INFO_SIZE]
    two_d_array[info_size + j*width + i] = value


import sys
print(format(sys.getsizeof(array2d), ",d"))  # -> 4,091,896

print(get_elem(array2d, 999, 999))           # -> 0
set_elem(array2d, 999, 999, 42)
print(get_elem(array2d, 999, 999))           # -> 42

As you can see the size of array2d is only slightly more (relatively speaking) than the size of the data itself (4,000,000 bytes in this case). You could dispense with the functions altogether and just do the offset calculation in-line to avoid the overhead of calling a function to do it on each access. On the other hand, if that's not a big concern, you could go even further and encapsulate all the logic in a generalized class Array2D.

Update

Encapsulating the implementation in a Class

Here's an example of that generalized class Array2D I mentioned. It has the advantage of being able to be used in the more natural array-like fashion of passing two integers to the indexing operator — i.e. my_array2d[row, col] — instead of calling standalone functions retrieve or set the values of its elements.

import array
from array import array as Array
import string
import sys


# Determine dictionary of valid typecodes and default initializer values.
_typecodes = dict()
for code in string.ascii_lowercase + string.ascii_uppercase:  # Assume single ASCII chars.
    initializer = 0
    try:
        Array(code, [initializer])
    except ValueError:
        continue  # Skip
    except TypeError:
        initializer = u'\x20'  # Assume it's a Unicode character.

    _typecodes[code] = initializer


class Array2D:
    """Partial implementation of preallocated 2D array.array()."""
    def __init__(self, width, height, typecode, initializer=None):
        if typecode not in _typecodes:
            raise NotImplementedError
        self.width, self.height, self._typecode = width, height, typecode
        initializer = _typecodes[typecode]
        self.data = Array(typecode, (initializer for _ in range(width * height)))

    def __getitem__(self, key):
        i, j = key
        return self.data[j*self.width + i]

    def __setitem__(self, key, value):
        i, j = key
        self.data[j*self.width + i] = value

    def __sizeof__(self):
        # Not called by sys.getsizeof() in Python 2 (although it should be).
        return sum(map(sys.getsizeof, (self.width, self.height, self.data)))

    @property
    def typecode(self):
        return self._typecode

    @property
    def itemsize(self):
        return self.data.itemsize


array2d = Array2D(1000, 1000, 'l')  # 1 million unsigned 4 byte longs.
print(format(sys.getsizeof(array2d), ',d'))  # -> 4,091,936
print(format(array2d.itemsize, ',d'))        # -> 4
print(array2d[999, 999])                     # -> 0
array2d[999, 999] = 42
print(array2d[999, 999])                     # -> 42

Upvotes: 4

martineau
martineau

Reputation: 123473

The question you refer to is about dictionaries, not arrays. Anyhow you could do this, which creates a list of arrays of 4 byte integers initialized to zero, which is effectively a 2D array:

from array import array

width, height = 1000, 1000
array2d = [array('l', (0 for _ in xrange(width))) for _ in xrange(height)]

array2d[999][999] = 42

Upvotes: 1

avoid3d
avoid3d

Reputation: 620

In python arrays are lists.

The memory advantage in the other question was gained from not using a dictionary.

In general you will not see memory savings in moving "from a list to a 2d array".

Give me a sample of your data and I will update my answer.

Upvotes: -1

Related Questions