Reputation: 429
I have a project in an embedded system (NodeMCU running MicroPython), where I need to store a very large array of variables, which have values of either 0 or 1. I need to be able to read/write them individually or via loops in a convenient way. For this example, I am filling the array with random integers between 0 and 1:
N = 50
table = [[randInt(0,1) for i in range(N)] for j in range(N)]
On my NodeMCU, even such a small array (2500 items) is enough to exceed the NodeMCU memory limits, crashing my script. I suppose this is because that in Python, int is an object with a lot of overhead. Since in my case I do not need the capacity of int variable - actually, 0 or 1 could be stored as a bit - how can I create and fill an array with the least-memory-consuming variables? Say, like in this example, randomizing between 0 and 1. I reviewed the uctypes, but as I'm new to Python, I couldn't get these to work. Or is there another way? How can create such an array with the least memory usage possible?
Upvotes: 2
Views: 1978
Reputation: 174
I would use a bytearray
to store the data. You can store 8 bits per byte by calculating the index
for the byte and the bit in that byte that corresponds to a particular row and column. Once you have
the indexes you can use bit shifts and bitwise operators to set, clear or retrieve the value for
that row and column.
class bitarray2d():
"""create 2d array of bits cols wide and rows high"""
def __init__(self, cols, rows):
self.bits = bytearray(cols * rows >> 3)
self.cols = cols
self.rows = rows
def set(self, column, row):
"""set the bit that corresponds to the column and row given"""
bit = (row * self.cols) + column
self.bits[bit >> 3] |= 1 << 7 - (bit % 8)
def clear(self, column, row):
"""clear the bit that corresponds to the column and row given"""
bit = (row * self.cols) + column
self.bits[bit >> 3] &= ~(1 << 7 - (bit % 8))
def get(self, column, row):
"""get the value of the bit that corresponds to the column and row given"""
bit = (row * self.cols) + column
return 1 if self.bits[bit >> 3] & 1 << 7 - (bit % 8) else 0
def print(self):
"""print the bitarray as a 2d array"""
for row in range(self.rows):
for column in range(self.cols):
print(self.get(column, row), end="")
print()
def line(self, x0, y0, x1, y1):
"""draw a line starting at x0, y0 and ending at x1, y1"""
steep = abs(y1 - y0) > abs(x1 - x0)
if steep:
x0, y0 = y0, x0
x1, y1 = y1, x1
if x0 > x1:
x0, x1 = x1, x0
y0, y1 = y1, y0
dx = x1 - x0
dy = abs(y1 - y0)
err = dx >> 1
ystep = 1 if y0 < y1 else -1
while x0 <= x1:
if steep:
self.set(y0, x0)
else:
self.set(x0, y0)
err -= dy
if err < 0:
y0 += ystep
err += dx
x0 += 1
# Test routine
columns = 64
rows = 64
bits = bitarray2d(columns, rows)
bits.line(0, 0, columns-1, rows-1)
bits.line(columns-1, 0, 0, rows-1)
bits.print()
Upvotes: 1
Reputation: 95873
You are correct, that int
objects have a lot of overhead. However, in this case, these int
objects may actually be cached (in CPython they would be), so the only overhead should be just the pointer...
However, a pointer will still require a machine word, to not rely on such things (implementation details) and to pack things more tightly, you could use actual arrays, you are currently using a list
of list
objects.
The arrays
module provides object-oriented wrappers around primitive, C-like numeric arrays. Unfortunately, they do not provide multidimensional arrays.
So you could try something like:
import array
import random
N = 50
def build_table(N):
rng = range(N)
result = []
for _ in rng:
arr = array.array('B') #unsigned byte
for _ in rng:
arr.append(random.randint(0,1))
result.append(arr)
return result
table = build_table(N)
If this were CPython, I would suggest the bitarray
module for maximum efficiency. I have no idea if that is available for micropython, but you could implement something like that yourself, likely on top of an array.array
. There are many examples of this, it is sort of a classic data structure from the times when memory was measured in bytes. Here's just one example from the Python wiki.
Upvotes: 0