Avikalp Gupta
Avikalp Gupta

Reputation: 63

Storing an integer less than 256 within a byte of memory in Python

I have a huge array (of arrays) of integers in the range 0-255. Since I know the range of the integers, hence I want to optimize the space occupied by them by storing each integer within a single byte.

In C++, I would simply use char to store the integers, but I am not able to find a way out in Python.

>>> a = 10
>>> sys.getsizeof(a)
24
>>> b = chr(a)
>>> sys.getsizeof(b)
38
>>> c = bytearray(1)
>>> c[0] = b
>>> c[0]
10
>>> sys.getsizeof(c[0])
24
>>> c
bytearray(b'\n')
>>> sys.getsizeof(c)
50

I have searched for data types available in Python, but I am not able to get any data type which can give me sys.getsizeof() equal to 1. I want to know whether there exists a spatially optimal way of storing such integers.

Upvotes: 0

Views: 1154

Answers (4)

Maxim Egorushkin
Maxim Egorushkin

Reputation: 136208

You can use numpy arrays for that. E.g.:

import numpy as np

byte_array = np.empty(10, np.uint8) # an array of 10 uninitialized bytes

See other numpy array constructors for more details.

Upvotes: 2

Steve Barnes
Steve Barnes

Reputation: 28370

If you are dealing with huge arrays then you will probably be best off using numpy which includes a lot of array tools for you.

There is some overhead but it is minimal:

import numpy as np
import sys

a = np.array([0]*10000, np.uint8)    
len(a)
# 10000
sys.getsizeof(a)
# 10048
sys.getsizeof(a[0])
# 13
a = np.array([0]*1000000, np.uint8)
sys.getsizeof(a)
# 1000048

Upvotes: 1

user2357112
user2357112

Reputation: 280207

sys.getsizeof(c[0]) doesn't report the actual amount of memory used to store the first element of c. Accessing c[0] makes Python construct an integer object (or fetch one from the small integer cache) to represent the value, but the bytearray does store the value as one byte.

This is more obvious with a larger bytearray:

>>> sys.getsizeof(bytearray([5]*1000))
1168

You can see that this bytearray couldn't possibly be using more than 1 byte per element, or it would be at least 2000 bytes in size. (The excess space is due to overallocation to accommodate additional elements, and some object overhead.)

Upvotes: 4

recursive
recursive

Reputation: 86064

There is a bytes class for the purpose of storing a packed sequence of bytes. I don't think there's an easy way of storing just a single number using one byte of memory.

Documentation for bytes

>>> bytes.fromhex('2Ef0 F1f2  ')
b'.\xf0\xf1\xf2'

>>> sys.getsizeof(bytes.fromhex(''))
33
>>> sys.getsizeof(bytes.fromhex('dead'))
35
>>> sys.getsizeof(bytes.fromhex('deadbeef'))
37

Upvotes: 1

Related Questions