Reputation: 63
I have a huge array (of arrays) of integers in the range 0-255. Since I know the range of the integers, hence I want to optimize the space occupied by them by storing each integer within a single byte.
In C++, I would simply use char
to store the integers, but I am not able to find a way out in Python.
>>> a = 10
>>> sys.getsizeof(a)
24
>>> b = chr(a)
>>> sys.getsizeof(b)
38
>>> c = bytearray(1)
>>> c[0] = b
>>> c[0]
10
>>> sys.getsizeof(c[0])
24
>>> c
bytearray(b'\n')
>>> sys.getsizeof(c)
50
I have searched for data types available in Python, but I am not able to get any data type which can give me sys.getsizeof()
equal to 1.
I want to know whether there exists a spatially optimal way of storing such integers.
Upvotes: 0
Views: 1154
Reputation: 136208
You can use numpy arrays for that. E.g.:
import numpy as np
byte_array = np.empty(10, np.uint8) # an array of 10 uninitialized bytes
See other numpy array constructors for more details.
Upvotes: 2
Reputation: 28370
If you are dealing with huge arrays then you will probably be best off using numpy which includes a lot of array tools for you.
There is some overhead but it is minimal:
import numpy as np
import sys
a = np.array([0]*10000, np.uint8)
len(a)
# 10000
sys.getsizeof(a)
# 10048
sys.getsizeof(a[0])
# 13
a = np.array([0]*1000000, np.uint8)
sys.getsizeof(a)
# 1000048
Upvotes: 1
Reputation: 280207
sys.getsizeof(c[0])
doesn't report the actual amount of memory used to store the first element of c
. Accessing c[0]
makes Python construct an integer object (or fetch one from the small integer cache) to represent the value, but the bytearray does store the value as one byte.
This is more obvious with a larger bytearray:
>>> sys.getsizeof(bytearray([5]*1000))
1168
You can see that this bytearray couldn't possibly be using more than 1 byte per element, or it would be at least 2000 bytes in size. (The excess space is due to overallocation to accommodate additional elements, and some object overhead.)
Upvotes: 4
Reputation: 86064
There is a bytes
class for the purpose of storing a packed sequence of bytes. I don't think there's an easy way of storing just a single number using one byte of memory.
>>> bytes.fromhex('2Ef0 F1f2 ')
b'.\xf0\xf1\xf2'
>>> sys.getsizeof(bytes.fromhex(''))
33
>>> sys.getsizeof(bytes.fromhex('dead'))
35
>>> sys.getsizeof(bytes.fromhex('deadbeef'))
37
Upvotes: 1