Reputation: 1032
I am using numpy.asarray
in my project to handle arrays due to its superb efficiency comparing with default Python lists. I am also supposed to take care of memory utilization when allocating the array because my program can receive big data in gigabytes. While checking numpy.asarray, I found out that the data type is inferred from the array itself unless stated. Thus, I have the following array:
np.asarray([list(map(int, list(x))) for x in X])
When I print print X.dtype
, I got int64
. Since the array X
here always contains binary values, 0 or 1, I thought to use dtype=np.int8
to reduce the memory needed when allocating space. But I am not sure if this is a good idea! Should I stick with the default int64
? Could int8
lose any data precisions that I cannot think of?
Thank you.
Upvotes: 0
Views: 618
Reputation: 2092
From NumPy Manual:
Array types and conversions between types
Data type Description ... int8 Byte (-128 to 127) ...
If you are only going to put binary values in the array than it will be just fine. You won't lose any data precision.
You could even think to set data type to bool_
which is stored as a byte and will definitely be the best solution for your memory and works as an int
too.
>>> import numpy as np
>>> x = np.asarray([1,0,1,0], dtype=np.bool_)
>>> x
array([ True, False, True, False], dtype=bool)
>>> x + 2
array([3, 2, 3, 2])
Upvotes: 2