Reputation: 602
I'm writing code that converts integers into padded 8-bit strings. I would then like to write those strings to a binary file. I am having problems figuring out the proper dtype
to be used with the numpy array that I am currently using.
In the following code when I have bin_data
variable set up with dtype=np.int8
the output is:
$ python bool_dtype.py
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 1, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
[0 0 0 0 1 0 0 0 0]
16
When bin_data
is set as dtype=np.bool_
the output is always true as shown below:
$ python bool_dtype.py
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 1, bool(a[j]) = True
a[j] = 1, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 0, bool(a[j]) = True
a[j] = 1, bool(a[j]) = True
a[j] = 1, bool(a[j]) = True
[ True True True True True True True True True]
16
When I look at the xxd dump of the data when using the dtype=np.int8
I see an expected byte being used to represent each bit (1,0) IE 00000001 or 00000000. Using dtype=np.bool_
leads to the same problem.
Why is bool always reading as True when reading an array element
How can I more efficiently store the data when I write it to the file such that a single bit is not stored as a byte but instead just concatenated onto the previous element?
Here is the code in question, Thanks!
#!/usr/bin/python2.7
import numpy as np
import os
# x = np.zeros(200,dtype=np.bool_)
# for i in range(0,len(x)):
# if i%2 != 1:
# x[i] = 1
data_size = 2
data = np.random.randint(0,9,data_size)
tx=''
for i in range(0,data_size):
tx += chr(data[i])
data = tx
a = np.zeros(8,dtype=np.int8)
bin_data = np.zeros(len(data)*8,dtype=np.bool_)
# each i is a character byte in data string
for i in range(0,len(data)):
# formats data in 8bit binary without the 0b prefix
a = format(ord(data[i]),'b').zfill(8)
for j in range(0,len(a)):
bin_data[i*len(a) + j] = a[j]
print("a[j] = {}, bool(a[j]) = {}").format(a[j], bool(a[j]))
print bin_data[1:10]
print len(bin_data)
path = os.getcwd()
path = path + '/bool_data.bin'
data_file = open(path, "wb")
data_file.write(bin_data)
data_file.close()
What I expect to see when using dtype=np.bool_
>>> import numpy as np
>>> a = np.zeros(2,dtype=np.bool_)
>>> a
array([False, False], dtype=bool)
>>> a[1] = 1
>>> a
array([False, True], dtype=bool)
Upvotes: 3
Views: 1192
Reputation: 2212
Edit:
If your boolean array has a length that isn't a multiple of 8, after packing and unpacking your array will be zero-padded to make the length a multiple of 8. In this case, you have two options:
data=data[:8*(len(data)/8)]
bool_data = np.array([True, True, True])
nbits = len(bool_data)
rem = nbits % 8
nbytes = nbits/8
if rem: nbytes += 1
data = np.empty(1+nbytes, dtype=np.uint8)
data[0] = rem
data[1:] = np.packbits(bool_data)
rem = data[0]
bool_data = np.unpackbits(data[1:])
if rem:
bool_data = bool_data[:-(8-rem)]
Upvotes: 6