Reputation: 152675
I'm trying to convert several masks (boolean arrays) to a bitmask with numpy, while that in theory works I feel that I'm doing too many operations.
For example to create the bitmask I use:
import numpy as np
flags = [
np.array([True, False, False]),
np.array([False, True, False]),
np.array([False, True, False])
]
flag_bits = np.zeros(3, dtype=np.int8)
for idx, flag in enumerate(flags):
flag_bits += flag.astype(np.int8) << idx # equivalent to flag * 2 ** idx
Which gives me the expected "bitmask":
>>> flag_bits
array([1, 6, 0], dtype=int8)
>>> [np.binary_repr(bit, width=7) for bit in flag_bits]
['0000001', '0000110', '0000000']
However I feel that especially the casting to int8
and the addition with the flag_bits
array is too complicated. Therefore I wanted to ask if there is any NumPy functionality that I missed that could be used to create such an "bitmask" array?
Note: I'm calling an external function that expects such a bitmask, otherwise I would stick with the boolean arrays.
Upvotes: 6
Views: 5600
Reputation: 221604
Here's an approach to directly get to the string bitmask with boolean-indexing
-
out = np.repeat('0000000',3).astype('S7')
out.view('S1').reshape(-1,7)[:,-3:] = np.asarray(flags).astype(int)[::-1].T
Sample run -
In [41]: flags
Out[41]:
[array([ True, False, False], dtype=bool),
array([False, True, False], dtype=bool),
array([False, True, False], dtype=bool)]
In [42]: out = np.repeat('0000000',3).astype('S7')
In [43]: out.view('S1').reshape(-1,7)[:,-3:] = np.asarray(flags).astype(int)[::-1].T
In [44]: out
Out[44]:
array([b'0000001', b'0000110', b'0000000'],
dtype='|S7')
Using the same matrix-multiplication strategy as dicussed in detail in @Marat's solution
, but using a vectorized scaling array that gives us flag_bits
-
np.dot(2**np.arange(3),flags)
Upvotes: 1
Reputation: 57033
How about this (added conversion to int8
, if desired):
flag_bits = (np.transpose(flags) << np.arange(len(flags))).sum(axis=1)\
.astype(np.int8)
#array([1, 6, 0], dtype=int8)
Upvotes: 2
Reputation: 15738
>>> x = np.array(2**i for i in range(1, np.shape(flags)[1]+1))
>>> np.dot(flags, x)
array([1, 2, 2])
How it works: in a bit mask, every bit is effectively an original array element multiplied by a degree of 2 according to its position, e.g. 4 = False * 1 + True * 2 + False * 4
. Effectively this can be represented as matrix multiplication, which is really efficient in numpy.
So, first line is a list comprehension to create these weights: x = [1, 2, 4, 8, ... 2^(n+1)].
Then, each line in flags is multiplied by the corresponding element in x and everything is summed up (this is how matrix multiplication works). At the end, we get the bitmask
Upvotes: 2