MSeifert
MSeifert

Reputation: 152675

Creating a "bitmask" from several boolean numpy arrays

I'm trying to convert several masks (boolean arrays) to a bitmask with numpy, while that in theory works I feel that I'm doing too many operations.

For example to create the bitmask I use:

import numpy as np

flags = [
    np.array([True, False, False]),
    np.array([False, True, False]),
    np.array([False, True, False])
]

flag_bits = np.zeros(3, dtype=np.int8)
for idx, flag in enumerate(flags):
    flag_bits += flag.astype(np.int8) << idx  # equivalent to flag * 2 ** idx

Which gives me the expected "bitmask":

>>> flag_bits 
array([1, 6, 0], dtype=int8)

>>> [np.binary_repr(bit, width=7) for bit in flag_bits]
['0000001', '0000110', '0000000']

However I feel that especially the casting to int8 and the addition with the flag_bits array is too complicated. Therefore I wanted to ask if there is any NumPy functionality that I missed that could be used to create such an "bitmask" array?

Note: I'm calling an external function that expects such a bitmask, otherwise I would stick with the boolean arrays.

Upvotes: 6

Views: 5600

Answers (3)

Divakar
Divakar

Reputation: 221604

Here's an approach to directly get to the string bitmask with boolean-indexing -

out = np.repeat('0000000',3).astype('S7')
out.view('S1').reshape(-1,7)[:,-3:] = np.asarray(flags).astype(int)[::-1].T

Sample run -

In [41]: flags
Out[41]: 
[array([ True, False, False], dtype=bool),
 array([False,  True, False], dtype=bool),
 array([False,  True, False], dtype=bool)]

In [42]: out = np.repeat('0000000',3).astype('S7')

In [43]: out.view('S1').reshape(-1,7)[:,-3:] = np.asarray(flags).astype(int)[::-1].T

In [44]: out
Out[44]: 
array([b'0000001', b'0000110', b'0000000'], 
      dtype='|S7')

Using the same matrix-multiplication strategy as dicussed in detail in @Marat's solution, but using a vectorized scaling array that gives us flag_bits -

np.dot(2**np.arange(3),flags)

Upvotes: 1

DYZ
DYZ

Reputation: 57033

How about this (added conversion to int8, if desired):

flag_bits = (np.transpose(flags) << np.arange(len(flags))).sum(axis=1)\
             .astype(np.int8)
#array([1, 6, 0], dtype=int8)

Upvotes: 2

Marat
Marat

Reputation: 15738

>>> x = np.array(2**i for i in range(1, np.shape(flags)[1]+1))
>>> np.dot(flags, x)
array([1, 2, 2])

How it works: in a bit mask, every bit is effectively an original array element multiplied by a degree of 2 according to its position, e.g. 4 = False * 1 + True * 2 + False * 4. Effectively this can be represented as matrix multiplication, which is really efficient in numpy.

So, first line is a list comprehension to create these weights: x = [1, 2, 4, 8, ... 2^(n+1)].

Then, each line in flags is multiplied by the corresponding element in x and everything is summed up (this is how matrix multiplication works). At the end, we get the bitmask

Upvotes: 2

Related Questions