Blue Moon
Blue Moon

Reputation: 449

How to create an array of binary digits of given unsigned integer numbers with Numpy?

I have an array of numbers between 0 and 3 and I want to create a 2D array of their binary digits.

in the future may be I need to have array of numbers between 0 and 7 or 0 to 15.

Currently my array is defined like this:

a = np.array([[0], [1], [2], [3]], dtype=np.uint8)

I used numpy unpackbits function:

b = np.unpackbits(a, axis=1)

and the result is this :

array([[0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 0, 1, 1]], dtype=uint8)

As you can see it created a 2d array with 8 items in column while I'm looking for 2 columns 2d array.

here is my desired array:

array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

Is this related to data type uint8 ?

what is your idea?

Upvotes: 1

Views: 376

Answers (4)

norok2
norok2

Reputation: 26906

One way of approaching the problem is to just adapt your b to match your desired output via a simple slicing, similarly to what suggested in @GrzegorzSkibinski answer:

import numpy as np


def gen_bits_by_val(values):
    n = int(max(values)).bit_length()
    return np.unpackbits(values, axis=1)[:, -n:].copy()


print(gen_bits_by_val(a))
# [[0 0]
#  [0 1]
#  [1 0]
#  [1 1]]

Alternatively, you could create a look-up table, similarly to what suggested in @WarrenWeckesser answer, using the following:

import numpy as np


def gen_bits_by_num(n):
    values = np.arange(2 ** n, dtype=np.uint8).reshape(-1, 1)
    return np.unpackbits(values, axis=1)[:, -n:].copy()


bits2 = gen_bits_by_num(2)
print(bits2)
# [[0 0]
#  [0 1]
#  [1 0]
#  [1 1]]

which allows for all kind of uses thereby indicated, e.g.:

bits4 = gen_bits_by_num(4)
print(bits4[[1, 3, 12]])
# [[0 0 0 1]
#  [0 0 1 1]
#  [1 1 0 0]]

EDIT

Considering @PaulPanzer answer the line:

return np.unpackbits(values, axis=1)[:, -n:]

has been replaced with:

return np.unpackbits(values, axis=1)[:, -n:].copy()

which is more memory efficient.

It could have been replaced with:

return np.unpackbits(values << (8 - n), axis=1, count=n)

with similar effects.

Upvotes: 1

Paul Panzer
Paul Panzer

Reputation: 53089

You can use the count keyword. It cuts from the right so you also have to shift bits before applying unpackbits.

b = np.unpackbits(a<<6, axis=1, count=2)
b
# array([[0, 0],
#        [0, 1],
#        [1, 0],
#        [1, 1]], dtype=uint8)

This produces a "clean" array:

b.flags
#  C_CONTIGUOUS : True
#  F_CONTIGUOUS : False
#  OWNDATA : True
#  WRITEABLE : True
#  ALIGNED : True
#  WRITEBACKIFCOPY : False
#  UPDATEIFCOPY : False

In contrast, slicing the full 8-column output of unpackbits is in a sense a memory leak because the discarded columns will stay in memory as long as the slice lives.

Upvotes: 1

Warren Weckesser
Warren Weckesser

Reputation: 114921

For such a small number of bits, you can use a lookup table.

For example, here bits2 is an array with shape (4, 2) that holds the bits of the integers 0, 1, 2, and 3. Index bits2 with the values from a to get the bits:

In [43]: bits2 = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])

In [44]: a = np.array([[0], [1], [2], [3]], dtype=np.uint8)

In [45]: bits2[a[:, 0]]
Out[45]: 
array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

This works fine for 3 or 4 bits, too:

In [46]: bits4 = np.array([[0, 0, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0], [0, 0, 1, 1], [0, 1, 0, 0], [
    ...: 0, 1, 0, 1], [0, 1, 1, 0], [0, 1, 1, 1], [1, 0, 0, 0], [1, 0, 0, 1], [1, 0, 1, 0], [1, 0,
    ...:  1, 1], [1, 1, 0, 0], [1, 1, 0, 1], [1, 1, 1, 0], [1, 1, 1, 1]])

In [47]: bits4
Out[47]: 
array([[0, 0, 0, 0],
       [0, 0, 0, 1],
       [0, 0, 1, 0],
       [0, 0, 1, 1],
       [0, 1, 0, 0],
       [0, 1, 0, 1],
       [0, 1, 1, 0],
       [0, 1, 1, 1],
       [1, 0, 0, 0],
       [1, 0, 0, 1],
       [1, 0, 1, 0],
       [1, 0, 1, 1],
       [1, 1, 0, 0],
       [1, 1, 0, 1],
       [1, 1, 1, 0],
       [1, 1, 1, 1]])

In [48]: x = np.array([0, 1, 5, 14, 9, 8, 15])

In [49]: bits4[x]
Out[49]: 
array([[0, 0, 0, 0],
       [0, 0, 0, 1],
       [0, 1, 0, 1],
       [1, 1, 1, 0],
       [1, 0, 0, 1],
       [1, 0, 0, 0],
       [1, 1, 1, 1]])

Upvotes: 0

Georgina Skibinski
Georgina Skibinski

Reputation: 13397

You can truncate b to keep just the columns since the first column with 1:

b=b[:, int(np.argwhere(b.max(axis=0)==1)[0]):]

Upvotes: 0

Related Questions